How to implement TinyURL

9 min readFeb 7, 2023

The system design of implementing a TinyURL-like system mainly focuses on the scalability and durability of the system. For example, if your website gets millions of requests, how will the website/system be efficient?

Purpose of a TinyURL-like feature:

Ensure that you get the right messages out to your audience without taking up too much space in your social posts. Also, make it easier to share your content: simplified and branded URLs tell customers everything they need to know about your site.

The functional requirements
The non-functional requirements
How it works
Capacity estimation: Traffic, Bandwidth, Memory, and Cache estimates
Load balancer
Database design
Core Features

⏩The functional requirements:

This endpoint should encode a URL to a shortened URL.
This endpoint should decode a previously shortened URL.
The last endpoint should allow me to follow a shortened URL to the original URL.

⏩The non-functional requirements:

These are a set of specifications that describe this feature’s operation capabilities and constraints and attempt to improve its functionality. These are basically the requirements that outline how well it will operate including things like speed, security, reliability, data integrity, etc.

Key questions that we can ask include:

Performance and scalability. How fast does the application return a shortened URL or decode a URL? How much will this performance change with higher workloads?
Portability and compatibility. Which hardware, operating systems, and browsers, along with their versions does the software run on? Does it conflict with other applications and processes within these environments?
Reliability, maintainability, availability. How often does the system experience critical failures? How much time does it take to fix the issue when it arises? And how is user availability time compared to downtime?
Security. How well are the system and its data protected against attacks?
Localization. Is the system compatible with local specifics?
Usability. How easy is it for a customer to use the system?

The non-functional requirements of this feature would be:

Different users should receive the same shortened URL for a URL that has already been shortened.
The shortened URL should not be duplicated. ie, each URL should have a unique shortened URL.
Shortened links should not be predictable in any manner.
The length of the entire shortened URL should be less than 40 characters.
Once the URL is provided or filled in, the decode, encode or re-direction should happen within 3 seconds.
The data retention period of the shortened URL is one year.
The system should be available with an acceptable downtime within 24 hours.

⏩How it works:

Option 1: Instead of encoding the URL, we either encode a unique identifier like UUID that is generated automatically and independently on any node or use MD5 of the URL to guarantee no duplicates and collisions.

In the above case, we use a UUID to obtain a user-readable string of alphanumerics that appears random but avoids collisions inherent in other random schemes. A Guid contains 16 bytes, or 128 bits, which translates to approximately 19 characters for a full Base64 encoding.

The downside of this is hefty length if we roll with Guid, or implement our own globally unique byte stream which would be error-prone.

Option 2: Enable Zookeeper, a distributed system manager, to have a cluster of Id generators and let each server have a specific range of unique identifiers to ensure there are no duplicates.

➡️Step 1: users will connect to a server based on the load balance algorithm that could be the least bandwidth or least response time.

➡️Step 2: the application on the server will check the cache to speed up the read response time and see if the URL exists in the cache. If the URL does not exist in the cache, the application will check the database.

whenever we add a new server or a server runs out of a range of Ids for the application to generate a unique shortened URL, it will contact the Zookeeper.

Zookeeper is known as a centralized service that coordinates and manages configuration information.

Also, to avoid having Zookeeper be a single point of failure, we will run multiple instances and enable the fail-over mechanism so that if Zookeeper goes down, then servers will connect to another instance of it.

If a new server is added, Zookeeper will give it an unused range of Ids. And if the range runs out, the existing server can go to Zookeeper and ask for a new unused range.

However, if we generate URLs in a sequential manner, this will be a security breach. So we could add another random 10–15 bits after the number to increase entropy.

⏩Capacity estimation:

This application is read-heavy. There are 2 assumptions:

there will be many redirection requests compared to creating new URL shortening requests.
the ratio between read and write would be 100:1.

✅Traffic estimates:

Assuming that there would be 500 million new URL shortening requests per month.

✅storage estimates:

We expect to have 500 million new URLs per month and the total number of objects expected to store every year will be 6 billion per year:

500 million * 12 months = 6 billion per year

Assuming each stored object will be approximately 1500 bytes, We will need 8TB of total storage:

6 billion * 1500 bytes = 8 TB

Approximately each object will be 1524 bytes, we use 1500 to estimate the capacity:
Id (16 bytes) + OriginalUrl (1500 bytes) + ExpirationDate (8 bytes) = 1524 bytes

✅Bandwidth estimates:

Write requests: we expect 200 new URLs every second so the total incoming data for our service will be 300KB per second:

200 * 1500 bytes = 300 KB/s

2. Read requests: we expect ~20K URLs redirections every second, and the total outgoing data for our service would be 30MB per second:

20K * 1500 bytes = ~30 MB/s

✅Memory estimates

Followed 80–20 rules and assumed that 20% of the URLs generate 80% of the traffic, so we cache those frequently queried URLs.

We have 20K redirection requests per second, and we will be getting 1.7 billion requests per day: 20K * 3600 seconds * 24 hours = ~1.7 billion
To cache 20% of these requests, we will need 510GB of memory: 0.2 * 1.7 billion * 1500 bytes = ~510GB

If there are many duplicate requests (of the same URL), the actual memory usage will be less than 510GB.

✅Cache:

In order to speed up the read requests, we can enable cache for those URLs that are frequently accessed. So when we get a heavy load of requests to a single link like a popular meme, we can have the redirect URL in the memory and process the redirect quickly.

Solution: We can use a solution like Memcached to keep full URLs with respective hashes.
How much cache memory should we have: we start with 20% of the daily traffic and adjust based on usage patterns. Based on our previous estimations, we will need 510GB of memory to cache 20% of the daily traffic.
Cache eviction policy: we can go with the Least recently used (LRU) eviction policy that removes the item used the least recently.

⏩Load balancer (LB)

Initially, we can use a Round Robin approach that distributes incoming requests equally among backend servers.

Round‑robin load balancing is one of the simplest methods for distributing client requests across a group of servers. Going down the list of servers in the group, the round‑robin load balancer forwards a client request to each server in turn.

However, the Round Robin approach does not take server load into consideration, so the LB would still forward requests to an overloaded or slow server.

If the server becomes overloaded with the Round Robin approach, we can adjust and try dynamic load balancing algorithms such as Least connection that checks which servers have the fewest connections open at the time and sends traffic to those servers. This assumes all connections require roughly equal processing power.

⏩Database design:

✅Key requirements:

This application is read-heavy.
There will be a one-to-one relationship between a given URL and a shortened URL. And there are no relationships between each record except for storing, which the user created with the short links.
The service needs to store billions of records.
Each object is small (less than 1K).

✅Tables:

We need two tables including:

The ShortenedUrls table is to store all shortened URLs and has the schedulers to purge data that is older than one year.
The Users table may not be needed unless we want to save users’ detail for marketing purposes. Or if we want to prevent malicious users from consuming all the URLs, we can limit users through an API key.

✅Database types:

NoSQL databases such as MongoDB, Redis, HBase, Neo4j
Relational databases such as Oracle, SQL server

Given the requirements are not looking for relational queries and have a need to be able to scale as time goes on, we will go with NoSQL.

NoSQL is known for three benefits: low cost, scalability, flexible data structure. The current market has 4 main NoSQL databases including document store, key-value store, column store, and graph store.

✅Data partitioning and replication:

There are 2 primary methods to scale the database and distribute the data including hash partition and range partition.

Firstly, data needs to be distributed using a partitioning key in the data. The partitioning key can be the primary key or any other unique key. Once you’ve identified the partitioning key, we need to decide how to assign a key to a given shard.

➡️One way to do this would be to take a key and apply a hash function. Based on the hash bucket and the number of shards to map keys into, the key would be assigned to a shard. There’s a bit of nuance here in the sense that an assignment scheme based on a modulo by the number of nodes currently in the cluster will result in a lot of data movement when nodes join or leave the cluster (since all of the assignments need to be recalculated). This is addressed by something called consistent hashing.

➡️Another way to do assignments would be to take the entire keyspace and break it up into a set of ranges. Each range corresponds to a shard and is assigned to a given node in the cluster. Given a key, you would then do a binary search to find out the node it is meant to be assigned to. A range partition doesn’t have the churn issue that a naive hashing scheme would have. When a node joins, shards from existing nodes will migrate onto the new node. When a node leaves, the shards on the node will migrate to one or more of the existing nodes.

A hash-based assignment can be built in a decentralized manner, where all nodes are peers of each other and there are no special master-slave relationships between nodes.

On the other hand, a range-based partitioning scheme requires that range assignments be kept in some special service. Hence, databases that do range-based partitionings, such as Bigtable and HBase, tend to be centralized and peer-to-peer, but instead have nodes with special roles and responsibilities.

⏩Core Features:

Algorithms: The goal is to generate a unique and short key for a given URL. Hence, the algorithm options could include but are not limited to:

A base10 system: we will have 10 million possible shortened URLs: 10 ^6 = 1 million
A base62 system ([A-Z, a-z, 0–9]): we will have 56.8 billion possible shortened URLs: 62 ^6 = 56.8 billion
A base64 system ([A-Z, a-z, 0–9, +, /]): we will have 68.7 billion possible shortened URLs: 64 ^6 = 68.7 billion

If we have 1000 requests per second and we go with 6 letters-long keys and use the base64 system, we will run out of options in 2.17 years.

While if we go with 7 letters-long keys and use the base64 system, we will run out of options in 139 years.

A base62 system ([A-Z, a-z, 0–9]): we will have 3.52 trillion possible shortened URLs: 62 ^7 = 3.52 trillion
A base64 system ([A-Z, a-z, 0–9, +, /]): we will have 4.36 trillion possible shortened URLs: 64 ^7 = 4.39 trillion