URL Shortener API with NodeJS and MongoDB
What’s the best way to start blogging other than looking for interesting problems and try finding solutions for them? After Googling design problems that I could read on and build a solution for, I decided to tackle the famous URL Shortening problem (yes, I never attempted it before).

URL shortening is a mechanism for shortening a Uniform Resource Locator (URL) into something easier to remember and share around. The basic challenge here is to make it noticeably short and of course, unique. The very first thought I had was to hash the URL, which after realizing how hashing functions work, wasn’t the best way to produce a short and unique URL. This is mainly because of the following:
- Hashing functions usually cause collisions (URLs mapping to the same hash, making them not unique)
- Hash lengths are usually not short enough
I’m talking here about figuring out a solution that is similar to Bitly or Tinyurl. Bitly, for example, is a very sophisticated service which specializes in managing URLs and doing analysis on them. What we want is an API that takes in a URL and returns a shortened one, or give it a shortened URL and return the original one.
So, what would the requirements be?
As a user, I want to be able to substitute a URL with another one that is shorter than the original
As a user, I want to use a shortened URL and return the original one
As a user, I want to have multiple short URLs that map to the same original URL
The third requirement indirectly implies that it should be unique all the time. Let’s get to the meat of this problem.
On Bitly, google.com becomes http://bit.ly/1bdDlXc, what is happening here?
First of all, the domain name is relatively short. Second of all, the path of the URL (/1bdDlXc) looks like a resource ID you can use to query the database to return the URL you initially submitted. That’s correct. But that looks a randomly generated string. How do confirm uniqueness in this case?
This article assumes you have NodeJS and MongoDB installed on your machine
Algorithm
Inspired by this SO answer, an encryption and decryption methods are required to ensure direct mapping of a resource ID to a URL and vice versa. The encryption process would produce 1bdDlXc and the decryption would return the URL that was submitted or the identifier for that URL (google.com or ID 256).
The encryption process would return a set of arbitrary characters. What and how many characters allowed help ensure a safe and strong encryption. If we were going to store the URLs in a database and allow multiple identical ones to exist then the best candidate for encryption is an alternating and unique field that is generated such as a database ID.
An incremental ID works in this scenario. For example, if we tried inserting the 120th URL, the ID would be 120 and in order to transform this number to a set of characters that is unique we would require a defined set like the following:
23456789bcdfghjkmnpqrstvwxyzBCDFGHJKLMNPQRSTVWXYZ-_The length of this set is 51. So the result would consist only of the characters declared, and this would require converting the ID to Base 51. The general formula to convert to Base X (encryption) is the following:
Let X be base, SET be the character set and ID be the number converted
while ID is greater than 0:
index = ID % X
s = SET[index] + s
ID = floor(ID / X)
print s --> converted IDDecryption is exactly the reverse of the above:
Given a string S
for every character c in S
ID = ID * X + (index of c in SET)
print ID --> original IDThe ID returned from decryption would be used to query the database for the original URL.
Now that we have the algorithm, let’s build our API!
Implementation
There will be only two HTTP methods (or endpoints) for our API:
POST /: Encrypting the URL and storing it in the databaseGET /:code: Decrypting the Base X string (code) to an ID and querying the URL
Usually, the GET method would redirect us to the URL immediately, but for the sake of this article, we will just return the URL
Create a NodeJS Application
I used Express Generator to scaffold the application. You can use whatever you want, but if you opt for the generator, follow the steps bellow.
- Install Express Generator by running
npm install express-generator -g - Run
express myapp. This will generate all files in a directory calledmyapp - Remove all “view” related components from the generated files and their dependencies from
package.json. You can keep the HTML files (except for the index file) but that would require handling the error handling differently - Run
npm install
Try running the app with npm start
Create Mock Endpoints
Now that the app is running, let’s create two mock endpoints. Open the routes/index.js file and you will find an endpoint created for you as GET / .
Create another two:
const getUrl = (req, res) => {
res.status(200).json({ message: 'get url' });
};
const saveUrl = (req, res) => {
res.status(200).json({ message: 'save url' });
};/* Create a short URL */
router.post('/', saveUrl);
/* Get original URL */
router.get('/:code', getUrl);
Re-run the app and test the endpoints with a tool like Postman.
The actual work happens in saveUrl and getUrl functions, so we are going to move the logic to a controller and implement the functions there. Our routes/index.js would look like this:
Before I get to controllers/index.js , let’s set up Mongoose.
Set up Mongoose
Install Mongoose by running npm install --save mongoose
Inside app.js import package mongoose and add the following code first thing after imports:
mongoose.connect('mongodb://localhost:27017/myapp');
mongoose.connection.on('open', () => {
console.log(`MongoDB connected: ${mongoose.connection.db.databaseName}`);
});
mongoose.connection.on('error', (err) => {
console.error(`MongoDB error: ${err}`);
});This would connect to MongoDB and create a database called myapp if it doesn’t exist yet, and logs (optional) when it’s connected and disconnected.
Test it by re-running the app.
The next step would be creating a Mongoose Schema to store the URLs
Create a Schema Model
The schema for URLs is simple, the only fields needed is url and the _id. The ID is incremental from 1…N where N is the number of links.
The _id by default is of type ObjectID and what we are looking for is an incremental ID. Unfortunately, there is no built-in way to choose how the _id can be generated, but there is a workaround according to their official docs.
The idea is to keep a collection as a key-value pair and keep count of how many URLs exist so that the next URL to store is the current count of URLs + 1. This is possible to create with Mongoose “pre” hooks whenever we try to save to the URL schema.
So, there are two schemas to create: CounterSchema and LinkSchema. Create a directory called models
CounterSchema
Create a new a new file under models called counter.js
Create a new a new file under models called link.js
A single entry in LinkSchema holds the _id which is an incremental number (overriden, ObjectID by default) and url which is the original URL submitted. Notice that we do not store the encrypted ID as it’s not needed, but it’s optional.
What’s happening here?
...LinkSchema.pre('save', function(next){
const link = this;
CounterSchema.findByIdAndUpdate('linkEntryCount', { $inc: { count: 1 } }, { new: true, upsert: true }, function(err, counter){
if(err) return next(err);
link._id = counter.count;
next();
});
});...
Before saving the new URL, a pre hook is fired which would increment (updates) the count by 1 in the CounterSchema and assign it as an _id to the soon to be saved link. The { new: true, upsert: true } respectively ensures that it always returns the updated document and/or create an entry if the CounterSchema (collection) doesn’t exist yet.
Note: linkEntryCount ID does not exist the first time it tries to update, but the upsert option ensures it creates one the first time a new ID is used.
Create a Controller
Createcontrollers/index.js
The saveUrl function expects the field url in the request body. When calling link.save() the pre hook in the LinkSchema is fired to set the _id . If created successfully, the ID will be encoded and returned as part of the response payload along with the formed link.
For
isUrl()helper method, runnpm install --save validator
The getUrl function expects the encode code as a parameter which it will then convert to a number that represents the ID in the database. The projection is to ensure we only return the URL.
That’s it. Re-run the app to ensure nothing is breaking.
Testing the Application
I will be using Postman to test the endpoints created.
POST localhost:3000/
- Click on the Body tab
- Select x-www-form-urlencoded
- For the key-value pair, the key would be url and the value would be any valid URL
- Click Send
The response would look something like this:
{
"code": "7",
"link": "http://localhost:3000/7"
}The number 7 represents an ID when decoded, and when you:
GET localhost:3000/7
The response would look something like this:
{
"url": "google.com"
}That’s it!
Conclusion
In this article we have learned the simplest way to shorten a URL and how MongoDB using Mongoose implements auto-incremental IDs. Next steps for you as an exercise:
- Redirect the user on GET with proper HTTP code
- Add analytics! Every time a user tries to GET the URL, add some information such as number of clicks and location. Add an endpoint to return the data for a specific shortened URL like in Bitly.
The code is on https://github.com/MohamadAtieh/shortify . You are more than welcome to do Pull Requests (fixes, unit tests, features, etc), raise issues or follow the repo as I will be adding more enhancements :)
