Week 1: Creating a Readable URL Shortener

Aaron Vontell
365 Days of Coding
Published in
8 min readJan 16, 2018

For week 1 of the 365 Days of Coding series, I thought it would be great idea to create a custom URL shortener with an emphasis on human readability. What exactly is a URL shortener? In it’s simplest form, a URL shortener will take in a long URL, such as http://yourdomain.com/information/pages/terms-and-service.html, and provide a new URL which is much shorter and redirects to the same page, such as http://yourdomain.com/tDxWof.

Taking this idea further, a human-readable URL shortener does the same thing, but provides a shortened URL with consisting of words that are human readable. For example, instead of having that tDxWof extension that could be difficult to remember, your shortened URL may come out as http://yourdomain.com/apples.

Note that this tutorial will not cover the routing and request-side of the server that would be needed to put this URL shortener on the web; tutorials along those lines will come in the future! This tutorial covers the design decisions and basic functionality of such a server. If you are new to JavaScript, this lesson will also serve as a gentle primer to asynchronous functions and buffered reading in JavaScript.

Motivation

Why would you want to create a human-readable URL shortener? Here are a few reasons why:

  • Sharing URLs with other people by means of verbal communication (rather than sending them a link electronically)
  • Sharing URLs that are more pleasing to the eye
  • Allowing users to use a easily-remembered word to complete an action. This is the selfish reason that I am creating this module; in one of my applications, users are asked to join a group. Rather than typing an annoying word into their application as a group identifier, their friends can share with them the human word which represents their group.

Design Principles

There are two principles and guidelines that we will follow in building this module. Note that these principles guide the decisions that we make in the following code, so if the principles of your application are not aligned, make sure to change the respective parts of the code!

  1. The URL shortener focuses on readability and usability, not persistence. In other words, our URL shortener will generate readable but short-lived URLs, since there are only a limited number of short but easily-remembered words in the English language.
  2. Performance is important, in that we shouldn’t be reading from files a bunch of times, and we should be reading files in a buffered and synchronous way.

Coding Time!

Let’s finally get down to business! The code for this lesson will be written in JavaScript, since many servers and websites are written in JavaScript, but please feel free to convert the code to your own platform. If you do, make sure to share in the comments!

If you would like to follow along, the full source code can be found here:

0. Infrastructure

Within the root directory, create a file named shortener.js, as well as a folder named resources/. Within the root project folder, also create a basic NodeJS package with the following command:

npm init

Following the given prompts to generate the project with a chosen name. This will simply allow us to run the project in a seamless manner later on. When asked for the entry point, I simply used the default ( shortener.js ) for the purpose of this project.

1. Getting a list of human-readable words

The first task in developing this module is to load up a list of words that can be used in the shortener. After searching “common English words list” on Google, I found a great GitHub repository from Google and first20hours that contains a list of 10,000 common English words. I downloaded the file with swears and curse words omitted (google-10000-english-usa-no-swears.txt) and put it within the resources/ folder. This file contains about 10,000 words, each on a new line.

Next, we need to load these words for use in our program. In the spirit of JavaScript, we want to do this asynchronously, so that this operation does not halt other processes on our server. Additionally, we want to read this file line by line, rather than loading the entire file at once, in case your application warrants the use of a file which contains hundreds of thousands of words.

First, we include two modules (which are built into NodeJS), fs and readline. The former is used to open and manipulate files in our filesystem, while the latter is used to read the file line by line in a buffered fashion.

Next, we write the following function, called loadValidWords, as well as define a few important global variables. This function completes the task of loading the words within the text file.

First, we define a path that leads to the file of words. Additionally, we define a minimum and maximum word length; this means that we will only load words from the file which are of length between 3 and 6 (inclusive), which helps us avoid single-letter words and long words.

Next, we define our function, which takes in a callback function that gets called once this process finishes. We first check if the list of words is populated; if so, then we already loaded the file, so we can just return the existing word list. Note that if you want to refresh the word list, you can simply remove this conditional; in this case we do this check for performance guarantees.

Finally, we create the buffered reader instance. Lines 21 through 24 in the above snippet create the instance of this buffered reader, while lines 27 through 31 read a line (i.e. a word from the file), checks if the word is of an acceptable length, and then adds that word to our word list. Finally, lines 34 through 36 executes the given callback once the end of the file is reached.

At this point, we now have a function which grabs all valid words for us to use in our URL shortener. Depending on your project, you might use a different word file (such as a list of themed words), but in this setting, we end up having 4,620 words to work with!

2. Shortening a URL

Our next step is to write a function that actually shortens the URL for us, and saves the result as a record to consult when the shortened URL gets called. We define the shortenURL method, along with more useful constants and variables, below:

First, we define the base domain of our website, since this will be the base URL used for our URL shortener. We also define a dictionary which is used to hold the mappings of shortened URLs to real URLs. Note that in a production environment, you would want to store this mapping in a database, such as a SQL or MongoDB server. However, that is not in the scope of this project, although there will most likely be a lesson in this series that covers the creation and management of a DB-backed server.

Our shorten URL function is pretty straightforward; we first grab a random word from the word list, store a record in the dictionary saying that this word maps to the given URL (which is passed into the function), and we return the new shortened URL, which is the concatenation of our domain and the random word.

There is something important to notice here; by using the splice method, we are removing the used word from the word list; this is because we don’t want to use a word more than once, since the old mapping would then be erased. Also note that we are using a helper function which generates a random whole integer in a given range; this was found on StackOverflow, which you should visit if you would like to learn more about how this function works.

3. Getting the Original URL

Now, the whole point of this program is to receive a shortened URL and redirect the user to the original URL. In order to do this, we define the simple function below:

This function takes in the whole shortened URL generated from section two and returns the original URL, by trimming off the domain and getting the mapped URL from the random word at the end of the shortened URL.

4. Removing Old Shortened URLs

Now, considering that we only have about 5,000 words to work with, it would be nice to remove existing shortened URLs from the record, and put those random words back into the list of usable words. The following function does exactly that:

Like in section three, we first get the short word used for this record, and then delete the existing record, making sure to also add the word back to the word list.

Putting it all together!

Finally, let’s put together an example! The following function startServer will be the entry point of the example.

First, we start the asynchronous loadValidWords function. Once the words are finished being loaded, that internal anonymous function is called. It prints out the number of available words, shortens a URL that I gave it, prints out the new URL, attempts to get the original URL back from the shortened URL, prints that result, and then removes this shortened record from the “database”. If you followed along and put all of this code into the shortener.js file, run the command node shortener.js in your terminal/shell, and you should get output similar to this:

Available Words: 4620
New URL: https://mywebsite.com/grant
Old URL: https://medium.com/365-days-of-coding

Your short URL make look different since the word is chosen randomly, but we did it! Remember that you can find all of the code for this lesson at the link below (just in case you weren’t following along).

Conclusion

This module provides some basic features to create a URL shortener, with a focus on generating human-readable URLs. We walked through the process of developing the module, and explained some of the design decisions along the way. While this tutorial doesn’t include the hosting-part of the code, this shortener can be easily integrated into existing or future JavaScript-based projects. If you are new to JavaScript, this lesson will also serve as a gentle primer to asynchronous functions and buffered reading in JavaScript.

If you have any questions, comments, or improvements, please let me know in the comments below. As always, thanks for reading, and see you next week!

--

--

Aaron Vontell
365 Days of Coding

Software Engineer @ Instabase, Founder of Vontech Software, LLC | Developer, Aspiring Entrepreneur, MIT MEng ‘18