Building an API Cache with Redis and Node

How to optimize API usage in a college search example application

Fetching data from an API can be costly, especially if your application is making multiple HTTP requests to an API endpoint. Problems like request throttling and network latency could become an issue really fast. This makes your application’s efficiency contingent on the reliability of that API, but Redis can help mitigate these issues.

This article is a walk-through demonstrating how to use Redis as a cache in an application that searches for colleges and their locations. The application uses IBM Compose for Redis and Node.js. The data we’re caching into Redis is from the US Department of Education (DoED) College Scorecard, which contains a variety of information on colleges throughout the United States. What we’ll cache are user queries, college names, and college locations. To make the application a little more interesting, we’ve incorporated Mapbox to visualize the college locations on a map.

Getting set up

The application requires a little setup so head over to the application’s GitHub repository. Clone the repository, then follow the directions to get set up with your IBM Compose for Redis database along with getting the API key for College Scorecard from Data.Gov and a Mapbox access token.

Here’s a breakdown of what you’ll need to start:

  1. A recent version of Node.js installed along with the npm package manager.
  2. The code from the GitHub repository, which includes all the Node.js libraries we’ll be using.
  3. An account on IBM Cloud or Compose.com account to provision a Compose for Redis deployment.
  4. A Data.Gov API token. With this token, you can use any of the APIs from Data.Gov, not only College Scorecard.
  5. A Mapbox access token. You just have to sign up for Mapbox and create a token for your account.

How it works

When a user makes a request for a college, they are first sending a Redis GET command to see if the college has been requested before. If the college is not in Redis, then the application queries the College Scorecard API for that college and stores the college name as the key with the API response as the value. The API response will also be returned to the user. If the college is in Redis, then Redis will retrieve the cached value and return it to the user.

Once a response is delivered to the user from the API or Redis, a list of the colleges will appear along with their locations on a map.

Very simple application showing the list of colleges that include “Washington” in their name. On the right are the locations of those colleges using a Mapbox map.

Retrieving a list of colleges

The College Scorecard API provides a wealth of information about colleges in the United States. The API allows you to query and select the fields you want returned, which is defined in its documentation. A basic query uses the following URL with your Data.Gov API key:

http://api.data.gov/ed/collegescorecard/v1/schools/api_key=<your_api_key>

Getting just the name and location of the colleges we’ll use:

https://api.data.gov/ed/collegescorecard/v1/schools?api_key=<your_api_key>&school.name=<college_name>&fields=school.name,location.lon,location.lat&per_page=100

For the we indicate the name, or partial name, of a college we’re searching for. Using , we indicate the required fields returned. And indicates that we want at least 100 colleges returned.

Running this code with Harvard, we’ll get back a JSON object with the college name “Harvard University” and its coordinates in the array:

Now that we know what we’ll get back from the API, let’s go through how to fetch the data and cache it in Redis using the Express web framework for Node.js.

Exploring the API-caching application

Let’s first start by setting up IBM Compose for Redis. IBM Compose for Redis has two modes: storage and cache. Storage mode saves the data to disk and scales up your database as the data grows. Cache mode doesn’t write data to disk. Instead it uses a data eviction strategy to delete old data to free up memory for new data. You’ll want to make sure that your Redis deployment is in cache mode after provisioning it so you’ll need to contact support to enable cache mode for you.

In the file we have the application code that will query and retrieve data from the College Scorecard API as well as cache that data and retrieve it from Redis.

To connect to Redis, we’re using the Express web framework. We import the Express library as well as a few other Node.js libraries that have been installed from npm. These libraries will allow us to connect to Redis, fetch data from the College Scorecard API, and handle our environment variables that contain the API key and Redis connection string, as well as reading in the Redis self-signed certificate.

For the next part, we’ll set up a connection to IBM Compose for Redis which connects using TLS.

Now, we use Express to point to the directory which contains the frontend of the application. That’s so when we run , the HTML page in gets loaded. We also set up a GET route, using , that contains the code to fetch data from the API and Redis that’s accessed from the endpoint.

Let’s run through the code here a little. On the frontend of the application, we’re using an HTML form to capture the college name a user enters. When a user enters a college name in the input box, it’s posted to the endpoint and can be retrieved using , where is the name assigned to the HTML input tag.

We’re using Redis as a key/value store to store the user queries as keys and the results as values. If we’re using Redis with other applications, we want to make sure that our keys are unique so we don’t accidentally try to store two different values for the same key. Therefore, we prefix to the query to make it unique.

let college = `college/${query}`.trim().toLowerCase();

The JavaScript and functions are added so that we don’t have multiple keys that contain the same results like “Washington”, “washington”, and “washington ”, for instance.

When a query comes in from the user, a Redis GET command is triggered to check whether the key is in the database. If it is in Redis, then the data is sent.

client.get(college, (err, data) => {
if (err) throw err;
if (data !== null) {
res.send(data);
}

If Redis doesn’t have the key, a request will be sent to the API and the JSON response sent to the application.

fetch(
"https://api.data.gov/ed/collegescorecard/v1/schools?api_key="
+
apiKey +
"&school.name=" +
query +
"&fields=school.name,location.lon,location.lat&_per_page=100"
).then(res => {
return res.json();
}).then(json => {
client.setex(college, 86400, JSON.stringify(json));
res.send(json);
}).catch(err => {
console.error(err);
resp.send(202);
});

The Redis SETEX command is used to set the key with the prefix, set an expiry time for the key, and set the serialized API JSON response as the value. The SETEX command is similar to the SET command except it expires the key after a given amount of time. We’ve set the key for 86,400 seconds so it will expire after one day. You can set it to whatever time is appropriate for your use case.

Testing it out

Before running the code, make sure that you’ve added your Mapbox access token. In , substitute the given Mapbox token with your own. Once you’ve done that, run the Express application using , or , then navigate to in your preferred browser.

In the input box, type in a college name; we’ve typed in “technical”. Once that query has been sent, you should have a page that looks like the following showing the first 100 colleges that contain the name “technical”.

Searching for college names containing the string “technical”.

While viewing the locations of all the colleges might be interesting to look at, let’s look at the performance of the API request versus Redis. Depending on the browser you’re using, open up the developer tools to view the network settings. On Firefox for macOS, it’s found using Command+Option+i and then selecting the “Network” tab.

Searching for a college with the name “Illinois”, we’ll get.

This API request takes 1477ms.

We got a response of 1477ms, which is due to getting a response from the API. Since this is a new college, it will also be loaded into Redis so let’s look at the response time difference when we search for it ten times.

Memoization speeds up subsequent API calls for the same search.

As you can see, there’s a significant difference in the response times between the API providing us with a response and Redis searching for the key and providing the value back.

We can also replicate these results from the terminal when sending a request using cURL. For instance when searching for a college in “Ohio”, we’d get something like:

for n in 1 2 3 4 5 6 7 8 9 10
curl -s "http://localhost:9000/api/colleges?college=ohio" -o /dev/null -w "\n%{time_total}"

1.155697
0.135714
0.047676
0.048619
0.056466
0.053072
0.047367
0.049279
0.051338
0.050380%

As you can see, the first result has the longest time, but subsequent responses are a lot shorter.

Since Redis caches the data until it expires, we can stop the application and return to it before the expiry time and get the same, fast results. Only when the data expires will we lose the data.

Cache you later

With Redis, you can significantly speed up the time it takes your application to deliver data to users who need it. This sample application is just a simple example of caching data, but try the database out to see how easy it is to use a managed database like IBM Compose for Redis as part of your caching solution.

IBM CODAIT

Things we made with data at IBM’s Center for Open Source Data and AI Technologies.

Dr. Abdullah J. Alger

Written by

Developer Evangelist @ IBM Cloud | I write about building things, breaking things, databases things, cloud things, and thing things.

IBM CODAIT

Things we made with data at IBM’s Center for Open Source Data and AI Technologies.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade