An Introduction to APIs (for R users)

Published in

The Startup

6 min readNov 22, 2019

Application Programming Interfaces, or APIs, are a powerful thing. As an analyst and not a computer scientist, however, I always thought they were complicated (turns out they’re not too bad afterall). At the beginning of my data science career, I was mostly generating reports and powerpoints. Then, I discovered the Shiny package and was excited to create web applications users could actually interact with. Time went by, and after creating many more reports and applications for people, I starting thinking about what if I could share my applications with machines too?

Building an API is like building an interface for your program that other programs, applications, systems, etc., can interact with. Building an interface for your R script opens up a world of new possibilities. With APIs, information can be transferred from one program to another, even if those programs are written in different languages. Think about all the ways it allows you integrate your R code with other services. Just as one example, you could build an API to expose your prediction models to a mobile application, making your predictions a core feature of the app and getting it in the hands of thousands of users.

But building APIs is not the only thing. There are many many APIs out there that you can incorporate in your own programs, so learning about what they are and how to use them is a huge asset.

API Standardization

APIs have been around since network-connected computers appeared in the ’70s and ’80s. Then with the emergence of the internet in the ’90s, companies began to standardize the way they built and exposed APIs. Overtime, this standardization improved (thankfully). Today, there are different standardized API types to choose from, often referred to as “API Paradigms”. These include Representational State Transfer (REST), Remote Procedure Call (RPC), and GraphQL APIs. In this post, I will focus on REST since it is the most commonly used today.

Calling a REST API (example)

REST APIs work by way of requests and responses. The “client”, whatever program is making the request, will make send a request to the API server via a HTTP URL. This request is sent in the same way you might type in a URL into a web browser to see a webpage. The request is the URL you typed in, and the webpage the server returns is the response. With REST APIs, however, the response is usually sent back as JSON or XML, standard formats for exchanging data over the web. R, as with other programming languages, has many functions for working with JSON and XML data which make it easy to extract the data.

Many APIs require some sort of authentication for you to use them, but others, like wikipedia’s Pageview API, are open to the public. Click on the URL below to send a request to the server and see what it returns:

https://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/en.wikipedia/all-access/all-agents/Albert_Einstein/daily/2015100100/2015103100

This HTTP request to the Pageviews API returns a pageview count time series of wikipedia’s Albert Enstein article for the month of October 2015 (in JSON format). Just like entering different website URLs will return different pages for that site, REST APIs have different “endpoints”. Let’s try a different HTTP request to a different endpoint this time:

https://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/en.wikipedia/all-access/all-agents/Barack_Obama/monthly/2018010100/2018123100

This HTTP request gets a monthly pageview count of Barack Obama’s article for the year 2018. As you can imagine, good API documentation is important to help developers understand what each API endpoint does and what query parameters should be used.

The Anatomy of a Request

A request is made up of 4 parts: the request URL (which you’ve already seen), the method, the headers, and the data / body.

The URL, also referred to as the endpoint, is how you access the API resource you’re looking for. In addition to the path to the API resource, you may also tack on some query parameters which allow you to modify your request for different key-value pairs. Query parameters are added at the end of your URL like so: https:://api-endpoint.org/api/rest_v1/resource_name?query1=value1&query2=value2.

Or the API may use Path Variables instead, in which case your HTTP request would look something like this: https::/api-endpoint.org/api/rest_v1/rescource_name/query1/value1

So where was the method, headers, and body when you called the Wikipedia API? Your web browser actually took care of generating the full HTTP request for you with all of the required parts and sending that request to the Wikipedia server. In Chrome, you can see the HTTP request for yourself by right-clicking on the webpage, selecting “Inspect”, and then clicking the “Network” tab:

The request method, and the URL, are both mandatory. The request method informs the server about the action to be performed. The GET method is used to request data from the API. Different HTTP methods invoked on the same URL provide different functionality to that API resource. POST, for example, is used to send data to a server to create or update a resource.

Headers are used to pass along additional information. Some common header examples might include access credentials like your user name, password, or API key which may be required to gain access to the API resource.

The body, like the headers, is optional. Because this was a GET request, no data needed to be passed so it was simply left blank.

Making API calls in R

The GET() function from the httr package helps you construct the query URL and make API calls. Its general format is the following: httr::GET(url, add_headers(a = 1, b = 2), authenticate("username", "password")). If done correctly, this will give you a response object from the API resource. Read the API documentation to identify the method, URL endpoint, query parameters, and headers needed to call the API.

Processing the Response

R has functions which can easily convert the JSON or XML responses from the API you are calling. These include httr::content() to retrieve the contents, jsonlite::fromJSON() to convert JSON objects into R objects (such as a list), and the rlist package to work with the JSON response which was converted to a list.

Use an API client if there is one

Many APIs have “clients” which make interacting with the APIs much easier. When working with an API, it’s a good idea to check if a client already exists for it. Clients are essentially R packages that others have already written that allow you to interact with the API without having to worry about the details of its structure or cleaning up the API response into workable data. For example, you can use the pageviews package to query the Wikipedia API from our example before. Running the function pageviews::article_pageviews(project = "en.wikipedia", "Barack Obama", start = "2018010100", end = "2019123100") returns a nice R data frame of Barack Obama’s article pageview number for every single day in 2018. By using this client, you can pretty much forget about everything you read up to this point (the anatomy of an API request, making API calls, processing the response) and instead use the simple R functions the client provides to get the information you need.

Build your own API

The plumber package makes it easy to convert R code into an API, which can then be published to the web. It works by annotating your R functions with special comments to define your API’s parameters, request methods, etc.

Conclusion

Whether you are looking to build your own APIs or use someone else’s, I hope this post helped to better understand how they work and why they are such a powerful tool!