How to access an API for first-time API users

This article presents an overview of APIs and serves as a step-by-step guide on how to use APIs for first-time API users, while using our Hackathon project as an example.

Riesling Walker
Data Science at Microsoft
13 min readNov 22, 2022

--

By Riesling Walker and Deepsha Menghani

Deepsha and I worked together to learn about APIs for the 2022 Microsoft Hackathon, a company-wide, multi-day, global event in which employees are encouraged to be curious and solve problems that they are passionate about. It’s encouraged that these “hacks” be work related, but they don’t have to be! Some Hackathon projects even end up being new Microsoft products like the Xbox adaptive controller!

For our Hackathon, we decided to use our data science skills to analyze our knitting queues on Ravelry, a social networking and organizational website for yarn-related crafts, by pulling data through the Ravelry API. We had been talking about ways we might be able to fit knitting into a Data Science @ Microsoft article, and we finally found our chance! But don’t worry — you don’t need to know anything about knitting to understand this article.

Through this project our goal was to:

  1. Learn to pull data with API calls in Python using Flask (this article!)
  2. Collaborating between Python and R using Reticulate
  3. Package a Data Science portfolio project

The remaining three articles will be published soon! So, stay tuned!

Why this project?

We had many reasons for wanting to pursue this, including:

  • APIs are cool. They are not something either of us have implemented at work before, but they always seem like they could be useful.
  • APIs are a self-contained portfolio piece. They make great portfolio pieces for a resume because they involve data collection, data cleaning, debugging, data visualization, reproducibility, and more.
  • APIs represent an opportunity for trying new things (and writing multiple articles). We liked this idea because we knew we could try a bunch of new things and write a series of articles about it! (More on this at the tail end of the article.)
  • We love knitting! Enough said.

So, what is an API?

An API, or application programming interface, is a way for computer programs to “talk” to each other. The most common and flexible APIs are Representational State Transfer APIs (REST APIs), where the client sends a request to the server, and the server returns an output to the client.

But this technical definition doesn’t quite explain the concept to someone who isn’t familiar with APIs already. A common analogy that’s used to explain APIs involves going to a restaurant.

  • A customer (i.e., a client) might call get_menu(), leading the waiter to bring the customer a menu of options.
  • A customer might call order(item = “chicken parm", side = “pasta"), leading the waiter to bring the customer a chicken parmesan with pasta.
  • A customer might call get_check(), leading the waiter to bring the check.

As you can see in this analogy, there are different “requests” that a client can call, each with different parameters, and each with different outputs. This is much like how a typical API operates.

Now that we’re familiar with the concept of client calls, let’s talk about some examples of APIs that you might use every day without realizing it:

  • When searching for flights on a flight aggregator like Expedia, you’re really providing a series of inputs including dates, origin city, and destination city to the website, which uses the APIs from a variety of airlines to retrieve the available flights that meet those parameters.
  • When checking the weather, you provide the input of ZIP code to an API function that returns a description of the weather in your area that it pulls from the weather bureau’s software system.

These are two examples of app or website functionality that pull their data from external APIs. But often a single app or website may use many private APIs as ways for different internal teams’ products to interact with each other to reduce dependencies and make integration cleaner.

APIs versus relational databases

One thing that we learned during this process is that doing analysis with an API is not the same as doing it using a relational database.

With a relational database, you start with everything and then whittle down to what you want, and each table that you start with has a clear and defined structure, meaning that you can expect every row to follow the same format. You can think of this as being similar to how an artist starts with a block of marble and carves out a sculpture, or how a carpenter does woodturning with a block of wood to turn it into a furniture piece. Both of these instances involve starting with a clear, defined block and then filtering down to get the result.

In contrast, when working with APIs, you start with nothing, and you must carefully request new data or material to build up to what you’re trying to create. Additionally, the typical API output is a JSON object, which means that for a single API call, the output might be different depending on the parameters — which in turn means that it might take more thought to combine objects together. You can think of this as similar to building a house: Even if a worker requests “wood” or “nail,” there are many shapes, types, and characteristics involved that are not interchangeable, and so it takes many types of components to construct something useful.

Because we are used to working with relational databases, the questions that we came up were ones that could be easily answered with a relational database, but not quite as easily with data from APIs. For example, we wanted an answer to the question “What percentage of knitting patterns are made by the top 10% of designers?” but, because there is no API call to get “all designers” or “all patterns,” we would have had to limit our universe to “the top n patterns.” As a result, we needed to start thinking about our questions with a limited universe up front, such as “Of designs queued by Riesling…” to ensure that we could pull all results with the API.

Getting started

As we mentioned earlier, this was a hackathon project, so we had a limited amount of time to accomplish our goal! So, we divided and conquered.

Because I had an API homework assignment from a class at Georgia Tech Online MS in Analytics using The Movie Database to refer to, I would work on the API calls, while Deepsha focused more on the analysis and data visualization. So, for the remainder of the article talking about APIs, this will be about my experience. Future articles will be about Deepsha’s experience.

API Components

API key or authentication token

Think of an API key as credentials that allow you to access an API. Sometimes this is in the form of only one token or key, and sometimes it comes in the form of a username and password. This allows the server that returns the information to the client to make sure that the client is authorized to get this information and to track how many requests are coming in per key. (The latter allows the API owners to charge per API call or set limits on API calls to reduce costs.)

An authentication token is basically an API key, but it identifies an individual user. This allows someone to add, modify, and delete data about their account, similar to what they could do logged into a website or UI.

Because API keys and authentication tokens can be used to track costs and even access secure account details, it is recommended to never share these credentials. For applications or programs, these are often stored in a separate file, or prompted for the user to enter them for the program. This is why tokens are not included in the provided GitHub Repository or code snippets below, but you are welcome to create your own to try out our code.

Getting an API key or authentication token is different for each API. For the Ravelry API, you can find instructions here.

API HTTP client

A Hypertext Transfer or Transport Protocol (HTTP) client allows you to make HTTP requests for an API. In our restaurant example above, this is the waiter — you can’t tell the kitchen directly what you want to eat, so you need some sort of client who gets the request from your table and back into the kitchen.

For this project I used http.client in Python.

API request and parameters

An API request is a call that you make to the API, letting it know what action you want to take or what information you hope to receive. This differs for each API that you use, so make sure to refer to that API’s specific documentation. Thankfully, Ravelry has clear documentation about all possible API calls, their parameters, and their outputs: https://www.ravelry.com/api

How can you tell whether an API is performing as expected? APIs return error codes if they are not performing correctly, letting you know details about what aspect isn’t working. I will get into this later in the “Debugging an API call” section further below.

API example in Python

This example uses the Ravelry API, but I hope these steps help you get started with any API. You can find all Python code used in the associated GitHub Repository.

As I mentioned earlier, Ravelry is a social network and organizational website for yarn crafts. It allows users to keep track of their own yarn, patterns they want to make, patterns they’ve made, search for patterns out there, and interact with other members. You can think of it like Goodreads or other websites that allow you to keep a log of what you’ve done, a queue of what you would like to do, search for things, and interact with other users.

A first API call

First, I need to import the needed packages:

Next, I call an API that does not require an API key to ensure that the http.client is working as expected, and the output can be read:

Query output

Great! It’s working!

Calling an API with an API token

Now, I want to try it with Ravelry. I will not be providing a personal token, so please create your own credentials if you would like to try this code.

To start, I chose the GET /color_families.json request because it has no parameters (which could cause complexity):

Query output

Gosh, that’s hard to read. So, let me use the json package to display it in a more readable way:

Query output

Beautiful!

Calling an API with parameters

Next, I wrote a request that has parameters: GET /patterns/search.json

First, I ran it without any parameters, because no parameters were required:

Both the top and bottom of the query output

Great!

I also noticed that for this JSON output there is a “patterns key” that contains a list of pattern details and a “paginator key” that contains information about how many search results there were and what page this came from. This could be useful later for iterating through pages and validating requests parameters.

Next, I wanted to add some parameters. I added search query, page, page_size, and — although only alluded to in the documentation — the type of craft:

Both the top and bottom of the query output

Here, from the paginator json key, we can see that there are 116 pages for these parameters. This is the output from page 6; there are two outputs per page, there are 232 total results, and the last page is page 116.

Creating a function for an API call

The next thing to do is make this repeatable. If I wanted to use this multiple times in my code, rather than copying and pasting the code, it is much easier, more readable, and less error prone to define a function:

Query output

Fabulous!

Creating a class for API call functions

Now, what if I want to add more functions? Should authUsername and authPassword be parameters in each function that I create? And what if I want to expand this to an app or package that multiple people could use to make requests — especially if people have different levels of authentication credentials?

The best way to expand this would be to build a class.

What is a class? A class is an outline for an object with all of the characteristics that the object might have, and these characteristics are usable at any time that the class is referenced. An analogy might be to think of a pet. Each pet has an “animal_type”, an “animal_name”, and an “animal_owner”. These attributes are then callable within any function defined within the class. Additionally, any function within the class cannot be called without a reference to an object in the class.

Query output
Cat tax: here is Riesling’s sister’s-in-law cat Hamilton! Photo used with her (and the cat’s) permission.

So, I could do the same thing and define a class that has the characteristics of the authUsername and authPassword, allowing the user to enter credentials only once, and allowing the user to access functions with different credentials throughout the code if needed.

Here, I define the class and functions that each perform a different API call:

Now, I can define an instance of the class:

And then call each function using that instance:

Query output
Query output
Query output

Debugging API calls

You may have noticed that I defined another function in that class that I haven’t called yet. That is because it doesn’t work!

As you can see in the documentation, the “favorites” call needs authenticated credentials, and I only created read only credentials.

Query output

Here we get two errors:

  1. 403 Forbidden because this is not a read-only API method and requires authenticated credentials. You can set up authenticated credentials with these instructions from here to make this function work.
  2. Because I assumed the output is a specific format, I get a JSONDecodeError. This could be solved with a try_catch or a case statement making sure the API gets a 200 code, enabling it to proceed.

One of my favorite parts of working with APIs is that the error codes are very clear and include descriptions to help debug the issues. They make using APIs a lot more approachable.

As a reminder, all of this code is available in the associated GitHub Repository.

What’s next?

All right! I can pull data using an API! So, what’s next?!

As I mentioned earlier, Deepsha and I divided the work on this project so that I would work on the API calls, and she would do the analysis and visualization. As evident by some of her previous articles, Deepsha is more comfortable using R, but I used Python for these API functions. Welp. Guess we’re each on our own then… And Deepsha will have to rework all of this in R….

Just kidding! We planned this! Follow Data Science @ Microsoft or Deepsha Menghani on Medium to get notified about our next article on how to collaborate on a project with both Python and R, where Deepsha will walk through how we worked together, along with the challenges that we faced.

We’re also sure that you’re dye-ing (yarn pun) to know about the analysis that we did with our Ravelry data! For that, follow Data Science @ Microsoft or both of us to see the output of our analyses and how we might package them together for a portfolio piece on our resumes. While you’re waiting, you can check out my other resume tips.

Knitting photos

Lastly, I couldn’t end this article without showing off some of my favorite knitting projects! I hope Deepsha will show off some of hers in the next article.

Project Details: Ravelry: rieslingm’s Don’t Panic; Photographer: Rachel Betson (@rbetsonphotography_weddings) • Instagram photos and videos

Resources

--

--

Riesling Walker
Data Science at Microsoft

Senior Data Scientist @ Microsoft. I like to talk about data, professional development, gender, the podcasts I’m listening to, and what I’m knitting.