Listen Notes API

Chava Gourarie
Antenna
Published in
3 min readNov 15, 2018

I’ve been messing around with the API from Listen Notes, a new-ish podcast database that appears to be making some headway in correcting the terrible state of podcast search.

As of June 2018, the database had 484,719 podcasts, and 29,732,875 episodes (according to the Listen Notes website), categorized into 18 categories and 81 subcategories (according to the API and the count function in R).

Listen Notes doesn’t pull from iTunes but compares in terms of total podcasts. A Variety article in February put the number of podcasts on iTunes at roughly 500,000.

I ran some searches to get a sense of what data the API will return. A search for ‘ta nehisi coates’ since January 1, 2018 returns 120 results, with 17 columns, including publisher, episode description, date and genre labels. Here are the

genre results for the Coates search:

The genre tags seem to be indicative that recently Coates has been coming up a lot more on the Black Panther circuit than in news and politics. This is born out by a very basic NLP attempt, in which I did a named entity search on the episode descriptions (using a bastardized version of this tutorial). Here are the results:

The three names after Coates and James Baldwin are all comic book writers/artists. Priest worked on Black Panther, while Waid and Samnee show up on episodes about Marvel that make mention of Coates but are not specifically about him.

But honestly, the descriptions are so inconsistent that all this chart means is that someone wrote a really detailed episode description, or has a boilerplate graf that includes a bunch of names/keywords. Still, podcast descriptions are the only way to learn anything about a podcast other than the metadata, so it was worth a try.

In general, it took me a while to wrangle the Search API through R, because a. I don’t know what I’m doing, and b. the API provides syntax and headers for like 7 different programming languages but not R. The more specific challengers were working with UNIX timestamps (in milliseconds!), and figuring out how to do pagination because the API returns 10 results at a time.

Eventually, I figured out how to run a search query and limit it to a range of time — though I suspect right now that there’s a cap on the results, since the Coates search returned exactly 120 results.

This was just a way to learn what was in the API, and how to work with it in R. Hopefully I’ll get to some more interesting analysis next.

--

--