It’s not often that you find a page that holds a top position for a highly coveted keyword that doesn’t simultaneously hold various top positions for similar ranking keywords. As SEOs, we look to create content targeting a specific query that also attracts similar terms to that core keyword. To find these correlating keywords, we are often left to scrape the “People also search for” section of the SERP or leverage third-party platform’s keyword research tools. But, these approaches are limited in scope, lack scalability, and can leave us wanting much more from this data. Instead, we can leverage Python to give us a more streamlined solution.
The idea for this script came up in a conversation between two former colleagues of mine that I happened to eavesdrop my way into. In discussing searches around “ADHD quizzes,” we were aiming to interpret the user’s true intent of the search and find out other ways users were looking to understand ADHD symptoms. A user can search for “ADHD quiz” with the intent to see if they have ADHD. Then again, searching for “ADHD quiz” can be seen as a search around your knowledge of ADHD, in the event you are looking to diagnose someone else or if you are trying to become more informed about ADHD without having it yourself. How do we create distinctions in queries to understand their relevance, and how can we quantify these learnings to expedite both keyword research and keyword insights gathering? I took that conversation more as a challenge and took my shot at it.
For that, we will aim to emulate Google’s interpretations of a keyword and how they connect multiple queries to a single result. By inputting a core term, we will gather top results, we will collect highly ranked keywords associated with the SERP listings, look for top listing results associated with THOSE results and then quantify relevancy against the original first input. In other words, let’s create a spider web of the primary term to other associated terms to see what other relevant queries Google associates with our search.
To do this, we will be using the SEMRush API (of which I am a huge, unpaid advocate). Note, that with this script, the API credit usage can vary based on the original input keyword used. Safeguards have been placed within functions to limit credit usage, but I have seen this script consume many (MANY) API credits for more intensive projects.
We will begin with importing the necessary libraries and functions.
- build_seo_urls(): allows us to ping the SEMRush API and return the top 10 results for a keyword.
- parse_response(): parser for extracting data from SEMRush API call.
- url_org(): pings the SEMRush API to give us every keyword in positions 1–10 for a given URL.
- secondary_later(): runs through our the the process of collecting keywords across our URL set found via url_org().
- third_layer_setup(): takes the second stage and compares it to the core results, calculating relevancy of semantically similar keywords through an occurrence score.
Running the Script
You will want to add your SEMRush API key to the api_key variable and set the “database” key in build_seo_urls to whatever market you are looking to search within.
The core of the script is around 15 lines where each function in a spider-like fashion will execute. When you run the script, enter in your starting keyword via the input to begin. Once the script runs, it will tell you how many semantically similar keywords it could find and export the file into a CSV with correlating search volume and an occurrence score of 1 to 10. Your starting query will always score 10 (10 out of 10 URLs rank for your original input because…you know…that’s where we started). From there, it will give you a score using that same notion.
The rule of thumb I have leveraged with this script is any keyword with an occurrence score above 7 is highly relevant to your source. 6 through 3 provide great added content opportunities or new areas of exploration. 2 through 1 give you a good sense for what other relevant searches are in the overall space of your original query.
There are many use cases for this script that go beyond traditional SEO research. From expedited keyword exploration (as often needed for new business initiatives, enterprise-scale keyword research and researching queries that are brand new to you), content ideation, gauging consumer sentiment around a particular subset of searches and data gathering for training clustering models, the world is your oyster! For the full script, check it out via my GitHub!