The purpose of this article is to provide you with a useful Python program I’ve created that connects to Google’s Knowledge Graph API. The program is built on top of the basic skeletal model provided by Google. This updated program allows the user to:
- Query the API for a specific term, and see what Knowledge Graphs currently show up in search for them.
- Select how many results you want to be returned.
- Filter the results by schema.org types (more on this later).
- Save the results to a local file for analysis.
What are Knowledge Graphs?
Chances are that if you’re reading this for SEO purposes, you inevitably already know what knowledge graphs are. As such, I’ll refrain from spending too much time going into detail on that subject. In essence, knowledge graphs are the panels seen on the right side of search results for a myriad of entity types. These “types” can vary from people, places, music, scientific theories, art etc. Depending on this type, knowledge graphs provide the user with information about the subject. If the knowledge graph is for a corporation, there is typically information on the current CEO, headquarters, year founded etc.
The data pulled into the knowledge graphs can be influenced by the structured data present on your web pages, but we won’t go too much into those specifics.
What is the Google Knowledge Graph API?
Google’s knowledge graph API provides us with direct access to the database where we can see which knowledge graphs are showing up for a given query. The results seem to be independent of the user’s location, therefore giving us a truer idea of what the ecosystem looks like in general. The API allows us to get a ranked list of knowledge graphs that regularly show up for a certain query, allowing us to get a high level view rather than having to perform a sequence of searches in a browser arbitrarily. For a more in depth explanation of the API’s capabilities, checkout the official Google Knowledge Graph API documentation.
How is this API Useful for SEO?
Since knowledge graphs are highly sought after by businesses and SEOs due to their large visible footprint in search, being able to identify areas of opportunity for winning knowledge graphs is an enticing proposition.
This is where my knowledge graph program can help run quick audits to identify these opportunities.
As the program stands today, I am only pulling a few chief metrics from the queries:
- Result name — Name present on knowledge graph.
- Result ID (MID) — Unique entity ID of the knowledge graph.
- Result Score — Score of results, the higher the score the more often it appears for the given query.
- Result Description — Provides the description on the knowledge graph, if present. They are typically sourced from Wikipedia. I’ve made this an optional field, the reason being that many knowledge graph results don’t have descriptions and making it mandatory would shorten the result list.
There are many more results you can have returned for your queries, I may add these as optional fields later on. You can see the full list of response fields in Google’s documentation.
Now into the nuts and bolts, in this section I’ll cover what prerequisites you’ll need in order to use the program. Here is the basic breakdown:
- Install latest Python release (3.7.2 currently)
- Install git.
- Download Visual Studio Code (or a suitable equivalent to run the code).
- Clone the program from my Github repository to your machine.
- Install the Python ‘requests’ module.
- Get Google Cloud API key.
I will not be going into minute detail on all aspects of peripheral configuration. If you hit a snag, a simple Google search should help you as most of these issues are heavily documented. Or ask a developer for assistance if you have access to one.
In order to install the latest Python library, simply navigate to Python.org and look under “Downloads”. There should be releases available for all major operating systems.
In this example, we’ll download for Windows and select one with an executable installer. Simply click the option you want and follow the installation prompts until it indicates Python has been installed onto your machine.
We’ll be using git in order to communicate with GitHub, so we need to first install git. You can find the directions on installing git on different OS via their official website.
If you aren’t very familiar with installing packages directly from the command line, they also provide downloadable versions for ease of use.
Downloading Visual Studio Code (or equivalent)
Next we need a platform in order to run the program in, I typically work in Visual Studio Code since it’s free and very robust. However, if you have an equivalent you prefer feel free to use that. All you need is platform in which to open and run the program in.
Clone the Repository from GitHub
Now we need to clone the repository from my GitHub that contains the program. For this, we’ll be using git (which we installed earlier) as well as using the command terminal.
Here are the steps:
- On your machine, in the start search bar look for the command terminal native to your OS. On Windows it’s PowerShell and on Mac typically it’s Linux. A simple Google search should help you determine which one you have.
- Navigate to my GitHub repository containing the program. On the far right you’ll see a green button that says “Clone or download”, we’re going to be cloning. Now click the little clipboard image circled below, this will copy the path.
3. Open the command line terminal native to your machine, we’re going to clone the project to the desktop for ease of access. You can accomplish this by entering :
Then simply hit enter and you should see the desktop folder appended to the path.
This is how you navigate in and out of folders via command line. For our purposes however, this is as far as we need to go.
Next we will clone the Github project to our desktop. Simply enter this command into the terminal and hit enter:
You should see some action taking place and a success message in your terminal. It worked! Now if you look at your machine’s desktop you should see a folder containing the Python file called ‘google-knowledge-graph-api’.
Installing the ‘Requests’ Module
This is a quick one, in order for the Python program to run all dependencies need to be installed. This Python program leverages the requests module which doesn’t come with the standard library, thus it needs to be installed separately.
In your command terminal enter this command and hit enter:
pip install requests
You should see activity and a success message, now the requests dependency should be downloaded to your machine.
Getting Google Cloud API Key
Finally, in order to interact with many of Google’s APIs you are required to obtain an API key. This is actually quite simple, just follow the steps outlined in Google’s official documentation to create an account and get your API key. This key will typically be a long string of random letters and numbers, you’ll need this key in order to use the Python program.
**NOTE** Make sure to keep your API key safe, never publish it anywhere publicly where it can be exploited.
Using the Python Knowledge Graph API Program
Now that setup is complete, we can move on to actually using the program.
Follow these steps:
- Open Visual Studio Code
- Look at the file path in the terminal at the bottom of the interface
3. Move the Python file from the desktop and move it to the folder the terminal is pointing to. This is to ensure Visual Code can find the file. In this example the platform is looking in the ‘python-practice’ folder. It’s worth noting that this path can be changed at any time, however, for simplicity purposes we’ll just keep the default path.
4. Once the file has been moved, in the top left click File > Open File…
5. Navigate to the folder with the file and select the Python program you’ve cloned.
6. You should now see the file open in your Visual Code.
7. Next we need to insert your unique API key into the program so you can start using it.
8. Save the API key string to a .txt file named ‘api-key.txt’. It’s important to ensure the name is exact since this is the file the program will look for. Now move this .txt file to the same folder as the Python program, this is the same process we did in step 2. Visual Code will look for the required files in this folder, thus it’s important to make sure both are present to pull from.
Now we’re ready to use the program.
Running the Program
Once all of the steps above have been completed, navigate to the program file open in Visual Code. Right-click on the window and select “Run Python File in Terminal”.
This will run the program and you should see activity in the terminal in the bottom of the screen. This is where you will interact with the program.
**NOTE** Since this program is terminal based, all you need in order to interact with it is your keyboard.
Here is the flow of the program:
1. First you’ll be asked to enter a query, this can be anything you want (e.g., cyber security, McDonald's, The Beatles etc.)
2. Second you’ll be asked to enter the number of results you want returned, this can be any numeric value. I typically opt for 10–20.
3. Third you’ll be asked if you want to enter an ‘entity filter’, this is in reference to the knowledge graph types mentioned earlier. This is useful if you want to only get results for knowledge graphs for a “Person” or perhaps an “Organization”. Enter either “Y” or “N”.
If you entered “Y” and wish to filter the results, you’ll be asked what type you wish to filter by. I’ve provided a link in the program to the schema page where you can find many types to search by.
4. Fourth, you’ll be asked if you want a knowledge graph description returned in the results. This is the text present on the knowledge graphs, usually sourced from Wikipedia. This is useful for matching knowledge graphs with those in search.
**NOTE** It’s important to reiterate that in many queries, only one or two results will have descriptions present on the knowledge graphs. As a result, you’ll see just one result and an error below it. This error is caught and displayed, essentially if a result doesn’t have the description metric it will error out. This is why when opting for descriptions your results will tend to be far smaller. If you just want an aggregate list of knowledge graphs, just say ’N’ to this field.
5. Fifth, you’ll see results appear for the query you entered. You’ll notice that, depending on the query and filter, there is an interesting smattering of results. In many cases you’ll see knowledge graphs that are related to the query but not a direct match. These could be founders of a company, actors in a film, musicians in a band etc.
Next to each result you’ll see an “ID” metric, this is the unique entity ID given to knowledge graph results based on Freebase data (currently).
Here’s an exert from an article on the topic of the IDs:
The “ids” field refers to the machine-generated identifier (MID), which is key to understanding concepts in both Freebase and the Google Knowledge Graph. A MID is, as per the Freebase wiki, a unique identifier for an entity. It just so happens that Google appears to leverage them and be dependent upon them in some aspects.
You can match up this ID in Google trends as well to see them in action.
As you’ll notice, next to each result is a numeric value in parentheses. This is the ‘Result Score’, which basically means that the higher the score the more often that knowledge graph is returned for the given query. This is useful to know when analyzing queries for knowledge graph opportunities.
If you notice consistently low scores for knowledge graph results for a given query, there’s a potential to optimize for the knowledge graph and overtake the weaker performing results.
6. Sixth, you’ll be asked if you wish to save the results to a local .csv file. If you enter ‘Y’, you’ll see a success message which provides the name of the file so you can find it. The file will be named ‘knowledge-graph-results.csv’. Once saved, simply search your computer for this file and open it to view the saved query results. If you enter ’N’, no results will be saved.
7. Seventh, you’ll be asked if you want to enter another query. If you enter ‘Y’ the program will loop and you can enter another query to start the process again. This allows you to perform as many queries as you want in a short amount of time. To exit the program, simply enter ’N’ when asked if you want to enter another query.
Hopefully you were able to follow this setup successfully and are able to find use of this program for your own SEO analyses.
If you have any questions feel free to reach out, I’ll be making updates to this program to add new features, as well as ensure everything is running smoothly. In case you don’t use the program for awhile and then come back, you can simply clone a new copy of the program to your desktop to ensure you have the latest version.