Automating URL Submission to Google for Indexing with Custom Search API and Indexing API

Sadith Vithanage
4 min readSep 21, 2023

--

Introduction

Ensuring that your website’s pages are indexed by Google is crucial for improving your online visibility and driving organic traffic. When you create new content or update existing pages, it’s essential to check whether these URLs are indexed. In this article, we’ll explore how to use the Google Custom Search API to determine if URLs are indexed and, if not, how to submit them to the Google Search Console using the Indexing API.

Part 1: Checking URL Indexation with Google Custom Search API

The Google Custom Search API is a powerful tool that allows you to pro-grammatically search Google’s index for specific URLs. With just a few lines of code, you can determine whether a URL is indexed or not.

Step 1: Set Up Your Custom Search Engine

Before we dive into the code, you’ll need to set up a Custom Search Engine (CSE) on Google:

  1. Go to the Google Custom Search page.
  2. Create a new custom search engine and configure it to search the entire web.
  3. Retrieve your Custom Search Engine ID from the setup.

Step 2: Obtain an API Key

Next, you’ll need to obtain an API key to use the Google Custom Search API:

  1. Go to the Google Developers Console.
  2. Create a new project or use an existing one.
  3. Enable the “Custom Search API” for your project.
  4. Generate an API key for your project.

Now that you have your Custom Search Engine ID and API key, you can use them in the code below to check if a URL is indexed:

import csv
from googleapiclient.discovery import build
import time
import random
import logging # Import the logging module

# Replace with your own API key and Custom Search Engine ID
api_key = 'YOUR_API_KEY'
custom_search_engine_id = 'CUSTOM_SEARCH_ENGINE_ID'

def is_indexed(url):
try:
service = build("customsearch", "v1", developerKey=api_key)
result = service.cse().list(q=f"site:{url}", cx=custom_search_engine_id).execute()

if 'items' in result and len(result['items']) > 0:
return "Indexed"
else:
return "Not Indexed"
except Exception as e:
# Capture and log the exception
error_message = str(e)
logging.error(f"Error for URL {url}: {error_message}")
return f"Error: {error_message}"

# Read URLs from the input CSV file
urls_to_check = []
with open('input_urls.csv', 'r') as csvfile:
csvreader = csv.DictReader(csvfile)
for row in csvreader:
url = row['URL']
urls_to_check.append(url)

# Create a new CSV file to store the results
output_file = 'indexed_urls.csv'
with open(output_file, 'w', newline='') as csvfile:
fieldnames = ['URL', 'Indexing Status']
csvwriter = csv.DictWriter(csvfile, fieldnames=fieldnames)

csvwriter.writeheader()

for url in urls_to_check:
try:
indexing_status = is_indexed(url)
csvwriter.writerow({'URL': url, 'Indexing Status': indexing_status})

# Add a random delay between 5 and 10 seconds before the next request
delay = random.uniform(5, 10)
time.sleep(delay)
except Exception as e:
error_message = str(e)
csvwriter.writerow({'URL': url, 'Indexing Status': f"Error: {error_message}"})
logging.error(f"Error for URL {url}: {error_message}")

print(f"Results saved to {output_file}")

This code defines the is_indexed function, which takes a URL as input and checks its indexing status using the Google Custom Search API. It returns "Indexed" if the URL is indexed, "Not Indexed" if it's not, or an error message in case of an exception.

You can incorporate this code into your workflow to check the indexing status of multiple URLs programmatically.

In the next part of this article, we’ll explore how to submit URLs that are not indexed to the Google Search Console using the Indexing API. This two-step process will help you maintain an up-to-date index of your website’s content on Google.

Part 2: Submitting Not Indexed URLs to Google Search Console

In Part 1, we learned how to check the indexing status of URLs using the Google Custom Search API. Now, in Part 2, we will explore how to submit URLs that are not indexed to the Google Search Console using the Indexing API.

Prerequisites

Before we dive into the code, you’ll need to prepare a few things:

  1. Service Account JSON Key: Obtain a service account JSON key file from your Google Cloud Console. Replace "YOUR_SERVICE_ACCOUNT_JSON_FILE" in the code below with the path to your JSON key file.
  2. Python Libraries: Make sure you have the required Python libraries installed.
from oauth2client.service_account import ServiceAccountCredentials
import httplib2
import json
import pandas as pd

# Replace with your service account JSON key file path
JSON_KEY_FILE = "YOUR_SERVICE_ACCOUNT_JSON_FILE"
SCOPES = ["https://www.googleapis.com/auth/indexing"]

# Initialize credentials and HTTP client
credentials = ServiceAccountCredentials.from_json_keyfile_name(JSON_KEY_FILE, scopes=SCOPES)
http = credentials.authorize(httplib2.Http())

def indexURL(urls, http):
ENDPOINT = "https://indexing.googleapis.com/v3/urlNotifications:publish"

for u in urls:
content = {}
content['url'] = u.strip()
content['type'] = "URL_UPDATED"
json_ctn = json.dumps(content)

response, content = http.request(ENDPOINT, method="POST", body=json_ctn)

result = json.loads(content.decode())

# For debug purposes
if "error" in result:
print("Error({} - {}): {}".format(result["error"]["code"], result["error"]["status"], result["error"]["message"]))
else:
print("urlNotificationMetadata.url: {}".format(result["urlNotificationMetadata"]["url"]))
print("urlNotificationMetadata.latestUpdate.url: {}".format(result["urlNotificationMetadata"]["latestUpdate"]["url"]))
print("urlNotificationMetadata.latestUpdate.type: {}".format(result["urlNotificationMetadata"]["latestUpdate"]["type"]))
print("urlNotificationMetadata.latestUpdate.notifyTime: {}".format(result["urlNotificationMetadata"]["latestUpdate"]["notifyTime"]))

# Read URLs from a CSV file (data.csv)
csv = pd.read_csv("data.csv")
csv[["URL"]].apply(lambda x: indexURL(x, http))
  • Replace "YOUR_SERVICE_ACCOUNT_JSON_FILE" with the path to your service account JSON key file.
  • Ensure you have a CSV file named “data.csv” containing a column named “URL” with the URLs you want to submit for indexing.

When you run this code, it will submit the specified URLs to the Google Search Console Indexing API. You will receive information about the submission status for each URL, including whether it was successfully submitted or if there was an error.

This two-part process, combining the Google Custom Search API and the Google Search Console Indexing API, allows you to efficiently manage the indexing of your website’s content on Google. It’s a valuable technique for maintaining an up-to-date presence in Google’s search results.

--

--