Ingesting Zoho CRM Data Using API in Python

Sheranga Gamwasam
5 min readFeb 25, 2024

--

CRM (Customer relationship management) software solutions play important role in businesses of any scale to manage and track their daily business activities across different modules such as Users, Calls, Meetings, Emails. Leads, Products, Sales and more. This helps in making better decision to improve the business to next level.

Zoho CRM is one of the top CRM software solution in the market currently. Today I’ll address the steps to be followed to ingest Zoho CRM data using COQL (CRM Object Query Language) in python.

  1. Create a Self Client Application and Get the Client ID and Secret.
  2. Add required Scopes and Retrieve authorization code.
  3. Generate Refresh Token.
  4. Generate Access Token.
  5. Download Zoho CRM data.

Create a Self Client Application and Get the Client Id and Secret

First you need to register your application at the Zoho API Console to receive your Client ID and Client Secret.

Then click on the Get Started and Hover the Self Client and click CREATE NOW, After that click OK. Once you create a Self Client Application, it will generate a Client ID and Secret.

Add required Scopes and Retrieve authorization code

You can generate an authorization code by providing valid and required scope details and setting the time duration under Generate Code.

Now, you have successfully completed the first two stages and obtained the required client credentials. Let’s proceed to the next step to generate an refresh token.

Generate Refresh Token

When you run the code snippet below with client credentials, it will generate refresh token. (The Code expires within Time Duration, defaulting to 3 minutes that you selected when creating Code)

Generate Access Token

An Access token is required to download Zoho CRM data, but it expiries every hour. If you need to generate a new access token, you have to use a refresh token.

Download Zoho CRM data

The Initial Step in Download section is to identify the required data points that need to be ingested using COQL because Zoho CRM have created different modules based different objectives. The table below shows some example modules with their description.

COQL for User modules

SELECT id, first_name, last_name, email, alias, phone, mobile, website, fax, street, city, state, country, zip, locale, country_locale, time_zone,
role.id, dob, created_by.id, Modified_By.id, created_time, Modified_Time,profile.id, time_format, Currency, status,confirm, zuid,
date_format, Current_Shift, Next_Shift, Shift_Effective_From, number_separator, decimal_separator, sort_order_preference__s,
name_format__s, status_reason__s, Source__s
FROM users
WHERE created_time > '{watermark_date}'
ORDER BY created_time limit {offset}, {limit}

COQL for Calls Modules

SELECT  id, Owner.id, Subject, Call_Type, Call_Purpose, Who_Id.id, What_Id.id, Call_Start_Time, Call_Duration, Call_Duration_in_seconds,
Description, Call_Result, Created_By.id, Created_Time, Modified_Time,Reminder, Layout, Tag, User_Modified_Time,
System_Related_Activity_Time, User_Related_Activity_Time, System_Modified_Time, Outgoing_Call_Status, Call_Agenda, Caller_ID, Dialled_Number, Customer_Number, Scheduled_In_CRM, Voice_Recording__s,
$se_module
FROM Calls
WHERE Created_Time > '{watermark_date}'
ORDER BY Created_Time limit {offset}, {limit}

COQL for Events Module

SELECT id, Owner.id, Event_Title, Venue, Who_Id.id, What_Id.id, Created_By.id, Modified_By.id, Created_Time, Modified_Time, Description,
Recurring_Activity, Remind_At, Currency, Exchange_Rate, All_day, Event_Purpose, Event_Type,Layout, Check_In_Time, Check_In_By.id,
Check_In_Comment, Check_In_Sub_Locality, Check_In_City, Check_In_State,
Check_In_Country, Latitude, Longitude, ZIP_Code, Check_In_Address, Check_In_Status, Start_DateTime, End_DateTime, Meeting_Notes_Test,
Tag, User_Modified_Time, System_Related_Activity_Time, User_Related_Activity_Time, System_Modified_Time, msexchangesync__Event_Source, msexchangesync__Events_MS_Id, $se_module,
$u_id
FROM Events
WHERE Created_Time > '{watermark_date}'
ORDER BY Created_Time limit {offset}, {limit}

COQL for Emails

SELECT id, Subject, Status, Sent_On, Owner.id, Sent_To, Created_Time
FROM Emails
WHERE Created_Time > '{watermark_date}'
ORDER BY Created_Time limit {offset}, {limit}

The sample code snippets for download are shown below.

from datetime import datetime, timedelta
import json
import requests
import pandas as pd

def __get_payload_query__(path, watermark_date, offset, limit):
"""Function to get the COQL payload.

Args:
path: the path where COQL payload file is available.
"""
select_query = ""
try:
with open(path, 'r') as file:
query = file.read()

select_query = query.replace('{watermark_date}',watermark_date)
select_query = select_query.replace('{offset}',str(offset))
select_query = select_query.replace('{limit}',str(limit))
except Exception as Error:
print(Error)
return select_query


def __get_data_from_zoho_api__(watermark_date,path, access_token):

""" Get the API response by passing Created_Time, limit(max is 2000) and offset(pagination)
Note: we can interate maximum 10,000 records with offset parameter
"""

url = "https://www.zohoapis.com/crm/v5/coql"
limit = 2000

offset = 0
more_records_status = True
yesterday_date = (datetime.now() - timedelta(1)).strftime('%Y-%m-%d')

if watermark_date.split('T')[0] < yesterday_date:
while more_records_status == True:

select_query = __get_payload_query__(
path=path,
watermark_date=watermark_date,
offset=offset,
limit=limit)

payload = json.dumps({
"select_query": select_query
})

headers = {
'Content-Type': 'application/json',
'Authorization': 'Bearer ' + str(access_token)
}

response = requests.request("POST", url, headers=headers, data=payload)

if response.status_code==200:

json_response = response.json()

if 'data' in json_response:

data_df = pd.json_normalize(response.json()['data'])
data_df.columns = [x.lower() for x in data_df.columns]

file_name = "zoho/zoho_" + str(data_df['created_time'].max()) + "_" + str(datetime.now()) + "_.parquet"
data_df.to_parquet(file_name)

if 'info' in json_response:

more_records_status = json_response['info']['more_records']

if more_records_status==True:
offset = offset + limit

else:
more_records_status=False

Reference

--

--