Zendesk to Neo4j Integration
Better control over you reporting needs
Zendesk, one of the leading customer service and engagement platforms, provides built in reporting functionality which will satisfy some reporting metrics. However at some point you are simply going to want more, whether it be entirely unique to your organization or something that the default reporting interface can not satisfy.
All hope should not be lost for they also provide a well documented and robust REST API . The REST APi returns all its results in JSON syntax.
The following details how to use the REST API to export data from Zendesk into Neo4 primarily using APOC through the use of apoc.periodic.commit
and apoc.load.jsonParams
The data model we will construct in Neo4j will be
Whereby the ZendeskTicket is the center of the model represented by the green node (circle).
Each ZendeskTicket has a relationship to the ZendeskUser, represented by the yellow node, who opened the ZendeskTicket as well as the agent who will solve the ZendeskTicket.
Each ZendeskUser has a relationship to the ZendeskOrg, represented by the pink node, the user is a member of.
Additionally, each ZendeskUser has a relationship to any KnowledgeBase Sections, represented by the grey node, the user has choosen to [Follows].
The remaining labels, namely Quarter represented by the blue node and Hour represented by the magenta node are created from the imported data and are used to facilitate reporting.
Finally the Import node, represented by the red node serves as meta-information to support pagination through the Zendesk API. This will be described in further detail.
As to usage of the Zendesk API the following needs to be be understood
Authentication
In the Cypher examples provided below, the calls to the Zendesk REST API will authenticate using a OAUTH token. Generating a token and its use is described at the Zendesk API
Rate Limits
Depending on your Zendesk subscription type, you are limited to the number of REST requests per minute. The Cypher to export the data assume the lowest request rate and to accommodate this you will see references to call apoc.util.sleep(x)
which will cause the Cypher to sleep/pause for x milliseconds
Pagination
By default, most Zendesk Rest endpoints will return at most 100 results in a single call. As such the Cypher script will paginate and request the next 100 results until all results have been returned. This is primarily accomplished through the Neo4j meta-data node with label :Import.
Used Tools
The example below has been tested and implemented with Neo4j 3.4.0 and APOC 3.4.0.1 though would expect this to work with most any APOC supported Neo4j version.
For our Neo4j-Support Zendesk environment this will result in almost 10k nodes, and supporting indexes, with properties and associated relationships on the order of 7 minutes to complete on a empty graph.
Part of the slowness is due to the Rate Limits Zendesk imposes and thus the Cypher purposely slows itself down through 3 calls to apoc.util.sleep(60000)
, thus sleeping for 3 minutes in total.
Preparation
The script will create the following indexes namely for the benefit of MERGE
statements.
create index on :ZendeskOrg(id);
create index on :ZendeskUser(id);
create index on :ZendeskTicket(id);
create index on :ZendeskTicket(initial_severity);
create index on :Hour(hour);
create index on :Quarter(quarter);
Finally, the script will create 4 Import nodes which are used to support pagination.
Zendesk API Usage
Documenting input and output fields and sample JSON return string
- Zendesk Organizations.
- Zendesk Users
- Zendesk Incremental Tickets Export
- Zendesk KB Section and Subscription
For each type of object to be extracted from Zendesk the Cypher is constructed such that we first define the :Import node for the object, representing the starting ‘page’ in Zendesk to request and then will perform a apoc.load.jsonParams within a apoc.periodic.commit which will thus allow us to paginate. For example the following Cypher is used to load Zendesk Organizations
In all Cypher example below, references to https://neotechnology.zendesk.com/api/v2/
will need to be replaced with your Zendesk domain. We make this easy by defining an “url”
parameter for it. Same goes for the :param zd_identifier: “Basic ZGFuYSTlnVlVDNnQ=”
OAUTH Zendesk token
:param url: "https://neotechnology.zendesk.com/api/v2/"
:param zd_identifier: "Basic ZGFuYSTlnVlVDNnQ="
Importing Organizations
MERGE (import:Import {id: 1})
SET import.page = $url + 'organizations.json?page=1',
import.token = $zd_identifier;
CALL apoc.periodic.commit('
match (import:Import {id:1})
CALL apoc.load.jsonParams(import.page,{Authorization: import.token},null)
YIELD value as orgs
WITH orgs, orgs.next_page as next_page
UNWIND orgs.organizations as oneorg
UNWIND oneorg.organization_fields as orgf
MERGE (n:ZendeskOrg {id:oneorg.id})
SET n.name=oneorg.name,
n.url=oneorg.url,
n.created=oneorg.created_at,
n.domain_name=oneorg.domain_names
WITH next_page,
CASE WHEN next_page is null then 0 ELSE 1 END AS count
FOREACH(_ in CASE WHEN count = 0 THEN [] ELSE [1] END |
MERGE (import:Import {id:1})
SET import.page = next_page
)
RETURN count', null);
In the above example the first statement sets the starting page for the Zendesk API to page=1
for requesting Organizations.
The statement withinapoc.periodic.commit
is responsible for allowing the Zendesk pagination to work. In this case we are evaluating the returned JSON parameter named next_page
and if it is set we update the Import node to reflect the next page from the API to fetch.
Importing Users
The same process occurs for loading Users as the Cypher is:
MERGE (import:Import {id: 2})
SET import.page = $url + 'users.json?page=1',
import.token = $zd_identifier;
CALL apoc.periodic.commit('
MATCH (import:Import {id:2})
CALL apoc.load.jsonParams(import.page,{Authorization: import.token},null)
YIELD value AS users
WITH users,users.next_page as next_page
UNWIND users.users as oneuser
MERGE (n:ZendeskUser {id:oneuser.id})
SET n.name=oneuser.name,
n.email=oneuser.email,
n.url=oneuser.url,
n.last_login=oneuser.last_login_at,
n.organization_id=oneuser.organization_id,
n.suspended=oneuser.suspended,
n.created=oneuser.date_created
with next_page,
CASE WHEN next_page is null then 0 ELSE 1 END AS count
FOREACH(_ in CASE WHEN count = 0 THEN [] ELSE [1] END |
MERGE (import:Import {id:2})
SET import.page = next_page
)
RETURN count ', null);
Exporting of tickets is a bit more involved since rather than a next_page attribute on the API URL it passes a unix start time with parameter start_time
. Further the api includes parameter include=metric_sets
which will return internal data about the ticket.
Importing Tickets
The Cypher statement to import tickets:
MERGE (import:Import {id: 3})
SET import.page = $url +
'/incremental/tickets.json?start_time=1293840000&include=metric_sets',
import.token = $zd_identifier;
CALL apoc.periodic.COMMIT('
MATCH (import:Import {id:3})
// so as to not exceed ZD throttle mechanism
CALL apoc.util.sleep(10000)
CALL apoc.load.jsonParams(import.page,{Authorization: import.token},null)
YIELD value as tickets
WITH tickets, tickets.next_page as next_page, tickets.count as APICount
unwind tickets.tickets as oneticket
MERGE (n:ZendeskTicket {id:oneticket.id})
SET n.user_id=oneticket.requester_id,
n.org_id=oneticket.organization_id,
n.url=oneticket.url,
n.ticket_id=oneticket.id,
n.status=oneticket.status,
n.org_id=oneticket.organization_id,
n.ticket_subject=oneticket.subject,
n.ticket_description=oneticket.description,
n.date_created=oneticket.created_at,
n.date_updated=oneticket.updated_at,
n.assigned_id=oneticket.assignee_id,
n.priority=oneticket.priority
WITH n,oneticket, next_page, APICount
UNWIND oneticket.fields as oneticketfields
// take oneticketfields as a json value similar to
// fields:[{id:24134566,value:severity_2},{id:80611248, ... ... ....
// unwind it so it becomes
// {id: 24134566, value: severity_2}
// {id: 80611248, value: ... .... ....
//
// ticket severity id=24134566
SET n.severity = CASE WHEN oneticketfields.id=24134566
THEN oneticketfields.value ELSE null END
// ticket cause id=24134536
SET n.cause = CASE WHEN oneticketfields.id=24134536
THEN oneticketfields.value ELSE null END
// ticket category id=24087433
SET n.category = CASE WHEN oneticketfields.id=24087433
THEN oneticketfields.value ELSE null END
// ticket version id=20781041
SET n.version = CASE WHEN oneticketfields.id=20781041
THEN oneticketfields.value ELSE null END
WITH n,oneticket, next_page, APICount
UNWIND oneticket.metric_set as metrics
SET n.reopens=metrics.reopens,
n.date_solved=metrics.solved_at,
n.assignee_stations=metrics.assignee_stations
WITH n, next_page, APICount,
metrics.first_resolution_time_in_minutes as first_res_time,
metrics.requester_wait_time_in_minutes as req_wait_time,
metrics.full_resolution_time_in_minutes as full_res_time,
metrics.reply_time_in_minutes as reply_time
// need to divide by ##.0 so as to force float division thus 59/60 will return 0.9833 and not 0
SET n.first_resolution_time_in_business_days=
first_res_time.business/1440.0
SET n.first_resolution_time_in_calendar_days=
first_res_time.calendar/1440.0
SET n.requester_wait_time_in_business_hours=
req_wait_time.business/60.0
SET n.requester_wait_time_in_calendar_hours=
req_wait_time.calendar/60.0
SET n.full_resolution_time_in_business_days=
full_res_time.business/1440.0
SET n.full_resolution_time_in_calendar_days=
full_res_time.calendar/1440.0
SET n.reply_time_in_business_hours=reply_time.business/60.0
SET n.reply_time_in_calendar_hours=reply_time.calendar/60.0
WITH next_page, APICount,
CASE WHEN (APICount<1000) then 0 ELSE 1 END AS count
WITH count, next_page, APICount
MERGE (import:Import {id:3}) SET import.page = next_page, import.APICount=APICount
RETURN count ', null);
Unlike exporting of Zendesk Organizations and Users, with Tickets we need to paginate until the JSON count value, now aliased as APICount, is less than 1000. If it the APICount is 1000 then this would indicate the JSON return has not consumed all the data and we need to fetch the next 1000. Also the references similar to FOREACH(_ IN CASE WHEN oneticketfields.id=24134566
may need to be modified as they represent custom fields in Zendesk.
KB Sections and Subscriptions
Finally loading of KB Sections and Subscriptions is performed via
MERGE (import:Import {id: 4})
SET import.page=$url + '/help_center/sections.json?page=1', import.token = $zd_identifier;
MATCH (import:Import {id:4})
CALL apoc.load.jsonParams(import.page,
{Authorization: import.token},null)
YIELD value AS sections
WITH import, sections, sections.next_page AS next_page
UNWIND sections.sections AS section_item
// create the Section Node
MERGE (n:ZendeskSection {id:section_item.id})
ON CREATE SET n += {name:section_item.name, date_created:section_item.created_at,
date_updated:section_item.updated_at,url:section_item.html_url}
WITH import, section_item,
$url + '/help_center/sections/'+section_item.id+
'/subscriptions.json?per_page=200' AS url2
// foreach section then find the subscribers
CALL apoc.load.jsonParams(url2,{Authorization: import.token},null)
YIELD value AS subscribervalue
WITH subscribervalue, section_itemUNWIND subscribervalue.subscriptions AS subscription_itemMATCH (s:ZendeskSection {id:section_item.id})
WITH s,subscription_item
MATCH (n:ZendeskUser {id: subscription_item.user_id})
MERGE (n)-[f:Follows]->(s)
SET f.subscribed_on=subscription_item.created_at;
Create Additional Relationships
Once the data is loaded appropriate additional relationships are created.
// connect user to org
MATCH (u:ZendeskUser) WITH u, u.organization_id AS uorg
MATCH (o:ZendeskOrg) WHERE o.id=uorg
MERGE (u)-[:IS_MEMBER_OF_ORG]->(o);// connect ticket to user and org
MATCH (t:ZendeskTicket)
WITH t, t.user_id AS tsubmitter, t.assigned_id AS towner
MATCH (u:ZendeskUser) WHERE u.id=towner
MATCH (u2:ZendeskUser)-[:IS_MEMBER_OF_ORG]->(org:ZendeskOrg) WHERE u2.id=tsubmitter
MERGE (t)-[:IS_ASSIGNED_TO]->(u)
MERGE (t)<-[:CREATED_ZENDESK_TICKET]-(u2)
MERGE (t)-[:ZENDESK_TICKET_ORG]->(org);// connect to open quarter
MATCH (t:ZendeskTicket)
WITH t, substring(t.date_created,0,4) AS zdyear, toInteger(substring(t.date_created,5,2)) AS zdmonth
WITH t, toString(((zdmonth-1)/3)+1) AS zdquarter, zdyear
WITH t, 'q'+zdquarter+zdyear AS total,
zdyear+'q'+zdquarter AS total2MERGE (zdq:Quarter {quarter:total2})
MERGE (t)-[:ZENDESK_TICKET_OPENED_QUARTER]->(zdq)
RETURN COUNT(*);
// connect to closed quarter
MATCH (t:ZendeskTicket)
WHERE t.status='closed'
WITH t, substring(t.date_solved,0,4) AS zdyear, toInteger(substring(t.date_solved,5,2)) AS zdmonth
WITH t, toString(((zdmonth - 1)/3)+1) AS zdquarter, zdyear
WITH t, 'q'+zdquarter+zdyear AS total, zdyear+'q'+zdquarter AS total2
MERGE (zdq:Quarter {quarter:total2})
MERGE (t)-[:ZENDESK_TICKET_CLOSED_QUARTER]->(zdq)
RETURN COUNT(*);
// connect to "open" hour nodes
MATCH (t:ZendeskTicket)
WITH t, time(datetime(t.date_created)).hour AS total2
MERGE (zdq:Hour {hour:total2})
MERGE (t)-[:ZENDESK_TICKET_HOUR]->(zdq)
RETURN COUNT(*);
Now that the graph is loaded, what data can we get out of it?
Lets look for users who have never logged in or haven’t logged in for more than 6 months?
These user accounts may want to be deleted/suspended due to inactivity.
MATCH (u:ZendeskUser)-[:IS_MEMBER_OF_ORG]->(o:ZendeskOrg)
WHERE NOT u.suspended
AND (NOT exists(u.last_login)
OR duration.inDays(datetime(u.last_login),datetime()).days>180)
RETURN u.name, u.email, o.name, u.last_login, u.suspended,
duration.inDays(datetime(u.last_login),datetime()).days;
Users subscribed per section
MATCH (s:ZendeskSection)<-[:Follows]-(u:ZendeskUser)
RETURN s.name, u.name
Tickets closed per ticket owner and percent of total
MATCH (t1:ZendeskTicket)
WHERE date(datetime(t1.date_created)).year=2018
WITH count(t1) AS ttotal
MATCH (t2:ZendeskTicket)-[:IS_ASSIGNED_TO]-(u:ZendeskUser)
WHERE date(datetime(t2.date_created)).year=2018
WITH ttotal,u.name AS supportmember, count(t) AS utotal
RETURN supportmember, utotal,
(utotal/toFloat(ttotal))*100 AS PercentClosed
ORDER BY utotal DESC;
Tickets closed per ticket owner timezoned when ticket was opened
MATCH (zdq:Quarter)
WITH zdq ORDER BY zdq.quarter DESC SKIP 1 LIMIT 5
MATCH (h:Hour)<-[:ZENDESK_TICKET_HOUR]-(ticket:ZendeskTicket)-[:ZENDESK_TICKET_OPENED_QUARTER]->(zdq)
WITH *, CASE WHEN h.hour IN range(0,6) THEN 'APAC'
WHEN h.hour IN range(7,14) THEN 'EMEA' ELSE 'AMER' END AS region
WITH region, zdq.quarter AS quarter, count(ticket) AS Total_Opened_Tickets
ORDER BY quarter ASC, region ASC
WITH region, collect(quarter) AS quarters, collect(Total_Opened_Tickets) AS tix
RETURN region, tix[0] AS q1, tix[1] AS q2, tix[2] AS q3,
tix[3] AS q4, tix[4] AS q5;
Resulting in (not the actual numbers):
+--------------------------------------+
| region | q1 | q2 | q3 | q4 | q5 |
+--------------------------------------+
| "EMEA" | 42 | 43 | 44 | 45 | 46 |
| "AMER" | 52 | 53 | 54 | 55 | 56 |
| "APAC" | 32 | 33 | 34 | 35 | 36 |
+--------------------------------------+
I hope this is useful for you, if so, please let me know which other queries you run on top of your data or if you have other improvement suggestions.
You can also find the full script in this GitHub gist