Zendesk to Neo4j Integration

Better control over you reporting needs

Published in

Neo4j Developer Blog

8 min readAug 14, 2018

Zendesk, one of the leading customer service and engagement platforms, provides built in reporting functionality which will satisfy some reporting metrics. However at some point you are simply going to want more, whether it be entirely unique to your organization or something that the default reporting interface can not satisfy.

All hope should not be lost for they also provide a well documented and robust REST API . The REST APi returns all its results in JSON syntax.

The following details how to use the REST API to export data from Zendesk into Neo4 primarily using APOC through the use of apoc.periodic.commit and apoc.load.jsonParams

The data model we will construct in Neo4j will be

Whereby the ZendeskTicket is the center of the model represented by the green node (circle).

Each ZendeskTicket has a relationship to the ZendeskUser, represented by the yellow node, who opened the ZendeskTicket as well as the agent who will solve the ZendeskTicket.

Each ZendeskUser has a relationship to the ZendeskOrg, represented by the pink node, the user is a member of.

Additionally, each ZendeskUser has a relationship to any KnowledgeBase Sections, represented by the grey node, the user has choosen to [Follows].

The remaining labels, namely Quarter represented by the blue node and Hour represented by the magenta node are created from the imported data and are used to facilitate reporting.

Finally the Import node, represented by the red node serves as meta-information to support pagination through the Zendesk API. This will be described in further detail.

As to usage of the Zendesk API the following needs to be be understood

Authentication

In the Cypher examples provided below, the calls to the Zendesk REST API will authenticate using a OAUTH token. Generating a token and its use is described at the Zendesk API

Rate Limits

Depending on your Zendesk subscription type, you are limited to the number of REST requests per minute. The Cypher to export the data assume the lowest request rate and to accommodate this you will see references to call apoc.util.sleep(x) which will cause the Cypher to sleep/pause for x milliseconds

Pagination

By default, most Zendesk Rest endpoints will return at most 100 results in a single call. As such the Cypher script will paginate and request the next 100 results until all results have been returned. This is primarily accomplished through the Neo4j meta-data node with label :Import.

Used Tools

The example below has been tested and implemented with Neo4j 3.4.0 and APOC 3.4.0.1 though would expect this to work with most any APOC supported Neo4j version.

For our Neo4j-Support Zendesk environment this will result in almost 10k nodes, and supporting indexes, with properties and associated relationships on the order of 7 minutes to complete on a empty graph.

Part of the slowness is due to the Rate Limits Zendesk imposes and thus the Cypher purposely slows itself down through 3 calls to apoc.util.sleep(60000), thus sleeping for 3 minutes in total.

Preparation

The script will create the following indexes namely for the benefit of MERGE statements.

create index on :ZendeskOrg(id);
create index on :ZendeskUser(id);
create index on :ZendeskTicket(id);
create index on :ZendeskTicket(initial_severity);
create index on :Hour(hour);
create index on :Quarter(quarter);

Finally, the script will create 4 Import nodes which are used to support pagination.

Zendesk API Usage

Documenting input and output fields and sample JSON return string

For each type of object to be extracted from Zendesk the Cypher is constructed such that we first define the :Import node for the object, representing the starting ‘page’ in Zendesk to request and then will perform a apoc.load.jsonParams within a apoc.periodic.commit which will thus allow us to paginate. For example the following Cypher is used to load Zendesk Organizations

In all Cypher example below, references to https://neotechnology.zendesk.com/api/v2/ will need to be replaced with your Zendesk domain. We make this easy by defining an “url” parameter for it. Same goes for the :param zd_identifier: “Basic ZGFuYSTlnVlVDNnQ=” OAUTH Zendesk token

:param url: "https://neotechnology.zendesk.com/api/v2/"
:param zd_identifier: "Basic ZGFuYSTlnVlVDNnQ="

Importing Organizations

MERGE (import:Import {id: 1})
SET import.page = $url + 'organizations.json?page=1', 
    import.token = $zd_identifier;

CALL apoc.periodic.commit('
 match (import:Import {id:1})
 CALL apoc.load.jsonParams(import.page,{Authorization: import.token},null)
  YIELD value as orgs
  WITH orgs, orgs.next_page as next_page
  UNWIND orgs.organizations as oneorg
  UNWIND oneorg.organization_fields as orgf
  MERGE (n:ZendeskOrg {id:oneorg.id})
   SET n.name=oneorg.name,
       n.url=oneorg.url,
       n.created=oneorg.created_at,
       n.domain_name=oneorg.domain_names
 WITH next_page,
 CASE WHEN next_page is null then 0 ELSE 1 END AS count
 FOREACH(_ in CASE WHEN count = 0 THEN [] ELSE [1] END |
     MERGE (import:Import {id:1})
        SET import.page = next_page
 )
 RETURN count', null);

In the above example the first statement sets the starting page for the Zendesk API to page=1 for requesting Organizations.

The statement withinapoc.periodic.commit is responsible for allowing the Zendesk pagination to work. In this case we are evaluating the returned JSON parameter named next_page and if it is set we update the Import node to reflect the next page from the API to fetch.

Importing Users

The same process occurs for loading Users as the Cypher is:

MERGE (import:Import {id: 2})
SET import.page = $url + 'users.json?page=1', 
    import.token = $zd_identifier;

CALL apoc.periodic.commit('
MATCH (import:Import {id:2})
CALL apoc.load.jsonParams(import.page,{Authorization: import.token},null)
YIELD value AS users
WITH users,users.next_page as next_page
UNWIND users.users as oneuser
MERGE (n:ZendeskUser {id:oneuser.id})
 SET n.name=oneuser.name,
     n.email=oneuser.email,
     n.url=oneuser.url,
     n.last_login=oneuser.last_login_at,
     n.organization_id=oneuser.organization_id,
     n.suspended=oneuser.suspended,
     n.created=oneuser.date_created

     with next_page,
 CASE WHEN next_page is null then 0 ELSE 1 END AS count
 FOREACH(_ in CASE WHEN count = 0 THEN [] ELSE [1] END |
     MERGE (import:Import {id:2})
     SET import.page = next_page
 )
 RETURN count ', null);

Exporting of tickets is a bit more involved since rather than a next_page attribute on the API URL it passes a unix start time with parameter start_time. Further the api includes parameter include=metric_sets which will return internal data about the ticket.

Importing Tickets

The Cypher statement to import tickets:

MERGE (import:Import {id: 3})
SET import.page = $url + 
'/incremental/tickets.json?start_time=1293840000&include=metric_sets',
    import.token = $zd_identifier;

CALL apoc.periodic.COMMIT('
  MATCH (import:Import {id:3})
  // so as to not exceed ZD throttle mechanism
  CALL apoc.util.sleep(10000)
  CALL apoc.load.jsonParams(import.page,{Authorization: import.token},null)
  YIELD value as tickets
  WITH tickets, tickets.next_page as next_page, tickets.count as APICount
   unwind tickets.tickets as oneticket
   MERGE (n:ZendeskTicket {id:oneticket.id})

   SET n.user_id=oneticket.requester_id,
        n.org_id=oneticket.organization_id,
        n.url=oneticket.url,
        n.ticket_id=oneticket.id,
        n.status=oneticket.status,
        n.org_id=oneticket.organization_id,
        n.ticket_subject=oneticket.subject,
        n.ticket_description=oneticket.description,
        n.date_created=oneticket.created_at,
        n.date_updated=oneticket.updated_at,
        n.assigned_id=oneticket.assignee_id,
        n.priority=oneticket.priority

   WITH n,oneticket, next_page, APICount
   UNWIND oneticket.fields as oneticketfields

   // take oneticketfields as a json value similar to
   // fields:[{id:24134566,value:severity_2},{id:80611248, ... ... ....
   // unwind it so it becomes
   //  {id: 24134566, value: severity_2}
   //  {id: 80611248, value: ... .... ....
   //
   // ticket severity          id=24134566
   SET n.severity = CASE WHEN oneticketfields.id=24134566 
       THEN oneticketfields.value ELSE null END
   // ticket cause             id=24134536
   SET n.cause = CASE WHEN oneticketfields.id=24134536 
       THEN oneticketfields.value ELSE null END
   // ticket category          id=24087433
   SET n.category = CASE WHEN oneticketfields.id=24087433 
       THEN oneticketfields.value ELSE null END
   // ticket version           id=20781041
   SET n.version = CASE WHEN oneticketfields.id=20781041 
       THEN oneticketfields.value ELSE null END
   WITH n,oneticket, next_page, APICount

   UNWIND oneticket.metric_set as metrics

   SET n.reopens=metrics.reopens,
       n.date_solved=metrics.solved_at,
       n.assignee_stations=metrics.assignee_stations

   WITH n, next_page, APICount,
        metrics.first_resolution_time_in_minutes as first_res_time,
        metrics.requester_wait_time_in_minutes as req_wait_time,
        metrics.full_resolution_time_in_minutes as full_res_time,
        metrics.reply_time_in_minutes as reply_time
   // need to divide by ##.0 so as to force float division thus 59/60 will return 0.9833 and not 0
   SET n.first_resolution_time_in_business_days=
         first_res_time.business/1440.0
   SET n.first_resolution_time_in_calendar_days=
         first_res_time.calendar/1440.0

   SET n.requester_wait_time_in_business_hours=
         req_wait_time.business/60.0
   SET n.requester_wait_time_in_calendar_hours=
         req_wait_time.calendar/60.0

   SET n.full_resolution_time_in_business_days=
         full_res_time.business/1440.0
   SET n.full_resolution_time_in_calendar_days=
         full_res_time.calendar/1440.0

   SET n.reply_time_in_business_hours=reply_time.business/60.0
   SET n.reply_time_in_calendar_hours=reply_time.calendar/60.0

 WITH next_page, APICount,
 CASE WHEN (APICount<1000) then 0 ELSE 1 END AS count
 WITH count, next_page, APICount
 MERGE (import:Import {id:3}) SET import.page = next_page, import.APICount=APICount
 RETURN count ', null);

Unlike exporting of Zendesk Organizations and Users, with Tickets we need to paginate until the JSON count value, now aliased as APICount, is less than 1000. If it the APICount is 1000 then this would indicate the JSON return has not consumed all the data and we need to fetch the next 1000. Also the references similar to FOREACH(_ IN CASE WHEN oneticketfields.id=24134566 may need to be modified as they represent custom fields in Zendesk.

KB Sections and Subscriptions

Finally loading of KB Sections and Subscriptions is performed via

MERGE (import:Import {id: 4})
SET import.page=$url + '/help_center/sections.json?page=1', import.token = $zd_identifier;

MATCH (import:Import {id:4})

CALL apoc.load.jsonParams(import.page,
          {Authorization: import.token},null)
YIELD value AS sections

WITH import, sections, sections.next_page AS next_page

UNWIND sections.sections AS section_item
// create the Section Node
MERGE (n:ZendeskSection {id:section_item.id})
ON CREATE SET n += {name:section_item.name, date_created:section_item.created_at,
                    date_updated:section_item.updated_at,url:section_item.html_url}
WITH import, section_item,
     $url + '/help_center/sections/'+section_item.id+ 
     '/subscriptions.json?per_page=200' AS url2

// foreach section then find the subscribers
CALL apoc.load.jsonParams(url2,{Authorization: import.token},null)
YIELD value AS subscribervalue
WITH subscribervalue, section_itemUNWIND subscribervalue.subscriptions AS subscription_itemMATCH (s:ZendeskSection {id:section_item.id}) 
WITH s,subscription_item
MATCH (n:ZendeskUser {id: subscription_item.user_id})
MERGE (n)-[f:Follows]->(s)
SET f.subscribed_on=subscription_item.created_at;

Create Additional Relationships

Once the data is loaded appropriate additional relationships are created.

// connect user to org
MATCH (u:ZendeskUser) WITH u, u.organization_id AS uorg
MATCH (o:ZendeskOrg) WHERE o.id=uorg
MERGE (u)-[:IS_MEMBER_OF_ORG]->(o);// connect ticket to user and org
MATCH (t:ZendeskTicket)
WITH t, t.user_id AS tsubmitter, t.assigned_id AS towner
MATCH (u:ZendeskUser) WHERE u.id=towner
MATCH (u2:ZendeskUser)-[:IS_MEMBER_OF_ORG]->(org:ZendeskOrg) WHERE u2.id=tsubmitter
MERGE (t)-[:IS_ASSIGNED_TO]->(u)
MERGE (t)<-[:CREATED_ZENDESK_TICKET]-(u2)
MERGE (t)-[:ZENDESK_TICKET_ORG]->(org);// connect to open quarter
MATCH (t:ZendeskTicket)
WITH t, substring(t.date_created,0,4) AS zdyear, toInteger(substring(t.date_created,5,2)) AS zdmonth
WITH t, toString(((zdmonth-1)/3)+1) AS zdquarter, zdyear
WITH t, 'q'+zdquarter+zdyear AS total, 
 zdyear+'q'+zdquarter AS total2MERGE (zdq:Quarter {quarter:total2})
MERGE (t)-[:ZENDESK_TICKET_OPENED_QUARTER]->(zdq)
RETURN COUNT(*);

// connect to closed quarter
MATCH (t:ZendeskTicket)
  WHERE t.status='closed'
WITH t, substring(t.date_solved,0,4) AS zdyear, toInteger(substring(t.date_solved,5,2)) AS zdmonth
WITH t, toString(((zdmonth - 1)/3)+1) AS zdquarter, zdyear
WITH t, 'q'+zdquarter+zdyear AS total, zdyear+'q'+zdquarter AS total2
MERGE (zdq:Quarter {quarter:total2})
MERGE (t)-[:ZENDESK_TICKET_CLOSED_QUARTER]->(zdq)
RETURN COUNT(*);

// connect to "open" hour nodes
MATCH (t:ZendeskTicket)
WITH t, time(datetime(t.date_created)).hour AS total2
MERGE (zdq:Hour {hour:total2})
MERGE (t)-[:ZENDESK_TICKET_HOUR]->(zdq)
RETURN COUNT(*);

Now that the graph is loaded, what data can we get out of it?

Lets look for users who have never logged in or haven’t logged in for more than 6 months?

These user accounts may want to be deleted/suspended due to inactivity.

MATCH (u:ZendeskUser)-[:IS_MEMBER_OF_ORG]->(o:ZendeskOrg)
  WHERE NOT u.suspended
    AND (NOT exists(u.last_login)
     OR duration.inDays(datetime(u.last_login),datetime()).days>180)

RETURN u.name, u.email,  o.name, u.last_login, u.suspended,
       duration.inDays(datetime(u.last_login),datetime()).days;

Users subscribed per section

MATCH (s:ZendeskSection)<-[:Follows]-(u:ZendeskUser)
RETURN s.name, u.name

Tickets closed per ticket owner and percent of total

MATCH (t1:ZendeskTicket)
WHERE date(datetime(t1.date_created)).year=2018
WITH count(t1) AS ttotal

MATCH (t2:ZendeskTicket)-[:IS_ASSIGNED_TO]-(u:ZendeskUser)
WHERE date(datetime(t2.date_created)).year=2018
WITH  ttotal,u.name AS supportmember, count(t) AS utotal

RETURN supportmember, utotal,
       (utotal/toFloat(ttotal))*100 AS PercentClosed
ORDER BY utotal DESC;

Tickets closed per ticket owner timezoned when ticket was opened

MATCH (zdq:Quarter)
WITH zdq ORDER BY zdq.quarter DESC SKIP 1 LIMIT 5

MATCH (h:Hour)<-[:ZENDESK_TICKET_HOUR]-(ticket:ZendeskTicket)-[:ZENDESK_TICKET_OPENED_QUARTER]->(zdq)
WITH *, CASE WHEN h.hour IN range(0,6) THEN 'APAC'
        WHEN h.hour IN range(7,14) THEN 'EMEA' ELSE 'AMER' END AS region

WITH region, zdq.quarter AS quarter, count(ticket) AS Total_Opened_Tickets
ORDER BY quarter ASC, region ASC

WITH region, collect(quarter) AS quarters, collect(Total_Opened_Tickets) AS tix
RETURN region, tix[0] AS q1, tix[1] AS q2, tix[2] AS q3, 
       tix[3] AS q4, tix[4] AS q5;

Resulting in (not the actual numbers):

+--------------------------------------+
| region | q1  | q2  | q3  | q4  | q5  |
+--------------------------------------+
| "EMEA" | 42  | 43  | 44  | 45  | 46  |
| "AMER" | 52  | 53  | 54  | 55  | 56  |
| "APAC" | 32  | 33  | 34  | 35  | 36  |
+--------------------------------------+

I hope this is useful for you, if so, please let me know which other queries you run on top of your data or if you have other improvement suggestions.

You can also find the full script in this GitHub gist