How KPMG UK Leverage Jira Data Connections

Published in

KPMG UK Engineering

6 min readOct 25, 2023

Table of Contents

What are Data Connections?
Why we need Proxy Lambdas
What if the Data Connection Times Out?
The Good and the Bad
Next Steps in our Work
Conclusion

Within Engineering at KPMG UK, we leverage Jira Service Management (JSM) to automate most user access requests. Examples include automated:

Creation of new Services (Jira projects, Confluence spaces, Github repositories),
Provisioning of new users on various platforms, Jira, Azure Active Directory,
Group management tooling, allowing Jira users to manage their group memberships.

We can go into plenty of detail about these automations and we have already presented how they have benefited us, but this blog is about another aspect of Jira Service Management that we have recently explored: Data Connections. Data Connections allow us to have JSM form fields populate with external data in real time.

Here are some quick-fire examples of where we’re leveraging Data Connections, we now have form fields that, in real time:

Populate with data from Github, which we use for Source Code Management:
- A field for all Github repositories, so users can pick a specific repo in their form
- A field for all Github teams in the organisation, so users can pick the team to be added to
List all Jira Projects and all Confluence Spaces. There are no native fields for either of these, and our Data Connection fields can be used to select Projects to be deleted, or Spaces that a user requires access to.
List all groups in Atlassian or Azure Active Directory (our IDP). This allows users to request access to certain groups, or request a list of members from these groups. This is important as regular Jira users do not have visibility of group memberships without going to organisation admins.

So, how did we set all this up? Let’s dive in…!

What are Data Connections?

Data Connections are a useful tool implemented within Jira Service Management that allow the population of Form fields with live data gathered from API Calls. This applies to radio buttons, checkboxes and select dropdown fields.

When a form containing a data connection is accessed, behind the scenes an API request is made and the field’s data is refreshed. This is the case unless caching is configured, which allows admins to tell Data Connections to remember its results for a period of time. This reduces the number of api calls made by the Data Connection and can lead to a better user experience, since users will not need to wait for the data to be gathered as often.

Without Data Connections, the next best thing is to have custom fields with static data that administrators will need to regularly update; not ideal!

Data Connections are configured within the “Issues” section of the Jira Admin area.

Image taken from Atlassian’s documentation

Why we need Proxy Lambdas

Data Connections are great. Except they have a few pitfalls:

They cannot handle paginated responses
They require a persistent method of authentication
They cannot access API endpoints in private services. In our case, some of the data we wanted to grab sits behind a private network. We don’t want to allowlist all of Jira’s IP addresses to get it working!

These factors, even individually, badly limit the versatility and usefulness of Data Connections.

So what is a Jira Admin to do…?

The answer: ⭐ Proxy Lambdas! ⭐

For those unfamiliar, AWS Lambda is a service offered by AWS which leverages serverless computing and allows you to automate the running of scripts in a variety of languages. For our use case, we are specifically utilising Proxy Lambdas. This means that when a lambda is triggered by an API call (from the Data Connection) instead of the API gateway handling the response, it is passed straight through to the Lambda function. Once the Lambda runs and a response is generated, the API gateway passes it straight back to that initial incoming request.

Illustration of how proxy lambdas work in our setup

OK but how does this help?

By moving the processing of the API endpoints to a Lambda function (specifically running a Python script in our case), we can:

Handle pagination ourselves through our Python script
Handle non-persistent api tokens by regenerating them through code
Handle private endpoints by allowing the IP of the lambda to the service we need. The lambda function sits outside of the private VPC, so it can talk to both Jira and the target service.

So, the proxy lambda handles all the heavy lifting, grabbing the data from any source. The Data Connection simply triggers that lambda and receives data from its response.

What if the Data Connection Times Out?

Whilst proxy lambdas solve a whole host of issues, they do come with their own limitations. One that we have personally encountered is the 30 second time-out limit of Lambdas when associated with an API endpoint. This has only been a problem for one of our Data Connections so far, but we fully expect it to be a problem later on as we expand our repertoire.

This becomes an issue when the lambda needs to grab a lot of data. In our case, we have a data connection that grabs all internal repositories from Github. That’s over 5000 repositories! As expected, it takes longer than 30 seconds to grab that data and we see the lambda timing out.

Our solution is to introduce a ⭐ second lambda. ⭐

In this 2-lambda solution, we have a standalone lambda gather, parse and store the data in an S3 bucket. As this Lambda is not connected to an API Gateway, it does not face the same restriction with timing out. In simple terms, its sole job is fetching data in the form of a JSON file and storing it.

Our second lambda then accesses that file and sends the text of the JSON file back to the data connection. This is much faster than grabbing directly from Github. By running the first lambda on a regular CRON job, we can still ensure that the data is up to date, and by storing it as a JSON, the second lambda has no extra leg work to do outside of sending the requested data.

The Good and the Bad

Data Connections have proved invaluable for our JSM site so far. Being able to offer an ever changing list of resources to be selected in JSM forms has freed up the time of our JSM Agents, as they don’t have to maintain lists that were previously basic custom fields. Data Connections have also allowed us to introduce a wider variety of automated tasks.

Chief amongst these are our Group Management tools that devolve the permission of managing groups to regular users without giving them site-admin access. We have Data Connections for group names and users from our Atlassian directory that allows users to interact with newly made groups or added users very shortly after they are on the site.

Based on our experience, we have one main area of improvement we’d like to see: more authentication options when configuring the data connection.

The current offerings are “Basic”, “Digest” and “Custom”. Since we are utilising proxy lambdas, we have not been able to make any of those work directly with a private API Endpoint from AWS and hope that Atlassian expands the range of options for wider usage in the future.

Additionally, we would really like to see Data Connections become available for regular custom fields, rather than being restricted to JSM forms. This would give us even more versatility across a larger portion of our Atlassian suite.

Next Steps in our Work

We aim to expand our suite of Data Connections; primarily focusing on pulling data from Inventory.

For context, Inventory is a custom-built internal service which contains an ⭐ inventory⭐ of all our cloud accounts, SSLs and other important services. Inventory lives within a private VPC, but has an API that we would like to make available for JSM. As Inventory is a source of truth for so much internal data, it is a great source for Data Connections.

Some of our plans for future Data Connections includes:

Pulling existing “Services” from Inventory, which would help us to enable a number of applications. Services in Inventory group together cloud (AWS, Azure, GCP) accounts/subscriptions that belong to a single project or “service”
Pulling business owners of accounts and Services from Inventory so we can have targeted approvals for the decommissioning of accounts

Conclusion

To conclude — Jira’s Data Connection feature has become an integral part of much of our JSM estate and we only see our usage increasing over time, especially with the workarounds provided by proxy lambdas. Data Connections aren’t perfect, but with a little know-how, they can be a powerful addition to a JSM instance.