Turn your organisation’s data exhausts into fuel

How the historical data left lying around in workplace tools such as Teams, Slack, Jira, Salesforce etc can be used to help you make your organisation better

Published in

ConnectedCompany

6 min readApr 23, 2021

An organisation is not the sum of its parts. It’s the product of its interactions.

In todays organisations, these interactions expand beyond social and collaborative dynamics because technology plays an increasingly critical role in how work gets done.

This post will introduce you to a few techniques we regularly use at ConnectedCompany to bring a different perspective to organisation performance; taking data and putting it into context, and helping organisations make better, more informed decisions.

We will describe how the activity data dispersed across our day to day work tools, the sort in widespread use across companies of all sizes, can be used to create meaningful insights.

Recycling the data you have left lying around, combining it into a powerful asset for change. A new fuel for growth and improvement.

The Theory

The network of any organisation is a combination of specific ‘signatures’ or ‘motifs’. Shapes and patterns that reveal certain insights & characteristics.

A few examples of popular network motifs — nodes which bridge, or hold together groups are quickly revealed, which then infers an insight based on the context.

These signatures are an effective way to prospect and assess the landscape of an organisation, as they help you to shortlist focus areas to explore in more detail. They also act as a baseline to track evolution over time, alongside other metrics and trends.

The Reality

Garbage in, Garbage out. Of course. Building these networks requires good data.

Historically work tool data was hard to access, so networks were built up manually through surveys (“who do you frequently collaborate with in your role?” etc). Whilst effective, these techniques are slow to produce results and quickly become outdated.

In part, that is because survey results are subjected to a recency bias (weighting ‘now’ disproportionately against the longer term trend) and if the wrong question(s) are asked in the survey then the whole approach needs repeating.

Thankfully, organisations are now awash with relevant data that allows these networks to be drawn and inferred dynamically. Chat tools like Slack or MS Teams, collaborative planning tools like Jira or Asana, customer support tools like Intercom, and CRM suites like HubSpot or Salesforce. These tools capture interactions between colleagues & customers, and between internal teams & departments. They can be combined and analysed to generate valuable insights.

However, the data inside these tools is often messy and noisy — not all interactions are about work, and not all collaboration is meaningful. There is work to do before it’s useful for decision making.

*A real (and typical) network generated before any data processing is applied. All nodes cluster together into the centre, creating a ‘hairball’ effect. No meaningful signal exists.*

So how do we fix ‘the hairball problem’?

Here are a few of the techniques we frequently use at ConnectedCompany to generate insights from messy datasets. They can be applied individually or in combination, depending on the problem to be solved.

Labels — structural properties that don’t change much

Labels are helpful as they allow nodes to be ‘grouped’, which then allows a network to be visualised more meaningfully and in ways that are easy to consume by any stakeholder across the organisation. It allows abstraction through aggregation, zooming in and zooming out.

One common example is labelling usernames (e.g @Matt) with common organisational data from HR (e.g. Role: Analyst, Product Department: Search, Location: London).

Zooming out — this birds eye view of an organisation is much simpler to interpret because it matches the formal language and structure used internally. Each group is a team, each dot is a colleague. Everyone your show should ‘get’ this image.

Tags — dynamic properties that are inferred from activity data

Applying methods that infer the intent or focus of an interaction is a powerful tool to generate insight from work data.

A common approach is subset data by tagging interactions that are work related. For example, focussing on technical questions & answers (e.g. ‘will this still work on Release 3.1?’) inherently generates a work-centric collaboration network, and produces similar results to a manual survey asking ‘who do you frequently turn to for help?’.

The example below shows how tags can be inferred from messages using combinations of Natural Language Processing (NLP) techniques and keyword matching, derived from the unique language used internally by an organisation.

Dimensionality reduction is a technique to cluster similar text strings together, creating a common property that can be ‘tagged’ to an event or an entity (colleague, code repository, team, division etc)

Tags over time is a powerful way to see ‘what’ information is flowing across an organisational network. What have been the dominant themes and focus areas consuming bandwidth.

Contexts — logical abstractions to connect work to ideas & themes

Building on the example above, significant tags, phrases and keywords can be grouped into ‘contexts’ that make logical sense for a business. For example, allowing activity to be mapped and aggregated against an idea or theme that’s of interest to teams or leaders (e.g. Release 3.1)

This is similar to how companies like Facebook and Instagram use activity data to generate prospects for advertisers (e.g. ‘who likes dogs & icecream?’)

When applied for organisational knowledge it enables queries like ‘who can I talk to about a bug raised by the Redbull account?’ or ‘which Lead Engineer has most experience with ServerCore recently?’.

A bimodal graph, like the one above in which both ‘people’ and ‘contexts’ (green) are plotted together, goes beyond collaboration to reveal the significance of specific ideas and themes uniting teams together.

Metrics — add depth and highlight potential focus areas

Instinctively, most people think of metrics as ‘outputs’. They manifest as KPIs and trends on management dashboards.

However, we find they are incredibly powerful as ‘inputs’. Numeric scores that can be applied across a network to highlight key traits and characteristics — making it easy to dynamically explore relationships in more detail, and to test hypothesis you have formed.

Metrics add emphasis to a visualisation, clearly highlighting areas of interest.

Timelines — plot related activities together over time

A network united around a specific context, or a set of tags or labels, provides the backbone for other forms of analysis too — a process, for example. Multiple tools are often used at different stages of a particular process (eg. Slack, Jira and GitHub for software delivery) and can be combined to create a perspective on the end-to-end activity.

This technique is often particularly insightful for teams, as it reveals the causes of hidden bottlenecks or problem areas. It also provides the dataset to explore and test hypotheses — exploring how changes to individual steps or activities could influence baseline metrics, organisational goals and objectives.

The end-to-end journey of a new website launch, each dot represents a signal of confusion, a problem or an issue raised along the timeline. Highlighting the opportunities to improve handoffs between teams

Resolving bottlenecks affecting software releases — each dot is an event, its size determined by a metric which calculates the addressable wait time. The large dots are events driving the critical path.

Lastly — don’t remove data, anonymise it

Some organisations are comfortable sharing data in return for the benefit it can yield, others are not. At ConnectedCompany, pragmatism is definitely a critical component of what we do and how we generate results.

Global organisations will often use the same tools everywhere, but will only want to analyse a particular region or section of the business. As a result, they will only commission data for a specific subset. Likewise permissions and policies may differ across geographies (e.g. CCPA, GDPR) rendering certain individuals ‘out of scope’ until we have their explicit permission.

However, because insights are derived from the network as a whole, every node is valuable. It is always better to show the existence of nodes and not to remove them from the dataset. For example, we commonly anonymise names/usernames and strip away all associated PII data during data ingest — replacing them with a pseudonym and therefore preserving the true structure of the network.

In practice, the output results in ‘backgrounding’ or greying out out of scope nodes. Maintaining the integrity of the analysis

Final thoughts

We’ve found that the most impactful way to work with activity data is in close collaboration with teams in pursuit of a collective goal or a target outcome. The human input is essential. Without the context to guide investigations, you’ll just be another person with data.

The good news is, the breadth of techniques described above means you’ll always be able to accelerate or assist a key decision, regardless of which team or department it originated from.

The work-graph is a powerful asset because of its versatility, and has huge potential to help everyone build a better work place for themselves and others.