Distilling Business Domains from a Ubiquitous Language

Save 37% off Writing Great Specifications with code fccnicieja at manning.com.

A ubiquitous language has to be a language that grows out of blending technological expertise with business expertise in reasonable proportions. This article discusses how to ‘distill’ business domains from the ubiquitous language of a business or project.

The “ubiquitous language” is a language cultivated in the intersection of technical and business jargons. Each project has its own ubiquitous language. It isn’t a language of the business, nor is it the language of technology. It’s a mix of both.

Figure 1. The sources of an ubiquitous language — a language created by mixing the jargons of technology and business to build a unified lingual model that can be used by both domain and technology experts.

The domain experts — who’re people with authority in an area or topic — have their design jargons. The technology experts, such as designers, testers, analysts, or engineers, have their jargons. The problem is that design jargon is as bad as technical jargon. When they talk too much about their work at the dinner table, bankers can be as annoying as programmers, designers, or testers. Anyone can be difficult to understand to a layperson inexperienced in their domain, when talking in professional or technical jargon.

We shall now explore how analyzing a ubiquitous language can be used to derive a model of the business domains used within a specification suite. Deriving the model is the first step in getting to our end goal — which is to create a proper system for organizing large specification suites according to the business domain.

If we compare two projects from two different fields, they’ll have vastly different ubiquitous languages. The narrower the business domain of a project, the more difficult the language seems to outsiders. For example, most people easily understand consumer-facing aspects of many businesses. Pop culture can help, too. Movies about Wall Street made many people understand the basic finance keywords such as shorting, popularized by a 2015 movie Big Short, or IPO, an Initial Public Offering, depicted in critically acclaimed The Wolf of Wall Street movie from 2013. Most businesses, such as Activitee, can only dream of similar publicity. Before we move on to continue our work on Activitee’s specification suite, let’s analyze a simpler example.

Spotting different domains in your scenarios

The ubiquitous language lies at the heart of Domain-Driven Design. Gherkin scenarios can be a great source of a ubiquitous language, because each scenario is a recording of a conversation business experts had with the technology experts about the requirements. If the recording is truthful, the scenario should easily capture a ubiquitous language which the experts created during their conversations, both face-to-face and in writing. We can imagine that a raw record won’t be perfect. Let’s say that we’re sitting in a specification workshop for a cloud storage service like Dropbox, Google Drive, Apple iCloud, or Microsoft OneDrive. A domain expert is talking. You’re writing down the behavior explained by the expert as a Gherkin draft. The record ends up on the whiteboard:

Given a 2 GB limit on free cloud drive accounts
And 2 GB of files on Simona's free cloud drive account
When she upgrades to the premium plan
Then her credit card should be charged $5
And her storage should be upgraded to a 40 GB limit

It’s a good enough draft — at least for the brainstorming stage. If we look at the scenario more closely, we’ll see that it introduces terms from several domains into a ubiquitous language of our team

  • words like cloud drive are from the cloud domain
  • gigabytes, files, or storage are from the more general storage domain, which, nonetheless, must stay connected to the cloud domain
  • and words like premium plan, free, or credit card are from the payments domain or commercial offering domain, a domain that all non-free products share

In fact, we could easily map how different domains map onto our scenario.

Figure 2. Different domains can appear in a single scenario depending on how we formulate the behavior using a ubiquitous language.

We can notice that some of these domains don’t have to be tightly-coupled. For example, the payments domain doesn’t have to be connected closely to the storage domain. Upgrading your account to the premium plan will make other non-storage features, such as team collaboration, available, too — maybe we should look for a more general way to specify the details of our business model. But that doesn’t mean we can uncouple all domains. The storage domain, for example, often stays connected to the cloud domain, because there can be no cloud storage without any storage in the first place.

Building a domain-driven specification suite in practice

We can now get back to our Activitee example. In this section, we’ll try to derive domains from a few specifications for Activitee. Thanks to the distillation process, we’ll be able add another layer to our suite-building skills and organize Activitee’s specifications and actors by their subdomains.

Domain-driven specification suite — A specification suite organized according to its various domains. Domain-driven suites are easier to manage when we’ve got too many actors or when our actors have too many Abilities.

Distilling subdomains from unequivocal scenarios

We shall begin with an easy and unequivocal example. Here’s an Ability which specified how employees can authenticate and manage their identity on the Activitee platform (see listing 2).

Ability: Employees can authenticate and manage their identity
  Scenario Outline: Users should be able to use two-factor authentication
    Given <authentication> for Simona
And Simona's desire to log in
When she provides her username
And she provides her password
And she enters <code>
Then she should be <authenticated>
     Examples: Two-factor authentication
| authentication | code | authenticated |
| two-factor auth | correct code | logged in |
| two-factor auth | wrong code | not logged in |
     Examples: Make sure single-factor authentication still works
| authentication | code | authenticated |
| traditional auth | no code | logged in |
  Scenario Outline: Users should be able to review their login history

Given Simona's previous logins were from <country>
When somebody logs in to her account from <login>
Then her account should be considered <compromised>
And Simona should be notified about the security breach
| country | login | compromised |
| the USA | the USA | secure |
| the USA | the UK | compromised |
| the UK | the UK | secure |
| the UK | the USA | compromised |
  Scenario: Users should be able to log out remotely
    Given Simona was notified about the security breach
When she confirms her identity via two-factor identification
Then she should be able to log herself out from all devices remotely

The three scenarios from listing 2 are focused. They talk about one thing and one thing only — the topic of security (see table 1). Their business context explains that. They were created when Activitee signed a new client who had an active Chief Information Officer responsible for the information technology and computer systems used in their company.

Specifications like this are easy to analyze, because they’re unequivocal. We can derive the domain easily as the scenarios only talk about a single topic. If all Gherkin specifications maintained this one-to-one relationship between scenarios and domains, our job in this section would almost be done. Unfortunately, as we saw in figure 2 and as we’ll see in the paragraphs to come, reality isn’t always organized this neatly.

Distilling subdomains from mixed scenarios

Here are some specifications that allow HR representatives to target their programs to specific audiences to allow engineers to see different events than sales representatives, as their interests may differ (see listing 3).

Ability: HR reps can target wellness programs by choosing specific audiences
  Scenario: Managing participants based on selected people
    Given Mike drafted a new event
And he chose to invite only selected participants
When he invites Simona for an annual review
Then only Simona should be notified
But other employees such as Jane should not be notified
  Scenario: Managing participants based on their location
    Given Mike drafted a new event
And he chose to invite only selected participants
When he invites employees from Atlanta
Then employees in all departments from Atlanta should be notified
But employees from New York should not be notified
  Scenario: Managing participants based on their department
    Given Mike drafted a new event
And he chose to invite only selected participants
When he invites employees from Engineering
Then employees in all locations who work in Engineering should be notified
But employees who work in Sales should not be notified
  Scenario: Managing participants based on their location and department

Given Mike drafted a new event
And he chose to invite only selected participants
When he invites the Engineering team from Atlanta
Then employees in the Engineering team in Atlanta should be notified
But employees who work in Engineering in New York should not be notified
And employees who work in Sales in Atlanta should not be notified
And employees who work in Sales in New York should not be notified
Table 2. Domains in listing 3

The language in the four scenarios from listing 3 isn’t as unequivocal as in the previous specification we analyzed (see table 2). At least two domains can be derived from our new steps. Some domain concepts refer to the events domain, and others to the domain of organization.

If we only wanted to analyze the ubiquitous language of our specification suite, saying there are two distinct domains at play here would be fine. The problem is that we want something different. We want to organize the suite by its domains. As the file system is a hierarchic structure, we’ll only be able to assign the feature file including the Ability from listing 3 to a single domain (if we want various directories to represent various domains). Therefore we must choose between the two domains we’ve distilled.

We can make our choice based on multiple approaches:

we can treat the core domain as more important than other subdomains

  • in Activitee’s case the events domain is clearly the core domain
  • the downside of this approach is that the core domain would always take precedence in every scenario it appears in — which isn’t something we’d want, as most scenarios must deal with the core domain in some way or another

we can count the domain concepts (words) in each domain and assume that the more popular domain is also more important

  • the first downside is that we’ve no guidelines for dealing with draws
  • the second downside is that some domains will be more popular only because they’re more generic than others or because they’ve more non-functional aspects, making it easier for them to spread horizontally across the suite

we can combine the two previous approaches to “calculate” relative weight of each domain

  • this way, we acknowledge that some domains (such as the core domain) are inherently more important than others
  • but we also acknowledge that domains with more domain concepts present can, in some cases, be more essential than even the core domain

Table 2 shows that the specification from listing 3 is close to a draw, with a 5–4 result. We can assume that both distilled domains are almost equally important to our distillate. Personally, I’d recommend choosing the events domain because the scenarios talk only about managing participants and the core domain can take precedence in this case. Let’s talk about an example when the opposite would be true. This specification says that employees should only see Activitee content which is relevant to them — and that the relevancy should be calculated based on their branches and departments (see listing 4).

Ability: Employees can see content they like in the news feeds for their branches
  Scenario Outline: Employees should only see content relevant to them
    Given <person> from <branch> who works in <job>
When <person> looks at the company dashboard on Activitee
Then <person> should see <type> <content>
    Examples: Employees should see after-work content only from their location
| person | branch | job | type | content |
| Jane | NY | Dev | NY after-work | events |
| Mike | NY | HR | NY after-work. | posts |
| Tom | Atlanta | Sales | Atlanta after-work | events |
| Ramona | Atlanta | Dev | Atlanta after-work | posts |
    Examples: Employees should see work-related content from all locations
| person | branch | job | type | content |
| Mike | NY | HR | Atlanta HR | events |
| Jane | NY | Dev | Atlanta dev | posts |
| Jane | NY | Dev | NY dev | events |
| Tom | Atlanta | Sales | Atlanta sales | events |
| Ramona | Atlanta | Dev | Atlanta dev | posts |
| Ramona | Atlanta | Dev | NY dev | events |
Table 3. Domains in listing 4
I counted posts as a part of the organization subdomain because Activitee’s news feed treats posts as pieces of content published in various interest groups — and groups belong to the organization domain.

If we have a look at table 3, we’ll see that listing 4 is the opposite of listing 3. We’re close to a draw again — but this time, it’s a 4–5 result in favor of the organization subdomain. This, and the fact that the examples talk not only about events, but also posts in interest groups, encourages me to recommend adding this specification to the organization subdomain. Here’s how that might look:

That’s all for this article. Hopefully this has been a thought-provoking read on deriving business domains from ubiquitous language. For more on scenarios and Ghekin and what they can do to help your business run like a top, download the free first chapter of Writing Great Specifications and see this Slideshare presentation.