Behaviour Driven Development with Scala and Cucumber

Published in

Wix Engineering

17 min readMar 19, 2021

Computers without instructions are not very useful, they just consume power and produce heat. It’s the responsibility of programmers to write those instructions in order for computer to behave. But just writing the instruction is not enough, another responsibility of programmers is to make those instructions simple and this is what makes the programmers job so difficult. Programmers must think of all the possible events that might happen in the computer system and write instructions to handle all of them, there’s quite a lot of details to consider. The programmer has to organise and manage all those details so they could be understood by computer and other programmers.

This is a drastically different approach the computer user sees the system. The computer user does not care about programming details, the user cares that computer behaves in a way that allows him to achieve certain goals. Since computer user and programmer see computer systems from different perspective, it’s difficult for both of them to understand each other and their point of view to the same requirements, problems and solutions. This imposes a communication barrier between the user and programmer, because same problems and solutions could be communicated and understood differently by each side.

To bridge this communication gap, Product Owner’s one of many responsibilities is to understand the goals of computer user and the details of the programmer. PO must describe the computer system in sufficient detail that the programmer can write instructions for the computer system. The PO must also be able to describe the system without those details so the user could understand it. In some cases it’s the PO that carries this responsibility in other there’s a dedicated role for it called Business Analyst.

So what language does the PO use to describe computer system behaviour ? One of the approaches is to use storyboards and wireframes, but these tend to change frequently, and programmers need business rules behind those storyboards and wireframes that don’t change that frequently. What we need here is a some kind of specification, that can be detailed enough so programmers can write code from and at the same time be simple enough for the computer users to read and understand.

Writing Scenarios in Gherkin Language

Given, When, Then

One of the solutions is to write the specification in a language called Gherkin. The most useful words from this language are Feature, Scenario, Given, When, Then. Let’s say we want to create a program that answers a question about chicken crossing the road. We would write the following scenario:

Feature keyword describes the name of the feature and can contain short description. Each Feature can contain multiple Scenarios that have a name and contain several steps consisting of Given, When, Then. In the Given step you describe the initial state of the system, When step is the action in the system, which might be invoked from a user or an event from another system. Then step is an expected result or expected state of the system. We also see And being used right after Given step, it acts as a synonym to the previous step, allowing to keep steps shorter. Without And the Given step would sound like Given there is a answerer and there is a user which would be too long for one step.

Good rule of thumb is to keep steps small in order to reuse them for other scenarios, this partitioning of steps allows to reuse And step in other scenarios or features, probably even unrelated to this specific feature. And steps allows to avoid re-declaring the same step multiple times like the following:

...
Given there is a answerer
Given there is a user
...

To make easier to read, And is preferred in such case.

Background

Let’s take a look at another scenario:

We see from this example that the initial two steps duplicate between both scenarios. It’s no issue when there’s only two scenarios in total and only two steps being duplicated, but when there are many more steps repeating in multiple scenarios, you can move them to a Background. The result would look like this:

Scenario Outline

Also, we see that the When and Then steps actually are the same, only difference being is the question and answer text. This becomes a problem when there are even more Scenarios with this issue. To solve it, we can use Scenario Outline, take a look how that works:

Steps are declared only once and parameters are passed from Examples table, each parameter from it’s own column. Scenario Outline acts as a template and Examples as actual parameters for the template. The result and process of scenario execution is the same as before, there are two rows in Examples table, so there will be two scenarios executed, one scenario for each row.

This becomes especially useful when dealing with numbers. Here’s another example when we describe behaviour of these chickens when searching for food, like insects. Let’s say typical chicken finds 3 insects per minute, but our robot chicken with it’s high tech motion detectors and thermal cameras can detect 10 insects per minute. In fact, this chicken is so advanced that it can attract even more bugs, sadly for the robot chicken - they’re software based. Take a look at the scenario for such case:

Specifying system behaviour through these scenarios is a useful approach to documenting how system should behave and enhance collaboration between developers, QA and PO when creating them. But this approach is not new in terms of documenting system behaviour through use cases that also acts as acceptance criteria, there are many formats such as User Story. Gherkin is different because it uses these business friendly words Given, When, Then in specifications that unsurprisingly are synonyms for Arrange, Act, Assert, which are used to arrange steps into groups in Unit Tests. This fits well within Gherkin language, because in the end, scenarios are test cases after all.

Executing Scenarios with Cucumber

But documenting system behaviour with Gherkin is only part of the story, where Gherkin really becomes useful is it’s ability to act as Specification that can be executed thanks to a tool called Cucumber. This means that each step in scenarios can invoke code and even forward parameters from Scenario to that code and when scenarios in Gherkin are written as Acceptance Criteria for the system, then Scenarios are executed as Automated Acceptance Tests. Gherkin allow to define the specification of the system that can act as documentation for User, PO or BA. It also allows specification to act as a requirement specification for programmer to create code. It also allows to collaborate and communicate between all mentioned parties. Cucumber executes that specification as Automated Acceptance tests in order to verify that the documentation/specification is up to date with actual system behaviour.

Since there are many implementations of Cucumber for various programming languages, we will examine how Cucumber works in Scala.

How to use

To use cucumber in Scala project we need to add these dependencies to build.sbt when using sbt.

scalaVersion := "2.13.5"

libraryDependencies += "io.cucumber" %% "cucumber-scala" % "6.10.1" % Test
libraryDependencies += "io.cucumber" % "cucumber-junit" % "6.10.1" % Test
libraryDependencies += "junit" % "junit" % "4.13.2"
libraryDependencies += "com.novocode" % "junit-interface" % "0.11" % Test

These are latest versions of Cucumber as of writing this article.

Apart from the obvious dependency cucumber-scala, we see there’s cucumber-junit and junit. This is because Cucumber will run feature files as junit tests, so cucumber-junit is needed, because a custom JUnit runner called Cucumber will be used from this package. This Cucumber JUnit runner will look for .feature and glue files on the classpath and accept configuration options. The junit-interface package is needed in order for sbt test command to execute feature files, without it, sbt will skip feature files. If you’re going to execute feature files only through IntelliJ then this dependency is not needed.

After running sbt update command to fetch the declared dependencies, we can begin creating our first feature file at src/test/resources/features/ChickenCalculator.feature with the following content:

After that, let’s create a test runner class that will parse our feature file and run each scenario as a test case. For that we will create runner class at src/test/scala/TestRunner.scala with the following contents:

import io.cucumber.junit.{Cucumber, CucumberOptions}
import org.junit.runner.RunWith

@RunWith(classOf[Cucumber])
@CucumberOptions()
class TestRunner {}

We created a JUnit test runner that will use Cucumber Runner, who will be responsible for parsing each feature file found in a classpath and find glue classes that match these feature file step implementations in order to execute them. Since we didn’t define any parameters in @CucumberOptions, defaults will be used, but if needed, we can provide custom paths for feature files or glue classes or we can even provide many other configuration parameters through this annotation.

Once we have TestRunner class in place, let’s run it in order to see it in action and how it will fail, as we didn’t provide any step definitions yet. We can run with the sbt test command and the result would look like this:

We can also run this test from IntelliJ, by running the TestRunner as follows:

We see it failed with an error stating that the step definitions for found feature files are not implemented and it also provided how these step definitions should look like. So let’s copy these step definitions to a class located at src/test/scala/steps/StepDefinitions.scala

Some of those step definitions are defined several times, so don’t forget to make sure they are declared once, otherwise you will get DuplicateStepDefinitionException . The step definition class should look like this:

And now, running sbt test provides different output:

Now it says that steps are not implemented, which we are about to do, but before that, let’s examine the step definitions for a bit.

How Step Definitions work

These step definitions map with a step in feature file, because first parameter in functions Given(), When(), Then() accept a string matching with the one in the feature file. These strings are actually Regular Expressions and you can actually write the step definition as follows and have the same effect:

Given("""^a chicken collects (\d+) insects per minute$""") { (int1: Int) =>
  // Write code here that turns the phrase above into concrete actions
  throw new io.cucumber.scala.PendingException()
}

This gives more flexibility by writing a single definition that matches multiple steps in various feature files. In earlier versions of Cucumber, Regular Expressions was the only way to map steps from feature files to step definitions, but later came a more convenient format called CucumberExpressions, which are used by default in generated step definitions as you saw in our sample. Still, if there’s a need, it’s possible to use RegularExpressions.

Another thing you will notice is that a parameter type is defined in the string of first parameter for Given(), When(), Then() functions. The value for this type is passed into the function which is a second parameter of Given(), When(), Then(). The parameter type matches the type being provided in feature file, in our case a number is provided in feature file (insect count, minutes or hours), so we use {int} in our step definitions, but there are many other types like {string} ,{float} ,{word} or even {} which is any type. If needed you can provide regexp for the type and you can write a transformer that will convert the matched result of regexp to actual type. There are many possibilities for customization regarding this, so for more info, take a look at Cucumber Expressions.

Notice the second trait that StepDefinitions class is extending, it’s a language code, in our case it’s EN. This means if you were to extend another existing trait like LT for Lithuanian language your step definitions would look like this:

This allows to write feature files in Lithuanian language along with Given, When, Then and other keywords translated to Duota, Kai, Tada. There are many languages supported.

Making tests pass

Let’s implement our step definitions with the following code:

As you can see, ChickenCalculator class instance is created and has parameters passed in from feature file. The expected result is compared with the actual result in Then step. For our simple application, assert is enough, but for more advanced cases it’s recommended to use an assertion library. In real world scenario, the ChickenCalculator would be some business related class you want to test.

Since our tests won’t compile at this point, because ChickenCalculator doesn’t exist, lets create an implementation. Create a class located at src/main/scala/chicken/ChickenCalculator.scala with the following contents:

package chicken

class ChickenCalculator(insectsPerMinute: Int) {
  def searchInsects(minutes: Int): Int = insectsPerMinute * minutes
}

This makes tests pass:

And we can verify that tests are indeed working, by replacing multiplication with addition, fromdef searchInsects(minutes: Int): Int = insectsPerMinute * minutes to def searchInsects(minutes: Int): Int = insectsPerMinute + minutes . And soon enough we will see that tests are indeed failing, confirming that they are reacting to business logic changes:

From error message we can see the Scenario which failed and from IntelliJ we can even be navigated from error message to step definition where the failure occurred.

Scenario Tags

There’s another interesting feature you can use is tags. You can tag some scenarios and run only specific ones, for example let’s tag first scenario as @wip like this:

And we define which scenario we would like to run by specifying a tag in TestRunner:

import io.cucumber.junit.{Cucumber, CucumberOptions}
import org.junit.runner.RunWith

@RunWith(classOf[Cucumber])
@CucumberOptions(tags = "@wip")
class TestRunner {}

From the results we see that only 1 test case was executed, which means it was our tagged scenario.

Scenario Outline

Rearranging commonly used steps into Background or into Scenario Outline doesn’t require any changes from step definition side, so let’s rewrite the scenarios into Scenario Outline and see how that works:

If you would run the tests now, you would see that they all pass. We didn’t change the step definitions, only how parameters are passed into the scenarios themselves.

IntelliJ support

IntelliJ has great Gherkin support out of the box and for Cucumber support you can install Cucumber for Scala plugin. It highlights all specific keywords, provides validation for the statements and provides warnings when step definitions are not implemented and even offers to create default implementation for you, so you wouldn’t need to copy-paste from the console.

In the end, project structure should look like this:

And if you need the project as a reference or as a starting point, you can find these examples in this github repository.

Specification Tests

Writing Scenarios in a business friendly language with Gherking and executing them as acceptance tests with Cucumber was at first a complementary feature to then existing RSpec testing tool. RSpec belongs to a category of testing tools that prefer to use functional specifications to describe system behaviour rather than user stories like in Cucumber. These specifications are technical since they were developed as a BDD friendly approach towards unit testing and was offered as an alternative to the Unit Tests, that’s why they are not suitable for business people. For that reason, Cucumber was added to RSpec as a BDD friendly approach towards Acceptance Testing, later on being spun-off into it’s own product.

Specification based Testing takes the ideas from BDD and uses them to solve issues that naturally appeared when using Unit Tests, for example naming test cases with sentences and using conditional verb like “should”. Furthermore Specification based Testing focuses on testing only the behaviour of the system, for example if you are designing a list sorting function and you are writing a unit test, you could write a test case where that is checking if the output list type matches the expected type or if single entity inside the list matches the expected structure. Since Specification Based Tests are influenced by Behaviour Driven Development, the only test cases we would write are the ones where we pass in unsorted list and expect a sorted list to be passed out of the function, since through test cases like these, we can verify the behaviour of the system. Or when creating a calculator program, BDD based tests don’t care what is the binary representation and structure of the integer, what it cares about is that 1 + 1 returns 2.

This doesn’t mean you cannot write Specification Based Tests that act as Unit Tests, you can, and same goes with Unit Tests, you can write Unit Tests like Specifications, using same vocabulary and only focusing on the behaviour without any Spec testing library. But since BDD suggest focusing on behavioural side of testing, Specification Testing tools offer better built-in vocabulary and functionality tuned towards expressing tests as specifications compared to Unit Testing libraries.

Since there are many Specification based tools for each programming language, let’s see how it looks on one of Scala’s testing tools called Specs2.

Specs2 supports two specification styles, last example showing Unit Specification and the other called Acceptance Specification. Regardless which is used, we can see that the tool supports various keywords like ‘should’, ‘must’, ‘have’ in order for test to be defined as a specification, allowing to explain the system through examples of expected behaviour. The library supports even more keywords and features, so take a look at it here.

Best practices

To use Cucumber efficiently, there are some common pitfalls that should be avoided.

Don’t mention UI in the Scenarios

One of the common pitfalls is that users who starting writing scenarios tend to mention User Interface elements, like buttons, input forms and etc. It’s natural to think of a feature as an interactions through UI, but this approach has several problems.

First problem is that UI tends to change frequently and this means that scenarios will have to change also, when in fact they shouldn’t, since business logic underneath didn’t change. This makes tests fragile. Tests need to fail when business logic behaves in undesired ways, by making tests fail because of UI change, you won’t immediately know that the UI change was the culprit, in result the test being false positive.

Second problem is that tests executed through UI will be very slow, since it will take time for UI to be rendered and for request and response to travel through all layers of the system. By writing scenarios that don’t mention UI and only operate on business entities, you can make them interact only with the actual business layer of application, which is agnostic of framework, UI and even database, making the tests faster.

This doesn’t mean you shouldn’t write scenarios that test UI, you can, but there should be a small number of these tests. But even in this case, you shouldn’t mention the UI in the scenarios, written in agnostic manner to UI, you can have step definitions in code that perform UI actions or business related actions in the business module of application. When mentioning the UI in Scenarios, you won’t have this freedom.

The typical setup for writing step definitions when testing through UI for a web application would be to use some sort of web driver like Selenium to interact with web application.

Write scenarios in such a way that UI is not a factor, since UI is not a core to the business. It’s actually easier to think about business requirements without the UI, since you don’t involve extra concepts to the Scenarios like frequently changing UI. But this might seem counterintuitive at first and even harder to think about considering how user interacts with a system through the UI, but it does get easier with practice and time.

Provide enough details

Another issue is the level of detail for scenarios. Don’t hesitate to mention each and every use case in scenarios, especially edge cases, by describing feature vaguely or without sufficient details, you leave room for programmers to make assumptions which can be proven wrong later on. Write requirements as acceptance test scenarios in order to prove that the program is working correctly.

Negative Scenarios

Next issue is that you need to write scenarios that not only tell what program should do, but also what program should not do. For these type of scenarios, QA specialist’s contribution is very useful since QA are skilled at knowing ways the user might interact with the system, ways we can probably not think of at first. We can write scenarios for cases how the system should decline certain actions from the user.

Keep steps small

Another tip is to keep Scenario steps small, don’t make them do many things at once. When steps are small, they can be reused in multiple other existing scenarios, allowing to minimize step implementation code and allowing for steps to be reused in future scenarios. By following this tip, you can achieve a stage when for new feature, you can write a scenario by reusing all the existing step, allowing not only to prove instantly if system is working correctly, but also to skip the step implementation stage.

Acceptance Testing complements Unit Testing

Lastly, Acceptance tests executed by Cucumber don’t replace existing unit or specifications tests, they actually complement each other. Firstly, Features and Scenarios are written together by developer, QA and business analyst. Then the developer implements the step definitions in order to execute the Acceptance Tests and see them fail. After that the developer needs to follow the three steps of TDD by writing a failing unit test, that should be made green by implementing some sort of functionality. After refactoring stage, the developer should re-run the Acceptance test to see if it’s passing. If it’s not passing, then write another unit test for another part of functionality and repeat the TDD cycles until Feature is passing. From there on, move on to next Feature. It takes multiple cycles of TDD in order to make single BDD test pass.

Single cycle of Acceptance Test consists of multiple cycles of Unit Tests

When to avoid using Cucumber and Gherkin

So who is responsible to write these specifications ? By the looks of it, it’s the PO or Business Analyst’s responsibility.

What if the business people don’t want to or don’t have time to write these scenarios ? In that case there might be some usefulness in the scenarios if business people might read them. Programmers write the scenarios and provide them to business people to read them in order to communicate.

But what if business people won’t read or write the scenarios? In such case they are useless since they don’t serve it’s intended purpose which is to communicate between business people and programmers. There is no point in writing specification in an inconvenient format if nobody else is going to read them in that format, so programmers should document the specifications in the format that is most convenient for them, which is a program source code or a Specification based Test or even Unit Test. This doesn’t mean we won’t be needing Acceptance Tests in such case, we need them since they are an important part of the application.

Conclusion

There is a misconception that Behaviour Driven Development is a discipline that should be practised only by programmers, it’s actually a tool for Business People and Programmers to communicate by using a common language to write scenarios. Those scenarios can be used as requirements by the programmer in order to write code and can be used as system documentation by Business People and Users in order to understand how the system behaves. Gherkin language provide business friendly vocabulary to write the specification and Cucumber can turn those scenarios into executable specification. And when the scenarios are written as Acceptance Criteria of the system, Cucumber can turn that specification into Automated Acceptance Tests, verifying that system behaves and functions as intended.