How to Squeeze Test Driven Development on Legacy Systems

We all love T.D.D. We know its benefits, we have read a thousand tutorials on how to build a system using this technique. But this not feasible for currently legacy systems

Photo by Anne Nygård on Unsplash

What is TDD?

We turn requirements into very specific test cases.

We improve software so all tests pass.

This is opposed to incorporating functionality that has not been proven to comply with requirements.

Created in 2003 by Kent Beck (also the xUnit testing Framework testing system author).

The Cycle

Solely based on behavior. Forgetting everything about accidental implementation.

2) Run all tests. The new test must fail. All the rest should pass.

3) Write the simplest possible solution to make the test pass. The programmer must not write code that is beyond the functionality the test checks. (K.I.S.S. and Y.A.G.N.I design principles)

If the all test passes restart the process or …

4) (optionally) make a refactor (when code stinks).

NEVER DO BOTH 1 and 4 Together.

Design Benefits

  • Simpler designs (KISS, YAGNI, Gold Plating avoidance, Fake it till you make it, Fail Fast)
  • Isolation on failures (less debugger or logging uses).
  • Design by contracts.
  • Modularization
  • Bottom up building.
  • Normal Use cases and Exceptions (Alternate cases) separation.
  • Full branches coverage (we cannot add code without a test covering it).
  • Instant feedback / psychological rewards.
  • Small steps incremental approach.
  • Based on Wittgenstein learning ideas by incremental examples and Cognitive Behavioral Therapy.
  • Defer implementation issues and Premature optimization.

Requirements

No Globals, No Singletons, No Settings, No Database, No Caches, No External API Calls and no side effects at all.

TDD can detect coupling problems.

Solving them leads to cleaner code focused on business logic alone and encapsulating implementation decisions.

We must deal with coupling problems using test doubles: mocks, stubs, fake objects, spy, proxies, dummy objects, etc.

Working on existing systems

The real world example

  • Users can search for artists based on a type-ahead selector on a React application.
  • System performs database queries on a heavy concurrent back-end system.
  • We need to remove redundant SQL queries matching part of artists names.
  • Like SQL Operator is very expensive on relational systems, and we are not allowed to change back-end architecture.

The Problem

SELECT * FROM ARTISTS
WHERE ((artist.fullname LIKE '%Arcade Fire%')
OR (artist.fullname LIKE '%Radiohead.%')
OR (artist.fullname LIKE '%Radiohead%')
OR (artist.fullname LIKE '%Sigur Ros%')
OR (artist.fullname LIKE '%Sigur%')))

System will execute just:

SELECT * FROM ARTISTS
WHERE ((artist.fullname LIKE '%Arcade Fire%')
OR (artist.fullname LIKE '%Radiohead%')
OR (artist.fullname LIKE '%Sigur%')))

Since this part is redundant and expensive to the database.

OR (artist.fullname LIKE '%Radiohead.%') 
OR (artist.fullname LIKE '%Sigur Ros%')

Let’s get to work

Always start with the simplest problem

1) Add a test (empty case)

Notice:

  • Class LikePatternSimplifier is not created yet.
  • No function simplify() is defined.
  • We number tests according to definition order.
  • First test is the easiest one and also the Zero Case of Zombies methodology.

Test fails (as expected). Let’s create the class and the function.

3) Write the simplest possible solution to make the test pass.

Notice:

  • First solution is always hard-coded.

Continue with another trivial case

1) Add a test (simple expression)

3) And the simplest solution for both cases.

Works like a charm in both cases.

We are taking baby steps, slicing the problem and following divide and conquer principle.

Continue with another (not so simple) case:

  1. Add a test (two independent expressions).

Code works correctly without changes. Is this a good test?

We will discuss it on a more advanced article.

Let’s move on.

Continue with a desired and juicy business case.

  1. Add a test (one expression containing the other).

Let’s make it work.

3) Add the simplest solution for all the already written cases.

This is an ugly algorithmic solution, but we will improve it with a refactoring once we become more confident.

We cannot fake it anymore. We need to make it.

Ugly, not performant, undeclarative and complex.

We don’t care. We need to gain confidence and learn on the domain.

Luckily, we will soon have time for better solutions.

Continue with another case.

1) Add a test (left expression containing the right one)

… and we are Green, so we are covering the business rule stating that terms order is not relevant. (Commutative Property).

We make it explicit so no smart refactor can ever break it!

Move on with another (not so simple) case.

  1. Add a test (Capitalization is not relevant to MYSQL engine but our users might not be aware of that).

2) We run the tests and the new one is broken. Let’s fix it!

3) The simplest solution for all the already written cases (with the new case).

… and tests are all green again with the ugly improved solution.

Code smells and we have several test cases. We need a better solution.

4) Let’s refactor the solution with a more efficient and readable one.

Let’s test first production scenario requested by customers.

1) Add two unrelated redundant prefixes

And it works !

We inject it on our legacy code:

Before

Let’s inject it.

What really happened

Software development is a group activity.

The quality assurance engineers found additional possible benefits.

Pattern could be in the middle of the string.

Customer agreed to add this functionality.

Lets consider those cases.

New cases are broken since they were not represented by a previous one. We keep fixing them.

4) Let’s change the solution to cover all previous cases and the new ones.

The end is the beginning

  • Using CI/CD codefix went into production.
  • Happy ending.

Once we submitted the intelligent SQL simplifier something bad happened.

This was actual SQL after terms of bad handling:

SELECT * FROM artists WHERE (())

This SQL generation mistaken as an empty condition.

So we will fix it TDD Way.

We isolate the defect and add it as a broken TDD Case.

Of Course, it fails since previous implementation brought an empty solution (and a customer complaint).

We can fix it by doing a duplicate’s remover case-insensitive pre-processor at the beginning of simplify function:

To see if we must test private methods please visit shoulditestprivatemethods.com

Tests are green again.

Not dealing with case-sensitive duplicate’s algorithm worked again.

Lets consider a different order.

Against our intuition we see it fails.

This is because the unit is bringing a ‘Yes’ instead of a ‘yes’.

The solution depends on the product owner. We can:

1) normalize all outputs.

2) change the test based on the property that our SQL Engine is case-insensitive on text fields.

We choose 2)

Tests are green again

We add more tests considering mixed cases.

Missing Opportunities

Case went to peer review.

One of the reviewers asked about not like comparison finding an improvement opportunity.

We asked our customer-on-site for agreement.

If user chooses not to see ‘head’ => it is choosing not to see ‘Radiohead‘ and Talking heads.

In SQL: NOT LIKE ‘%head%’ implies NOT LIKE ‘%Radiohead%’ which is redundant in an AND condition.

Our simplifier was already aware of that, so we injected in a second place being confident tests were already covering that scenario.

Conclusions

  • Implementing on a big system requires special techniques to gradually remove coupling.
  • TDD influenced all the written code. We have all the new code covered. (17 unit tests and 3 SQL Generation tests).
  • We gained confidence on every new case add ensuring they didn’t break previous ones.
  • We should only test private methods using method objects/FunctionAsObject or reflection.
  • We used TDD on Development, Code Review, QA fixing and Production Defects Life-cycle.
  • We developed a parallel customer system (the tests). They will always be our first omnipresent user.
  • Solutions were as simple as possible.
  • It’s very possible to make TDD on existing projects with lots of code.
  • TDD does not replace or overlap QA Process and tasks.
  • Multiple roles were involved and added value: Developers, QA engineers, Customers and Code Reviewers.
  • We faked it until we made it.
  • TDD does not guarantee a good design.
  • We should never change or optimize not covered code.
  • IF you like this article please let me know, so I can write more on TDD on Legacy systems.

Legacy Code is all the code without tests.

Michael Feathers

Part of the objective of this series of articles is to generate spaces for debate and discussion on software design.

We look forward to comments and suggestions on this article.

This article is also in Spanish here.

Dev Genius

Coding, Tutorials, News, UX, UI and much more related to development

Sign up for Best Stories

By Dev Genius

The best stories sent monthly to your email. Take a look.

By signing up, you will create a Medium account if you don’t already have one. Review our Privacy Policy for more information about our privacy practices.

Check your inbox
Medium sent you an email at to complete your subscription.

Dev Genius

Coding, Tutorials, News, UX, UI and much more related to development

Maximiliano Contieri

Written by

I’m senior software engineer specialized in declarative designs. S.O.L.I.D. and agile methodologies fan.

Dev Genius

Coding, Tutorials, News, UX, UI and much more related to development