How did we implement smoke tests for our CI/CD process in Trendyol?

Smoke testing in CI/CD process

Seyma Yilmaz
Trendyol Tech
4 min readJun 6, 2022

--

Art by Mindful QA

As Homepage & Recommendation Team, we wanted safe deployments, therefore we wrote smoke tests to achieve this.

Let’s talk about our journey and tell you why and how we did it.

How did we start?

We wanted to switch to Continous Deployment as a team, so we started our own Canary Deployment process with customized details.

Our process was like this:

  1. Run smoke tests with no traffic on the candidate deployment.
  2. If tests succeed, gradually increase traffic.
  3. If tests fail, roll back to the previous deployment.

That’s why we needed smoke tests.
For more details, click here: Blue-Green or Canary? Why not both?

You can think of this story as Part 2 of our journey.

Now let’s talk about smoke testing a little bit.

What is smoke testing and how do we do that?

Smoke tests qualify the build for further formal testing. The main aim of smoke testing is to detect early major issues. Smoke tests are designed to demonstrate system stability and conformance to requirements.

For more details, click here (an article from Guru99).

In our team, both QA Engineers and Developers take an active part in testing and development processes for cross-functionality, this helps us to share know-how and improve our way of thinking.

Our Implementation

We have end-to-end automation tests for the staging environments that is covering all the cases (with coverage too, we’ll talk about it in different stories).

Example automation test code:

When automation tests fail, the process automatically stops.

We wanted the same behavior in the production environment.

But for the production environment, since we are gradually increasing traffic, we wanted to use smoke tests as a simple stopping point for us, so the smoke tests aren’t covering all combinations but simply making API calls and making sure the service is returning 2xx response codes.

Test cases are fewer because there is no business logic in tests.

How did we implement it?

Our Canary Deployment Process

For the smoke tests, we use our automation test project but run it with a different profile.

We are using TestNG for automation tests so we kept that intact.

TestNG is an automation testing framework which is inspired by JUnit which uses the annotations (@). TestNG overcomes the disadvantages of JUnit and is designed to make end-to-end testing easy.

Using TestNG, you can generate a proper report, and you can easily come to know how many test cases are passed, failed, and skipped. You can execute the failed test cases separately.

Example smoke test code:

As you can see above, smoke test cases are shorter than automation test cases since we aren’t checking for business logic in smoke tests.
We are already doing it in the automation tests process.

We pass the smoke test URL as an env variable from outside since the smoke test needs to be run in the candidate deployment.

This deployment doesn’t have any address from outside (since it’s a candidate deployment and Istio is managing the traffic control), so we are running our smoke tests from the service discovery URL.

The cluster can reach lorem-ipsum-api-canary via service discovery so we are always making sure that we are sending the requests to the new code.

Smoke test step in Argo UI.

If you want you can also add this as a template in your root and pass the URL and XML from outside such as:

Then in canary analysis pass it like:

Then change the image with our smoke test image as a final step:

Thanks to Emre Tanrıverdi for his help.

Hopefully, see you later in another story. 👋

Thank you for reading! ✨

If you want to give me feedback feel free to reach me via LinkedIn. 👩‍💻

--

--