Best Approaches to Testing in Production

7 min readJun 4, 2020

In today’s increasingly complex and risky world of rapid software delivery, testing in production has become a key element in the product validation toolbox.

To maintain quality in production environments, developers must be prepared to handle bugs experienced by users and resolve them quickly. To achieve this, continuous monitoring and testing after the staging process is imperative.

This article will walk you through the need for production testing and provide the best approaches for testing applications after release. Throughout this article, you will get a clear idea of the best tools and techniques to use for efficient testing.

Why Test in Production

As a best development practice, developers perform multiple tests during the development stage. However, no matter how exhaustive these tests are, they do not assure stable performance after a system is released.

This is because most of the challenges experienced in a live environment are unprecedented during the initial development stages. Additionally, it is hard to simulate the real production environment.

Testing in production is the best way to ensure an application runs smoothly even after introducing changes to the codebase. Here are the main reasons why every development team should be performing tests in the production environment:

Real-time performance monitoring — Unlike tests performed during development, production testing monitors software performance in the real-world. Testing the application in a dynamic environment where user traffic and application data keep changing is the best way to get a clear picture of its performance and resolve potential issues.
Improved bug detection — Production testing helps uncover bugs and security vulnerabilities that went unnoticed during earlier testing processes. As you push code to a live environment, it is more likely that specific bugs will be revealed.
Higher resilience and quicker recovery — Testing through production makes an application more resilient in the face of disaster. It also improves its recovery ability in case such devastating events occur. This is especially important to avoid loss of functionalities or user data in the real-world.
Reducing the risk of frequent deployment — Every change made to the codebase calls for a new deployment. This presents a lot of risks if issues related to the new deployment go unnoticed. To avoid hurting user experience in the live environment, it is vital for development teams to perform comprehensive production tests.
Maintain quality of applications — Testing with production data helps detect real-time system failures, network failures, interruptions, poor connections, and other errors. Even if testing in the staging environment was not performed adequately, developers could use beta programs that allow customers to provide feedback on new product features and their experience.

Approaches to Production Testing

As production testing finds its place in the software lifecycle, there is an increased need to test across the three phases of production, i.e., the deploy phase, the release phase, and the post-release phase.

This can be done in several ways. In this article, the testing approaches are grouped into two broad categories according to the tools used. These are continuous monitoring and deployment/release strategies.

Using Performance Monitoring and Testing Strategies

Monitoring tools help development teams identify issues before they affect the user base. They also provide a better understanding of how system resources are used, so it is easier to determine whether your application can meet the scaling demands in a production environment.

These tools generally refer to a combination of tracing, logging, and metrics. When used, these tools overlap and complement each other in production testing and debugging.

Tracing

Tracing involves using tools that can capture and visualize specific workflows in a system. Tracing is especially important for distributed systems where multiple users can request the same service simultaneously.

It is also an ideal strategy in applications where a single user requests multiple back end services, or submits multiple requests to the same service. Tracing tools help developers identify the point at which a system can reach its threshold before its usability or services start degrading.

A good tracing example is when you run automated API tests and get a clear picture of how your application works. For instance, you can use Loadmill to create user flows from a real session in your application. The flows are generated based on specific parameters and then run in the test suite.

Logs

Logs are essential when it comes to testing and debugging applications. Through logs, we can identify events that lead to a failure, thus making it easier to resolve errors and minimizing costly downtimes. However, some developers still rely on unstructured logs.

A good practice would be investing in structured and centralized logs for even smoother monitoring, testing, and debugging. You can use a tool like Loggly to capture well-structured logs during production testing

Metrics

Just like logs, metrics are essential for error detection. However, these are generally more intentional in that they tell you when something is going wrong.

Metrics are essential when making deployment decisions. For instance, it would be inappropriate to deploy code with significant performance regression.

You can use metrics to track certain types of errors during production testing and identify the impact of changes made to the codebase. Their application goes beyond testing in production in that a cohesive metric collection strategy is key to continuous software delivery.

A practical example of using metric tools is when a developer performs load testing after ensuring an API is free from regression bugs. Using Loadmill, you can replicate production traffic in your test environment to capture real-world system performance.

This data can be used to prevent issues and performance bottlenecks after release.

Using Deploy and Release Strategies

Blue-Green Deploys

Blue-green deployment, sometimes referred to as red-black deployment, involves setting up two identical production environments. In this technique, a full copy of the system (it could be a monolith or a service) is kept while traffic is routed to a live deploy of the system.

If the blue one is live and the green is non-live, testing will take place in the green (non-live) environment. Once testing in the green environment is complete, the router is switched such that all incoming traffic is redirected to the green, which becomes the live environment, and the blue becomes non-live.

This strategy allows developers to monitor system behavior and performance, with the ability to switch back in case there are issues.

While the blue-green strategy helps get rid of system downtime when testing in post-release, it has a complex setup and depends highly on the system.

Canary Testing

The canary testing strategy involves rolling out a product version or new feature to a small subset of end-users for validating system functionalities.

In this strategy, traffic routed to these canaries depends on a number of things, including specific work type, customer group, demographic, geographical locations, times, etc.

The testing team will then monitor system behavior to identify any issues before releasing the product to a larger audience. Doing this allows you to test in a live production environment without risking severe bug impact.

At a high level, canary testing helps control the category and number of users affected during production testing.

Feature Flags

Feature flags, also known as feature toggles or feature snippets, are code level wrappers that allow developers to test new system features in production. In this technique, toggles are used to release features to system users under certain logical conditions.

They provide an easy way to control the visibility of a particular feature, thereby allowing immediate rollback if necessary.

Here is the anatomy of a basic feature flag:

As you can see, feature switches allow you to turn system functionalities on and off without redeploying new code. If the feature flag is on, the new code is executed.

Similarly, if the feature flag is off, the new code is skipped. This allows you to capture accurate real-world data when testing in production.

Below is a JavaScript example showing a toggled feature.

Conclusion

Software systems can fail in many ways, some of which can have a severe impact on users.

Testing in production, if done efficiently, can help eliminate most system failures and increase confidence in the system’s resiliency. While production testing is not entirely risk-free, it can help development teams continuously deliver top-notch software and improve reliability in even the most complex modern systems.

At its core, testing in production not only helps organizations deal with system anomalies in a better way, but also improve user experiences, brand reputation, and generate more revenue. It should, therefore, be considered an essential part of the testing pipeline and overall software delivery process