High-coverage functional testing of microservices applications is critical to prevent regressions
One of my friends who manages an engineering team of about 100 engineers went through a migration of a monolith application into a microservices-based architecture. His team develops a software application that is used by several thousands of users across their customers. I wanted to understand more about his experience after they migrated to microservices. One of the comments that stood out for me was “the importance of functional testing has increased many-fold and it is risky to release new versions.” This comment surprised me since I associated microservices with higher development velocity.
The overall application consists of several microservices involving a rich variety of interactions between services creating a complex labyrinth. We must now protect each of these interactions with lots of functional tests. It is not enough to just test the contracts (schema of message exchanges), but the range of parameters in requests and responses must also be tested. Many of these values encode business logic and hence be protected with tests to prevent regressions.
What are the main reasons behind the need for high- coverage functional testing?
One of the main sources of problems is the independent development and release of services. Such loosely coordinated development allows bugs due to subtle mismatches in assumptions to creep in relatively easily.
Developers tend to make subtle assumptions about the outputs they are producing for other services or outputs they are consuming from dependent services. All such assumptions must be protected with functional tests. Otherwise, the code based on such assumptions is just waiting to cause regressions in functionality. We outline a couple of such simplistic examples below.
Data type mismatches: Data types, beyond the basic types such as integer/string, are not strongly declared with JSON message exchanges among services. In monolithic applications, a big class of software changes that may cause type mismatches are identified by compilers. All such checks enforcing type matches across microservices must now be done by functional testing. Other message encodings such as Thrift and Protobufs address this issue.
Exception handling: Since these services are independently developed, failure handling across services could be different from each other. Consider scenarios when developers have to handle information-starved or erroneous paths: they make assumptions in creating responses. The general reaction in the interest of speedy unblocking is “let me make this work and I will verify this assumption.” If such assumptions in responses don’t align across services, the overall functional behavior could be incorrect.
Response drops and delays: Each service must now also pro-actively protect against scenarios when the service it depends on may respond late or never.
High Engineering Effort for Achieving High Coverage Functional Testing
Due to both sets of reasons discussed above and possibly others, protecting the functionality of a microservices application from regressions due to loosely coordinated development needs a comprehensive set of functional tests. The challenge however in achieving comprehensive test coverage is that the engineering cost associated increases significantly.
Engineers or product managers would have to develop the test scenarios, test cases, assertions, and scripts. All of these have to be maintained as software evolves. Hence, engineering effort is split between developing new functionality and creating the test infrastructure and tests to prevent regressions to existing usage. The latter often takes at least 25% of all engineering effort. Each bug that slips through the cracks requires a significant amount of customer support and engineering effort to diagnose, reproduce, and fix. Therefore, engineering teams are unable to fully achieve the promise of high development velocity with microservices.
As microservices adoption grows across engineering teams, it becomes critical for us to address this growing problem of having to invest a larger fraction of engineering effort in comprehensive functional testing. We need significant innovation in the development tools needed to address this already critical and ballooning problem. We are developing such a solution at Mesh Dynamics.