By: Niv Lipetz, Software Engineer @ ZOOZ
RPS — Requests per second, request latency, and overall system performance and reliability are some fundamental concepts that need to be taken into account when designing a high capacity API. With CI/CD becoming a common deployment methodology, deployments to production are a constant occurrence. How can we ensure that the core capabilities we initially designed our system with to have remained intact?
We built Predator.
Predator is an open-source performance framework we created for ourselves. We were fed up with the limitations of out-of-the-box solutions and the complication of writing complex custom tests. We designed Predator to manage the entire lifecycle of stress-testing a server, from creating a test file, to running scheduled and on-demand tests, and finally viewing the test results. Bootstrapped with a user-friendly UI alongside a simple REST API, Predator helps developers simplify the performance testing regime.
Initially, Blazemeter was our framework of choice. Able to run JMeter files and easily create heavy server load, it provided us with a decent solution. Nevertheless, the limitation of runners and the complication of writing tests that simulate our merchants’ flows caused us to search for new solutions. We were missing a framework that would provide an end-to-end solution that would be both economic and rich in features.
This is when we at ZOOZ — a technology company after all — decided to develop Predator as our end-to-end solution that would satisfy our needs. Our aim was to simplify performance testing and provide enough metrics to help us isolate performance problems.
We chose to use Artillery https://github.com/artilleryio/artillery as our engine for running load using HTTP requests in Predator tests.
Our decision to internally develop our own framework gave us the privilege to design Predator exactly the way we needed. Some of the include features:
1. Domain Specific Language
Writing a performance test that checks specific parts of our API end-to-end used to be a hassle, but now it is effortless. By creating definitions for each DSL, request templates are generated. These can be reused in the same test and in other tests under the same DSL type, reducing replication. Also, by using Artillery syntax, response values can be captured and used in the rest of the pre-defined definitions, enabling the DSL definitions to be dynamic. Overall, this feature allows us to create scenarios that truly mimic our merchants’ flows with little effort and a minimal amount of time.
2. API Functional Testing
DSL is important for testing the product as a whole, but the performance problem can usually be pinpointed to a smaller source, such as a buggy feature or service. For this reason, API Functional Testing is included. Writing a test flow that targets an array of endpoints with the ability to extract response values to use in later requests is as straightforward as it can be.
3. Live Reporting
Seamless integration with Prometheus, InfluxDB, and Grafana enables the creation of custom dashboards and metrics, thus adding another important dimension to the test results. Moreover, Predator generates a native HTML report, providing a sleek report interface. These are updated constantly throughout the test run with the test’s real-time results.
Test history and data retention are critical when analyzing a system’s performance over time. These, however, can get pricey when using cloud-based frameworks. With support for both relational databases (SQLite, MSSQL, MySQL, Postgres) and NoSQL databases (Cassandra), saving past test results are included out-of-the-box. Data retention is based on the user’s database configuration.
Before Predator, running two or more tests simultaneously was limited due to third party limitations. Now, we are able to scale tests using our own resources. With support for both Kubernetes and Metronome, all a user needs to do is provide the platform configuration, sit back, and let Predator deploy runners that load the user’s API from their chosen platform. A local deployment adapter is also included, allowing deployment of runners in the user’s local machine.
6. User Interface
Predator is bootstrapped with a UI that consolidates this framework into one user-friendly interface alongside a simple REST API to create, run, and view tests along with their results.
The following features guided us throughout our development, and they shape the core ideology of Predator.
Loading Our System
1. Nightly Tests:
Multiple deployments a day to different parts of a codebase can potentially harm the system’s performance. To protect ourselves, we configured Predator to perform nightly tests via a cron mechanism. These tests ensure that the entire system is hit end-to-end.
2. Test On-Demand:
Whether a developer needs to test a feature in isolation or an enhancement before deployment, or whether a system administrator needs to simulate a certain stress load in a development environment, Predator provides the ability to run an immediate test.
3. Part of a CI/CD pipeline:
As part of the effort to minimize performance regression, the true responsibility is on the sole-service to make sure it does not deploy any bugs. By running performance tests as a step in a CI/CD pipeline, developers can make sure their deployment is bug-free in terms of a performance footprint.
Ever since we integrated performance testing into our daily routine, improvements in our system’s performance have increased drastically. We will share the fine-tunings that worked for us in our next blog post, as well as how we diagnosed our system’s weak spots and the actions we took to overcome them.
Moreover, we believe what helped us will help others. This led us to open-source Predator, with a scheduled release date at the end of March 2019,