PROD vs UAT: How to make sure a UAT environment is a good replication of Prod and vice-versa?

Published in

Proof Reading

12 min readSep 29, 2022

A very common issue many organizations face is how to make sure a test environment (UAT → User Acceptance Testing) is a good replication of Prod (Production). Similarly, we want to make sure that what was tested in UAT is going to behave as expected in Prod.

This leads us to a lot of questions like:

How should the deployment process work?
How should apps, configs and reference data be defined?
How can we make sure binaries and configs are consistent across different hosts?

As a computer engineer, I always tend to think that the best way to organize things is to code them up.

In this blog post, I want to present how we solve this problem at Proof Trading and how our deploy-as-code solution works.

Deploy-As-Code

Our approach to a deploy-as-code solution is to have a single YAML file that basically contains all configurations for a given environment. Then in order to compare one environment to another, we can do a simple comparison between 2 files. All the rest will be generated automatically out of this main YAML configuration. Also as code is generated automatically, it is possible to include all types of validations that our project could need as JSON validations, like not allowing beta version to be deployed to prod, and any other checks you can imagine.

Config Project

Many people think about their project just in terms of the main code itself. But how is it deployed? How can we get the right binary in the right folder on the right server with the right config? And how can all that be replicated in the same way across different environments?

The solution for this is to have a Config Project that will generate all configs, scripts, reference data, and binaries out of a single YAML configuration file.

Every time a new deployment happens, all steps need to be done again, regardless if it is a single binary change or one char change in a config file. The deployment process needs to be consistent to make sure it is reliable and reproducible.

Ok, now we understand that a Config Project is important, but how should we set it up?

A key feature for a Config Project is that all specific data for a specific environment needs to be part of a specific config file. This config file is used to generate all configs that are important for your system. But wait Marcio, you say: “this idea sounds good, but how can you generate code out of a single YAML file? How does that work?”

In order to answer this question, I created a git hub repo to illustrate https://github.com/marcioammoreno/project-config.

It is basically the same thing that we use at Proof Trading. I would love to open source our main repo, but as it involves a lot of client specific information in our setup, it wouldn’t be possible to open source our original project.

To make a useful example, I basically created a project that deploys QuickfixJ example apps. The goal is to deploy an Executor app (a fix acceptor that fully fills all orders received) in one server and 2 Banzai apps (fix initiator UIs to send orders) with 2 different FIX version configurations. The same config would apply for Prod and UAT. The only difference is that we are going to setup Prod with Quickfix version 2.3.0 and UAT with version 2.3.1.

Before we dive deep into how our code works, it is important to talk about Ansible.

Ansible

Ansible is a suite of software tools that enables infrastructure as code. It is open-source and the suite includes software provisioning, configuration management, and application deployment functionality. More information can be found here https://docs.ansible.com/ansible/latest/.

Ansible allows us to generate all code that is going to be used for our deployment in a very simple and concise way.

Ansible can perform specific tasks like: creating folders, copying files, downloading files, generating files out of templates, and many others.

A big important feature of Ansible is Ansible Playbooks. Ansible itself is a single task process while Ansible Playbooks allow us to run many Ansible tasks in sequence.

Now, let’s understand how our code works and how our build process happens.

Deep dive into the code

We basically have 3 main folders: Ansible, Environment, and Model.

The Ansible folder contains ansible playbooks required to build the project. Ok, but what is the definition of Build here? Our Build process has 4 steps: Download Binaries, Generate App configs, Generate Templates, and Copy References.

Let’s go one by one:

1\ Download Binaries: this step gets all jars and libs required and stores these in the generatedCode/bin folder.

2\ Generate App Configs: In general, our deployments could have many different apps, and this step is responsible for making sure that apps configs are consistent across the board. We will have a separate section solely on this topic below.

3\ Generate Templates: This step basically copies over every file and folder inside the model/template folder to the generatedCode folder, and applies the Jinja2 template definition used by Ansible (https://docs.ansible.com/ansible/latest/user_guide/playbooks_templating.html).

4\ Copy References: This step is very similar from step 3, with the only difference being that it copies things as they are to generatedCode, and there is no templating. This is very useful for reference data or scripts that are the same across different environments.

The Environment folder contains the actual definition of our environments.

Environments.yaml just defines how many environments we are going to have. In our particular case here there are just 2, but it could be as many as you wish.

environments:
  prod: prod.yaml
  uat: uat.yaml

Let’s dive into uat.yaml file:

quickfixJar:
  version: 2.3.1hosts:
  server-1:
    host: 172.31.17.142
    user: ec2-user
    executor: true  client-1:
    host: localhost
    user: marciomoreno
    banzai: trueapps:
  server-1:
    executor:
      validOrderTypes: 1,2,F
      defaultMarketPrice: 10.5
      sessions:
        - senderCompId: EXEC
          targetCompId: BANZAI
          fixVersion: FIX.4.2
          port: 9878        - senderCompId: EXEC
          targetCompId: BANZAI2
          defaultMarketPrice: 12
          fixVersion: FIX.4.4
          port: 9880  client-1:
    banzai_1:
      sessions:
        - senderCompId: BANZAI
          targetCompId: EXEC
          fixVersion: FIX.4.2
          host: server-1
          port: 9878    banzai_2:
      sessions:
        - senderCompId: BANZAI2
          targetCompId: EXEC
          fixVersion: FIX.4.4
          host: server-1
          port: 9880

Basically this file defines everything that we need to deploy our code properly.

As we can see, the QuickfixJ version is specified as = 2.3.1, there are 2 different hosts, and we can see the apps that are going to be deployed in each host.

The Common.yaml file contains things that all environments have in common:

deployment:
  path: /tmp/quickfix
  artifacts: /tmp/quickfix-deploymentjvm:
  args: "-Xms2G"

In our basic example, this just defines where the deployment is stored and the JVM args used by apps.

And finally, probably you already guessed what you will find inside prod.yaml 🙂.

The Model folder is divided in 3 parts: Reference, Template, and Apps.

1\ Reference (as described in the build step) contains all files that are going to be the same in all environments. For example: a csv file containing securities definitions.

2\ Template contains files generated using configurations from environment YAML. For example: general configs applied to all apps, scripts, and our ansible deployment definitions.

3\ Apps: this one in particular contains the definition for each app in the deployment. In this case we have 2: Executor and Banzai.

How are Apps defined?

In this section we are going to dive into “Generate Apps Configs” ansible code. The goal of this step is to transform the definitions in each app folder (app config + spec.yaml) + main environment configuration YAML into code generated for each host.

App Folder definitions (app config + spec.yaml)

Plus main environment configuration YAML

apps:
  server-1:
    executor:
      validOrderTypes: 1,2,F
      defaultMarketPrice: 10.5
      sessions:
        - senderCompId: EXEC
          targetCompId: BANZAI
          fixVersion: FIX.4.2
          port: 9878        - senderCompId: EXEC
          targetCompId: BANZAI2
          defaultMarketPrice: 12
          fixVersion: FIX.4.4
          port: 9880  client-1:
    banzai_1:
      sessions:
        - senderCompId: BANZAI
          targetCompId: EXEC
          fixVersion: FIX.4.2
          host: server-1
          port: 9878    banzai_2:
      sessions:
        - senderCompId: BANZAI2
          targetCompId: EXEC
          fixVersion: FIX.4.4
          host: server-1
          port: 9880

Into the generatedCode for each host.

As you can see in the figure above, the instances folder contains all apps (divided by host) that need to be installed. Each app folder will contain a run.sh script and a config file that is very specific to that app in that particular instance.

Here is the run.sh as a template:

#!/bin/bash/usr/bin/java \\
	{{ config.jvm.args }} \\
  -cp "{{ config.deployment.path}}/bin/*:{{ config.deployment.path}}/lib/*" \\
  {{ app_launcher }} \\
  {{ config.deployment.path}}/apps/{{ app }}/{{ app }}.cfg \\
       	2>&1 | tee {{ config.deployment.path}}/logs/{{ app }}.log

And here it is after being generated as part of the build for Banzai_1 app:

#!/bin/bash/usr/bin/java \\
	-Xms2G \\
  -cp "/tmp/quickfix/bin/*:/tmp/quickfix/lib/*" \\
  quickfix.examples.banzai.Banzai \\
  /tmp/quickfix/apps/banzai_1/banzai_1.cfg \\
       	2>&1 | tee /tmp/quickfix/logs/banzai_1.log

Ok, by this point you probably get the idea about how we are generating all code that we need for our deployment. But how do we actually run it? What is the command to start it? How is it coded?

Build process

To start build process, just invoke the ansible-playbook as below:

ansible-playbook -i ansible/inventory.ini ansible/build.yaml

The first part of the build process is to download binaries and put them in the bin folder.

This is accomplished by this ansible definition:

- name: Create bin folder
  file:
    path: "{{ baseDir + generatedCode + env }}/bin/"
    state: directory- name: Get QuickfixJ release
  ansible.builtin.unarchive:
    src: '<https://github.com/quickfix-j/quickfixj/releases/download/QFJ_RELEASE_>{{ vars["cfg_" + env].quickfixJar.version | replace(".", "_") }}/org.quickfixj-{{ vars["cfg_" + env].quickfixJar.version }}-bin.zip'
    dest: "{{ baseDir + generatedCode + env }}/bin/"
    remote_src: yes

As you can see, it is getting the release version directly from the environment configuration YAML:

{{ vars["cfg_" + env].quickfixJar.version }}

This is the main idea behind this project build. All steps are based on a single environment definition file. When we need to compare Prod vs UAT (or vs any other environment), we just need to compare 2 files and see the differences. All the rest should be exactly the same.

Also, as an example of validation, we have an extra step to make sure binary versions ending with “-beta” cannot be deployed to Prod. This should allow us to deploy beta versions to UAT but only deploy fully validated versions to Prod. If we try to deploy a beta version in Prod, the step below will fail and consequently our build will fail:

- name: Only allow prod master builds for prod
  fail: msg="Beta builds are not allowed in Prod"
  when: 'env == "prod" and vars["cfg_" + env].quickfixJar.version.find("-beta") > 0'

The second step is the app generation process.

Apps are defined per machine. The same app could be registered in different hosts with different configurations. Examples could include a primary and a backup, flow distribution, etc.

Let’s use Executor App as an example here.

Environment configuration YAML:

apps:
  server-1:
    executor:
      validOrderTypes: 1,2,F
      defaultMarketPrice: 10.5
      sessions:
        - senderCompId: EXEC
          targetCompId: BANZAI
          fixVersion: FIX.4.2
          port: 9878        - senderCompId: EXEC
          targetCompId: BANZAI2
          defaultMarketPrice: 12
          fixVersion: FIX.4.4
          port: 9880

App config template:

[default]
FileStorePath={{ config.deployment.path }}/logs/quickfixdata/executor
ConnectionType=acceptor
StartTime=00:00:00
EndTime=00:00:00
HeartBtInt=30
ValidOrderTypes={{ appConfig.validOrderTypes }}
UseDataDictionary=Y
DefaultMarketPrice={{ appConfig.defaultMarketPrice }}{% for session in appConfig.sessions %}
[session]
SocketAcceptPort={{ session.port }}
SenderCompID={{ session.senderCompId }}
TargetCompID={{ session.targetCompId }}
BeginString={{ session.fixVersion }}{% endfor %}

App config generated after build:

[default]
FileStorePath=/tmp/quickfix/logs/quickfixdata/executor
ConnectionType=acceptor
StartTime=00:00:00
EndTime=00:00:00
HeartBtInt=30
ValidOrderTypes=1,2,F
UseDataDictionary=Y
DefaultMarketPrice=10.5[session]
SocketAcceptPort=9878
SenderCompID=EXEC
TargetCompID=BANZAI
BeginString=FIX.4.2[session]
SocketAcceptPort=9880
SenderCompID=EXEC
TargetCompID=BANZAI2
BeginString=FIX.4.4

The app config is placed in a folder specific to the host it belongs to. In this way, our project could have the same app deployed in multiple machines.

Also, if we want to deploy the same app multiple times in the same host, we just need to add the suffix separated by underscore (_):

apps:
  client-1:
    banzai_1:
      sessions:
        - senderCompId: BANZAI
          targetCompId: EXEC
          fixVersion: FIX.4.2
          host: server-1
          port: 9878    banzai_2:
      sessions:
        - senderCompId: BANZAI2
          targetCompId: EXEC
          fixVersion: FIX.4.4
          host: server-1
          port: 9880

The third and fourth steps are very similar. One is responsible for copying over files as they are from the reference folder to the generatedCode folder. The other one runs a template process in all files inside the template folder and saves them inside generatedCode as well.

Ok, now that you know how all deployment code is generated, let’s take a look how it is actually deployed.

Deployment process

The deployment process consists of 2 steps.

1\ Copy all artifacts to a versioned folder

2\ Setup all hosts to use artifacts from the versioned folder

The advantage of using 2 steps is: when there is a need to rollback a deployment, it is just a matter of running the second step to re-setup environment with the original version.

If you have not noticed yet, the Ansible folder with the deployment code was inside the Template folder. This means that when we generate our code, we will have specific deployment code per environment.

It could be noted when you check the inventory.ini file inside model/template/ansible.

This ansible inventory file is generated for each environment.

[all:vars]
artifactsPath="{{ config.deployment.artifacts }}/{{ '{{' }} lookup('env', 'BUILD_NUMBER') | default('SNAPSHOT', true) {{ '}}' }}"
deployPath="{{ config.deployment.path }}"[executorHost]
{% for hostName, host in config.hosts.items() %}
{% if host.executor is defined and host.executor is sameas true %}
{{ hostName }} ansible_connection=ssh ansible_user={{ host.user }} ansible_host={{ host.host }}
{% endif %}
{% endfor %}[banzaiHost]
{% for hostName, host in config.hosts.items() %}
{% if host.banzai is defined and host.banzai is sameas true %}
{{ hostName }} ansible_connection=ssh ansible_user={{ host.user }} ansible_host={{ host.host }}
{% endif %}
{% endfor %}

Now, let’s get back to the deployment steps.

Step 1:

- name: Deploy artifacts in {{ env }}
  hosts: all
  gather_facts: false  tasks:
    - name: Delete old generated folder if it is an SNAPSHOT deployment to have a full refresh
      file:
        path: "{{ '{{' }} artifactsPath {{ '}}' }}"
        state: absent
      when: artifactsPath.endswith('SNAPSHOT')    - name: Create deployment folder
      file:
        path: "{{ '{{' }} artifactsPath {{ '}}' }}"
        state: directory    - name: copy artifacts
      synchronize:
        src: "../generatedCode/{{ env }}/"
        dest: "{{ '{{' }} artifactsPath {{ '}}' }}/"
        perms: yes
        use_ssh_args: true

Step 1 is a simple copy of all artifacts to the artifactsFolder in each host of our deployment.

Step 2:

- name: Setup artifacts in each host
  hosts: all
  gather_facts: false
  vars:
    jarVersion: {{ config.quickfixJar.version}}
{% raw %}
  tasks:
    - name: Create deployment folder in case it does not exist
      file:
        path: "{{ deployPath }}"
        state: directory    - name: Create log folder in case it does not exist
      file:
        path: "{{ deployPath }}/logs"
        state: directory    - name: Create symbolic link for apps
      file:
        src: "{{ artifactsPath }}/instances/{{ inventory_hostname }}/apps"
        dest: "{{ deployPath }}/apps"
        state: link    - name: Create bin folder in case it does not exist
      file:
        path: "{{ deployPath }}/bin"
        state: directory    - find:
        paths: "{{ artifactsPath }}/bin/org.quickfixj-{{ jarVersion }}/"
        patterns: "*.jar"
      register: jarFiles    - name: Create symbolic link for lib
      file:
        src: "{{ item.path }}"
        dest: "{{ deployPath }}/bin/{{ item.path | basename }}"
        state: link
      with_items: "{{ jarFiles.files }}"    - name: Create symbolic link for lib
      file:
        src: "{{ artifactsPath }}/bin/org.quickfixj-{{ jarVersion }}/lib"
        dest: "{{ deployPath }}/lib"
        state: link    - name: Create symbolic link for scripts
      file:
        src: "{{ artifactsPath }}/scripts"
        dest: "{{ deployPath }}/scripts"
        state: link    - name: Create symbolic link for configs
      file:
        src: "{{ artifactsPath }}/configs"
        dest: "{{ deployPath }}/configs"
        state: link    - name: Create symbolic link for data
      file:
        src: "{{ artifactsPath }}/data"
        dest: "{{ deployPath }}/data"
        state: link{% endraw %}

As you can see, the Step 2 is very basic. It is just mapping the required folders (to keep system structure the same) independently of which deployment version is being used.

Just as an example, assuming artifactsPath as /tmp/quickfixj-deployment and deployPath as /tmp/quickfixj we would have the follow structure:

/tmp/quickfix/apps -> /tmp/quickfix-deployment/VERSION/instances/server-1/apps
/tmp/quickfix/bin
/tmp/quickfix/configs -> /tmp/quickfix-deployment/VERSION/configs
/tmp/quickfix/data -> /tmp/quickfix-deployment/VERSION/data
/tmp/quickfix/lib -> /tmp/quickfix-deployment/VERSION/bin/org.quickfixj-2.3.1/lib
/tmp/quickfix/logs
/tmp/quickfix/scripts -> /tmp/quickfix-deployment/VERSION/scripts

Bin and Logs folders are not created as symlinks. Inside the bin folder all files would have the representation as a symbolic link. This was made just to illustrate the many different possibilities and the flexibility this type of project allows us.

Step 1 and Step 2 could be invoked using the commands below (this is an example for UAT):

ansible-playbook -i generatedCode/uat/ansible/inventory.ini generatedCode/uat/ansible/deployArtifacts.yaml
ansible-playbook -i generatedCode/uat/ansible/inventory.ini generatedCode/uat/ansible/setupEnvironment.yaml

Ok, now everything is setup. To start executor app under the server-1 host, we just have to run it from the designated folder (in our case /tmp/quickfix):

./tmp/quickfix/apps/executor/run.sh

Please note that running our apps would be the same regardless of the environment. All environment definitions were already made as part of the deployment process.

World of possibilities

This is a very basic example just to show the main functionality of a deploy-as-code project.

You could extend this to do whatever your deployment needs such as setting up jobs, setting up network connections, setting up python environments, performing validations, etc.

It makes the project more robust and easier to maintain as it reduces a lot of the manual steps that, in general, are performed during a common production rollout.

Conclusion

It is important to think about deployment as a code release process in the same way as how things are made for application code base. I.e. deployment code would also be versioned in the same way as application code is versioned.

If a new version of your code is made, you will need a new version of the deployment code as well. In this way, everything is code driven and there is no manual intervention in the deployment process. It is the best way to make sure Prod and UAT environment are always in synch and whatever was tested in UAT will be deployed to Prod in the exact same way.

I hope that this has been helpful to you. If you like it and want to ask more questions, please feel free to reach out to me at marcio@prooftrading.com