Providing a 3rd party development environment
At Worldsensing we take very seriously our collaborations with external partners and customers. This is why we must provide tools for these collaborations to be easy. Our products demand plugins and interfaces that are developed by our internal software team as well as by 3rd parties, therefore having the right environment is key to our success.
In this post I am going to give a brief explanation of the transition we made from an insecure, slow and prone to human error environment to a more efficient, secure, automated, visible and controlled way of working between external teams and ourselves. So let’s start!
Our legacy
A while ago, we provided a cloud development environment for our partners. This environment was a Linux instance for each of our products, and we provided them with SSH access so they could start developing.
Unfortunately, this method proved to be:
- Insecure: We gave SSH access to that instance, sometimes with root permissions.
- Prone to human error: Many developers sharing the same instance.
- No-visibility: There were no metrics or monitoring to help development and operations troubleshooting tasks.
- Interruptions: Our ITOps team had to be very reactive, as it was requested many times for root actions on that instance.
Due to these list of negative effects on our performance, we decided to create a model for easing everyone’s development lifecycle.
Our current architecture
Our products are built using microservices architecture. We use Docker for our applications and Docker Compose for launching the stack from 1 to n containers. Our instances have these minimum software requirements:
- Docker Engine: Container engine.
- Docker Compose: Container stacks engine.
- Git: Source code versioning.
- Telegraf: Metrics.
We strongly recommend this same software stack to our partners as it gives them a unified tooling system easy to deploy, manage and troubleshoot.
On top of these software stack we deploy our products as displayed here:
We provision all our instances using Ansible playbooks for automation, repetition and to avoid errors. Playbooks are essentially sets of instructions (plays) that you send to run on a single target or groups of targets (hosts). So an Ansible playbook is in charge of:
- Launching a new instance in our service provider.
- Basic provisioning of the instance: git, ntp, …
- Provisioning Docker Engine & Docker Compose.
- Provisioning and configuring Telegraf.
- Deploy our solutions
- Run our software
- Run functional tests
Once this process is finished, we end up with a ready to use environment for the different teams.
Phase 1 — Development
During the development phase, we provide a secured instance without SSH access, but with HTTPS access to the following services:
- Product API REST so that plugins & interfaces can connect locally or remotely.
- Product front-end where the components are mostly displayed.
Then developers can use their own infrastructure without our help and possible delays. New implementations get registered to our products via API REST and developers can see its results via front-end.
We too provide a metrics system for each instance. We use the TICK stack for that purpose. Telegraf sends system and Docker Engine metrics to our development InfluxDB which can be seen through Chronograf:
This phase 1 gives us a lot of flexibility as our ITOps team is not wasting time on deploying and supporting tasks, whilst developers are able to code with a fully functional and properly monitored environment.
Phase 2 — Integration
Once the new development is ready, it is then integrated into the same instance as our product. Again, the only allowed access to the instance is HTTPS.
Our main core stack is up and running at all times in a new integrated instance called phase 2. Then we need to add 3rd party stacks to the container cloud as we can see on the next diagram:
Tools
In order to integrate all those pieces we use some more tools. All our infrastructure and deployment instructions are scripted following Infrastructure as Code standards. These scripts are saved in various Git repositories (our version control system) for conflict detection, change management, responsibility and automation between many other positives features.
We deploy those different stacks to the target instances with Ansible. Ansible is a set of software packages that automates software provisioning, configuration management, and application deployment. This method allows us to download, change, integrate, run and test the different stacks. These stacks are normally composed of a few Docker Compose files that are executed to start the software stack.
Jenkins triggers Ansible playbooks. Due to its simple and easy-to-use graphical interface, our developers can login and trigger jobs that have already been defined in the application. Jenkins, is an open source automation server. An example of its use: A developer adds a feature to the code and pushes the change to Git. This developer can login to Jenkins and apply the change by running a Jenkins job. This job uses Ansible scripts to deploy and run those changes into the target instance. Jobs can be set manually or automated for specific functions (nightly builds for example).
But… once the applications are running, how can developers debug them without SSH access? Portainer is our chosen tool. It is an open source management UI which allows engineers to easily manage Docker Engine environments. Through a simple UI, developers can see logs, restart containers, display environment variables, … all docker commands but through a graphical and secure interface (HTTPS + username/password).
Last of our provided tools is the TICK stack for monitoring. TICK stands for the naming of the four packages included in the stack: Telegraf, InfluxDB, Chronograf & Kapacitor. Our development instances include a Telegraf agent in charge of sending metrics to an InfluxDB time series database, then Chronograf is a web UI to display this metrics in a simple and easy way.
Conclusion
At Worldsensing we have now achieved secure but usable development environments for our developers and external partners.
And not only that, but by using configuration management tools like Ansible, environments are reproduced seamlessly; development = staging = production. At the operational level, we can now see a clear responsibility separation between:
- Worldsensing and 3rd party partners
- Developer and operations
Please feel free to comment on this post. Interesting insights may help us and other readers grow.