DevOps for Small Businesses: How to Implement DevOps Practices on a Budget

Published in

neoxia

10 min readJun 8, 2023

DevOps can seem daunting for small businesses with limited resources, but implementing these practices on a budget is possible. You have to keep in mind some principles: start small and scale up gradually, identify pain points and prioritize them. On a daily basis, focus on collaboration, automate repetitive tasks, measure and monitor your progress and take advantage of infrastructure as code tools. The list of tools that will be presented to you are subjective but they were the tools that helped me implement DevOps practices on a budget for a small business. After all, the most important thing is that you lean toward the real goal of DevOps: enabling businesses to deliver applications more quickly and efficiently. Yes it sounds like the goal of Agile as well, because DevOps is a way of enabling agile on the technical side, although, agile evangelism will be let to other articles.

Collaboration

Have you already lost code by erasing what one of your coworkers just wrote ? Are you exchanging your code through emails and USB keys ?

If so, this chapter is for you.

Version control (also known as revision control, source control, or source code management) designates systems responsible for managing changes to software engineering projects. In a nutshell they do the following:

Store a history of changes of your code
Allow you to track who did what when
Add metadata, labels, tags to your work
Help adopt a sustainable and scalable development team workflow

The most common version control systems are Git and SVN. They are almost always backed by a web-based repository management platform that will give users an interface.

For Git, the most common platforms are Github and Gitlab (either self hosted or as a service). I’ve explicitly put Bitbucket out of the equation because the Atlassian ecosystem is not best suited to small businesses. Both of them allow users to push code, documents, create issues, consult kanban boards and a lot of other features that might fit your needs. You can read articles that are comparing them but you can also simplify the problem: Github can be compared to Apple and Gitlab to Android. From my point of view, and even if I own an iPhone, I prefer my code to be hosted on Gitlab, it costs less and gives me an experience I can customize for my coworkers. Although being widely used, Git is a tool that often lacks good practices, you can either adopt Gitflow or Trunk-based development. I will not do a full coverage of these 2 practices in this article but I definitely recommend you to read about them.

A commit format convention is very simple to adopt and will help a lot reading other people’s commits. I personally like this one www.conventionalcommits.org for its simplicity.

Automation

Are you frightened by deployments? Do you deploy your code only once a month because you’re afraid of breaking something? Or maybe you deploy 5 times a week and incidents happen in 3 of them?

Automate what takes time and might be an error source. Your operating system provides plenty of built-in tools to do so (scheduler, services, shell scripting etc…) and it would be a mistake not to take advantage of them. Master what you already have before installing a bunch of new tools. Bash and PowerShell scripts are a great way of automating whatever you often do on a machine. Investing time in writing a script now might save you a lot of time and errors later.

Over-automation can lead to an overly complex and rigid DevOps environment that is difficult to maintain and troubleshoot. Teams may be too reliant on automation and not have the skills or processes in place to handle situations that fall outside of the automated workflows. Carefully evaluate which processes and tasks can be automated and which should remain manual.

Ansible is probably the tool that changed my life the most as a DevOps on the automation side. It allows you to execute tasks (shell commands for example) and manage file configuration (templating through jinja2). All of this on multiple instances at the same time. It is done securely through SSH. Basically you describe commands, configurations, variables and targets in a YAML file and Ansible executes these tasks (called a playbook) on a predetermined number of targets (hosts). Ansible provides a good error report and stops deploying on error-affected instances.

You might have a specific and documented deployment/update process, but being executed by humans, errors can quickly sneak in. One of the simplest ways to avoid this is to rebuild this process in an Ansible Playbook. Here are the reasons why I would recommend doing so:

Dependencies: installing dependencies through community-approved plugins. I used to install docker engines in a lot of my instances. One day, Docker decided to change their setup process. I wasn’t aware and lost a lot of time figuring this out. I decided to automate the docker setup though an Ansible Galaxy community-written module.
Multi-target deployment: You can execute the same Playbook on multiple targets at the same time (deploying on 3 different production servers at the same time). Also you can tag tasks in order to group them by target environment (permitting to have specific tasks for production or dev deployment).
Configuration: managing configurations for multiple environments or regions can be really challenging. I like to use environment variables as much as I can and I feed them with jinja2 files templated by Ansible. In ansible, variables can be attached to instances and instances can be grouped, this is done in the inventory.
Versionning: As your Ansible YAML Playbook will be stored in your project’s git repository, it will be versioned as well, meaning that if you need to deploy a previous version of your softwares, you will be able to use the associated playbook.
Consistency: automated deployment allows one to maintain a consistent environment across many instances. Remember the time when you were in a hurry to deploy a new instance and didn’t remember which version of Python you installed.

It is perfectly acceptable to stop there and keep an Ansible version of your deployment process and launch it from your computer, although you’ve made it one step further to Continuous deployment.

A focus on Continuous Integration and Continuous Deployment

Continuous integration (CI) and continuous deployment (CD) are two key DevOps practices. CI involves continuously integrating code changes into a central repository and automatically running tests to detect errors. CD involves automatically deploying code changes to production environments once they’ve passed tests.
It is worth mentioning Jenkins and ArgoCD, they are big players in the CI/CD market. But as they are tools that are not integrated within your version control platform, I wouldn’t recommend them for now. Adopting an integrated solution (included in Gitlab/Github) is the way to go to limit maintenance costs and skills gaps. CI/CD is just a way to trigger specific tasks on each change on a specific branch of your git repository. Tasks are organized in stages so they can accomplish more complex tasks such as building, testing and deploying your project.

A very simple and typical stages structure is :

Build → Test → Stage → Deploy

Regarding tests, there are none of them far more often that we would like to admit. In my previous experiences I tried to implement some very basic tests such as: “Can my app be executed?”. If you don’t have any tests and don’t have time to implement them: take this time, it is a really bad habit not to write tests. Implementing a dynamic staging environment so the developer and/or the ops can test the application manually and decide whenever it’s good enough to send it to production is also a solution that should be considered.

Remember what you’ve automated with Ansible previously? this is what you’ll use in your deployment stage in order to deploy your application without having to rewrite anything. The automation you’ve done before was the first step to do in order to implement continuous delivery.

Measure and Monitor

Remember the time when you kept coding new features while your app was down for 1 hour?

If you’re running a public-facing website, wouldn’t it be nice to get an email each time it’s not reachable? It takes 2 minutes to set up with tools such as UptimeRobot (or any other uptime checker).

Regarding monitoring, the market is full of fancy dashboards and data visualizations, before jumping into that eye-catching choice, it is very important to determine what should be monitored. Walk through your previous incidents and see what metric could have given you an insight about what is going on and focus on the source of the issue, not the consequences. I remember having a RabbitMQ queue not being consumed, it took a lot of memory, but having a high memory usage was just the consequence, the real cause resides in the software components that did not consume the queue correctly. Decide on some thresholds on key metrics and monitor them.

Also monitor generic system metrics as well (such as memory, disk and CPU usages) as they can alert you on issues you haven’t discovered yet.

I figured out that there are 2 stacks available as a self hosted solution:

ELK: ElasticSearch, Logstash, Kibana
Grafana, Prometheus and Loki

Other cloud based solutions also exist such as Datadog and Dynatrace.

The ELK stack is a bit hard to jump into if you’ve never used it before and it’s more for data people. I think it might be overkill for a small business unless you’re familiar.

About Grafana, they offer a hosted version and AWS provides a managed one as well.

Grafana linked with Prometheus and Loki can achieve the same functionality as ELK but in a much simpler way. Grafana acts as a dashboard tool, Prometheus gathers metrics from instances and Loki acts as a log collecting agent. The stack can easily be spun up in a docker-compose file. One thing I really like about Grafana is that it can be used with a variety of data sources (from AWS CloudWatch to a simple SQL database). Meaning you can create dashboards that link data from multiple environments (even your business environments if needed).

You can also use your cloud provider’s solutions (AWS CloudWatch, Azure Monitor, GCP Operations suite or other tools depending on your provider).

Infrastructure as Code

A new customer in another part of the world needs you to deploy your infrastructure as close to him as possible (due to latency, regulation etc…) How do you do that ?

How do you achieve a cloud-to-cloud migration?
If you need to deploy an old version of your solution, how do you make sure you remember the right infrastructure components you should deploy?

Infrastructure as code (IaC) is the process of managing and provisioning infrastructure, such as compute power or network components through configuration files (often YAML/JSON), rather than physical hardware configuration or interactive configuration tools. After writing those configuration files, you can deploy them as much and as often as you want. These files need to be versioned in your source control solution so you can review changes and go back in time.

Tools that can help you on this quest are Terraform (provision cloud agnostic resources through YAML), CloudFormation (Amazon Web Services), Azure Resource Manager (Azure) and Google Cloud Deployment Manage (Google Cloud Platform).

Before starting to automate your infrastructure provisioning and invest time in that, it is important to be convinced and sure that the infrastructure is reliable and sustainable.

Containers and Virtualization

For companies that are developing softwares running on servers, containers are an extremely useful solution when it comes to improving an already existing piece of software. You simply describe all the steps needed by your program to run (dependencies, environment variables etc…) and then build your image. It can then be run and managed through various orchestration tools such as Docker compose or Kubernetes, these help launch containers, scale them (either in or out), manage their network rules, port etc…

It is incredibly easy to use that to move from on-prem to cloud or even from cloud to cloud, it is easy to deploy (so it can simplify the way you are automating your deployment) and with that you can start enjoying the true benefits of public cloud providers as they offer solution to run containers on nodes or serverless (I’m thinking about AWS Fargate, Azure Container instances or GCP Cloud Run).

Do not forget about security

My final words will be that improving the speed and efficiency of your software development should never come at the expense of security. As a small business, a security incident could harm the company at a very high level. It is important to prioritize security throughout the entire DevOps process, from development to deployment. That is why nowadays we are starting to hear about DevSecOps in order to integrate security in the different phases of a project.

I gave you problematic examples, and solutions I chose to overcome them. In my past experience, this procedure took almost two years and it’s a never-ending process. Remember what has been said at the beginning of the article: start small and scale up gradually.

Feel free to share your thoughts in the comments section.

If your situation needs to be investigated more seriously, you can definitely reach out to me: victor.fonne@neoxia.com, visit our website https://neoxia.com/ or visit one of our offices in Montréal, Paris, Bordeaux, Grenoble and Pau.