IoT to AIML infrastructure-as-code. Deployment from 3 weeks to 3 hrs!

SpiralData
CodeX
Published in
5 min readAug 5, 2022

In today’s article, we will take you through the journey of our infrastructure deployment process where (as the title mentions) we went from 3 weeks to 3 days to 3 hours to deploy using AWS Service Catalog. Our IoT to AIML platform as a service is battle-hardened across water and defence projects, enshrining best practice, so data scientists can create decision advantage, fast.

The Beginning: 3 weeks (2019)

In the beginning the deployment of our IoT to AIML reference architecture was a tedious process to say the least. Even though we had modularised the infrastructure, it still was a manual process of clicking, entering commands and configurations, testing and if a service or component wasn’t functioning as expected, we refer back to our internal environment and compare configurations, etc.

This process on average took us 3 weeks to complete, as well as end-to-end testing.

Round #1: 3 days (2021)

Just over a year ago we decided to automate the deployment process to a certain extent, this wasn’t going to be a fully fledged automation project since we didn’t have the luxury of time or resources to dedicate to such a project. We were only able to allocate 5 days and a single resource, and went about converting the infrastructure deployment into scripts (AWS CLI).

The process was simple (at a high-level), take each service implemented in the reference architecture and

  • Convert the deployment into a AWS CLI Script
  • Parameterise as much as possible, this will allow customising the deployment
  • Test it, fix any issues, make enhancements
  • Convert the script into a batch file

Document

  • Version control
  • Prerequisite
  • Expected inputs and outputs
  • Any gotchas, like make sure the correct AWS Profile and region is configured locally before executing the scripts, etc

After a couple of services were converted using the above checklist, the process was clear. We managed to achieve significant time savings cutting down the reference architecture deployment time from 3 weeks down to 3 days.

Round #2: 3 hours (2022)

Over the past 12 months our IoT to AIML reference architecture has gone through multiple iterations of updates (major updates are coming soon) and looking at potential risks we identified the following

  • Along with the increased number of services, the architecture has grown in complexity
  • Increased number of configuration steps
  • Requirements for enhanced security, monitoring and management.
  • Multiple projects where each customer would require multiple environments

It was time to take another crack at improving our process, this time it’s not just about cutting the deployment time, it’s about the ability to manage fixes, changes and enhancements to the infrastructure. We looked into infrastructure as code (IaaS) and the options available and identified AWS Service Catalog as a solution. It’s been available for a number of years and was at a matured state for our requirements.

  • The basics — understanding the basics behind can save you hours or even days running around in circles trying to get things to work. AWS online content has many YouTube channels providing great content on the AWS Service Catalog.
  • Service Catalog is a service which utilises CloudFormation to deploy infrastructure-as-code (IaaS)
  • Product — a CloudFormation stack that can deploy one or more services
  • Language YAML or JSON, we recommend YAML apart from fully compatible with JSON, it also supports comments and comments can be used to break down the sections and explain what each section does. This is critical for future reference
  • Versions — Each product can have multiple versions used to release fixes, enhancements and updates
  • Support details — used to provide information regarding product, support email, links..etc
  • Portfolio — a collection of products
  • Constraints — apply limits, governance and cost controls
  • Tags — can be used to store metadata regarding each service, tags can also be used to calculate cost at a detail level
  • Sharing — Control access by share the portfolio with other AWS Accounts, OUs, Groups, Roles and Users
  • Sandbox environment — it’s key to have an isolated environment to test out what you have learnt by setting up a separate AWS account. Once you are done with the testing, terminate all the services and if required, close the account
  • Get it working, quick and dirty some might say
  • Worry about cleaning up the code, adding tags..etc after the primary code is working as expected
  • Try out multiple test scenarios
  • Terminate everything and try again until you are happy with the outcome
  • Don’t forget to update the documentation as you go along
  • Product home — this is the environment where the product (the reference architecture in our case) would be configured, any updates, fixes and enhancements would be tested in the sandbox environment and then implemented in the Product home before being distributed to other internal or customer environments
  • Deployment test environment — this is where the final deployment of the product is tested. Once the code is cleaned and setup in the Product Home environment, it’s shared with the deployment test environment where the entire architecture is deployed end-to-end. This will expose any issues with the deployment process, documentation..etc Once the deployment is complete, you carry out a end-to-end testing process which finalises the entire process.

This entire exercise took us around 10–12 days to complete and we managed to bring our deployment time from 3 days down to 3 hours.

Key benefits and takeaways:

  • Fast deployment (similar to a quick starter pack) of a real-world tested infrastructure
  • Reduced cost in professional services, opening up more time/budget to focus on solving data science problems for our customers
  • Almost seamlessly receive updates, fixes and enhancements
  • Keep up to date with the latest technologies, yesterday’s methods and tools might not be good enough anymore
  • There is always a better way to do the same task, the key is identifying when to start using a better method
  • Always compare the effort vs benefits, not just technical benefits, the customer must benefit from it as well

Thanks for reading. Stay tuned for our next major update of our IoT to AIML Reference architecture coming soon!

--

--

SpiralData
CodeX
Writer for

Creating smarter organisations… using AIML.