UserGuiding AWS DevOps Transformation

Kubra Tahaoglu
bestcloudforme
Published in
5 min readApr 19, 2022

Every software needs user onboarding. But knowing where to start can be challenging, and inspiration always helps. This is why they’ve created a Customers page where you will find how real users are solving their user onboarding problems. UserGuiding helps you onboard your users with product walkthroughs that don’t require any coding, allowing you to prompt the right in-app experience, to the right persona, at the right stage of their user journey.

Customer Challenge

UserGuiding’s wish was to transform the infrastructure with tightened security, high availability, efficient performance and the response time that won’t change depending on user location. To achieve these goals, the infrastructure was designed with considering best practices and AWS native approach.

The verticals of tightened security were determined as user permissions, hosting sensitive data, communicating separate networks etc.

For user permissions, UserGuiding’s need was restricting the permissions of users who need to access the services by considering their use cases and giving no access to outsiders. The reason for this segmentation is to protect the customer data against all kinds of attacks that may come from externally and internally.

For hosting sensitive data, UserGuiding demanded that sensitive data shouldn’t be accessible publicly and should be hosted in appropriate services/engines. Also, only permitted users should be able to view/edit the data.

UserGuiding has a wide range of users located in different areas of the world so they needed two separate production workloads hosted on different regions. For communicating separate networks, UserGuiding needs the networks positioned on different regions to communicate with each other internally and with high speed on runtime.

To achieve high availability and efficient performance, UserGuiding needed a mechanism in which it could instantly detect performance problems or accessibility that would affect the user experience and send alerts to the teams that would interfere.

UserGuiding’s applications need to communicate with each other with considerable low response time but run on isolated environments for availability. Moreover, the applications need to automatically scale depending on the user requests and resource usage.

UserGuiding needed all CI/CD processes to be automated to eliminate the human factor. As important as it was to deploy new versions as quickly and automatically as possible, it was just as crucial to rollback to the previous version in case of a problem. Therefore, a controlled process was expected in the production environment to reduce the margin of the error but a dynamic prerequisite was required to speed up the testing processes.

Partner Solution

As Bestcloudforme, we placed the production and testing environments in separate AWS accounts. This way, we aimed to eliminate the effects of test processes on the production workload.

We used AWS IAM service to restrict user access as an important security level. We created different groups with different service permissions and restrictions depending on the user use cases then we attached the specified users to these groups and made sure that no extra access had been given. Besides that, we aimed to tightened console access with MFA authentication and tightened the command line interface (aws-cli) access as well with MFA must policies attached to each group. We restricted root account activity with setting alerts to inform us every time when a root account activity is detected. Moreover, we used AWS Cloudtrail to track all user activity and stored them on an Amazon S3 bucket.

We segmented the Amazon VPCs considering resource use cases with public and private subnets and we placed majority of the resources on private subnets to restrict access. We used security groups for securing the access to the instances and services. Since we placed the resources on private subnets, we needed a vpn set up to access them, therefore, we used a 3-party vpn solution that AWS Marketplace offers.

To host the applications, we used containerization technology with Amazon EKS. Since we hosted EKS privately, we used internet-facing application load balancers to serve the applications to the end users and we used Amazon Route53 private dns zones to serve the applications internally. In addition, we used Amazon ECR as a private docker registry for application images served on EKS.

We used AWS CodeBuild projects to automate CI/CD processes. With CodeBuild, we handled all the processes without provisioning and managing the resources needed for build and deploy. In addition, using CodeBuild for applications served on AWS managed services like EKS is beneficial for infrastructure integrity. Since a dynamically triggered CI/CD process was needed for the test environment, we used GitHub Webhook Triggers feature of CodeBuild to achieve this.

To increase the efficiency and decrease the margin of the error, we preferred to use AWS managed services for critical tools. We used Amazon OpenSearch, Amazon ElastiCache and Amazon RDS to store data. With the use of these services, we managed a performance efficient and highly available infrastructure. Since the data stored in the OpenSearch and RDS is critical for the workload, we used AWS Lambda to backup the Opensearch data periodically to an S3 bucket and we enabled automated backups for RDS.

To increase user experience, efficiency and availability, monitoring and alerting mechanisms was a must for critical workloads. We used Amazon CloudWatch to monitor all resources hosted on AWS and created alerting mechanisms with the help of both Amazon SNS and AWS Lambda services. To monitor application resources, accessible endpoints and application performance, we used 3-party tools which are Grafana, Prometheus and New Relic.

To meet the multi-region production workload need, we placed two VPCs in different regions. Some of the resources located on different VPCs should be able to communicate with each other, therefore, we set up a VPC peering connection and managed to internally communicate these resources. With VPC Peering, we managed a secured and faster connection between the networks.

Result and Benefits

By using AWS IAM, the console and command line interface authentication tightened. With security group definitions, internal placement of the resources and vpn, security requirements are managed.

By using Amazon CloudWatch, SNS, Lambda services and some 3-party tools, we managed to handle monitoring and alerting mechanisms that would primarily affect availability and efficiency needs.

By using AWS EKS, we ensured that the system is scalable, accessible, and performance efficient. Both the cluster nodes and the application pods are scalable based on the traffic received by end users and resource usage. Therefore, we provided UserGuiding a system with best performance with the least cost.

With the CI/CD scenario, we have automated the process to make it easier for UserGuiding to deploy their codes or rollback to the previous version without the human factor. The time spent on building the code, uploading it on ECR and serving the application on the EKS has decreased by approximately %60.

--

--