Hacktober at eXp: A Refreshing Experience

An exam, a pandemic, and an idea.

Liam Hanninen
eXpSoftwareEngineering
6 min readJul 18, 2022

--

We complained our way through Advanced Mendix Exam prep. ‘We’ being myself and six fellow software engineers at eXp Realty. Leading up to the exam we waded through microflow, Team Server, and namespace caveats and edge-cases. We wavered between mild confidence and bouts of frustration. But through it all we managed to have some fun and learn a lot. eXp is a fully remote real estate company and most of us had never met in person before. So during those weeks leading up to the exam we agreed we should ‘get together(in-person) sometime’. You know how that goes: “After Covid…”. The sentiment, genuine. The timeline… very flexible. So we waited and saw COVID cases start to drop in the spring of 2021. We started to plan the where, when, how. Perhaps we can plan a business-related effort around this. Maybe we can involve other teams. What will this social experiment become? But like many a wedding, cruise trip, and family reunion we too had to cancel our gathering as the Delta variant peaked in the US. An in-person meet was not in the cards. So we regrouped and pivoted to a virtual hackathon.

A project worth hacking.

Even before it went virtual, tech leaders were supportive of an in-person event. They were eager to encourage a pseudo-social event that still provided value to the business. We just needed to nail down a project that would be align with their goals. Scope and value were two guiding factors. Whatever we picked had to be (mostly) accomplished in a week of development and had to be valuable to the business.

Working with leadership we landed on bolstering Refresh Week tooling. Refresh Week is a Tech Team-wide effort to populate all lower environments with production data. It involves a code freeze, manual snapshot restoring, monitoring, and setting up configurations. It often becomes a week (or two) of ‘hurry up and wait’. It’s a week of time-wasting monitoring, double or missed communication, and tedium. The goal became to build an interface for DevOps to interact and monitor Refresh Week restores. In addition, we wanted to build a notification suite to streamline post-restore activities. The interface, notifications, and restore-triggering mechanism would be built in Mendix while an Amazon Web Services (AWS) Step Function would monitor progress and send updates back to the Mendix app.

The Refresh app was the front and back-end. An AWS Step function monitored and reported on progress asynchronously.

Order snacks and clear your calendars!

Of course the world kept spinning which meant things still broke outside of our protected, meeting-free, hackathon sanctuary.

To fuel us for the coming week we were given a stipend to order snacks and beverages from a snack site. This was handy since everyone has different snack cravings. We could choose our own combination of salty, savory, caffeinated, carbonated, or sun-dried snacks and drinks.

My snack ‘crate’. Yes, that’s a Golden Girls mug. ‘Stay Golden’ Betty.

Besides the snacks, one of the biggest changes was to our schedules. We are full-time employees working on ongoing projects, and these projects would not wait for us. Given several weeks notice and a nudge from leadership most teams were able to accommodate the brief but total loss of a team member. Of course the world kept spinning which meant things still broke outside of our protected, meeting-free, hackathon sanctuary. So we did lose members during the hackathon occasionally to software-based, metaphorical fires or important meetings.

Nonetheless, the team was consistent, reliable, smart, creative, and passionately curious. We had the mental space and time to discuss edge cases, potential ‘wow factors’, and alternative or new ideas. There were times that were heads-down development, and there were times that were quintessential pair-programming. There were roadblocks and breakthroughs, there was confusion and clarity. Most importantly there were snacks.

Let’s get technical.

The below description hardly gives justice to all of the work the developers and dev-ops did. Not just the work work but also the exception handling, edge-case consideration, and code re-writes. It also doesn’t detail the timely, persistent, and thorough testing by QA. Testing was especially important since we had been spending the entire week trying to develop thoughtfully yet quickly. Here is a brief overview of the key technical aspects of the now-named Refresh App.

The main dashboard has up-to-date progress info of ongoing and completed restores.

The Refresh App provides DevOps with a table, listing all available apps in our Mendix ecosystem. The apps, their available environments, and backup snapshots are populated onto a view within the Refresh App using the Mendix Deploy API. DevOps selects any one of the apps to begin a short workflow, first selecting which environment(s) to restore, and then which snapshot to use.

Finally they trigger it which makes three API calls:

  1. One to the Mendix Deploy API
  2. One to an AWS API Gateway
  3. One to Jira’s Agile Server REST API

Here are the juicy details for each of those:

(1) Triggers the actual restore using the Mendix Deploy API. When you host your infrastructure in the Mendix cloud the Deploy API can be used to programmatically interact with your applications. We use it to perform the restores. Once restores are triggered status updates are provided which is the mandate for the step function.

(2) Kicks off an AWS step function that monitors the restores. A lambda (serverless code) within the step function polls Mendix’s Deploy API every two minutes. It sends updates back to Refresh App each time until the restore completes or fails.

(3) Triggers the creation of Jira cards detailed with dynamic information related to the particular restores. The cards include important timestamps and details denoting which app and environments are restored. Assignee is changed to primary contact (as designated in user configuration interface) when the restore is complete. Any number of additional contacts can be added, which tags them in the Jira card.

On completion of the restore, an email and/or text is sent out to the primary and normal contacts. In the meantime, the main dashboard is being updated so DevOps and QA have visibility in one place of all restores, their status, duration, and timestamps.

Was it worth it?

These combined efficiencies contributed to the shortest and smoothest Refresh Week we’ve had in two years.

For us it was a fun, fresh challenge that was a break from our normal routine. For the tech team at large — what were the benefits? The actual Refresh Week, which is done quarterly, loomed as we wrapped up the timely Refresh App Hackathon. In the past, Refresh Weeks (the restores and subsequent work) often lasted more than 7 days. The one after the Hackathon lasted 5, with all restores completed after 2. We can’t take all of that credit since two of our largest apps were being restored in a limited capacity — which reduced the risk of days-long restores. But we can take credit for the ease and speed at which DevOps was able to monitor and handle all 28 restores in a single dashboard — many that ran several hours long. We now have visibility to historical restores (moving forward) and their duration. So in the future DevOps can know if a restore is taking longer than expected. The automated paperwork via Jira and the automated notifications via email and text abbreviated down-time and busy work. It also streamlined communication to developer leads who had to coordinate post-restore work. These combined efficiencies contributed to the shortest and smoothest Refresh Week we’ve had in two years.

What’s Next?

This app is part of two future efforts: The first is continued automation of Refresh Week based on user feedback, including adjustments and additional features. Before the next Refresh Week we also hope to make further strides in automating post-refresh activates. Secondly, it has started a discussion around in-house CI/CD tooling. We have the foundation of what could become a broader suite of tools to support the automation of our entire deployment pipeline.

--

--