How we used QA and microservices for a seamless launch

Published in

Looka Engineering

10 min readJun 15, 2018

We wanted to accomplish three things with the launch…

Earlier this month, Logojoy launched a major update to our infrastructure. We replaced the old rigid data structure that we used to describe logos with a new flexible data structure. We call these data structures ingredients. They contain everything Logojoy needs to know to draw a logo. To launch the new ingredients, we updated our client app, the server, and some of our microservices. Taken together, a lot of things could have gone wrong, because the ingredients are used throughout Logojoy: updating the ingredients was like doing a blood transfusion on the app. Fortunately, the launch went smoothly and we rolled out the new ingredients unbeknownst to our users.

We executed a seamless launch because we had three main goals:

Maintain continuous service
Maintain a consistent experience for users
Maintain the ability to rollback smoothly

We were able to achieve these goals (and thankfully did not have to rollback) due to two main factors. First, we performed deep QA with the entire product team on the new ingredients. Second, we leveraged our app’s microservice architecture to gracefully transition users to the new ingredients. Now, with the new ingredients in production, we have the foundation to accelerate product iteration and open up the throttle on growth.

Why we outgrew the old ingredients

Our mission with Logojoy is to make great design accessible to everyone. The old ingredients were great for getting us started, but they were getting in the way of our mission because they severely limited the variety of designs we could generate for users. We started batting around the idea of new ingredients in November 2017 after I had spent a week adding four new logo layouts to the app. It was obvious that adding new logo layouts was not scalable. Even worse, there were many layouts that were impossible to describe using the old ingredients.

When we decided to move forward with the new ingredients, we discussed what they would have to be like to give us the flexibility to achieve our growth goals. In principle, everyone on the product team was on the same page, but when it came to implementation, we had to go back and forth a few times. The most important concept was separating logo elements from their layout in the ingredients. We decided that the features property would keep all the data relating to things like text and symbols, and the layout property would describe how those features should be placed in relation to each other. This compares favorably to the old ingredients, where layouts were described only by their name, which was used to look up logo feature relationships at various points in the app. The layout property also had to be extensible so that we could represent arbitrarily complex designs with it. After weeks of prototyping, we settled on a tree structure for the layout property that satisfied these objectives. We hoped the new ingredients would make it easy to develop new logo layouts, improve training of our AI, and be more maintainable over the long term than the old ingredients.

How we QA’d the launch

The night we deployed the new ingredients, one of our devs and I were laughing at how we were rolling them out to all our users at the same time. With all our previous releases, we had staggered the rollout. In many ways, the new ingredients are more complicated and wind more intricately through the app than previous releases (e.g. when we updated the editor or launched our GoLang API). Further, once a user has their logo converted to the new ingredients then we don’t want to convert it back to the old ingredients. We were confident to roll out the new ingredients to everyone at once because we’d spent the last week preparing for it.

Just over a week before launch, we put the new ingredients on staging for initial testing. This was followed by three days of chipping away at obvious breaking bugs and browser issues. We tested checkout flows to make sure that users could purchase (the most important thing); we tested fulfillment to make sure that after a user makes a purchase, they receive the right assets; and, most of all, we tested the editor.

The editor is where users spend most of their time in Logojoy. It has something like 1 million interactions a day with users, and an average customer spends 80 minutes in the editor before purchasing. Our editor also helps people get what they want quickly: 60% of our customers purchase their logo the same day. All this emphasizes not only how important the editor is to Logojoy, but also how rich it is. While this is a major boon to our users, it makes QA time consuming, to say the least.

**top**: editor view of logo; **bottom**: symbol, layout, and font variations on logo

While QAing the editor, we wrote a library to migrate old ingredients logos to the new ingredients so existing users could take advantage of the benefits of the new ingredients. This meant that in addition to testing the editor with the new ingredients, we tested everything with old logos migrated to the new ingredients as well. Most of our users convert same day, but about 10% of users spend more than a week working on their logos, so it’s essential we continue to give them a great experience.

After three full days of fixing the obvious bugs, we invited the rest of the product team to run through the app and give us their feedback. We opened up an Asana board for QA and waited for the tickets come in. Inevitably new bugs surfaced; as one dev said to me in a Slack message “such is life.”

One bug that comes to mind is that we had a function in one of our key libraries that caused the site to not load on older versions of some browsers. We hadn’t gone through a deep browser test yet, so hadn’t picked it up. It was funny how after months of work, as soon as we shared the staging site with the product team, the first person to log on shouted out for everyone to hear “the page doesn’t load!” Fortunately, it was a quick fix, and we had the staging site working for everyone 15 minutes later.

With the new ingredients up on staging, we worked with our designers to make the logos look amazing. We analyzed things like what symbol alignments work with what logos, how to size different elements, and what color schemes work in certain contexts. Our logo design algorithms learn from what our users like, but we also make sure our algorithms are given a sound education in the fundamentals of design from seasoned designers.

Breaking down a logo’s spacing and sizing

The beauty of launching with microservices

We decided to launch on a Tuesday. For me, the most important thing was that the new ingredients were working well enough and I was confident we wouldn’t crash the app. Our team spent all of Tuesday working on final tweaks and finishing tickets. By this point, we had spent the last week QAing the new ingredients, and it needed to get out the door before something silly happened, like one of us deciding some library needed another refactor before launch. I was excited to get this project in production. The logos that we could create with the new ingredients looked terrific. I wanted to get these logos out into the world.

Later that day the new ingredients went live. With our microservice architecture, we adopted the strategy to keep old and new services running simultaneously. Existing users continue to use the old services, and new users use the new services. Most users purchase their logo the same day they sign up, and almost all within a week, so we knew that by the end of the week virtually all our users would be on the new ingredients.

The Logojoy app runs on an efficient microservice architecture. We serve more than 100,000 users a week, and our AWS bill is less than $100/month (database aside). The cornerstones are our Kubernetes server cluster, our React/Redux client-side that communicates with the server using GraphQL, and a host of microservices run on AWS lambdas and ec2 instances. To update the app for the new ingredients, we had to update two microservices: the React/Redux app and, of course, the server.

The two microservices we had to update were the microservice that draws logos, and the microservice that prepares assets for purchased logos, known as fulfillments. Both run on AWS lambda functions. Fulfillments are routed server-side and logo drawing is done client-side. We employed the same strategy in both cases. Prior to the launch, we had three stages for each of the lambdas: dev, stage, and prod. We didn’t want to update the prod lambda directly because that would lead to jarring experiences for users in the app. You could imagine a user with old ingredients sending a request to the new ingredients lambda and getting an error, interrupting their session. Further, if something went wrong, it could lead to errors for all users and it would make rolling back more awkward.

To avoid updating the production lambdas directly, we created a new stage: next. Users wouldn’t interact with the ‘next’ lambdas until we updated the react app and server. This meant we could safely test the lambdas in their production environment before sending users’ requests to them. This is particularly useful because our lambdas require permissions for other AWS services, so we could test requests and make sure all our configurations were correct before putting them live.

Our strategy of running old and new services simultaneously meant that the fulfilments and drawing had similar development. For fulfilments, an updated server would route new and old ingredients to their appropriate lambda stage. We uploaded the server docker image for the new ingredients early on Tuesday with no issues. For the drawing microservice, we updated the React-app’s environment file to point to the new lambdas (stage: next) and deployed a new version of the app. We keep both the old and new lambdas running because users who are on the app will continue to interact with the old lambdas — their experience is uninterrupted. Even better, if we have to rollback for any reason, all we have to do is redeploy the last version of the react app that points to the old ingredients lambdas. Thus we’re able to achieve all our goals: No interruption of service, users have a consistent experience, and rollbacks are straightforward.

What happened when we launched

When it comes down to it, the launch was almost as easy as we hoped it would be. We launched the new ingredients at 9 pm and had customer success managers on hand in case anything went wrong. I opened an error log terminal window to watch the lambda functions and prepared myself to see crash logs. There were no crash logs, just errors. The errors weren’t bad either. For example, in the drawing lambda, some properties were undefined that in testing had never been undefined. Then we found there was a memory leak due to the way we were storing the users’ requests — also in the drawing lambda. After fixing that, we were still getting out of memory errors, so we doubled the memory of the lambda and that fixed those problems. By this time, it was just after 11 pm, and we were confident that everything was working well enough to call it a successful launch and get some sleep.

The next day, after some well-deserved sleeping in, we fixed the bugs that had come up overnight, then fixed the useful but not critical things we had set off to do after launch. For instance, improving the sizing of slogans, fixing a bug with the uppercase and lowercase buttons, and completing bug fixes on migrated logos.

Over the following days, we continued to refactor and refine the code. In that same period, we saw that conversions were up 6% on the week, and 7% on the prior week. Our customers liked the new logos that we were generating. On top of that we didn’t have any complaints to customer success about bugs related to the launch of the new ingredients and the performance of the lambdas improved, too. With the new ingredients, we reduced the number of times the lambdas had to be called, and we reduced the number of invocation errors on our drawing lambda by over 50%.

The Takeaway

Taken together, it’s easy to feel good about the launch of the new ingredients. We deployed a data structure that’s crucial to the continued growth of Logojoy, and we’re helping users make better logos than before. During the launch, we provided our users with an uninterrupted experience, and got minimal complaints from our users due to bugs from the launch. We increased conversion rate and general performance of the app, too.

Most importantly we’re laying the foundation for things to come so that we can tackle even bigger problems. Now to get on to making those new logo layouts.

Interested in combining AI and design? We’re looking for Software Engineers to join our product team. https://logojoy.com/careers/