How Up Hail Used AWS to Evolve from a Side Project to a Business
By Avi Wilensky, Founder, Up Hail
Our Up Hail web app lets you quickly compare the costs and options of ground transportation services, including Uber, Lyft, taxis, buses, and bikeshares. You simply enter your origin and destination and click a button, and Up Hail shows you the best deals.
Up Hail started two years ago as a small Flask app on a simple virtual machine. We set out to answer two simple questions: Where do the major ride-hailing and ride-sharing services operate, and which one is the least expensive? Fast forward two years later, and millions of people have used the app to get fare quotes and to discover new, related services. Tens of thousands of new users are signing up and creating accounts each month.
During our transition from a scrappy side project to a revenue-producing online property, we realized that we needed a cloud infrastructure that could handle our growth while still giving us the flexibility to experiment and improve. In this post, I discuss how our infrastructure needs changed as we evolved from the lean prototype phase to serving millions of visitors in production, and how we used AWS to help us get there.
Our Early Challenges
In true “lean-startup” fashion, we began by writing and shipping a semi-operable, bug-ridden prototype. It consisted of a simple Flask app written in Python, MongoDB, and a very bootstrap-looking front-end design. Our only investment besides time was the cost of a domain name and a $10/mo hosting plan. Life began on a single, simple virtual machine on a low-cost hosting provider. It was so simple that when the machine crashed, we restarted it manually. Outages were frequent, and most of them were a result of us doing stupid things and breaking machines in production. 99% uptime was good enough, or so we thought at the time, and infrastructure was a complete afterthought.
The beauty of launching lean is that the cost of failing is minimized. A web app is a quick and inexpensive way to test an idea in the wild. If it flops, or no one finds the product useful, you can move onto something else. And that’s precisely what we did. We shipped, let it sit in the wild, and forgot about it. We went back to our normal day jobs servicing clients and looking for new problems to solve. For six months, our virtual machine sat on the Internet neglected and collecting dust (except for the manual restarts when it crashed). Any new users who happened across the site came through Google searches or word of mouth, and we didn’t spend a cent on acquisition.
The downside of launching lean is that you need a bit of luck for your test app to become a successful product. It turned out that one of the users who stumbled across Up Hail liked the product and found it useful. He blogged about it, and we accidentally found our way onto Mashable’s The 11 most useful web tools of 2014 roundup. Other major media outlets followed suit, along with social chatter on Twitter, Facebook, and Hacker News. Product validation! Back to work.
A review of our Google Analytics data showed that people were coming and traffic was growing. Over 1,000 users a day were running searches on our prototype. Display ads and affiliate links were peppered in, and a revenue stream was born. The single condition of a business is a paying customer. We had our first paying customer, Google, paying us for clicks and impressions in the Google Display Network.
Our transition from side project to income producer changed the stakes of the game. Downtime from a crashed web server now meant loss of revenue. We needed to maintain better uptime and reinvest revenue in a more resilient cloud infrastructure. We needed an infrastructure that would keep us online when we broke things. We needed better insight as to why our virtual machines broke, and a better monitoring and alerting system.
Using AWS to Expand Our Business
We stumbled across a free workshop by AWS called “Architecting Highly Available Applications on AWS” at the nearby AWS Pop-up loft in NYC. We spent a full day learning about and building resilient and elastic systems, using free demo accounts and open-source tools. We learned about Chaos Monkey and Bees with Machine Guns. We soon decided that AWS was our best option for our cloud infrastructure.
The first step was decoupling our monolithic application and separating responsibilities across several virtual machines (also known as instances), thereby creating a multi-tier (or n-tier) architecture. Our app and database servers were deployed onto independent hosts, with security groups in place that defined communication rules between them. Our app servers now live in an Auto Scaling group, with a minimum of two production machines running simultaneously at all times. In case of a traffic spike, new virtual machines spin up in real time in response to demand, which we recently witnessed after Up Hail was featured on the homepage of Gizmodo.
We use an Elastic Load Balancing (ELB) load balancer to distribute the load across the cluster. We moved our static assets to Amazon S3, and we use Amazon CloudFront to serve our content for faster delivery to our users. Additionally, we configured Amazon CloudWatch alarms to send SMS and email alarms to our team should anything be unreachable. We created Lambda functions for backups, snapshots, and automated testing. We created security groups to ensure that our databases are not accessible via the public Internet and only whitelisted IP addresses can access our internal infrastructure.
In hindsight, if we could do it all over again we would architect our solution from the beginning on a cloud platform like AWS. For example, as a startup we could have used the AWS Free Tier to take advantage of the cloud infrastructure while keeping our costs down. That would have saved us the pain of having to decouple our monolithic architecture later.
In the beginning, we had a lot of downtime. It would have been useful to have a monitoring service like Amazon CloudWatch to send notifications when things went wrong. We also had a very manual deployment process, which we learned could have been simplified by using a service such as AWS Elastic Beanstalk.
Despite all this new, heavy duty infrastructure in place, we still break machines and our application from time to time. We have learned from Werner Vogels that “everything fails, all the time.” However, with AWS we no longer have to worry because our issues are behind the scenes and don’t impact our users. Even if a natural disaster occurs that causes one of our Availability Zones to fail, replication in another zone keeps our application online. Our ELB load balancer runs continuous health checks and will route users to a working machine, decommissioning a broken one. Auto Scaling will automatically boot up a new healthy instance.
We now treat our instances as cattle instead of pets, as they should be. And we still have a long way to go to reach the scale that our infrastructure is designed to handle. But now we’re ready. Bring it on.