Autoscaling the servers like a BO$$ 😎 · tldr 😉

So this post is more of sharing how we can achieve X with amalgamation of Y,Z… where, X => auto-scaling aws machines Y,Z… => tools like git,ansible

I work here at CloudFactory team to create some meaningful work/opportunities. We are big fan of some good stuffs like:

  • ansible run a task/playbooks on a target host(s); it like you have a simple tool that reads some tasks(simple steps you want your machine/stuffs to be in/with). Its also like; your mom cooking all the foods in kitchen with ease (simple yaml definitions, minimal headaches of overdoing and less hassles of dependencies hail)
  • aws-ec2 (elastic cloud computing); its like you have some simple api calls and you blow a fully bootstrapped computers of your need in couple of mins; yes of course, assumption is you got some 💰 to burn. 🤑
  • git : the stupid content tracker. its a simple tool that help you change anything in your code with full contorl/freedom in a much collaborative way. learn it, if you have not. or magit-it if do Emacs.
  • sidekiq (a.k.a Simple, efficient background processing for Ruby)
  • list is long, can’t write now, must stop.

So this post is more of sharing how we can achieve X with amalgamation of Y,Z… where, X => auto-scaling aws machines Y,Z… => tools like git,ansible

At the time of massive ddos: 2016 Dyn cyberattack our auto-scaling strategy was failing too. Reason: Unable to resolve, since the bootstrap process involved cloning the latest deployed codebase from the github.

We used to bake AMIS (its like an pre-configured virtual disk file on aws ready to turn on and operate with). That means we have much of things hard-coded in those frozen drives. Although the boot-up process used to configure the new machines with ansible hooked up from rc.local scripts, the whole process was its-self statically trapped inside. So as the our auto-scaling metrics started triggering the launch process of new machines, all useless as they were chocking at clone failure, since resolving github to their ips was not working and yes our machines were mostly there.

I was thinking of hardcoding the ips into poor-man’s-dns 😅 entry.

... ;; ANSWER SECTION: 126 IN A 126 IN A ...

But the scale-up and scale-down was a dynamic, and I need to ansible it every time a new machine wakes up. (At the time, i was not aware of user-data section of autoscaling 🤔, we could have sed in changes dns changes, but /etc/hosts gets dynamically updated with yet another script running for other necessary reasons…). We knew bunch of automation stuffs, still we could not sleep well that night, and next workday at office was like 😴 , hehe.

Feeling retarded, googling the possibilities, then i got routed to this awesome blog and i heard of ansible-pull. Blazed with the stuffs/steps by the Lazy Geek on that post, shared it with my team and pinned it like this is what we need. Period.

And yesterday… this came true and I am writing this post on my rooftop 😄

Auto Scaling Basic Flow Diagram

I might be playing out random beating the bush alone, but whatever/however you think 🤔 😲 😧 😀 🙊

The context of this post is like, I am puddling:

  • aws and ec2 auto-scaling stuffs
  • ansible to do my stuffs
  • put my bootstrap scripts in some place like github
  • use s3 bucket and put the necessary credentials there only accessible via bunch of machines via ~IAM role~(a abstraction on best security practices i like about aws)
Feels like you are in sync /(at least 30% or near, continue; else bail out, goodbye 👋, though some are stubborn enough to bare with me 😆)

I am thankful to handful of people who guide/inspire/motivate me get it done. {@kajisaap @sameergautam @arbabnazar}

The post is highly biased from the post by the Lazy Geek if you get confused 😕 or seek for original stuffs, do visit his awesome blog url. And do learn and try it, it worth thriving for.

The whole steps is reproducible as: …

Since the import is broken, I am removing rest of story to be view from the original blog post, URL below ;)

Originally published at