Capsule & AWS | Java on the cloud

Auto-Scale your Java Capsule over multiple regions in 30 minutes or less.

# AWS, Elastic Beanstalk, Route 53, Java, Capsule

In all the java deployment malarkey, Capsule finally brought us the glory of a simple mechanism for packaging our JVM apps. Checkout the introduction on a previous post, or in short;

Capsule defines a way to package an entire application into a single runnable jar and run it as easy as java -jar app.jar. Dependencies, system properties and JVM arguments are all taken care of. Gone are the days of platform-specific startup scripts, JVM flags and classpath nightmares.

With the convenient maven-plugin (or gradle) tool taking care of the build process, we come up with this little runnable jar all prepared & pre-configured, i.e a capsule, ready to be deployed to a server.

Firing up a cheap Digital Ocean server for $5 and then simply executing the capsule jar and you’re good to go.

Utilising little libs such as undertow for an embedded http server we can create a lightweight backend REST API app all in a simple executable jar. A breath of fresh air compared to the ageing complexities of JavaEE, Spring, Tomcat, WARs et al.


Interestingly, the little backend capsule can chug away nicely handling a considerable amount of requests, on just a single server.

However, when things become more serious, we need to talk about latency, backup instances etc. And this is where Amazon AWS comes in.

The AWS Elastic Beanstalk service provides a convenience layer on top of EC2 and ELB in the form of pre-setup environments specific to the app and preconfigured auto-scaling features. Relatively recently, java environments were introduced, which make a perfect combination for our java capsule.

Think of it like a server cluster that grows and shrinks all automatically (or elastically) running instances of our java capsule.


  1. The first step is to select a region such as U.S West (North California).
  2. Select Elastic Beanstalk and action to create a new environment.
  3. Select ‘Web Server Environment’ if you want HTTP services. Otherwise select ‘Worker Environment’ for things like background jobs etc.
  4. The predefined configuration should be ‘java’ and the environment type should be ‘Load balancing, auto scaling’.
  5. In the next steps upload your app and configure any additional settings.

And thats it! We now have an automatically scalable server cluster with our app deployed. This Elastic Beanstalk environment is essentially made up of a number of EC2 instances behind an ELB all just configured automatically. AWS takes care of deploying the app to all the instances, and firing up additional instances on high load (or winding down on low load).

Each EC2 instance in this Java SE environment actually runs an nginx server setup to forward requests to port 5000 locally. So our java capsule jar with its embedded http server only needs to listen to requests received on ‘localhost’ at port 5000. You can see more info on the AWS guide, which also details about how to configure the nginx server. When the app is deployed in the environment, AWS will replicate it to all instances and execute it as a normal java jar. If we’ve preconfigured our capsule jar at the build process, theres nothing more to do.

The environment can be reached by <name> so for example if your environment name is ‘mycapsule’ and the app handles a HTTP GET endpoint at /ping, we can query it by hitting:


Select our new environment in the AWS console and then select ‘Configuration’ on the side menu, and finally the ‘Scaling’ box. Here, we will find settings on how scaling is handled, define scaling triggers and define how many instances to fire up.


Taking our scaling setup to the next level, we can replicate our environment to other regions to reduce latency. So for example, we could have one in U.S West (North California) and one in E.U (Ireland) to spread the load depending on where the client is.

By changing the name of our U.S environment to mycapsule-us and having our E.U environment named mycapsule-eu we can hit our two clusters:



Finally, putting all the pieces together, we need to setup our DNS to hit the specific environment based on the location of the client (latency).

And again AWS come up trumps, with Route 53!

So the idea here is for the DNS to translate a request to our main domain, say, to one of our clusters.

We first create a ‘hosted zone’ in Route 53 for our domain

Then we simply create two CNAME records for to point to the two clusters, with a specific ‘Routing Policy’ based on ‘Latency’.

  1. Create a new Record Set
  2. Name:
  3. Alias: No
  4. TTL: 30 seconds
  5. Value: dualstack.<code>
  6. Routing Policy: Latency
  7. Region: us-west-1
  8. Set ID: us

Note that the ‘Value’ must contain the ‘dualstack’ domain of the environment ELB (and not the normal <name> A dropdown will pop up with options, select the domain under ‘ — ELB load balancers — ’.

Do the exact same for the E.U region (so this will have a different dualstack domain, the region will be eu-west-1, and the Set ID will be ‘eu’).

After its all processed we now have two auto scalable clusters awaiting requests in the US and EU all accessible by hitting

Use a VPN to change your location and try hitting the domain. Check the logs of each environment to see which server cluster the request hit depending on your IP.


Replicate the concept over more regions, and increase the limit of the instances in each environment as your users grow.

Boom! We’ve just auto-scaled our Java Capsule over multiple regions in 30 minutes or less.

Now get building…

One clap, two clap, three clap, forty?

By clapping more or less, you can signal to us which stories really stand out.