Infrastructure Service Discovery

Nelson Oliveira
reachnow-tech
Published in
4 min readAug 30, 2019

In our previous post, I went over the infrastructure moovel came up with when the initial carve-out from car2go’s systems happened. Consul was mentioned as our choice for service discovery. With the passing of the years, we realised that very same service discovery was complicating our setup, rather than simplifying it. This post is the second in the Infrastructure evolution series and will detail exactly how we amplified our infrastructure’s service directory.

Before we begin, let’s define what our goal was. To put it in easy words:

A simple, fast and reliable service discovery within the scope of the infrastructure we built.

Sounds simple enough, right? But that’s just the very high level view of it. In other words, we decided to utilise what Amazon Web Services (AWS) already offers, all this to be able to express everything a certain service would need in a very easy to follow CloudFormation (CFN) stack. The template for said CFN stack would then include:

  • A way to handle the heavy amount of traffic using AWS’s Elastic Load Balancing (ELB);
  • A Route 53 definition pertaining the service’s DNS;
  • Environment Variable Management for said service (will come in the next blog post);

Ready? You better be!

First things first: ELB. Using load balancing strategies isn’t a foreign concept to most engineers these days, and Amazon’s solution for that is the ELB service. In terms of our infrastructure, it wasn’t too hard to create some and adapt them to our needs. To create an ELB, a couple of things need to be in place:

  • Availability Zones defined by Virtual Private Cloud (VPC). One ELB is inserted in and works inside given VPCs;
  • Security Groups for said VPCs;
  • Applicable targets for routing traffic into;

Fortunately, we didn’t need to create any of the above, since we already had those in place. Let’s take an example: We have a Development VPC in our setup, which includes 3 subnets and an associated Security Group, as well as a service (for example, our trips service) to route traffic to. I’ll go over how the service was then set up with Route 53 later on.

With all of the above, we are then essentially creating an Application Load Balancer (ALB). The following CFN template describes the process thoroughly:

# The ELB itself, referencing the pertaining Subnets and
# security groups.
LoadBalancer:
Type: AWS::ElasticLoadBalancingV2::LoadBalancer
Properties:
Name: !Ref AWS::StackName
Scheme: internal
Subnets: !Ref Subnets
SecurityGroups:
- !Ref SecurityGroup
Tags:
# ...
# The listener for the load balancer, which in essence forward the
# traffic to the default target group.
LoadBalancerListener:
Type: AWS::ElasticLoadBalancingV2::Listener
Properties:
LoadBalancerArn: !Ref LoadBalancer
Port: 80
Protocol: HTTP
DefaultActions:
- Type: forward
TargetGroupArn: !Ref DefaultTargetGroup
# The mandatory target group that then got the traffic routed to
DefaultTargetGroup:
Type: AWS::ElasticLoadBalancingV2::TargetGroup
Properties:
Name: !Ref AWS::StackName
VpcId: !Ref VPC
Port: 80
Protocol: HTTP
TargetGroupAttributes:
- Key: deregistration_delay.timeout_seconds
Value: 20
Tags:
# ...
# The DNS entry for this ELB
LoadBalancerDNS:
Type: AWS::Route53::RecordSetGroup
Condition: !Ref DnsName
Properties:
HostedZoneId: !Ref HostedZoneId
Comment: LoadBalancer DNS Entry
RecordSets:
- Name: !Ref DnsName
Type: A
AliasTarget:
HostedZoneId: !GetAtt LoadBalancer.CanonicalHostedZoneID
DNSName: !GetAtt LoadBalancer.DNSName

There’s a particularity here on this template. You will most definitely notice that we created a Target Group with no rules whatsoever (no health check definition, no targets nor protocols). This is because each service stack defines its own Target Group rules and its own host name using Route 53:

# The service's target group
TargetGroup:
Type: AWS::ElasticLoadBalancingV2::TargetGroup
Properties:
Name: !Ref AWS::StackName
VpcId: !Ref VPC
Port: 80
Protocol: HTTP
Matcher:
HttpCode: 200-299
HealthCheckIntervalSeconds: !Ref HealthIntervalSeconds
HealthCheckPath: !Ref HealthEndpoint
HealthCheckProtocol: HTTP
HealthCheckTimeoutSeconds: !Ref HealthTimeoutSeconds
HealthyThresholdCount: !Ref HealthThreshold
TargetGroupAttributes:
- Key: deregistration_delay.timeout_seconds
Value: 20
# Create a listener rule for said service within
# an existing ELB
HostListenerRule:
Type: AWS::ElasticLoadBalancingV2::ListenerRule
Properties:
ListenerArn: !ImportValue
Fn::Sub: ${LoadBalancerStackName}-Listener
Priority: !Ref ServicePriority # Can be arbitrary
Conditions:
- Field: host-header
Values:
- !Ref Hostname
Actions:
- TargetGroupArn: !Ref TargetGroup
Type: forward
# The DNS record that represent the service in our internal VPC
ServiceDNSRecord:
Type: AWS::Route53::RecordSetGroup
Properties:
HostedZoneId: !ImportValue
Fn::Sub: ${LoadBalancerStackName}-AliasHostedZoneID
Comment: Service DNS Entry
RecordSets:
- Name: !Ref Hostname
Type: A
AliasTarget:
HostedZoneId: !ImportValue
Fn::Sub: ${LoadBalancerStackName}-CanonicalHostedZoneID
DNSName: !ImportValue
Fn::Sub: ${LoadBalancerStackName}-DNSName

The above essentially means that each service registers itself within a certain ALB, defines its own hostname and listener rules. Each service can then independently set itself according to its own needs.

How does this whole thing work after all?

At moovel, we rely a lot on our VPCs during our daily work. All of the services that were set up with the above configuration are accessible when you’re inside the VPC itself. When inside the VPC, you can access the services directly by their DNS names, like http://abc.xyz.moovel.com. The following URL grants access the service directly.

What were the gains?

We gained a lot by setting the infrastructure and our services this way:

  • We could set our infrastructure up as code;
  • Using in-built AWS features allowed us to have simple and succinct CFN templates, meaning no other setup efforts were needed;
  • Each service can individually make its own choices in regards to DNS, health and listener rules;

The next post on this series will expand on how we organised our services in Elastic Container Service (ECS) Clusters to fit the then newly created ELBs.

--

--