Running a InfluxDB Time-Series database on a AWS ECS cluster
--
Every self-respecting business is looking at adequately measuring and monitoring KPI’s (Key Performance Indicators) or other critical parameters.
Often, such data presents itself as a time series: a series of data points indexed (or listed or graphed) in time order. Say server CPU usage or wind-speed at a certain location.
Time-series data can be stored in a wide variety of data-stores, including relational and non-relational databases, each with there respective trade-offs. In the last decade a new category of databases has been developed that are optimized for such datasets: TSDB’s or time-series databases.
Typically a time-series database has high ingest rates (but rarely overwrites a record) and a number of functions (as aggregation) that perform much better than any other traditional databases when it comes to large sets of time-series data.
Picking a TSDB
Some of the names and vendors you will run into include: Graphite, OpenTSDB, TimescaleDB, InfluxDB and (unreleased at time of writing) AWS Timestream.
For a personal IoT project I evaluated a number of TSDB’s and settled on InfluxDB. Mainly because it was easy to start with and worked out of the box.
One of the challenges seemed to be that there isn’t a non-commercial versions out that there that has good redundancy and back-up build-in. The only solution to have some sort of high-availability seemed to run two identical instances of InfluxDB, each in another Availability Zone and having our client application write to both at the same time. This seems a bit awkward at first, but ended up working fine.
Running InfluxDB on AWS ECS
I’m a big Infrastructure as Code fan, so my goal was to bring this solution up through some AWS Cloudformation templates.
Persistant Storage
This brings us to the problem of persistant storage. Starting out I hoped to use AWS Fargate managed container service, but unfortunately it is not possible to add any persistant storage to a Fargate container.
The easiest solution seemed to be to attach a AWS Elastic Filesystem (EFS) to both EC2 container instances and mount a data storage volume to the containers. As we are running relative small datasets I haven’t been to worried about performance issues.
The Cloudformation template below can be used as starting point to bring up a EFS that is accessible from two subnets, each in a separate AZ. A little hard to read, but notice how the EFS DNS name and mount command are exported for use in the template that will bring up both database containers.
myPrivateEfsSg:
Type: AWS::EC2::SecurityGroup
Properties:
VpcId: !ImportValue myVpcID
GroupName: EfsPrivateSecurityGroup
GroupDescription: Security group for EFS mount
SecurityGroupIngress:
— IpProtocol: tcp
FromPort: 2049
ToPort: 2049
CidrIp: !ImportValue myPrivateSubnet1CidrBlock
— IpProtocol: tcp
FromPort: 2049
ToPort: 2049
CidrIp: !ImportValue myPrivateSubnet2CidrBlock
SecurityGroupEgress:
— IpProtocol: “-1”
CidrIp: “0.0.0.0/0”ElasticFileSystemRetain:
Type: AWS::EFS::FileSystem
Condition: Retain
DeletionPolicy: Retain
Properties:
Encrypted: !FindInMap [ EncrpytionBoolean, !Ref EncryptionState, Boolean ]
KmsKeyId:
!If [ UseAWS-ManagedCMK, !Ref 'AWS::NoValue', !Ref Cmk ]
FileSystemTags:
- Key: Name
Value: !Ref 'AWS::StackName'
PerformanceMode: !Ref PerformanceModeElasticFileSystemDelete:
Type: AWS::EFS::FileSystem
Condition: Delete
DeletionPolicy: Delete
Properties:
Encrypted: !FindInMap [ EncrpytionBoolean, !Ref EncryptionState, Boolean ]
KmsKeyId:
!If [ UseAWS-ManagedCMK, !Ref 'AWS::NoValue', !Ref Cmk ]
FileSystemTags:
- Key: Name
Value: !Ref 'AWS::StackName'
PerformanceMode: !Ref PerformanceModeElasticFileSystemMountTarget0Retain:
Condition: Retain
DeletionPolicy: Retain
Type: AWS::EFS::MountTarget
Properties:
FileSystemId: !Ref ElasticFileSystemRetain
SecurityGroups:
- !Ref myPrivateEfsSg
SubnetId: !Ref Subnet1ElasticFileSystemMountTarget0Delete:
Condition: Delete
DeletionPolicy: Delete
Type: AWS::EFS::MountTarget
Properties:
FileSystemId: !Ref ElasticFileSystemDelete
SecurityGroups:
- !Ref myPrivateEfsSg
SubnetId: !Ref Subnet1ElasticFileSystemMountTarget1Retain:
Condition: Retain
DeletionPolicy: Retain
Type: AWS::EFS::MountTarget
Properties:
FileSystemId: !Ref ElasticFileSystemRetain
SecurityGroups:
- !Ref myPrivateEfsSg
SubnetId: !Ref Subnet2ElasticFileSystemMountTarget1Delete:
Condition : Delete
DeletionPolicy: Delete
Type: AWS::EFS::MountTarget
Properties:
FileSystemId: !Ref ElasticFileSystemDelete
SecurityGroups:
- !Ref myPrivateEfsSg
SubnetId: !Ref Subnet2Outputs:
ElasticFileSystem:
Value: !If [ Delete, !Ref ElasticFileSystemDelete, !Ref ElasticFileSystemRetain ]
Export:
Name: !Join ['', [ !Ref ExportName, 'ElasticFileSystem' ]]ElasticFileSystemDnsName:
Description: DNS name for the Amazon EFS file system.
Value: !Join [ '.', [ !If [ Delete, !Ref ElasticFileSystemDelete, !Ref ElasticFileSystemRetain ], 'efs', !Ref 'AWS::Region', 'amazonaws', 'com' ] ]
Export:
Name: !Join ['', [ !Ref ExportName, 'ElasticFileSystemDnsName' ]]ElasticFileSystemMountCommand:
Description: Mount command for mounting the Amazon EFS file system.
Value: !Join [ '', [ 'sudo mount -t nfs4 -o nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2 ', !Join [ '.', [ !If [ Delete, !Ref ElasticFileSystemDelete, !Ref ElasticFileSystemRetain ], 'efs', !Ref 'AWS::Region', 'amazonaws', 'com:/' ] ] ] ]
Export:
Name: !Join ['', [ !Ref ExportName, 'ElasticFileSystemMountCommand' ]]
Brining up two InfluxDB containers
I created a nested stack with the following base template resources. Make sure that Subnet id’s and Security groups have been exported by their respective templates.
myInfluxDB1:
Type: AWS::CloudFormation::Stack
Properties:
Parameters:
mySubnet: !ImportValue myPrivateSubnet1
myLogGroup: "/ecs/influxdb1-ecs"
myServiceName: "influxdb1"
myContainerName: influxdb1-container
mySecurityGroup: !ImportValue PrivateClusterSG
TemplateURL: !Sub 'https://s3-${AWS::Region}.amazonaws.com/influxdb-ecs-nested-stack-${AWS::Region}-${AWS::AccountId}/stack-templates/influxdb-ecs-cf-template.yml'
TimeoutInMinutes: 20myInfluxDB2:
Type: AWS::CloudFormation::Stack
Properties:
Parameters:
mySubnet: !ImportValue myPrivateSubnet2
myLogGroup: "/ecs/influxdb2-ecs"
myServiceName: "influxdb2"
myContainerName: influxdb2-container
mySecurityGroup: !ImportValue PrivateClusterSG
TemplateURL: !Sub 'https://s3-${AWS::Region}.amazonaws.com/influxdb-ecs-nested-stack-${AWS::Region}-${AWS::AccountId}/stack-templates/influxdb-ecs-cf-template.yml'
TimeoutInMinutes: 20
Bringing up the container services can be a little bit tricky. The volume section is where the magic happens and the EFS volume is mounted to the container.
Also, have a look at how service discovery has been done and the AWS Route 53 DNS updates with the containers IP address. Again, make sure you have a AWS Cloudmap service name available that can be imported by the resource.
The template exports the end-point where the database can be reached.
LogGroup:
Type: AWS::Logs::LogGroup
Properties:
LogGroupName: !Ref myLogGroupTaskDefinition:
Type: AWS::ECS::TaskDefinition
Properties:
Family: influxdb-ecs
RequiresCompatibilities:
— EC2
NetworkMode: awsvpc
ExecutionRoleArn: !Sub arn:aws:iam::${AWS::AccountId}:role/ecsTaskExecutionRole
Memory: 512
Cpu: 256
ContainerDefinitions:
- Name: !Ref myContainerName
Image: influxdb:latest
Memory: 512
Cpu: 256
MountPoints:
— ContainerPath: /var/lib/influxdb
SourceVolume: influxdb_data_volume
PortMappings:
-
ContainerPort: 8086
HostPort: 8086
-
ContainerPort: 8083
HostPort: 8083
LogConfiguration:
LogDriver: awslogs
Options:
awslogs-group: !Ref myLogGroup
awslogs-region: !Ref AWS::Region
awslogs-stream-prefix: !Ref myServiceName
Volumes:
-
Host:
SourcePath: !Sub
— /efs/${ServiceName}
— {ServiceName: !Ref myServiceName}
Name: “influxdb_data_volume”
ServiceDefinition:
Type: AWS::ECS::Service
Properties:
LaunchType: EC2
TaskDefinition: !Ref TaskDefinition
Cluster: Hydro-Cluster-Private
ServiceName: !Ref myServiceName
ServiceRegistries:
— RegistryArn: !GetAtt InfluxDbServiceDiscovery.Arn
DesiredCount: 1
NetworkConfiguration:
AwsvpcConfiguration:
AssignPublicIp: DISABLED
SecurityGroups: [!Ref mySecurityGroup]
Subnets: [!Ref mySubnet]InfluxDbServiceDiscovery:
Type: AWS::ServiceDiscovery::Service
Properties:
Name: !Ref myServiceName
DnsConfig:
DnsRecords: [{Type: A, TTL: “10”}]
NamespaceId: !ImportValue myPrivateServiceDiscoveryNamespace
HealthCheckCustomConfig:
FailureThreshold: 1Outputs: MyInfluxDbServiceName:
Description: The Service discovery name
Value: !Join
— ‘.’
— — !GetAtt InfluxDbServiceDiscovery.Name
— !ImportValue myPrivateNamespace
Conclusion
It’s possible to bring up a time-series database in a redundant configuration.
In a next article we will look into using Grafana as a dashboard for our database.