Scaling out with AWS Kinesis stream and .Net Core dockerized microservices

Nirjhar Choudhury
Trimble Maps Engineering Blog
4 min readOct 21, 2019

The problem statement

At Trimble Maps we have the Trip Management application, which is used by the trucking industry to create and manage a user’s trip from start to finish. The most common use case for this app is a truck making multiple pickups and deliveries, and also stopping for fuel and rest breaks along the way. These vehicles send data to our application through IoT devices every minute, and we use this data to provide insights to our partners in the transportation industry.

The application that we start with is a traditional monolithic web application that receives data from these devices every minute and stores that information in a database. There is also a continuously-running service that retrieves these messages from the database and processes them on a specific interval.

Whereas this is a good solution to start with, it does not scale when the volume of traffic increases. This system is not able to manage the load and process requests in an efficient manner.

Legacy Monolithic Application

The Solution

To handle volume at scale, we had to re-engineer the entire application.

First, we had to re-architect the IoT input requests to publish those into a stream (we chose AWS Kinesis managed stream), so that we can do real-time processing on the data.

The next thing we had to do is to introduce a queuing mechanism (we chose RabbitMQ here for its simplicity and robustness) so that we can scale out the business logic processor.

After that, we had to get out of SQL Server as the IoT data storage and transition into a big data solution (we chose MongoDB for its excellent support for geospatial data).

Finally, we had to use an in-memory distributed Caching solution (we chose AWS Elastic Cache for Memcached) to improve the performance of our service responses. Plus, SQL server cannot scale horizontally, and there is always limit to how much vertical scale you can achieve (and that becomes very expensive). A horizontally scalable, distributed cache made a lot more sense for our use case.

We are still using SQL Server, but only for some static data which does not change often after the first insert.

distributed microservices
A highly scalable distributed solution

With this new architecture, every component of the system can scale horizontally, allowing it to theoretically scale infinitely. Well, of course, there is a cost associated with it. But the point is we do not have to scramble and make significant changes to handle an increase in volume.

Explaining the components:

Web API: .Net Web API application running in IIS. Responsible for handling client requests and responding appropriately. This is also the API used by the IoT devices to post data to our system.

Kinesis Consumer: .Net Core services processing Kinesis stream messages. There are two consumer applications, which scale outs to match the number of shards in the Kinesis stream. One application’s responsibility is to save the messages to MongoDB for historical analysis, and the other one publishes to RabbitMQ for real-time processing.

Kinesis Consumer Manager: Another .Net Core service is responsible for assigning shards to the consumers to process. It assigns shard in a round-robin fashion to each consumer. From our findings, it works best to have a maximum of one consumer per application per shard, up to a maximum of five consumer applications reading the same shard, since Kinesis has a limitation of five reads per shard per second. The Kinesis Consumer Manager uses DynamoDb to store the shard-to-consumer mapping.

MQ Processors: .Net Core applications reading RabbitMQ and processing messages. They keep the state of the application in Memcached. This data is read by the Web API tier and used when responding to the client’s requests.

Conclusion:

In conclusion, this re-architecture allows us to scale elastically based on load. We can increase the number of shards on the Kinesis stream based on load, and that will trigger an increase in Kinesis consumers, and then the downstream MQ processors. With the bulk of the data in MongoDB and in-flight data in Memcached, bottlenecks in SQL Server is not going to be a cause of concern anymore.

At present, we are prepared to handle over 100x more requests to our application, which in turn will process millions of IoT requests per day.

Interested in joining our team at Trimble Maps? Click here!

--

--