Events are messages that are sent by a system to notify operators or other systems about a change in its domain. With event-driven architectures powered by systems like Apache Kafka becoming more prominent, there are now many applications in the modern software stack that make use of events and messages to operate effectively. In this blog, we will examine the use of three different data backends for event data — Apache Druid, Elasticsearch and Rockset.

Using Event Data

Events are commonly used by systems in the following ways:

  1. For reacting to changes in other systems: e.g. …

Amazon DynamoDB is a managed NoSQL database in the AWS cloud that delivers a key piece of infrastructure for use cases ranging from mobile application back-ends to ad tech. DynamoDB is optimized for transactional applications that need to read and write individual keys but do not need joins or other RDBMS features. For this subset of requirements, DynamoDB offers a way to have a virtually infinitely scalable datastore that requires minimal maintenance.

While DynamoDB is quite popular, one common complaint we often hear from developers is that DynamoDB is expensive. In particular, costs can scale sharply as usage grows in…


In this blog I compare options for real-time analytics on DynamoDB — Athena, Spark and Elastic — in terms of ease of setup, maintenance, query capability, latency. There is limited support for SQL analytics with some of these options. I also evaluate which use cases each of them are best suited for.

Developers often have a need to serve fast analytical queries over data in Amazon DynamoDB, to enable live views of the business and application features such as personalization and real-time user feedback. However, as an operational database optimized for transaction processing, DynamoDB is not well-suited to delivering real-time…


In this post I explore how to support analytical queries without encountering prohibitive scan costs, by leveraging secondary indexes in DynamoDB. I also evaluate the pros and cons of this approach in contrast to extracting data to another system like Athena, Spark or Elastic.

Rockset recently added support for DynamoDB — which basically means you can run fast SQL on DynamoDB tables without any ETL. As I spoke to our users, I came across different ways in which secondary indexing is used for analytical queries.

DynamoDB stores data under the hood by partitioning it over a large number of nodes…


This post outlines how to use SQL for querying and joining raw data sets like nested JSON and CSV — for enabling fast, interactive data science.

Data scientists and analysts deal with complex data. Much of what they analyze could be third-party data, over which there is little control. In order to make use of this data, significant effort is spent in data engineering. Data engineering transforms and normalizes high-cardinality, nested data into relational databases or into an output format that can then be loaded into data science notebooks to derive insights. …


When we surveyed the market, we saw the need for a solution that could perform fast SQL queries on fluid JSON data, including arrays and nested objects:

The Challenge of SQL on JSON

Some form of ETL to transform JSON to tables in SQL databases may be workable for basic JSON data with fixed fields that are known up front. …


In the last three years, I’ve had many conversations with professionals in Silicon Valley. One question that comes up often is “what do you love about your job?”. In response, I hear a lot of different answers. There are some that love the money and see their jobs as a means to making a living. There are some that love the money and can admit it to themselves and others. Then there’s the (not necessarily mutually exclusive with the above) set of people that love what they do. These people tend to find their jobs fulfilling because they’re experiencing growth…


This post is inspired from a book I read, Naked Statistics and a documentary that I watched, “The Human Face of Big Data”. I recommend the documentary to anyone who wants to know more about what the big fuss is about data and why we care. The book is focused on just how powerful statistical analysis can be, and how easy it is to trip into a false sense of security and draw incorrect conclusions from data. It cautions the reader to be wary of interpreted statistics that often can be framed to support a variety of different hypotheses. …


Often I have found that I am measuring myself against borrowed yardsticks: adopting someone else’s notion of success. In February 2017, I decided that I would change that. I wanted to study happiness, success and human emotion from a scientific standpoint. I enrolled in an online course, “Success”. I have nothing but praise for this course. The course made me think and introspect. It asked thought provoking questions which helped me truly understand myself, from my innate motivations to my goals when I was a child. Based on many hours of thinking, I can define a single coherent theory of…

Anirudh Ramanathan

Cloud Native

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store