Cassandra Meetup — Batch loading for Data Products & Building API layer for C*

By Daniel Chia

Cassandra, a scalable NoSQL database, powers the majority of Coursera’s online request processing. Cassandra affords us horizontal scalability, stable and fast latencies, and no downtime maintenance. These aspects allow us to provide our learners that are distributed globally a feature-rich online education platform with stable latencies and little downtime.

Coursera hosted the July Cassandra South Bay Users meetup, where Daniel Chia and Sourabh Bajaj were joined by Christos Kalantzis, Director of Engineering (Cloud Database Engineering) from Netflix to talk about two exciting topics.

The first topic, covered by Sourabh, touched on how we used batch loading with Cassandra to bring insights from our offline data warehouse powered by Amazon Redshift back to our learners. Cassandra allows us to serve these derived insights with low latencies, allowing us to use these insights in our online platform.

The second topic, covered by Daniel and Christos, speaks about a trend that we’re perhaps starting to see, which is building an API abstraction layer on top of your underlying storage database. This storage service allows developers to focus on building awesome products by encapsulating best practices in using the database, while allowing the storage team to iterate on performance and features (such as secondary indexing).


Batch Loading Cassandra for Data Products

Building an API layer: Coursera

Building an API layer: Netflix


Originally published at on July 19, 2016.