Building a Streaming Database for Fun (Not Profit)

These are some screenshots of our visualizations of tweets from inside of the USA containing keywords about polling places at 8pm on 11/09/2017. RIP.
A class diagram of the four objects that make up the TrickleDB system.
Tweet #1 for our example
This is what the buffer table looks like after the StreamManager executes the INSERT statement in the Source’s queue.
The StreamView’s table.
Tweet #2 for our example
The buffer table after the new INSERT statement has been executed by the database server.
I am not very popular on Twitter
On the left side is an example of the TrickleDB architecture. Red arrows represent data flow, blue arrows represent ownership. The right side shows the different forms that data takes as it flows through the system. The colored boxes represent the form that data takes during each step of processing. The blue box represents the data received over the connection on the Source’s URL. The pink box represents the INSERT statement parsed from the raw JSON and stored in the Source’s circular FIFO queue (the buffer). The orange box represents 1. The record in the Stream’s buffer table, 2. The INSERT statement used to populate the Stream’s temp table. The green box represents 1. The record of aggregates in the Stream’s temp table, and 2. The coalesced record in the StreamView’s view table.
The schema of the StreamView needed to track a streaming average.
This is the Stream’s temp table schema to track AVG(number_of_likes).
This is the Stream’s temp table updated with the number of likes on the records currently in the buffer table.
This is the StreamView’s table after coalescing with the Stream’s temp table.
Finally, this is the StreamView’s table after updating its complex aggregate function (the average) with the newly coalesced simple aggregate functions.

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Installation Guide for TensorFlow on macOS High Sierra 10.13.4

Using Newtonsoft.Json in ASP.NET Core And SignalR

An Introduction to Coding and Programming

AWS IAM Access keys rotation using Lambda function — Part01

7 Things You Should Never Do in the Morning

Interview Time! with Notebooks

Jupyter Notebook in the cloud

Do You Know All Status of The PMP® Certification?

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Emily Mazo

Emily Mazo

More from Medium

Kth Largest Element in a Stream — LeetCode 703

Bitbucket pipelines can be used to automate several operations by simply running a series of…

bitbucket workspace

Migrating Data Between Amazon Redshift Databases.

Distributed Caches: The Guide You Need (Part II)