TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Member-only story

On Demand Materialized Views: A Scalable Solution for Graphs, Analysis or Machine Learning

--

Image by Author using Chart.js

Aggregating data for graphs, analysis, portfolios, or even machine learning can be an arduous task and difficult to scale. In this article, I will go over MongoDB’s new(ish) $merge pipeline that I feel resolves a lot of these scaling issues and automates certain design practices that previously took a lot of custom development to accomplish, however, Mongo’s documentation fails to provide extrapolated examples or multiple use cases. This article will be diving heavily into MongoDB’s aggregation operation. It will assume you already have knowledge of how to aggregate data and will be focused primarily on the $merge pipeline which covers scalability, caching and data growth.

Table of Contents
Basic Usage
Incrementing New Subsets of Data
Incrementing or Replacing a Field based off a conditional
Aggregating Data from Multiple Collections
Creating a Basic Graph or Machine Learning Data Set

Let’s create a simple example with some mock data. In this example we will aggregate generic posts and determine how many posts each profile has, then we will aggregate comments. If you are using the code snippets to follow this article, you will want to create a few data points following the style below. However this solution would easily scale for a database with a large amount of…

--

--

TDS Archive
TDS Archive

Published in TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Quest Henkart
Quest Henkart

Written by Quest Henkart

Director of Software Architecture at Glidr.io

No responses yet