Member-only story
On Demand Materialized Views: A Scalable Solution for Graphs, Analysis or Machine Learning
Aggregating data for graphs, analysis, portfolios, or even machine learning can be an arduous task and difficult to scale. In this article, I will go over MongoDB’s new(ish) $merge pipeline that I feel resolves a lot of these scaling issues and automates certain design practices that previously took a lot of custom development to accomplish, however, Mongo’s documentation fails to provide extrapolated examples or multiple use cases. This article will be diving heavily into MongoDB’s aggregation operation. It will assume you already have knowledge of how to aggregate data and will be focused primarily on the $merge pipeline which covers scalability, caching and data growth.
Table of Contents
Basic Usage
Incrementing New Subsets of Data
Incrementing or Replacing a Field based off a conditional
Aggregating Data from Multiple Collections
Creating a Basic Graph or Machine Learning Data Set
Let’s create a simple example with some mock data. In this example we will aggregate generic posts and determine how many posts each profile has, then we will aggregate comments. If you are using the code snippets to follow this article, you will want to create a few data points following the style below. However this solution would easily scale for a database with a large amount of…