Transform Notion into RSS reader. Part 2 — Multiple RSS endpoints

Enhancing our Notion-native RSS reader

Dmitry Kankalovich
Geek Culture
6 min readJun 2, 2021

--

TL;DR: We will expand our solution for reading RSS in Notion with the ability to configure multiple RSS endpoints as well as the ability to highlight new content. As usual, you can find all the sources here.

High-level architecture - v2
High-level architecture — v2

I’ve been using the solution we’ve built in part 1 for a while and soon enough realized that it needs improvements:

  1. Supply multiple RSS endpoints — one endpoint is simply not enough, the entire point of RSS is to sink multiple sources into one database forming your personal daily feed.
  2. Highlight new RSS items — since the feed can be updated on a weekly or even monthly basis, and not necessarily have daily updates, I need some visual guidance to quickly distinguish new items.
  3. Preview RSS items — the text-based links are good enough, but not awesome. Would be really nice to be able to see a preview of the content, i.e. for the video items or web pages.

Now, let’s start with the tail — Preview RSS items. The Notion itself has a perfect tool for that — the Embed object type, which renders a content-specific widget given the URL. So for the web page it's going to be a nice page content preview, and for the, say, video it can render an actual player. Very handy for our case.

However, sad news: at the moment of writing this article Notion API does not support many object types, including Embed 😣. You can follow their Twitter and api spec for updates.

No doubts, it will be added later, but for now we can do nothing about it.

Well, fortunately, the list of other improvements IS something that we can address, and without further ado — let’s jump to it.

Supply multiple RSS endpoints

Previously we had stored a single RSS endpoint in our .env file and passed it to the Lambda as an env var. Now, we need to transition to something more sophisticated and extendable.

Given that we’re now going to have multiple RSS endpoints and their content must somehow nicely co-exist inside a Notion page, for each endpoint it makes sense to supply the following parameter group:

  • RSS URL — the actual URL to pull the data from
  • Title — a custom title to visually separate different RSS feeds upon import to Notion
  • Limit — the limitation on overall items displayed per RSS feed
  • Updates Only — a true/false flag whether or not we want to display only new RSS updates

Since we’re using NodeJS-based Lambda, the natural structure for storing such a config would be a JSON object:

As to where to actually store this configuration file I’ve explored several options:

  • SSM Parameter Store — not good enough for storing complex objects, has 4 KB size limits per param.
  • Secrets Manager — we’re not dealing here with secrets, albeit it can be used for storing JSON.
  • S3 — fits the case, but feels like too much, I do not want to introduce the entire object storage dependency just for this small file.
  • DynamoDB — really the same as S3 in a sense of overkill, however, this is the option I’d go with if I want to change config bypassing stack deployment.
  • Lambda env vars — we could Base64 the JSON payload, but it has 4 KB size limit.
  • Lambda handler hardcode — actually good enough, but it bothers me to mix configuration and code.
  • Lambda embedded file — we can place a config file in the Lambda package and read it in the handler. It is the best option at this point.

Following my favourite YAGNI and KISS principles, let’s go with embedding config file in Lambda.

So go ahead and create resources/lambda/rss-config.json and fill it with the aforementioned configuration.

Rework our Lambda handler at resources/lambda/index.js to leverage NodeJS-native way to read configuration JSON file. Prepend file contents with this line:

At this point, we need to asynchronously pull the contents of each RSS endpoint and transform it in accordance with the Notion API contract.

For the content pull operation, we could leverage something like Promise.all, but I actually think it is a good application for Generator. Create a new file resources/lambda/rss_importer.js:

And use this generator to pull & transform RSS feed in a new file resources/lambda/rss_transformer.js:

Finally, put it all together in our handler at resources/lambda/index.js:

And this is it for the Lambda for now!

Before we actually deploy changes, cleanup any usages of now-obsolete environment variable RSS_FEED_URL in .env and lib/notion-rss-feed-stack.tsfiles respectively.

Once you did that — deploy our changes:

You can manually invoke Lambda to test our changes. The result should look similar to this:

Multiple RSS endpoints
Multiple RSS endpoints

Convenient, isn’t it? Let’s move forward and a highlight feature for the new items.

Highlight new RSS items

We want to quickly identify new items in our daily feed. For that, a comparison with the previous day’s feed needs to be done. But to achieve such a feature this data must be stored somewhere.

The simplest and cheapest solution here is to leverage AWS S3 and store the RSS feed of the previous day, then use it to compare with the current day feed to highlight new elements.

S3 is the object storage, which — for our case — is equivalent to the file storage over HTTP API. So, we need to come up with some unique per RSS feed filenames. Looking at the rss-config.json the only truly unique and without the possibility of collisions identifier for a file would be the URL of the RSS endpoint.

However, the URL itself might contain symbols, that won’t work for a file name — an S3 object key. The simplest solution here is to use a consistent hash of the URL as the file name, aka S3 object key. The MD5 hashing function is good enough here, so we’re going to use it.

Alright, so store RSS feed of the previous day in S3, pull and compare it with today’s RSS feed, and depending on the configuration, either simply 🔥 highlight 🔥 new items or just remove obsolete stuff. Rinse and repeat the next day.

Pretty simple.

Given that we’re introducing a new AWS resource dependency for our Lambda, the scope of changes is this:

  • Infrastructure: provision an S3 bucket and assign Lambda permissions to read/write it
  • Lambda: pull-in AWS SDK for S3 dependency and refactor the code to support new feature

Let’s deal with it.

Infrastructure changes

AWS CDK makes it easy to provision a new bucket and link it with Lambda, so just simply edit our resource stack at lib/notion-rss-feed-stack.tsto look the following way:

Notice how we passed RSS_BUCKET_NAME to the Lambda and also explicitly assigned permissions via bucket.grantReadWrite(handler).

Now moving to the application logic.

Lambda changes

We’ll need couple of more dependencies added to the Lambda module, so navigate to the resources/lambda folder and run following:

Once it’s done, let’s think about how we better refactor our application logic.

Combined all together we’re starting to have a lot of stuff going on in that handler, so it’s better to isolate areas of functionality in their own files and link it all nicely together via the Facade pattern. Edit resources/lambda/index.js to look this way:

Looks nice and simple, isn’t it?

Now, I am going to link the rest of the files via GitHub so I can stop bloating this article with code listings any further:

Once you did copied / edited files — you’re set to deploy your changes.

As usual:

And then manually trigger your Lambda to verify the new feature. If everything ok — you should notice 2 things:

  • Notion daily RSS page will show all feed items as highlighted with 🔥 — it’s expected, since we run new feature for a first time, so there is simply no previous day to compare with.
  • The newly created S3 bucket would contain a bunch of files, which are really just today’s RSS items per endpoint and which will be used to compare with daily RSS tomorrow.

After a day or two, your daily RSS page will look similar:

Highlighted RSS feed
Highlighted RSS feed

Notice how the new elements in AWS Feed are highlighted and easily discoverable, and also how we choose to keep there all other items, even obsolete — in contrast with other RSS feeds.

Find the complete source code here

This is it for today, thanks for reading! I will be doing Part 3 when Notion implements support of the Embed type of page object, so stay tuned.

--

--