Building a Personal RSS Email Digest Service in Go

Dan Bridges
Parallel Thinking
Published in
3 min readMay 21, 2019
An example email from our RSS digest service

In this article I’ll go through how to build a small Go app that is able to check a list of RSS feeds and send an email digest of any articles published in the previous 24 hours. This task involves concurrently fetching the feed data, parsing the XML, rendering html templates, and emailing the final result. Amazingly, with Go’s batteries included approach, all of these tasks can be accomplished with just the standard library.

Structure

What we need to do:

  • Fetch the feed XML
  • Unmarshal the feed into a collection of structs.
  • Check if the feed has been updated in the past 24 hours, if not proceed to next feed.
  • Select the entries in each feed that were updated within 24 hours.
  • Use an html template to render those entries into a digest.
  • Email that digest to a recipient.

Fetching the Feeds

The very first step is to fetch the feeds. We can do this concurrently using goroutines. We define the input feed urls:

Then we define functions that fetch the feeds:

fetchFeeds uses a channel to receive results from the goroutines processing each individual feed. The goroutines all run fetchFeed which issues a GET request to the url, reading and parsing the response. We’ll get into the implementation of parseFeed in the next section. After starting the goroutines,fetchFeeds waits for them to complete, gathering up results as they do (or ignoring errors if there is an error)

Unmarshaling ATOM XML

We will now take a look at parseFeeds which takes the XML body and parses it into a hierarchy of structs for further processing. When initially developing this portion of the app, I used static sample feed data instead of fetching the feeds each time. For simplicity we will focus on only handling ATOM feeds. Lets start by grabbing the example feed from Wikipedia’s ATOM article.

Looking at the example feed we can immediately see the two core datatypes we will need, a feed and its entries. Lets create structs that map to those entities:

These are our most fundamental structs, but they also contain more basic structs such as Content, Link and Author:

One thing to note is the custom type atomTime type for the updated attribute. atomTime's underlying type is a time.Time object. We need a custom type so we can implement an unmarshaler to convert the updated date string to something we can use in Go:

We also add some helper functions to atomTime to get the underlying time.Time and to provide a human readable localized string:

Parsing a feed and unmarshaling the data is now straightforward:

Finally we need to filter our feeds to keep only the feeds updated in the last 24 hours:

HTML Templates

We want to display our feeds in a nice html email. We’ll use Go’s template package to render the data. We have two html templates: the main layout and a template for a single feed. We’ll also add a stylesheet template to render our styles in the head tag.

Our Content struct has an HTML method which wraps the content in a template.HTML so it can be rendered inline. We also do some crude regex replacements to remove any CDATA tags from the content and update the image urls to include the base url so they display correctly.

We will store all of our templates in their own directory for easy loading and rendering:

Emailing the Digest

Once we have rendered our html to a string, emailing is accomplished using the net/smtp package. SendMail does the heavy lifting for us—all we need to do is supply appropriate headers to the message:

If using Gmail the ENVs will look something like this:

MAIL_FROM=sender@gmail.com
MAIL_TO=recipient@gmail.com
MAIL_PASSWORD=<sender password goes here>
MAIL_HOST=smtp.gmail.com
MAIL_PORT=587

Additionally you will have to enable “Less secure app access” from your Gmail account Security pane. I’d recommend not sending mail from your primary account.

Wrapping Up

We now have all of the pieces put together. Every time we run the program it should fetch the given feeds, filter out entries not occurring in the last 24hrs, and email the results. I have a Raspberry Pi server that runs this as a cron job once day.

The full source code is available on github: https://github.com/dbridges/rss-digest

--

--

Dan Bridges
Parallel Thinking

Software developer at Beezwax Datatools and former researcher in Physics & Neuroscience.