Building an RSS feed using Claudia.js and AWS Lambda

Bear with me…

I wanted to create an RSS feed for the Digital Outcomes and Specialists (DOS) opportunities area of the UK Government Digital Marketplace. I’ll come to why in a moment. First, I’ll talk about how.

As far as I can tell, DOS opportunities can be searched for in the Digital Marketplace portal and you can set up email alerts when new ones get added but there isn’t an RSS feed for them. I don’t know why… I may just have missed it?

To create an RSS feed, I needed a way of grabbing the list of opportunities as HTML, parsing it and converting it to RSS. I also needed to surface the resulting RSS at a URL so that other services could consume it.

Enter AWS Lambda:

AWS Lambda is a compute service that lets you run code without provisioning or managing servers. AWS Lambda executes your code only when needed and scales automatically, from a few requests per day to thousands per second. You pay only for the compute time you consume — there is no charge when your code is not running. With AWS Lambda, you can run code for virtually any type of application or backend service — all with zero administration. AWS Lambda runs your code on a high-availability compute infrastructure and performs all of the administration of the compute resources, including server and operating system maintenance, capacity provisioning and automatic scaling, code monitoring and logging. All you need to do is supply your code in one of the languages that AWS Lambda supports (currently Node.js, Java, C# and Python).

and Claudia.js:

Claudia makes it easy to deploy Node.js projects to AWS Lambda and API Gateway. It automates all the error-prone deployment and configuration tasks, and sets everything up the way JavaScript developers expect out of the box.

Claudia.js is a framework for creating Lambda functions. I use it for most of the Lambda functions I create. The good news is that it takes care of the heavy lifting around configuring the AWS API Gateway (which is what allows you to interact with the Lambda function over HTTP). The bad news is, you have to code in Node.js — which is not one of my favorite programming languages. Oh well, needs must.

This isn’t intended to be a tutorial for Claudia.js but, for the record, this is pretty much what I did:

mkdir dos2rss
cd dos2rss
npm init
npm install claudia-api-builder -S
npm install cheerio -S
npm install request -S
npm install request-promise -S

I then created a file called app.js containing the following code:

var ApiBuilder = require('claudia-api-builder'),
api = new ApiBuilder(),
cheerio = require('cheerio'),
rp = require('request-promise'),
rss = '',
urlRoot = 'https://www.digitalmarketplace.service.gov.uk',
urlSearchPath = '/digital-outcomes-and-specialists/opportunities?status=live';
module.exports = api;
rss = "<?xml version=\"1.0\" encoding=\"UTF-8\" ?>\n";
rss += "<rss version=\"2.0\">\n";
rss += "<channel>\n";
rss += "<title>Digital Outcomes and Specialists opportunities</title>\n";
rss += "<link>https://www.digitalmarketplace.service.gov.uk/digital-outcomes-and-specialists/opportunities?status=live</link>\n";
rss += "<description>Buyer requirements for digital outcomes, digital specialists and user research participants </description>\n";
api.get('/rss', function () {
var options = {
url: urlRoot + urlSearchPath,
transform: function (body) {
return cheerio.load(body);
}
};
return rp(options)
.then(function($) {
$('.search-result').each(function(i, element){
rss += "<item>\n";
rss += "<title>"+$(this).children('.search-result-title').children('a').text()+"</title>\n";
rss += "<link>"+urlRoot+$(this).children('.search-result-title').children('a').attr('href')+"</link>\n";
rss += "<description>"+$(this).children('.search-result-excerpt').text()+"</description>\n";
rss += "</item>\n";
});
rss += "</channel>\n";
rss += "</rss>\n";
return rss;
})
.catch(function(err) {
return '<?xml version="1.0" encoding="UTF-8" ?><rss version="2.0"><channel></channel></rss>';
});
}, { success: { contentType: 'application/rss+xml'}});

Don’t worry too much if you don’t understand the details of this. Basically, it grabs the HTML list of current opportunities (using a Node library called ‘request-promise’), parses it (using a Node library called ‘cheerio’), selects the bits we are interested in and then converts them to RSS XML format.

Note to self: this code will break if/when the Digital Marketplace HTML is updated!

Finally, I ran:

claudia create --region eu-west-1 --api-module app

to deploy the Lambda function to the AWS region in Dublin.

Claudia.js reported what URL my new Lambda function was available at and, voila, an RSS feed was born.

So, why did I want this?

Good question.

Partly to learn a bit more about Lambda, Node.js and HTML parsing. That’s enough of a reason, right? Partly because I want to start taking copies of all the lists of ‘Essential skills and experience’ that are required for opportunities coming up in UK Government work.

I hope to start storing these away in a DynamoDB database (the AWS NoSQL service) so that they can be analysed over time. I figure it’ll be interesting to see how required technologies change over time. I’m guessing that one might expect data-related skills requirements to grow over time for example. Ditto Blockchain (rightly or wrongly). Ditto low-code and integration platforms (Mulesoft for example). It will be interesting to see.

I haven’t done this second bit yet but it’s on my to-do list. Creating the RSS was just the first step. It allows me to feed opportunity alerts into services like IFTTT and the like, providing greater flexibility in terms of onward handling.

I’ll report back when I’ve done more…

One clap, two clap, three clap, forty?

By clapping more or less, you can signal to us which stories really stand out.