How I update dynamic content on Github pages with AWS Lambda

I’m running a site called nexttexteditor.com, it’s hosted on Github pages, and generated with Jekyll (Github has support for generating Jekyll, as soon as you push new code to Github it generates your Jekyll site, often in a matter of seconds).

Anyhow, the site is rather simple and consist of a list of javascript frameworks for building rich text editors.

The information about the frameworks is stored in a YAML-file, and works kind of like a database.

The information in this YAML file is then presented on the website. So basically, this YAML-file …

YAML configuration file for all editors

… Becomes this website.

The YAML file is generated to this list by Jekyll

This is nothing new, it’s just the way Jekyll works. But if you pay attention to the screenshots you’ll see that I’m presenting information like number of Github stars, how many open issues and when the repository was last updated. How can you automatically update this kind of data on a statically generated site? The answer is scheduled (via Cloudwatch) AWS Lambda and Github API.

The lambda function is created with javascript in a Node.js environment.

The flow for updating data is as follows:

  1. A CloudWatch rule is scheduled to run a Lambda function every 12 ours.
  2. The function downloads the specific file _data/editors.yml via Github API
  3. Convert the YML to JSON to make handling more comfortable
  4. Iterate all “editors” and fetch information about each of the repositories
  5. Update editors.yml file with fresh values
  6. Commit/Push editors.yml to Github
  7. Github Pages notices the new commit and starts to generate the new site using Jekyll. Done!

The Lambda function

You can find the source for this function at Github.

let request = require('request');
const YAML = require('yamljs');

/*
Required environment vars
*/
const userAgent = process.env['USER_AGENT']
const username = process.env['USERNAME']
const email = process.env['EMAIL']
const key = process.env['KEY']

const urlToUpdate = `https://${username}:${key}@api.github.com/repos/andene/texteditors/contents/_data/editors.yml`


/**
* Takes a URL for a Github repository and remove the
* first part containing github.com
*
@param url
*
@returns {string}
*/
const parseUrlForRepoName = (url) => {
if (url.indexOf('https') < 0) {

}
const githubString = 'https://github.com'
return
url.substring(githubString.length, url.length)
}


/**
* Helper method to get option object for a request
*
@param method
*
@param url
*
@returns {{url: *, method: string, headers: {User-Agent: *}}}
*/
const getOptionsForRequest = (method, url) => {
return {
url: url,
method: method.toUpperCase(),
headers: {
'User-Agent': userAgent
}
}
}


/**
* Return information for a Github repsository
* Resolves with information parsed as JSON
*
@param url
*
@returns {Promise}
*/
const getRepoInformation = (url) => {

return new Promise((resolve, reject) => {

const options = getOptionsForRequest('get', url)
request(options, function (error, response, body) {
if (error) {
reject(error)
return
}

const repoData = JSON.parse(body)
console.info(`Fetched:: ${repoData.name} (${repoData.url})`)
resolve(repoData)
})
})
}


/**
* Updates selected properties on editor object
*
@param githubInformation
*
@param editor
*/
const updateFile = (githubInformation, editor) => {
editor.stargazers_count = githubInformation.stargazers_count
editor.open_issues = githubInformation.open_issues
editor.watchers = githubInformation.watchers
editor.updated_at = githubInformation.updated_at
editor.github_description = githubInformation.description
}


/**
*
*
@param editors
*
@param sha
*
@returns {Promise}
*/
const updateRepo = (editors, sha) => {

return new Promise((resolve, reject) => {
const yamlString = YAML.stringify(editors, 2)
const base64Yaml = new Buffer(yamlString)

const updateOptions = getOptionsForRequest('put', urlToUpdate)
updateOptions.body = JSON.stringify({
"message": "Updated editors.yml with Lambda",
"committer": {
"name": username,
"email": email
},
"content": base64Yaml.toString('base64'),
"sha": sha
})

request(updateOptions, function (error, response, body) {
if (error) {
reject(error)
return
}

const parsedBody = JSON.parse(body)
const result = `Done:: File is pushed with SHA: ${parsedBody.commit.sha} (${parsedBody.commit.html_url})`
resolve(result)
})
})

}

const start = (event, context, callback) => { //Learn more about these lamba params at http://docs.aws.amazon.com/lambda/latest/dg/welcome.html

const options = getOptionsForRequest('get', urlToUpdate)
getRepoInformation(options.url)
.then(repoData => {

const fileContent = Buffer.from(repoData.content, 'base64')
const editors = YAML.parse(String(fileContent))

const repoFetches = editors.map(editor => {
const repoPath = parseUrlForRepoName(editor.github)
const repoUrl = `https://${username}:${key}@api.github.com/repos${repoPath}`
return
getRepoInformation(repoUrl)
.then(repoInformation => {
updateFile(repoInformation, editor)
})
})

Promise.all(repoFetches).then(() => {
console.info(`All editors updated`)

updateRepo(editors, repoData.sha).then(result => {
callback(null, result);
}).catch(error => {
callback(error, null)
})

})
})
}


if (process.env['LAMBDA_TASK_ROOT']) { // Running in AWS Lamba
exports.handler = (event, context, callback) => {
process.env['PATH'] = process.env['PATH'] + ':' + process.env['LAMBDA_TASK_ROOT']
start(event, context, callback)
}

} else { // Running in local environment
start
({}, {}, (error, result) => {
console.info(result)
})
}

Create the Lambda

If you never created a Lambda before I recommend the official get started guide. I usually start out with a ‘Blank function’.

Remember to add the dependencies inthe zip file. In my case I’m using a YAML-parser and request to simplify handling of HTTP-requests.

The way I do it is just to zip index.js together with node_modules.

Then log in to AWS Console and create you function and upload you code. I’m also using environment variables to pass Github username and personal access token to the function.

When you are ready, just press ‘Save and test’

The console output from Lambda (Is also available in CloudWatch)

Run Lambda function every 12 hours

By using CloudWatch you can setup a rule that will execute the lambda function every twelve hours, or whatever interval you want.

Go to CloudWatch, click Rules and create a new rule. The source of the event should be of type “Scheduled”. When selection that option you can then select a fixed rate, or specify a cron expression.

On the right side you should be able to find your Lambda function. My lambda function is called github-downloader in the example below.

Create a CloudWatch rule to trigger the Lambda function (Either in CloudWatch or when creating the Lambda)

Get a Github personal access token

Hopefully you have enabled 2FA for your Github account, by having done that, you will need to generate a personal access token for your Github account. This will allow you to use the Github API.

Go to your settings page > Personal access tokens (Lower left corner) > Click on “Generate new token

You will then see a new generated key, save it!

Summary

I’m very pleased with this solution. I don’t have to worry about servers being up and configured, and running AWS Lambda is very cheap, this function takes about 3 seconds to run.

I really like running Jekyll on Github pages, as long as you have Internet and computer/mobile access you can update your site doesn’t matter where you are, or you can have a Lambda update it for you.

Updating with information about Github repositories is just one area of use, I’m sure you can find other areas and API’s where this might come handy.

Just a tip, there is a lot of information about repositories available on Github API, for example check out the aws-cli repo.

Links