From online CMS to static site generator: motivation and process

Dmitry Medvinsky
e-Legion
Published in
8 min readApr 23, 2019

A short while ago we managed to sell to our management the thing many developers dream of — a total rewrite. Of our company’s website engine, that is. Below we describe our motivation and the process of migration from an old online CMS-based website to a new one based on a static site generator.

What is the site about, tho?

e-Legion is a mobile app development company. Our website consists of several informational pages like “about us”, “contacts” and such, a list of vacancies and a portfolio — a large number of showcases of our projects which have similar appearance but unique features. There isn’t really any user interaction. But there is full bilingualism separated by the domain name.

There are different kinds of people working with the website content: marketing managers, HR managers, and developers. Each person uses different tools for various tasks: whilst a frontend developer working on a new portfolio case very much prefers working via git pull; vim; git push kind of way, an HR manager fixing a typo in a vacancy description would rather change a single letter via a web interface, press a single “save” button and keep on doing HR.

Before

Historically, the previous reincarnation of the website was built upon the custom engine using the Ruby on Rails framework. The CMS was based on rails_admin. Some stuff used WYSIWYG editor (CKEditor), in other places HTML editor (CodeMirror) was used.

Motivation

Why though, if it works? Here goes:

  1. The problem of data synchronization between the repo and the database.
    A simple RoR-based CMS uses a simple RDBMS SQLite (by the way, this is a good example of when SQLite is a good fit for a production website). The content that is managed via the online CMS is stored in the database, not in the repository. This tremendously complicates things for the developers who need to make some major changes in a portfolio case, for example. The thing about those cases is that the initial implementation of the case is stored in the repo as a HAML template which is later compiled and written into the database during the deployment for it to be edited via the CMS. The reversal is difficult and although possible, will most certainly not result in an identical source which might cause problems like broken diffs and painful merges.
  2. Operational cost.
    Even though the site is not user-interactive, it still requires the full range of backend infrastructure: at least one VM with a reverse-proxy and an app server. It’s no secret that Rails apps require quite a bit of RAM, which costs money.
  3. Performance.
    Performance optimization is more challenging. Caching pages/fragments requires code. Cache invalidation is one of the toughest problems in software. Well, not in this case, but still… CDN integration requires some consideration, too.
  4. Ease of use.
    WYSIWYG sucks. Everybody knows that. More often than not one needs to press the “view source” button and work directly with HTML. This HTML produced from the ERB/HAML template might not be pretty. Working with it in a text control of the browser window is just painful for the most thick-skinned of us.

The first three points are obviously and fully solved by the static site generator: there is nothing to synchronize because all data is in single storage; you don’t need much power to serve a couple of static files in the 21st century; caching and CDN become trivial.

Other notable things about this approach:

  • We ended up hosting the site on GitHub Pages. They automatically manage Let’s Encrypt SSL certificates and integrate with Akamai CDN, all for free.
  • All changes are stored in the Git log. Before, some of them were in the Git log, the others were in the rails_admin changelog, which isn’t so great.
  • Testing the changes before rolling them out into production is way easier. Synchronizing all the content between staging and production used to be hard. Manually making changes on stage before applying them to prod is tedious to the point of denial due to it being optional. At some point, the staging was shut off completely due to being unused and consuming resources for nothing. Now we managed to do it pretty much automated and kind of mandatory: pushing to master deploys to stage automatically; a successful build can be deployed to prod by a press of a button in the GitLab UI.
GitHub Pages Settings UI

Hello? It’s 2019

Well, the advantages are obvious and have been so for quite a while. No, our developers didn’t just wake up in their caves with a prophecy of static sites inscribed on the walls. In fact, they were trying to sell this idea to the management for several years already. But besides the unplanned budget, the main issue stopping us was the ability to teach the primary content editors (marketing managers) to use this whole new system. They already were rather comfortable working with HTML code, but learning the whole new ways of Git is challenging even for more tech-savvy users.

Back then, we had an idea to create a custom frontend with an intuitive interface, but this takes time and effort which is something to consider when the developers’ resources are scarce.

Time went by and the things like GitLab with its Web IDE made an appearance. We created a small prototype of the future CMS based on GitLab and presented it to marketing folks along with a short lesson on how to use it. It only took less than an hour of explanation and they liked it better than the old CMS! So, when the time for a website redesign came, the backend rewrite was launched as well.

GitLab Web IDE UI

The simple tech behind it

Here are a few details on what we’ve got:

Project root

Obviously:

  • it’s backed by Node.js and uses Gulp task runner;
  • on the top level it’s got: app (application files: templates, js, css), images (well, images), pages (the content which is to be edited), public (the directory served over HTTP).
pages subdirectory

The content is structured in an intuitive way so that one can easily find the necessary page by knowing a URL. Pug templates are used as their syntax is cleaner and less error-prone that of HTML.

Building JS and CSS is nothing interesting, so we skip that. Instead, here is a non-verbatim piece of a task that builds the pages:

gulp.task('pug', () => {
// Preload shared data only once.
let sharedData = {};
for (let lang of ["en", "ru"]) {
sharedData[lang] = loadSharedData(lang);
}
// Load necessary data from other files for each file render.
let dataGetter = (file) => {
let content = frontmatter(String(file.contents));
file.contents = new Buffer(content.body);
let data = content.attributes;
data.lang = file.relative.split(path.sep)[0];
data.env = process.env.NODE_ENV;
loadPageData(data, sharedData);
return data;
};
// Remove html file extensions in URLs.
let renamer = (filepath) => {
if (filepath.basename === '404') {
return; // special case for Github Pages
}
if (filepath.basename !== 'index') {
filepath.dirname = path.join(filepath.dirname, filepath.basename);
filepath.basename = 'index';
}
};
return gulp
.src('./pages/**/*.pug')
.pipe(data(dataGetter))
.pipe(pug(pugOptions))
.pipe(rename(renamer))
.pipe(gulp.dest('./public'));
});

In order to load data used in templates rendering we use gulp-data. Pages’ metadata is kept right inside the templates by using front-matter. Shared data like projects and vacancies list is loaded by a pretty simple custom data getter that provides every page with required data.

Besides that, gulp-rename is used in order to make URLs pretty. Files like faq.html are renamed into faq/index.html so that they are accessible via /faq/ URL.

CI Build Job

The second important thing that we’d like to take a look at is GitLab CI/CD configuration:

stages:
- build
- stage
- release

build_sites:
stage: build
tags:
- npm
before_script:
- make deps
script:
- make build
variables:
NODE_ENV: production
artifacts:
when: on_success
expire_in: 7 days
paths:
- public

deploy_staging:
stage: stage
tags:
- npm
only:
- master
environment: staging
dependencies:
- build_sites
script:
- make deploy_server
variables:
SSH_USER: elegion

deploy_production:
stage: release
when: manual
tags:
- npm
only:
- master
environment: production
dependencies:
- build_sites
script:
- make -j2 deploy_ghpages

Things to note:

  • The build is triggered by a push to any branch. Thus, developers get quick feedback in case there is some build failure when working in a feature branch.
  • Any push to master branch triggers the staging deploy. For staging we just use rsync --archive --compress --delete --copy-links ./public ${SSH_USER}@${SSH_HOST}:. Yes, it’s that simple.
  • Production deploy job becomes available upon successful build and is as simple as a single button click.
Deploy to prod is just one click away after the green lights are on.

Conclusion

We spent about 2 days on the prototype. Add 3 days to get the full “engine” running. Configuring CI/CD took a couple of hours. The rest (content and web design) would be spent anyway due to the planned redesign. In the end, everyone is happier: developers (simple is better than complex), devops (they are pretty much out of the process at all) and content managers (to quote one of them: “What I can tell you for sure is that now I don’t get the urge to close my eyes and run away when I need to change something on the site”).

If your IT company is still using dynamic CMS systems for informational and marketing websites which has little to no backend interaction because you or your management is afraid it will cause trouble for your marketing team, look at our success story and get inspired. The modern tools like GitLab make the transition seamless and the experience more pleasing than the ugly and/or uncomfortable CMS. Even if there are some questions at first, any developer at hand can help, given the knowledge of Git, because everything is simple and intuitive.

--

--