Creating a lean stack from first principles

Keeping things simple to iterate quickly

Carlos Roman
Mission Beyond
Published in
7 min readMay 13, 2021

--

For the Talent Compass team, time is of the essence. To validate our assumptions quickly without spending too much time building throw-away components, we aimed to use as many off the shelf products as possible — free would also be a bonus point.

Finding skills

The core of one of our smoke tests was to find a user’s skills from a conversation they were having with the app. We decided a skills database/service, in which we could feed in the user inputs and have it output a list of Knowledge, Skills, Abilities, and Other characteristics (KSAOs), would be an ideal approach. So we set out to find an existing one.

After some investigation, we found the European Skills, Competences, Qualifications and Occupations (ESCO). This is a comprehensive, multilingual dataset with all the EU labour market-relevant KSAOs (We’ll call them skills for simplicity) and also a taxonomy of occupations. The best part: it’s published as a Terse RDF Triple Language file, which we could easily build into a graph representation of the data.

Why is that important? This gives the important relationship between skills and other skills, but also the relationship between skills and occupations. These are “semantic triples” that look like:

  • Subject: orchestrate music
  • Predicate: is broader skill of
  • Object: compose music

The graph’s power is in these relationships that allow us to infer new facts about the data.

We have the data, what now?

Data is useless if we can’t query it, so needed to load it into a graph database. I settled on Neo4j as it has a plugin called Neosemantics which would load the data quickly, and Neo4j’s query language (cypher) is easier to use than SPARQL.

The relationships between “Compose Music” and “Orchestrate music” in Neo4j

Using Neo4j’s full-text search, we created search indexes on the labels and descriptions to begin searching for skills from some keywords. This also meant there was no need to set up something complex like Elasticsearch to enable full-text search.

But how to expose the database in a consumable way?

I initially thought about overlaying a RESTful interface and returning a domain model as JSON. I could then pipe this into something like jq to manipulate. However, I wanted to concentrate more on the queries than on writing code for the service.

I kept it simple with Spring Boot and Spring Data Neo4j. This is probably not the stack many would have chosen for rapid application development.

Why such a heavy framework when this could have written in Golang/Rust/NodeJS/Elixir? Because of how quick it was to get it up and running.

Using Spring Initializr, I generated a basic project with all the dependencies needed. I could connect to our Neo4j database with only two lines of configuration. With another couple of lines of Kotlin, my domain model was mapped to the graph in Neo4j. A handful more and I had a Spring data repository setup. Thanks to Spring Data REST this was automatically exposed as a REST endpoint.

No need to parse responses from Neo4j into JSON objects or setting up API endpoints and paths. It was all taken care of and meant I could work on the cypher queries.

Still, querying the API endpoint was faster and easier but not ideal when running user testing. We wanted to explore the data more visually so we could feed the data back to our user interface which is a “bot” called Kai.

The visual part of the stack

We wanted to keep this quick and easy to build. We wanted a quick frontend to visualise the data we were searching for. So with the help of Ed Compton, we decided to create a frontend using Next.js. Rather than talking directly to Neo4j, we reused the REST endpoint as it was already returning to us a simplified view of the data in Neo4j.

This meant we could work in parallel on querying the data and the front end without stepping on each other's toes. So while Ed worked on displaying the data, I carried on mining the data and improving the searches.

So we now have the DNA of our stack 🥳

But we wanted others to be able to play with it. As the services were being created we made sure we could run them all in containers using Docker Compose. It was a bit of an investment, to begin with, but paid off: we had a stack that could be deployed anywhere. This achieved our goal of creating an easy and repeatable deploy process.

You might think we deployed the containers to a Kubernetes cluster (K8s), however, I felt this would be the wrong use of our time. Starting a cluster is easy with tools like kOps on AWS but it would be another component to manage. Not to mention we would also have to write a whole bunch of YAML to deploy the services and expose them to the world. Oh, and we would have to write some sort of CI/CD pipeline to push any changes to the cluster.

So having ruled out K8s, I thought “this is just a Jamstack with one service and a DB”. Pretty much just an S3 bucket with CloudFront in-front with an EC2 instance (or two) to hold the “services”. Something that could easily be spun up manually with a few clicks on the AWS console but I wanted something we could easily tear down and rebuild. This could all easily be set up with CloudFormation or Terraform in a day. Doing so felt a bit “heavy” for our startup to invest in straight away. It also meant still having to write some sort of pipeline to push changes as well.

The idea of setting some EC2 instances in autoscaling groups and testing it all worked felt like a bad use of our time and the opposite of having a lean stack.

Then I remembered I’d summarised our stack as “just a Jamstack with one service and a DB” and everything was just “containers”. If only Heroku had an easy API and was not so expensive to deploy our stack.

Turns out in the last couple of years this space has seen a lot of competition. One of the contenders is Render who describe themselves as the “Zero DevOps cloud for developers and teams”. With a tagline like that, and my DevOps engineer hat on, I felt the need to challenge them.

Challenge accepted!

So after a morning stand up and fueled with some fresh coffee, I began reading the documentation Render had published. It all seemed too good to be true. Their service spec looked just like a Docker Compose file with some extra syntactic sugar. We deployed our frontend first as it was the simplest component of our stack. Render also has native support for Next.js to make things easier. According to the Render example, deploying a Node Server meant all I needed was ten lines of YAML. It turns out that was all I needed (after tracking down the person who could give Render access to our private repo).

So with the frontend up, I looked at our Spring Boot application and how to deploy the Docker container we had for it. Again the documentation made it too good to be true. Following their Docker examples, I had our container deployed with no trouble at all. The final step was to copy and paste the same lines of YAML and change them to point to the Dockerfile for our DB. Another commit and push and we had our stack fully deployed. The only problems I ran into were the typos I made in the code referencing environment variables.

In the space of an hour or so, I had fully deployed our stack and made it accessible to the team.

There was no need to worry about EC2 autoscaling groups, K8s pod replica sets or configuring our CI/CD to watch for changes in our repo. Out of the box, Render lived up to its tagline and helped us deploy our lean stack in a “zero DevOps” way. For our rapid deployment project, Render was perfect.

Our services all deployed and running

Our lean stack

With Render as our platform, we had a quick and efficient way to deploy our project. Using the ESCO skills taxonomy we were able to build a quick skills service using Spring Boot to power our Next.js frontend. All these components together have allowed us to iterate quickly and efficiently without having to worry about things like pods or EC2 instances.

Without all those overheads we were able to concentrate on improving our queries and how we accessed the data, ultimately leading to a richer, more relevant user experience for Talent Compass.

--

--