Using Real-life Experiences to Teach an Engineering Course

Roman Kuchin
Pipedrive R&D Blog
9 min readJan 13, 2021

--

Most people know that to keep your mind in good shape and to stay up-to-date, you need to consistently study new things, otherwise you’re likely to get left behind in your field. With that in mind, how would studying through teaching work?

This article is about myself and another Pipedrive infrastructure engineer, who began giving a course about IT infrastructure services at Tallinn University of Technology, and what we learned about it.

What are we teaching?

The course we were asked to teach is “IT infrastructure services”. The course’s study plan covers a list of services: DNS, AD, email, FTP, etc. Despite working as infrastructure engineers, we haven’t had any significant experience with some of these apps and didn’t feel we could teach others, for example, the internals of AD, or how email protocol works. Luckily, the university provided us with full freedom regarding what we’re teaching during the course, as long as it still relates to the course name, to some degree.

With this in mind, we decided to design a new course but keep the existing name, and instead of teaching how email/FTP/AD work, we mainly focus on how to generally operate infrastructure and how to provide infrastructure for a company as a service. And of course, it should be Infrastructure as a Code.

What services should be covered?

We decided to only cover services that we use heavily in our daily life. No point in teaching something that we’ve never deployed for production use. This decision also saved us a lot of time, we can give lectures with relatively little preparations and with several real live examples and use cases.

This is the service list that we used (short-list for almost every startup today):
- Nginx
- Bind9
- MySQL
- Prometheus
- InfluxDB+Telegraf
- Grafana
- Docker
- HAProxy
- Keepalived

Photo by Tim Gouw on Unsplash

How the course was built

The course we are teaching is mostly practical. We try to keep it this way so that it’s more interesting for us and more involving for the students. In the beginning we would spend a few hours (sometimes an entire lecture) just explaining the theory about what configuration management is, how web servers work, or why we need different SSH keys. In only a short amount of time, usually 3-4 lectures, the theory explanations would transform into technical demos, practical exercises and detailed discussions.

We provide a technical task (lab) after every lecture and students have a few virtual machines (VMs) provisioned for them that they need to configure in order to get some certain service running.

The first task is usually pretty trivial, something to get familiar with the environment. Last year for example, the task was getting access to the virtual machine using the SSH key.

Each additional task is based on a few previous ones with the overall goal to improve setup, provision some new service, or reconfigure existing ones.

The final tasks are usually quite sophisticated. Tasks such as setting up the highly available web app in Docker, behind the HAProxy, all supported with replicated MySQL cluster, own DNS, (of course backed up and monitored). This is a huge leap for those who couldn’t tell the difference between public and private SSH keys only a couple months prior.

Managing the study environment

The university created a tenant for us in the university cloud (based on OpenStack). Through this we have access to the cloud web interface, but not the API. We could easily create the first and second VM and we could also “click through” a few more, but creating 3 VMs for each of almost 100 students’ web UI was clearly not the right way to go.

In the very early preparation stage we realized that we absolutely needed API access to the cloud. While teaching a course about automation and configuration management, you can’t afford to set it all up manually!

Luckily, the cloud web interface was operating via API itself, and getting the needed API requests from the browser was easy. We then wrapped them into a Python script and got ourselves a nice tool to manage student VMs — we called it “vm-admin”. This allowed us to create and delete VMs automatically, for any student or for all of them at once.

Unfortunately, we immediately faced another problem:

  • How do we manage the list of students?
  • How do we know who owns a VM already, and who needs another one?
  • How do we provide access to the VMs?

So we thought — what if we delegate this to the students themselves? Instead of asking them to email us the SSH keys, or even worse, generating a password for any new VM and sharing it, we utilized GitHub. We wrote another script that would integrate the GitHub API and our own API to manage the VMs, and created a GitHub account for it. We named this script “github-bot” and asked our students to do two things:
1. add their SSH key to the GitHub profile
2. add our GitHub bot user as a collaborator to their repository

After that, all we needed to do was sit back, relax, and see how our GitHub bot fetches the new invites from GitHub, enrolls the students, downloads their SSH keys, starts needed VMs, and adds the keys there!

Photo by Alex Knight on Unsplash

We also added student activity monitoring — the script was searching for students with no commits in the last 15 days, and killing their VMs. Once the new activity was detected, new VMs were created — and all this was happening without a single click from us.

Best practices from our daily work

As we wanted to give practical and up-to-date knowledge to students, we established limitations similar to what we have at our own workplace.

First rule:
No manual changes are allowed, configuration management procedures should be used — Ansible helped us a lot to achieve this. To enforce the IaaC way and to push the message that we don’t expect manual configuration — we destroy and recreate VMs for students every night. Every morning VMs have new IPs so hardcoding something in the code just won’t work in a longer run.

If the Ansible code is written correctly, the provision of web services, databases, load balancers and monitoring should take only a few minutes. Quite quickly, students realize that the only way to save their work is to make all the changes in Ansible playbooks, but not directly on VMs.

Second rule:
If the code isn’t in GitHub — it doesn’t exist for us. We don’t check how VMs are configured but we check the code instead.

Third rule:
Fix teacher mistakes, we’re also humans! All the course related code (lectures, lab tasks, demos, even some scripts we use) is also on GitHub, if students see mistakes — they can create pull requests with their fixes. Quantity and quality of PRs to teachers’ repos give bonuses during an exam.

For technical issues and questions we used Github Issues.

Checking labs

Since we had quite a few students, checking each and every lab solution didn’t seem feasible. Instead, we tried to write the lab tasks in a way that students can check the result themselves. We also wrote (yet another) script “Judge Dredd”, that did some basic lab checks (required file is present, service is listening on the required port and so on), and generated the individual results page for every student. Exam solutions remained manually checked.

Storytelling

As this experience turned out to be mostly fun for us, we also tried to make sure it was fun for the students. With this in mind, we made the course feel like the story of a small startup and each of our students joined it as an infrastructure engineer. Starting off, just having the first VM with Nginx is fine, but then we need a database, then our own domain, and it becomes bigger and bigger like a snowball. Once we started to get complaints about slowness, it was the perfect time to add some monitoring and alerting for our infrastructure. One day, we noticed that we could potentially lose all customer data, so it was a good day to talk about backups and then set them up. At the very end of it all, we’ve got the exam.
Here is the exam’s intro:

Our application is ready to be released!

Today is the release day, a link to the web app was published on the Internet and you start to serve your first happy customers. You monitor all your infrastructure components and are ready to react to any problems.

Of course everything that could go wrong — goes wrong! Suddenly one of your DNS instances just died without any reason. While you were checking the logs one HAProxy crashed! And Docker containers with web app just start to disappear!

Luckily, you did a great job in the last few months to build fault-tolerant infrastructure and happy customers didn’t notice any problems with web app during that time.

Photo by heylagostechie on Unsplash

After evaluating it, we agreed that this is the exam that we ourselves would enjoy taking.

Why?

Even though we manage our study cloud with Ansible and a lot of work is done by scripts, it’s still very time consuming. There should be strong reasons and motivation to do this, here is ours:

Fun
Just as it. It’s just for fun, name it a hobby if you like. Something new to our daily routine.

Self-improvement
If you really understand a technology, you should be able to (for example) explain it to your grandmother. Sadly, we found quite a few gaps in our own experience with infrastructure services while preparing for this course, and now that it’s finished we feel much better about our knowledge. This course also helped to refresh the knowledge about well known technologies. During the lectures students sometimes ask “What if …?” or “How to …?” about things that we often never even think of!

Mad troubleshooting skills
While in our daily life, we usually see the code that works, with students it’s the opposite. We rarely have a chance to see so many broken things in our daily life as we happen see in every lab with our great students. Most of the lab time is spent helping to troubleshoot issues, which sometimes can be extremely tricky, like “Oh, you have a password length of 9500 characters (literally, 9 kilobytes), probably that’s why it doesn’t work” or “Your username starts with 0 and Ansible interprets it as octal number”.

Often these problems end up as simply fun memories afterwards, but quite a few times we’ve later seen very similar problems in our production environments, and then it’s like “Ah, I’ve already seen this error!”

New teammates
Our course lasts 16 weeks, more than enough time to get to know the students and understand who will suit our team in Pipedrive the best and bring the most value to the company. The course is taught in the second year for bachelors students, so the best students are not yet taken by other companies. Also, new interns already took a crash course about IT infrastructure services, so it takes less time to start as an infrastructure engineer intern in Pipedrive.

By the way, we’re hiring…

Interested in working in Pipedrive?

We’re currently hiring for several different positions in several different countries/cities.

Take a look and see if something suits you

Positions include:

— Back-End Developer
— Front-End Developer
— Full Stack Developer
— Lead Engineer
— Database Engineer
— iOS Developer
— Junior Infrastructure Engineer
— And several more

--

--