Distributed Systems 101

Alan Byrne
Version 1
Published in
2 min readMar 28, 2024

This short article provides useful resources to get software engineers started with distributed systems.

AI generated image of “distributed nodes”

A distributed system is one in which the failure of a computer you didn’t even know existed can render your own computer unusable.

— Leslie Lamport

MIT 6.824 Distributed Systems Course

MIT provide an excellent course, it’s free and you don’t have to sign up for anything. You will need to brush up on the Go programming language. The Go website has enough to get you up to speed.

The course covers lots of topics like Map Reduce, Fault Tolerance, Replication, Distributed Transactions etc. The course will get you into the mindset required for distributed programming. You will be sorting out race conditions in your sleep, and hopefully find yourself wondering “what happens if some system crashes at exactly this line of code” — this is a good thing.

I particularly enjoyed learning about the Raft consensus algorithm. You’ll see Raft consensus being used mainly because it’s considered easier to understand (and therefore troubleshoot) than Paxos. For example, recent versions of Kafka use a self managed Raft algorithm instead of an additional Zookeeper dependency. I highly recommend watching the Raft lecture (part 1 and 2), reading the RAFT paper, and trying to write the algorithm from scratch yourself. There are unit tests provided for each lab that will provide feedback on your implementation.

Note — there are newer versions of the course available, I opted for the 2020 version because the lecturer and videos are great. To use a different version of the course, change the year contained within the url provided.

Microservices

If you’re interested in microservices or service orientated architecture in general, Sam Newman provides a great book on building microservices. It covers topics like service boundaries, information hiding, the different types of coupling etc. Sam also guides on when they are appropriate to use and to what extent!

Event Streaming

For event streaming I recommend looking at the Kafka crash course provided by Tim Berglund followed by some hands on experience in the Kafka 101 course. Learn more about use-cases and design with the free confluent book (you will need to provide your e-mail address in exchange for the pdf). Alternatively, if you have an O’Reilly Learning subscription, access the book there.

If you found the above resources useful or would like to provide more resources then please share in the comments.

Good luck!

About the author

Alan Byrne is a software engineer and technical lead at Version 1. If you’re interested in joining us, please take a look at our careers page for open roles!

--

--