How to start your career as a Database Developer

This article will focus on paving a road for the reader, from the starting line to being good enough to land a job as a Database Developer. However, this article expects the reader to have already studied another programming language before attempting to become a DB specialist or have some sort of CS (computer science) background.

Sanjin Halilović: Dribble & Behance

Imagine the following scenario, a baron in the medieval era requests from one of his engineers to construct living quarters. The engineer is given only some basic tools: an axe, a hammer, a saw, a hoe, and some raw materials. The engineer has many options on how to proceed from here, from making simple huts with very basic materials, to making a base and building a full house that can house multiple people.

Source: Pixo Sprout’s Blog

Databases are similar to this analogy, the user is given mostly basic tools, similar to C++ in a sense (although not on the same level). Databases are basically a big puzzle wrapped around a deceivingly simple interface called SQL. Databases offer a lot of interesting challenges if a person is willing to dig more than just skin deep.

Where to start

As with most things in this world, one must first know the rules before breaking them (and there’s quite a lot of breaking to be done). This means starting all the way from the beginning.

  1. Relational Algebra
  2. SQL (DML, DDL, DCL)
  3. Subqueries and CTEs
  4. Indexes
  5. Triggers
  6. Entity Relationship Model (ER Model)
  7. Normal Forms (NF)

To cover all of these topics, a great book can be referenced here: Database System Concepts (at the time of writing, the 7th edition is the latest). The chapters that cover these topics:

  • Chapter 1 — Some history (bonus reading)
  • Chapter 2 — Relational algebra, basics of database schemas
  • Chapter 3 — Introduction to SQL
  • Chapters 4 and 5 —SQL
  • Chapters 6 and 7 — Database design

These are placed in sequence according to the aforementioned topics (the chapters themselves include more than just the aforementioned topics which should be learned as well).

This book is only the start, researching each topic further by itself is also recommended after reading the chapters mentioned above.

There is one more thing that can be done to improve on a daily basis without taking more than 30 minutes: checking dba.stackexchange. It takes no more than 30 minutes each day to learn something new. First, mark a database system tag(ex. PostgreSQL) and all questions with this tag will be colored making it easier to flip through them. The method is quite simple: an unfamiliar topic? Great! Research it and learn it. Know the answer to the question? Even better! Answer it and become an active member of the community. Doing this once in the morning and once before sleep doesn’t take more than 30minutes, yet a lot of knowledge gets accumulated across time quickly, it’s about consistency. However, do expect that potentially some of the topics will take more than 30minutes of research.

Sample DBs to help on this adventure:

There are also many SQL exercises out there, like this one and this one.

This stage should last ~1 month (or more, depending on the number of hours spent and how deep the research goes).

What’s next? Books!

Source: Pixo Sprout’s Blog

This is where the digging goes more than just skin deep. Starting off with normal forms, something every DB specialist should know is C.J. Date’s theory on the 1st NF, namely, atomic values and how he considers the rule ambiguous. Here is my answer to a dba.stackexchange question on this specific topic. The Third Manifesto has quite a lot of reading material for those interested in the theories behind these topics, written by C.J. Date and Hugh Darwen.

Finishing the previously recommended book (Database System Concepts) is also a good idea.

However, starting from this point, it is important to learn some more advanced topics and approaches when designing and programming on the Database layer. There are basically two books here, one for Microsoft SQL Server, the other for PostgreSQL. Although these books are technology specific, the approaches and concepts explained inside apply to other systems as well, albeit with some differences and nuances.

Understanding Internal Implementations of a DBMS starts to become the name of the game. Unfortunately, this is where it starts to become hard to find materials on the topic. Microsoft SQL Server does have a book however: SQL Server 2012 Internals. At first glance “2012” might seem old, but majority of it hasn’t changed as we’re talking about internal implementations of the system and not the SQL standard (fun fact: at the time of writing, the current actual SQL standard is SQL2016, so even that doesn’t change that often).

As for PostgreSQL, the only real documentation of its architecture is available online for free on interdb written by Hironobu Suzuki.

MySQL has the official documentation available online, unfortunately not all sections have been documented.

During the learning process, it is a good idea also to create your own GitHub account and make some smaller projects. Use this to learn about Git (thanks to my colleague Haris Mašović for providing me with this amazing resource so I could share it).

Here are also some famous people to follow:

This stage will most likely last multiple months (even years). It is plausible to have landed a job while still not having finished all of the content written above, as it takes quite a while to truly understand the concepts and master them.

Further readings

Want more? Consider diving into the internal implementations of indexes. Starting from data structures (ex. Btrees) to how each DBMS handles pages on disk. To start this adventure, consider the following:

  • Oracle — 4Kb pages
  • SQL Server — 8Kb pages
  • PostgreSQL — 8Kb pages
  • MySQL — 16Kb pages

And this does influence the behavior of indexes in a system.

If this is still not enough, then consider going even deeper to how a DBMS is written (usually in C or C++). CMU (Carnegie Mellon University) has publicly available courses on database systems from introductory to advanced.

Last note

Never be afraid to sign up for a job missing few of the requirements, it is only natural not to know everything at the start. This article is simply the tip of the iceberg, there is a lot to be learned out there.

Best of luck on the Database Adventure!

Sanjin Halilović: Dribble & Behance

--

--