If you like the series, check out my upcoming book on Database Internals!

This is a series of articles introducing Distributed Systems concepts used in databases. First article of the cycle, was about Links, Two Generals and Impossibility Problem. If you’re interested in the subject, you can also check series about Disk IO.

Main goal here is to help to build up knowledge that will help you to understand how databases work and what decisions database implementers make, to be able to better operate databases or get up to speed and start working on one.

Today we’ll continue building up…

If you like the series, check out my upcoming book on Database Internals!

This is a series of articles introducing Distributed Systems concepts used in databases. First article of the cycle, was about Links, Two Generals and Impossibility Problem. If you’re interested in the subject, you can also check series about Disk IO.

Main goal here is to help to build up knowledge that will help you to understand how databases work and what decisions database implementers make, to be able to better operate databases or get up to speed and start working on one.

Distributed Systems

Distributed system can be thought…

If you like the series, check out my upcoming book on Database Internals!

Series consist of 5 pieces:

New series on Distributed System concepts in Databases can be found here.

In the first and second parts, we’ve discussed underlying Operating System mechanisms that help to perform writes on disk. In the third part, we’ve started talking about an immutable on-disk data structure, LSM Trees. …

This week’s reading was a paper by the awesome HASLab research group. Paper addresses a problem of anti-entropy overhead, generated by Merkle Trees and introduces a framework for per-object causal consistency. The paper itself can be found here, another paper describing the clock implementations mentioned in paper in more details can be found here, the database implementation is published here and the clocks library can be found here.

This post is not meant as a replacement for reading the papers, but might help to quickly glance through the concepts before diving into it. …

If you like the series, check out my upcoming book on Database Internals!

Series consist of 5 pieces:

New series on Distributed System concepts in Databases can be found here.

In previous posts, we’ve discussed different flavours of IO and an immutable on disk data structure called LSM Tree. …

If you like the series, check out my book on Database Internals!

New series on Distributed System concepts in Databases can be found here.

In the first and second parts, we’ve discussed underlying Operating System mechanisms that help to perform writes on disk. Now, it’s time to start moving towards higher level concepts.

Today we’re going to explore one of the types of storage often used in the databases. Each of them have their own advantages and disadvantages, so building a database system is always about trade-offs and we’ll try to address some of those as well.

Mutable vs Immutable data structures

One of the…

If you like the series, check out my upcoming book on Database Internals!

Series consist of 5 pieces:

A new series on Distributed System concepts in Databases can be found here.

Memory Mapping

Memory mapping (mmap) allows you to access a file as if it was loaded in memory entirely. It simplies file access and is frequently used by database and application developers.

Memory mapping maps the process virtual pages directly to the Kernel Page Cache, avoiding additional copy from and to user-space buffer as it is done with Standard IO.

With mmap a file can be mapped to a memory segment privately or…

If you like this story, check out my upcoming book on Database Internals!

Series consist of 5 pieces:

New series on Distributed System concepts in Databases can be found here.

Knowing how IO works and understanding use-cases and trade-offs of algorithms and storage systems can make lives of developers and operators much better: they will be able to make better choices upfront (based on what’s under the hood of the database they’re evaluating), troubleshoot…

The three books I’ll be reviewing today are looking at the ways statistics and math can be badly misused. In the age when data is “the new oil”, when the Internet is full of unchecked facts, it is very important to know your brain’s blind spots and be equipped with the machinery to analyse the presented information and tell the truth from trickery.

Weapons of Math Destruction by Cathy O’Neil

Weapons of Math Destruction by Cathy O’Neil

Weapons of Math Destruction, just as the witty name suggests, describes the mathematical machinery used in the new world of “big data” by signifying the human biases, coming up with the new ones (some we would never…

These are the notes I was taking while reading Three Body Problem by Cixin Liu. If you’d like to read an article on note taking, you can check out Future-Proof Reading.

Ideas

The book discusses the problem of unnecessary destruction caused by the progress of the society and the moral side of it. Mike Evans’s father is one of the examples of greed-driven individuals who believes that for the sakes of progress, any sacrifice is tolerable. Other example of primal, primitive greed are humans that destroy the forests that Mike planted. …

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store