Can you explain a little more on how you change the algorithm to remove the need for data movement…
Mark Grover
1

Sure. It‘s fairly detailed but I’ll attempt to do it justice.

All of our keys include 63-bit ID which starts out with a 40-bit timestamp. Our shard index algorithm (by default) looks a little bit like this:

Hash = MurmurHash3(key)
Shard = Hash % 20

Through dynamic configuration updates (another long story) we can change that algorithm to something more like:

Hash = MurmurHash3(key)
if Timestamp(key) > “2016–05–01” then
Shard = 20 + (Hash % 20)
else
Shard = Hash % 20
end

That change will have no immediate effect, but at 00:00 UTC on the first of May, any new id’s generated will be written to shards 20 + Hash % 20 (20–39) while the old id’s will write to Hash % 20 (0–19)