Updating a social network users stream — timings

Fast is never fast enough

Mark Papadakis
Distributed Systems
3 min readSep 5, 2013

--

I have been testing CloudDS’s multiple keys per RowMutation feature and I thought I ‘d try to time how long it would take a fictional social network service to append a new event id generated by a user to all its followers streams (fanout).

This is how twitter implements tweets streams. I figured it could be useful down the road for one thing or another.

In this experiment, there are 5 nodes involved, across 2 different datacenters. The replication strategy is set to network aware (which means it will distribute copies across DCs to maximize availability) and replication factor is set to 3 ( for every object we keep another 2 copies around ).

For that update request the consistency level is set to QUORUM, which means CloudDS needs to receive acknowledgements from at least (3/2+1 = 2) nodes for each key update prior to responding to the client.

This is the code:

And this is how long this takes for a user with 100,000 followers.

0.253

UPDATE:

0.246

A few of those ideas did work, and this resulted in a close to 4% boost. Not much, but it it proved that there is more to gain on there. When digests cache is disabled for the ‘tweets’ column family, it drops to 0.224.

UPDATE:
This has now been reduced to 0.214(13% improvement). If digests cache is disabled, it should drop to 0.2. 0.2 seconds for 100,000 updates is close to what we were aiming for.

UPDATE:
Dropped to 0.194. With digests cache disabled, this drops to below 0.15 seconds. Goal accomplished.

The nodes are configured to use upto 256MB of RAM for the memory tables(when the sum of all active memory tables size exceeds that threshold, CloudDS will flush one or more to SSTables to respect the memory thresholds) and underlying storage is HDD (not SSD), using XFS.

The updates are committed to disk (fdatasync() right after disk operations) — so they persist fully. CloudDS supports in-memory only storage which should probably drive timings even lower, if it was configured to use it for the ‘stream’ ColumnFamily.

I am fairly confident this can be reduced even further by setting memory thresholds higher and adjusting other default configuration tunables, and/or switching to SSD for the underlying storage medium, but I am pleased with the results anyway. We have a few ideas that will help reduce that time by at least 20% if they work in practice (work in progress). If they do, this post will be updated to reflect that.

One should never be satisfied with ‘fast enough’, especially when faster means everything else that depends on it will go faster too, and resources are put to better use.

--

--

Mark Papadakis
Distributed Systems

Seeking Knowledge 24x7. Head of R&D at Phaistos Networks | Simple is Beautiful — https://markpapadakis.com/