Druid vs OpenTSDB
Disclaimers: a) I haven’t actually benchmarked these b) I’m not an expert on either system so there could be inaccuracies
I’ve read that Druid out performs OpenTSDB for queries, especially as the number of dimensions grows. Why is this so?
TL;DR: HBase is optimized for low-latency writes, point reads and small scans, not for aggregating billions of rows. Druid is optimized for filtering and aggregation.
OpenTSDB stores metrics in sorted order by metricname,time,[tags]. To filter and aggregate, it needs to scan all row keys in the time interval, parse the tags to see if they match the filter criteria, and perform the aggregation on the column data. If we’re lucky, the data is already resident in the block cache. Filtering can be done on the region server but all data passing the filters must be sent to the query client for aggregation (I don’t think co-processors are flexible enough to be applied per query). Also, because HBase has a Log-Structured Merge architecture, queries for newer data may be slower as the RegionServer may have to scan and merge data from the memtable and multiple HFiles.
Druid stores metrics in column format as well, applying compression like LZF, and ideally keeps all the data in memory. It also keeps bitmaps indexes for dimensions that it can use to efficiently pick out only the rows it needs to read. Filtering is fast because bitwise boolean logic can be used on the entire vector at once. Aggregation is fast in Druid because it only reads the data that’s needed and divides the aggregation work amongst all the nodes serving the query. Only the partial aggregates need to be combined at the broker node.
Since HBase is strongly consistent and optimized for writes, data is only served from a single region server. You cannot scale up the query capacity for a given shard beyond dedicating an entire RegionServer to it. Druid, on the other hand, treats shards (called segments) as immutable and can replicate them to as many historical nodes as needed to achieve the desired query concurrency and throughput.
In Druid, since metrics are grouped into datasources, if you run a query that aggregates multiple metrics, the filtering computation can be shared. OpenTSDB would require two separate queries, both doing scanning and filtering.