Grouping By Time in SQL
Here’s a trick for making SQL a little better with time-series data.
SQL isn’t known for being great at telemetry (time-series) data. The tendency for it to be write intensive (little and often), key-duplicate heavy (separate sensor-readings reported against the same time), and requiring of continuum aggregations for queries; all seemingly conspire to make make SQL and time-series data unwilling bedfellows.
I’d heard of some of the tricks (like using staging tables to manage the access patterns); however today I learned about the use of ‘bin time’ to help ease the troubles that down-sampling and joining pose to relational storage.
The ‘bin time’ trick is to augment your telemetry data with a column that specifies which sample-bin the row belongs to. For example, if you intend to sample every five minutes, then your bin identifier will round down the measurement time to the preceding five minutes interval from epoch.
Using the bin time column allows you to easily use the
GROUP BYaggregations to downsample your data. It also gives you a common column in which toJOINmultiple, temporally desperate, telemetry streams on.
This means that, should you have a mixture of fast and slow telemetry, you can even maintain a table with slow moving state that can injected back into the fast telemetry stream.
The binning period itself is, however, up to you; and should balance the minimum sampling time with the overhead created by the count of samples likely to be recorded within it. Although don’t worry too much- you needn’t only have one bin time column, so you can still maintain a lot of flexibility.
Of course, whether or not you should twist all your database concerns into SQL shaped ones is an argument for later- but the insight shows that just because your problem domain is a time based one, doesn’t mean it can’t be a SQL one too!


