On solving the tzdb changes problem

This is a follow-up to our previous article, About time and computers.

Aldrin Martoq Ahumada
Servicios A0 SpA
8 min readJun 5, 2018

--

All around the world, local changes to time occur many times a year, sometimes unexpectedly. This is a big headache to everyone, and in particular to us, the people who write software and maintain those systems running.

I think we can semi automate when a new tzdb is released so we can do The Right Thing® whenever a timezone changes, given the following:

  1. Your system stores all dates as unix timestamps (seconds since the epoch, ignoring leap seconds). A lot of current software works this way.
  2. Your system/users choose a local representation of their dates using an oficial timezone, like America/Santiago. UTC offsets like -04:00 are not enough.
  3. Your system stores somewhere what version of tzdb is currently being used, like 2018e.

A typical example: a new time change is announced, and suddenly my doctor appointment for the next week is incorrect.

So, when a new version of tzdb is released (say 2018f), the sysadmin updates their tzdb packages (say apt dist-upgrade), and then runs an script that could automatically update the data affected by the change. Or, ask the user to do The Rigth Thing®; maybe a notification like:

There have been changes in your current time zone Santiago (UTC-04), would you like to review these 3 appointments and check if they are still correct?

But today, this is not possible: when a sysadmin runs apt dist-upgrade, the old 2018e version is overwritten with a new 2018f version, and the old 2018e info is lost. There is no automated way to ask "what changes occurred between 2018e and 2018f".

Having the changes between consecutive versions like 2018e to 2018f is not enough:

  • in 2017, a new timezone America/Punta_Arenas was created because part of my country decided to not change clocks for daylight time anymore, so we needed to apply the changes from America/Santiago 2016j to America/Punta_Arenas 2017a.
  • some old system are never touched in years, so we need to apply the changes between America/Santiago 2013c and America/Santiago 2018e.
  • or maybe I decide to live in Punta Arenas, and when I choose the new timezone in my user profile, would like to see which appointments would probably prefer to change. So we need to apply the changes from America/Santiago 2018e to America/Punta_Arenas 2018e, and probably ask the user if those appointments are still correct.

If we had all versions of tzdb, we could easily calculate any changes between (timezone_a, version_a) → (timezone_b, version_b), and do something about those changes. I didn't find one, so I wrote one.

A web app that demo the concept is here: https://a0.github.io/a0-tzmigration/demo/. It is usable on mobile, but it works best in big screens.

There are 52 changes between America/Santiago at 2016j and America/Punta_Arenas at 2017a

Just choose the starting timezone/version, the final timezone/version, and it will display a list of time ranges where dates could be affected by that change. If you press the data button, a table with all the changes is displayed. You can download each table in CSV format.

The same changes, in a table.

It's not perfect. I'm pretty sure it has errors. But I hope to improve it with your help, so we can finally automate the mess that happens whenever a new timezone database is released.

Adding A0 TZMigration tool to your own software

There are currently two software packages to use this:

Full documentation is being added to the official webpage https://a0.github.io/a0-tzmigration/. The basic usage is the following ruby code, it works almost the same in other languages:

require 'a0-tzmigration-ruby'a = A0::TZMigration::TZVersion.new('America/Santiago', '2015a')
b = A0::TZMigration::TZVersion.new('America/Santiago', '2015c')
puts a.changes(b)
# =>
[{:ini_str=>"1910-01-01 04:42:46 UTC",
:fin_str=>"1910-01-10 04:42:46 UTC",
:off_str=>"+00:17:14",
:ini=>-1893439034,
:fin=>-1892661434,
:off=>1034},
{:ini_str=>"1918-09-01 04:42:46 UTC",
:fin_str=>"1918-09-10 04:42:46 UTC",
:off_str=>"-00:42:46",
:ini=>-1619983034,
:fin=>-1619205434,
:off=>-2566},
{:ini_str=>"1946-07-15 04:00:00 UTC",
:fin_str=>"1946-09-01 03:00:00 UTC",
:off_str=>"+01:00:00",
:ini=>-740520000,
:fin=>-718056000,
:off=>3600},
{:ini_str=>"1947-05-22 04:00:00 UTC",
:fin_str=>"1947-05-22 05:00:00 UTC",
:off_str=>"+01:00:00",
:ini=>-713649600,
:fin=>-713646000,
:off=>3600},
{:ini_str=>"1988-10-02 04:00:00 UTC",
:fin_str=>"1988-10-09 04:00:00 UTC",
:off_str=>"-01:00:00",
:ini=>591768000,
:fin=>592372800,
:off=>-3600},
{:ini_str=>"1990-03-11 03:00:00 UTC",
:fin_str=>"1990-03-18 03:00:00 UTC",
:off_str=>"-01:00:00",
:ini=>637124400,
:fin=>637729200,
:off=>-3600}]

The above means that, for example, if your system has dates between 1990–03–11 03:00:00 UTC and 1990–03–18 03:00:00 UTC, you must substract 1 hour so the local time at that timezone is the same when you upgrade from version 2015a to 2015c.

The code above always fetch its data from our repository, a bunch of JSON files. You can ask the current known versions and timezones available at our repository with:

A0::TZMigration::TZVersion.versions
# returns known versions and timezones =>
{"2013c"=>
{"released_at"=>"2013-04-19 16:17:40 -0700",
"timezones"=>
["Africa/Abidjan", "Africa/Accra", "Africa/Addis_Ababa", "Africa/Algiers", "Africa/Asmara", …
A0::TZMigration::TZVersion.timezones
# returns known timezones and their versions =>
{"Africa/Abidjan"=>
{"versions"=>
["2013c", "2013d", "2013e", "2013f", "2013g", …

If you find a bug or need features, send them at the github issues page of each project. I will happily maintain these and other languages, send your new language request at the a0-tzmigration github issues page. I can program from COBOL to Vue.js, so you can trust me ;-)

Once you see and understand my proposal and think it fits your system, you will meet with the really hard part: integrating this in your software. I highly recommend that don't do this immediately. First, I'm pretty sure some nasty bugs will occur, maybe caused by us or yourself. So, the first changes should be done manually, and gradually increase the automation. Also, there is not a single way of doing this migration, and your system will have dates that have to be modified, some that definitively not, and some that you don't really know. You may have to ask your users if those changes are correct or not.

Time in software is really complicated. If you are new to the subject, I suggest to read our previous post: About Time and Computers. I probably make mistakes here or there, so please post any corrections in the comments.

Zach Holman recently posted an interesting article, too: UTC is enough for everyone, …right?, with some history and examples of how hard is time.

The tzdb version repository

Both packages read a set of JSON files, which contains each version of the tzdb. These files are generated by our ruby gem and are available here: https://a0.github.io/a0-tzmigration-ruby/. The ruby gem gathers its data from the ruby tzinfo-data gem, which currently has versions 2013c onwards and data up to the year ≈ 2068 (50 years from now).

There are two kind of JSON files:

Each JSON file is self explanatory, full documentation is being added in the official webpage https://a0.github.io/a0-tzmigration/, too. In the meantime, some caveats:

Within each timezone, the most important thing are transitions. Each transition has:

  • utc_timestamp: the UNIX timestamp at which the transition ocurrs, that is seconds from the epoch ignoring leap seconds.
  • utc_offset: how much this zone changes from utc, in seconds. UTC-04 is -14400.
  • utc_prevoffset: what offset from utc was used in this timezone before ocurrs the change, in seconds. UTC-03 is -10800.

We use the above to make calculations. The following data is for informational purposes, and aren't used to make calculations:

  • local_ini_str and local_fin_str: they represent how the change was seen locally, for example the government said "on October 14, 2017 at 00:00:00 hours, add 1 hour to your clocks", so local_ini_str is 2007–10–14 00:00:00 CLT and local_fin_str is 2007–10–14 01:00:00 CLST.
    Please note that CLT and CLST are not longer used in tzdb, in version 2018e that same change is displayed as 2007–10–14 00:00:00 -04 and 2007–10–14 01:00:00 -03.
  • utc_time: the same date of UNIX timestamp, as an ISO8601 string.

There are timezones that are alias or link to another one. Chile/Continental is actually America/Santiago. Never assume that a link is permanent, Africa/Brazzaville used to have transitions, but was converted to an alias of Africa/Lagos since version 2014g. Or Africa/Asmera has switched between Africa/Asmara and Africa/Nairobi.

Some final words

Time is really complicated and it's easy to get it wrong. I feel that most software of our current software is not really prepared to handle it correctly, and let the users suffer the consequences. But I think we can do better.

The first thing is to increase awareness that this is not an easy problem to tame. I recommend again to read our post About Time and Computers, talk about time with your team, and get informed more about time formats, the libraries you are using, and figure out what assumptions they do or don't.

The second is to know that there is not a single approach that fits all systems. For example, some people may prefer to just store the dates in local time, and forget all of this mess. I personally think that option is not really valid anymore: we are in a global world, and we, our users and our software must be prepared for that. Even in my own country, a new timezone America/Punta_Arenas was created and obviously we have to collaborate with people from other parts of the country.

But if keeping local time is enough for you, that is great!

Other people have told me that they you should store some kind of location to your dates (or users), for example that a live in Valparaíso, so whenever a time zone changes (or a new one is created), you can automatically update the users dates. Well, that depends if that suits your particular system, but again, in a global world, assuming I am static in a city doesn't sound too appealing to me. What if I have a remote meeting with someone in Chile, England, Argentina, and one of those countries change their timezone?

That's why I think we should not try to hide timezone to users, so they are aware of the issues, too. For example, I saw this interface for configuring a google analytics id:

A nice and simple timezone chooser interface

So everything is a big it depends, on what your software does and your users needs. Again, if you solve your issues and your users don’t have to deal with this mess, that is great, too! And please share how you are doing it, so others can benefit from that.

And the third and last thing is that we could and should do better, much much better. Each time there are more and more systems running software, so if we automate these things, we reduce the manual labor needed to keep them in sync with real life rather tan simply ignoring them.

These projects are a start, and I really hope in the future that these JSON data files or something alike could be added to the official tzdb releases. But I expect to go even further, maybe simplifying the current updates process of most software, like no need to restart our systems if a new tzdb is released. Or create an authoritative server like NTP or DNS, but for timezones. Only Time will tell.

--

--