Date formats for Apache CouchDB and Cloudant
Picking the best format when storing date & time in JSON
Apache CouchDB and IBM Cloudant are JSON document stores and as such don’t have a native date type — only the data primitives allowed in the JSON specification.
Before discussing date storage formats, we should first tackle the issue of timezones. Timezones are utterly baffling so best practice is to store dates in the UTC timezone in the database even if your data originated from many places across the globe. Storing data in the same univeral timezone in the database means all the dates and times in our date store are in the same “units”. There’s nothing stopping your front end app converting these dates into a format according to the locale of each of your users. Using UTC also neatly sidesteps the issue of Daylight Savings Time!
As to storage formats, there are three options in common use.
1. Store date as a single ISO-8601 string
There’s an international standard for storing date and time as a human and machine readable string:
It consists of:
- year, month and day separated by hyphens.
- a “T” character to separate the date and time elements.
- hour, minute and second (to microsecond precision) separated by colons.
- UTC timezone indicated by
Z, which refers to the military timezone Zulu.
which when packaged into a Design Document with a built-in reducer can be queried with:
This produces time-ordered, hierarchical, grouped aggregations of your data.
What is less obvious is that if you had chosen to store the data in different format, such as
2018-05-02 15:02:40, then the call to
2. Store date as a timestamp
Instead of storing your date and time as a string, you could opt to store an integer timestamp instead — specifically, the number of milliseconds since 1 January 1970 00:00:00 UTC.
Although this is not as human-readable as the ISO-8601 string, it is still machine-readable, sorts in date & time order and can be easily converted back into a Date object in a map function:
This format is not really suitable for storing dates before 1970 (although negative timestamps do work!) but it may be useful for simple date arithmetic as one timestamp can be subtracted from another to calculate the time difference.
3. Store date and time components in separate fields
The third option is to store each date and time component separately:
This is more verbose than the previous two solutions but has the advantage that the data is ready for querying and indexing without any pre-processing. This is particularly important when using Cloudant Query — MapReduce views are commonly used to pre-process data before it is emitted into the index but there is no such facility when using Cloudant Query. For Cloudant Query, the data has to be in the document and in the correct format.
Create an index with the
/db/_index endpoint or through the dashboard:
and query the index using the
/db/_find endpoint or again, using the dashboard:
If the data were stored in ISO-8601 format, it would not be possible to index or query a single component (e.g. the year) on its own using Cloudant Query.
Date & time gotchas
Date.setMonth()functions use number 0-11 to represent months January to December.
getFullYear(), not the function names you might expect!
- If you use string manipulation to split up a date you may be run into this conundrum in your map function
In the above code we’ve elected to store only the date as a string. We’re splitting the string by the
- character, turning the year/month/day pieces into integers and using them to create a hierarchical index. What's the problem with this approach?
Take the date
2018-08-09 (9th of August). In this case the data calculated would be:
- year — 2018
- month — 0
- day — 1
09 to indicate that you wish the date to be parsed as Octal numbers! It can be remedied with:
This indicates that your strings are in decimal.