Effective Unit Tests for Time Travel

Aubrey Goodman
Motiv Engineering
Published in
7 min readMar 23, 2018

At Motiv, we strive to do things others have deemed impossible. We’ve done some amazing work to cram a lot of sophisticated sensor equipment and processing capability into a ring. Motiv Ring does imbue the wearer with special powers, but not the time travelling kind. While we can’t make any claims of time travel, we have seen some strange things in our system logs related to time. I’d like to share some of those things with you. We do the hard work so you don’t have to.

First, a word about Daylight Saving Time

Most of the world has a sensible approach to time. Here in the USA, we suffer from a long-failed experiment to win back more daylight. This manifests in confusing three-character time zone codes (PST, PDT), encoding not only where you are on the planet, but also any silly local offset to which you might be subjected. Unsuspecting users may simply overlook the bizarre behavior of their mobile app when, twice a year, we abruptly redefine “now” as “an hour ago” or “an hour from what was otherwise now.” This nuance of US federal law requires everyone to introduce a different offset suddenly on pre-coordinated dates. We all “spring forward” or “fall back”, leaving our computers wondering what is going on.

Sure, but it’s still just a time zone offset, right?

So far, we’ve gotten lucky. It’s currently cost-prohibitive to travel faster along the surface of the planet than one time zone per hour. For the collective sanity of all my fellow engineers, let’s hope this continues long enough for compound interest in our pensions to distract us. Until then, we’re comfortable knowing there is very little chance of a user’s going for a run in San Francisco then jumping on a plane before their wearable has had a chance to sync, only to sync in New York, making it look like they ran three hours ago. Worse still, if there’s any sort of bug related to time, it quickly becomes unclear if values in the future are actually in the past.

Pitfall #0: Assuming Local Time / Ignoring Offset

Problem

You board a transcontinental flight in San Francisco at 12pm, flying east. When you land in New York, the local time is probably around 9pm. Without crystal clear agreement of the time zone offset, your phone syncs with your connected device, and data is recorded based on the time stamp from the originating time zone. Then, when your phone has a chance to sync with the cell tower network, it adjusts the clock accordingly. This wreaks havoc on your app’s ability to align itself in space-time.

If your mobile device is logging current time without clear and distinct policy surrounding time zone offset, all bets are off. Put another way, it’s critically important all the components of your ecosystem agree about the current time. This is magnified when you have users spanning multiple time zones.

Symptoms / Early Warnings

  • Odd time shifts — Data recorded apparently from the future, but contextually pinned to the past. This means your ecosystem is potentially out of sync with itself. If you find yourself filtering out future timestamps in an SQL query, you may suffer this problem.
  • Unique index constraint violations — If you use a database system where the time stamp is used as a primary key, as is common with time-series data, you may experience bizarre constraint violations when a device changes its system clock. Some records may have conflicting timestamps due to overlaps introduced by the clock change. Records received before the clock change may overlap with those received afterward.

Solution: Standardize on UTC (Zulu) time

If every device participating in your ecosystem exists on GMT, the only time offsets your systems will experience are relativistic in nature.

Reasoning

Let’s be honest — time zone confusion is a 20th century problem. If we’re looking forward to 21st century problems, relativistic time distortion seems par for the course.

Pitfall #1: Assumptions About Reference Times

Problem

As a general rule, software engineers love functions with inputs and outputs. It’s easy to think of things in these terms. We can provide a controlled set of input values and measure the output, comparing it against expectations. If the values match, the test passes. You could spend your whole career, building a rich and complex understanding of the systems you curate, and never encounter the scenario I’m about to share. Hopefully, this helps you to avoid what I suffered.

I once designed and built a dynamic rules-based event processor system. This system reacted to simple events with complex reactions, based on conditions defined by a downloaded file. One of the components of this system was a time filter. If an event occurred within some threshold of a reference time, the function would return true. Naturally, I created a test for this. Tests passed, and code review was positive, so I merged the changes. A few days later, one of the tests failed.

The triage team scratched their heads and thought “well, if it happens again, let’s re-evaluate.” The next day, the tests passed, so the triage team disregarded the issue as an isolated anomaly. A week later, the same test failed again. This time, the triage team escalated the issue directly to me, saying “we don’t know why, but it happened again.”

At this point, I had the benefit of perspective. When I saw the report, I thought there are very few reasons for this sort of problem. It could be concurrency failure. We had previously had concurrency issues with some C++ standard libraries. (There’s a special place in hell for std::localtime.) So, I checked for use of undesirable libraries, and I found we were doing everything right. Then, I saw the culprit. Deep inside a logic flow, I found this:

NSDate* now = [NSDate date];

On its face, this is innocuous. Surely, it’s normal to know when things happen. Yes. Normally, yes. However, in the world of unit tests, where we have prescriptive expectations, the consideration of “now” is weird. From the perspective of the test, I had written something like this:

assert(!isTuesday, “encountered Tuesday”);

Of course no one would write this, and I certainly didn’t write this intentionally. It ended up looking more like this:

assert(expectedDayOfWeek == actualDayOfWeek, “unexpected day of week”);

I’m sure we can all agree we the nuance is easily lost.

Symptoms / Early Warnings

  • Repeating failures on specific days of the week.
  • Assumptions about “now” — Any time you see some code attempting to measure the current time, take extra care in assessing the risks. This goes double for delayed invocation, dispatching blocks to asynchronous queues. It’s easy to inadvertently introduce an unexpected delay.

Solution: Injecting Time Values (Don’t assume “now”)

The simple solution is to move the “now” evaluation up one level to the calling component. In this case, it’s a simple matter of refactoring, like this:

// TimeCondition.swift
// determines if delta between date and reference is greater than
// a specified threshold

// old
func isSatisfied() -> Bool {
let date = Date() // ← assumption about “now”
return date.timeInterval(since: self.referenceDate) > self.thresholdValue
}

// new
func isSatisfied(date: Date) -> Bool {
return date.timeInterval(since: self.referenceDate) > self.thresholdValue
}

Reasoning

  • Inversion of Control — Requiring an input to the method enables a wider variety of potential uses. This also allows you to craft your tests with specific values for known edge cases, rather than relying on hidden assumptions.
  • Self-describing Design — When a developer encounters your method descriptor with a date value in its signature, they know immediately they will be required to provide a date.

Pitfall #2: Device System Clock

Problem

Mobile phones offer many luxuries to users. One such convenience is the ability to synchronize the system clock to the universally adopted definition of “now”. The vast majority of users have enabled this useful feature. However, there are some users who have adjusted their device system clocks, and their device does not agree with the internet about what time it is. This manifests in strange data, either from the future or the past.

When working with a distributed hardware ecosystem, where some computation happens on a Motiv Ring and some happens on a mobile device, time synchronization is even more critical to overall system health. When we’re measuring time-series data, it’s essential to have nearly perfect agreement between devices about the current time.

Everything in the universe experiences time slightly differently, due to motion relative to other things in the universe. For example, clocks at substantially different altitudes experience subtle drift in their values, when measured over long periods of time. To account for this, the clock in a Motiv Ring is synchronized with the mobile device each time we sync. If these values diverge, it can manifest in very unusual ways.

Symptoms / Early Warnings

  • Negative or Large Time Offsets — We use a cloud-based analytics platform to receive application log data from devices in the wild. Our system includes consideration for both the time stamp when the event occurred and when the log was received by the web service. Large discrepancies between actual and received time stamps can indicate non-standard system clock.
  • Unique index constraint violations — As noted above, using the system time as a primary key for records created at a specific time can lead to unique constraint violations.

Solution: Use Network Time Protocol (NTP)

Rather than relying on the user to specify the correct time, query an NTP server cluster on app launch. This way, you avoid any unexpected behavior due to user customization of their device.

Note: this only works for network-connected devices.

Lyft offers a convenient Swift library for integrating this functionality into your iOS app. (https://eng.lyft.com/freezing-time-6ebe8ffe3321)

Instacart also developed a similar library, called True Time. (https://tech.instacart.com/offline-first-introducing-truetime-for-swift-and-android-15e5d968df96)

--

--