An Analysis of Causality

John Mastro
Apr 4 · 4 min read

N.B., this is just for fun. We don’t really believe that Mercury being in retrograde caused these events, even though that is exactly what happened.

true causality

TL;DR

Mercury was in retrograde recently and it really made our lives difficult here on the Data team at Ro.

Background

When a planet is “in retrograde,” it means that it appears to be moving backward through the sky. Of course, it’s not really moving backwards — it’s just a trick of the planet’s speed and position relative to us. Mercury goes retrograde three or four times a year, for about three and a half weeks each time.

Mercury is named for the messenger of the Roman gods. In astrology, Mercury rules over communications and technology, so when Mercury is in retrograde you’re likely to experience some mayhem in those areas. Its chaos is most acute right as it turns retrograde and then eventually goes direct, and extends into pre- and post-retrograde “shadow” periods of a couple weeks before and after retrograde.

What’s past is, in retrospect, ominous foreshadowing

Going into Mercury’s last period in retrograde, we were using a budget syncing tool to sync data into the previous incarnation of our data warehouse. For no apparent reason, this stopped working entirely one day. Neither we nor the support team were able to satisfactorily determine what was going wrong.

At the time, we didn’t understand Mercury’s significance and attributed this experience to complex interactions between complex systems and perhaps the tool just not being that great for the job. We were in the process of migrating to a different system anyway, so we finished that up and never looked back, until it was too late.

The events

Now we use a different vendor to sync into our new data warehouse. It worked quite well until Mercury went retrograde again, and then suddenly it… didn’t. We started seeing discrepancies of a few hours each where not all row insertions and updates would make it to the warehouse. This kicked off a long odyssey over nearly a month, including many early mornings and late nights, with no regard for weekends.

Some of the earthly explanation for the events of this period seem plausible enough, while others frankly strain credulity. But what they all have in common is that they ignore the real issue: Mercury.

Our response

When the very heavens in the sky are against you, sometimes there’s really not much you can do besides hold tight and wait for it to be over. Luckily, your author, who had the distinct pleasure of spending a great deal of time on this general issue, is a Taurus. In other words, he is stubborn enough to work on insoluble tasks almost indefinitely. He just kept that up until the galaxy stopped being so inhospitable and things went back to normal.

I’m just joking (and I want to be especially clear about that if you are one of my superiors). Some of the things we did include:

  • Coordinated the investigation and debugging efforts between our team, other teams at Ro, and the vendors involved
  • Built an (initially ad-hoc) system to make sure we always had verified, discrepancy-free data through midnight in our warehouse and available to business users
  • Refined our data monitoring to pinpoint discrepancies between the source database and the warehouse
  • Set up Metabase to serve the business’s most important real-time data needs directly from the source until real-time data was restored to the warehouse
  • Audited all of the data being replicated to our warehouse to minimize the amount of data that needs to be synced
  • Dug into the operational details of managing deployments involving schema changes and data migrations in the presence of a real-time data requirement
  • Started discovery on a fallback system for replication that avoids third parties altogether

Some of these things are temporary bandaids, but others leave us significantly better prepared going forward. Yet none of these things can truly solve the underlying issue (still Mercury).

Then, on March 28th, Mercury went direct, and the next day the sync started working again, without errors or discrepancies. On the same day, your author’s kitchen sink started leaking, which proves the cosmos have a sense of humor.

Of course, we can hardly return to the status quo ante. I described above some of the steps we’ve taken to make ourselves more resilient to these types of problems, and we have more planned for the near future. After all, we’re still in the post-retrograde shadow period, and Mercury will go retrograde once again in July.

Causal analysis

While it’s awfully suggestive that Mercury rules over technology and communications, and these problems lined up so well with Mercury’s period in retrograde, I recognize that this isn’t really proof.

I therefore asked our team’s Data Scientist if she could build a Machine Learning model that would really put this to the test. All you have to do one-hot encode some things, normalize and/or regularize some other things, build a neural network, and you get your answer. I mean, it’s science, and tensors can be multi-dimensional, just like the universe.

Anyway, she listened to the problem and said “p = .99”, so there you have it: a near-perfect p-value. It must be true!

Dedicated to a certain Gemini ;-)

Ro Data Team Blog

Ro Data Team Blog: data analytics, data engineering, data science

John Mastro

Written by

Ro Data Team Blog

Ro Data Team Blog: data analytics, data engineering, data science

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade