How I stay positive
Recently, a few people at work have asked me how I am able to be positive. Honestly, my first reaction was, “I seem positive? Really!?” I know that data migration is not a sexy project. What’s so fascinating about moving data from point A to B? That really seems mundane. However, I started to remember saying to myself things were a LOT more interesting than they seem. It is these experiences I’d like to share with you.
I’ll admit, I’m a bit of a workaholic. Tell me to take a vacation or come to the office, I will choose to go to the office. Three day weekends are hell for me. Guess I must be crazy. Seriously though, when Kris O stated that he’s happy that there’s work to do, I felt myself agreeing with him. It’s one reason why I’m able to be positive. I’d rather be busy and producing something that I believe will help Zappos get to the next stage.
What about other reasons? What makes data migration fun and interesting? Turns out there is quite a bit. I think the main reason most of us may think it is boring is due to the abstraction, which is moving data from A to B. But what if I said there were a lot of interesting challenges along the way that makes it a very interesting project? You probably wouldn’t believe me, right? So I’ll list a few things that have happened and then I’ll let you decide for yourself.
Early in the project, it was apparent that the data being handled was critical. All those emails, passwords, and credits cards made the project so radioactive that we’d have to wear radiation suits just to avoid getting burned. While I exaggerated a bit, this made for an interesting problem to solve.
First of all, encryption is key. Choosing what to use was the easy part. Choosing how to make it work took a bit more effort. Learning to use BouncyCastle’s PGP implementation with a simple InputStream or OutputStream interface was just awesome! How many can say that they’ve done something like this?
That was just one of the few things early on in the project. Later, we were at a point in the project where optimization is necessary. There were enough features implemented to see the slow points in the code. There were quite a few things we found, some which were easy to fix. One simple one was to use a buffering SQS client which effectively provides asynchronous messaging over the non-async API. If you’re familiar with the Async versions of most of Amazon’s AWS clients, you’ll definitely understand how helpful this becomes. Not only do you get non-blocking calls, you can easily achieve it via spring configuration change! Yay for DI!
Of course, there were more optimizations to be made. There were areas where some refactoring was required to decouple classes from requiring singleton instances. The fun part came from trying to find some way to parallelize handling of a customer’s data. Remember, we are dealing with different data sets such as identity, addresses, credit cards, and favorites. Normally, just creating a Runnable or Callable out of handling the separate data sets would be enough. Luckily, we have handlers for each data set so doing something like this is actually quite easy. But things are never as they seem as it turns out that there are dependencies between data sets.
Customer data basically requires a customer’s identity to exist first. Importing addresses and favorites require the customerId. Credit cards actually depend on both addresses and the identity. What we have is a situation where identity must complete before starting on addresses and favorites. Credits cards can’t start until addresses are done.
So how can we solve this issue? We are trying to create Future objects but need to control how when certain Futures can start processing. How do you chain asynchronous operations? I’ll give you two hints. First, look at ListenableFuture found in Google’s guava library. Second, we are dealing with a tree structure.
How many times in your programming career so far have you needed to use a tree data structure? What about recursion? I will bet it is extremely low and probably less than the number of fingers on your hand. But it is stuff like this that makes things interesting and fun even with something like data migration.
Anyway, I know this was long winded. And I apologize if you feel that I’ve wasted your time. For those that feel otherwise, I hope my experience has helped you look at things a bit differently. Remember, things are never as they seem. If you look carefully enough, you’ll find those hidden gems.