It Worked in Testing

Launch day is just the beginning of owning a piece of software or a new warehouse.

Christopher Hazlett
Code and Conveyors
8 min read4 days ago

--

The lights were the kind of lights that drain you. Yellowing with that background hum of sorrow only fluorescent lights can muster with efficiency. The coffee and pastries were not better, but a conference room at the Knight’s Inn near our soon-to-be Las Vegas cross-dock facility provided only the barest of essentials. We really didn’t care about it. There were a dozen of us looking over project plans and facility designs and we were all very excited, so much so that the oddly stained maroon carpet barely registered as a concern at all. In retrospect, the setting was burned into my memory so much that I can even recall the smell of the place (a mix of cleaning products and a danish), such was the importance of opening a new facility. This was also the first facility Patrick and I launched together.

This logistical expansion was part of Gilt’s impressive and aggressive expansion in the early 2010’s. With so much product being manufactured or sourced from the West Coast, it made a lot of sense at the time to open a facility that could process products closer to their origination and quickly get them to their waiting customers within 24 hours of initial receipt (a.k.a. Cross Dock). This should save the company money and get orders to their customers faster. A win, win.

As facilities go, it was not large, and it wasn’t supposed to be. It was not built for storage, so 25,000 square feet was organized with a receiving area that fed into fast track storage which fed directly to the pickers on the opposing side with a single winding conveyor. By the time inventory was put away, orders were already being picked and shipped. It was extremely simple by design. But to someone who hadn’t had the pleasure of opening a brand new facility yet, it was magnificent and intoxicating.

The kick-off was a pretty standard affair. A round robin of introductions. A little speech by the Chief Supply Chain Officer. A walk through the tight timeline of 6 weeks. An assurance that the material handling equipment was already ordered and would be installed in time for the target date. A comprehensive integration design and testing plan was already underway. You get the picture. Everything, from the conveyors to the technical integration, was just going to work.

I’ll spare a lot of the nitty gritty technical details, but our partner who was managing the facility was operating on a mid-frame AS400 from the late 90’s that required SFTP file transfer for communicating orders. It also had a hard limit of 10 characters per filename, so you could not name a file of orders something like november_15_2014_orders_1.csv, a completely reasonable, and legible, name. I know what you’re thinking. Why work with a partner running such old tech? I had the same thought back then, but it’s important to have some perspective. When this project was getting off the ground (late 2013, early 2014), this was still the early part of the tech boom and it was not very common for venture capital firms to invest in the development of modern fulfillment software. Not to say it wasn’t happening, but it was rare. Shipbob started the same year we started this project which gives you a sense for where the industry was. Even to this day, a lot of software that runs the logistics of all of our daily lives was made decades ago and a lot of companies are still catching up. All that is to say that we had to work with what we got and solve the impedance mismatch between our real-time, highly-available, extremely low latency ecommerce system and the batch processing and lower reliability of our partner’s FTP servers and cron jobs (cron jobs are small programs that run at predetermined intervals).

Before I go into more detail about the integrationI think it’s important to recognize one incontrovertible fact of retail operations: A retail company is only good if it can fulfill its promise to their customer. That promise, at its most fundamental, is to get their customer the item(s) they ordered when they said they’d get them there. Everything else is window dressing. When you start from that position, the design of your software and facility follows. It means you have to be sure that inventory is accurate at the facility and in turn accurate on the website. It means that when you take an order from a customer, it must be sent to the facility on the same day with a guarantee that it has been received and can be picked. I like to reframe the software problem as a bank for people new to the industry. Afterall, inventory represents the biggest investment (outside of labor) that retail company’s make. So what goes into the bank must be accurately tracked and what goes out must also be accurately tracked. If inventory goes missing, we just lost money.

With that in mind, we designed our technical integrations with redundant checks and balances. The way it worked was actually quite simple. Within our systems, we kept track of every file we dropped onto the FTP server of our partner and every order, every inventory move, receipt, etc. via transaction IDs that could be played back from the beginning of time. At intervals throughout the day and most importantly at the end of the day, we’d send a list of every file that moved between systems over to our partners servers. Their systems would then confirm that they had received every file we sent and that our systems weren’t missing any files from our list. If there were any misses, we’d then follow that up with a list of transaction IDs to confirm that all transactions were captured on both sides. If anything was missing, we could slot that transaction into our event log and the systems would catch up to each other. It was this design that controlled for the impedance mismatch between the two systems and ensured we were in lock step. It worked like a champ, too. Our bank accounts were accurate.

By the time we got to launch day, we converged on the facility in Vegas to open the doors and start shipping. It was then that I realized the impedance mismatch between Gilt’s systems and the 3rd Party was not just in batch processing vs real-time processing, it was also in software management. The AS400 from the late 90’s did not have any deployment or testing automation in place. I watched terrified as our (well meaning) partner began typing the RPG commands into the production AS400 as he translated them from the test server. Because we were automated, our servers were waiting to do the final day of tests and I assumed the same from our partner (oh my naivete). It took 4–5 hours on launch day to get the production server coded, and then we ran through our battery of production tests to ensure the systems were working as we had tested them just the week before. Spoiler alert, they weren’t and there was a lot of patient (sometimes very impatient) waiting for the fixes to be coded live into production and then retested. By 5 pm (after an 8 am start), we were ready to drop our first real order, which was good because the UPS driver had just shown up.

The order dropped into the facility, we picked it and boxed it, put the shipping label on it. The first order went to Christine in Pennsylvania. The UPS driver had waited for over an hour and I remember it vividly, we walked the box over to Patrick who was standing at the dock door with the driver and he handed him the box as the hot Las Vegas air and bright early evening sun cascaded through the opening. We were elated. We had done it. Orders were flowing. We bid the UPS driver adieu until tomorrow when we’d give him the remaining 149 orders yet to be picked, but we had a working system. Our order management system had a tracking number and the customer had already gotten the email that her order was on its way. Once confirmed, we started up the engine. Let’s pack all the boxes. More orders and more receipts were coming the next day.

It was a really great moment as an engineer who worked primarily in bits and bytes. When you do something and it becomes physical, it means more (to me at least). I was helping load the pallets as the orders were coming off the conveyor, and I felt proud of the team and proud to be part of some really excellent software, when I saw the name Christine again. I thought it was funny. “What are the chances?”, I thought, and then I put the box down on the pallet. Then I got a John, and a John, and a Ben, and another Ben, and another Ben…and then Christine again. So…the chances were super high. I talked to the head of the facility (by now Patrick had left to get a nice Italian dinner and was waiting for us to come celebrate), and we stopped processing orders. We pulled every packed order off the pallets and started lining them up by destination. There were 2 or 3 of every order. If we had started faster, we may have given all of that inventory to UPS and been out thousands of dollars.

It turns out that the WMS was repeating multi-line orders. It was an easy fix, but we needed to start from scratch with all the orders to ensure they were clean. We wiped all the orders from the WMS and then resynched the orders with the facility the next morning. Every order had to be opened and re-received to be properly processed (minus Christine’s of course).

I look back on this story fondly now, but I remember being indignant sometimes, angry at our partner other times, and disappointed in myself for not seeking to understand how our partner’s processes worked. Now I look back at it as one of the more important lessons in my career. It taught me what it means to really own a thing. In software, business, and even life, we talk a lot about the start of something, or time to market, or launch dates, but the reality is a launch date is just the start of ownership. From then on, I was determined to approach software and launches as just step 1 of building something great, and as anyone who’s ever worked with me will tell you, I’m obsessed with getting the most important things right at launch. Once you get those important things right, you can get started on the really challenging part of the work, owning what comes next.

For that project, it meant our designs had to be resilient in the face of imperfections. It meant that we needed a facility that could receive quickly and ship even faster. We got those pieces right and we proved that our designs actually worked as we hoped. The error on the day of launch was just the beginning and from then on, I was never surprised when something didn’t work exactly right on Day 0 and I factored in the actual cost of ownership when it came time to decide what to build and what not to.

--

--

Christopher Hazlett
Code and Conveyors

Chris is the CTO at OptechGroup, a boutique consultancy, that helps clients achieve their operational and technical goals.