Torturing Data

This morning on my Metro ride to work, I heard a great quote listening to a podcast called Boagworld. The guest, Jared Spool said:

“If you torture data enough you can get it confess to anything you want.”

A company I once worked for took that approach to data. That company went out of business in 1999 leaving thousands of us out of work.

Preston Trucking started in 1932 as a small regional carrier and grew into a dominant business primarily in the northeast region of the United States. They specialized in LTL shipments (Less Than Truckload). Their motto, “The 151 Line” was an attempt in the 1930’s to boost marketing by highlighting how big they were. The owner, A.T. Blades counted all the pieces of equipment they had and came up with a total of 151.

Graduating college in 1985, my first job was sorting punched cards in the computer room at Preston Trucking. Even then, the equipment was considered old and outdated (The 1890 United States Census was done using punched cards). I feel very lucky to have gotten a chance to work on the old equipment for a few months before I found another job. The experience was invaluable because I was physically working with data rather than in an abstract way. Plus, watching and hearing a large stack of cards mechanically sort into different slots is mesmerizing.

HUMOROUS SIDE NOTE: When Preston Trucking finally got rid of their old equipment, they had boxes and boxes of punch cards to dispose of. One manager thought a bonfire might be the best way to get rid of them and piled them all in a heap and poured gasoline over them. Gasoline creates fumes when you pour it, which is part of the reason it is so dangerous. When the manager lit a match, the whole pile exploded and punch cards were raining down from the sky.

The company was smart enough to know that technology could give us a competitive edge and they invested millions in setting up a proper data center. When I returned to Preston Trucking in 1990 they were creating the best technology in the trucking industry. They also hired good people and built some outstanding systems for the era. Some examples of this are:

  1. A.R.R.C. — Automated Rating Routing and Coding. Preston Trucking shipped all types of materials (hazardous and non hazardous). Because we would transport almost anything, there was a huge amount of effort needed to (1) properly price the shipment, (2) ensure that it did not conflict with other items, (3) comply with interstate regulations, and (4) get it to the destination the most efficient way. We were able to automate this for over 80% of the shipments, which eliminated a room full of manual coders, and provided better service at a lower cost.
  2. Before there were online maps, we were very much interested in tracking miles and routes for shipments. The truckers kept track of their miles for individual legs of a trip but we needed a national perspective to pick the best routes to get a shipment to its destination at the lowest cost. We had a table of distances between zip codes but the data was based upon measurements made on a a flat map. We had to apply a curvature of earth calculation to arrive at a more accurate result. We were also working on applying more miles in areas where there was a barrier between zip codes, like a river or a mountain. It was crude by today’s standards but very helpful in our pricing calculations.
  3. QYM Costing — Quality Yield Management Costing. This was an invaluable tool created to determine the actual cost of shipping a product from point A to point B. We tried to get exact measurements on every step between pickup and delivery. One night a group of us went to one of the terminal hubs at midnight. We ran around with stopwatches timing the forklift drivers as they moved freight from one truck to another. (NOTE: We also learned that our presence could corrupt the results; the forklift drivers ripped open boxes of candy to give to one of the pretty women in our group.) We took this data and created a system that clearly showed which freight was profitable and which freight lost us money.

The problem was the company was losing money. We were pushing tons of good information to management but they didn’t have enough time to process it. Instead of using our reports to identify and address the issues, some managers would ask us to manipulate the data to make their reports seem better than they were. I think some had the naive notion that if the reports looked good, then the company would succeed. They reasoned the company had made a huge investment in technology, so we only needed to tell the computers that we were successful. Again:

“If you torture data enough you can get it confess to anything you want.”

There are other reasons why Preston Trucking failed but it always bothered me because I felt our computer systems were very robust for the era. Computers could identify the problems but that didn’t mean they could fix them.

