Time-in-Process and Entropy

Alexei Zheglov
2 min readOct 20, 2017

--

The concept of entropy in information theory is about the quantity of information we can expect to receive from observing some probabilistic phenomenon.

If we happen to observe a very infrequent event, our observation carries a great quantity of information. If we happen to observe a more ordinary, frequent event, the information quantity is much less. If we’re observing something that is certain to happen (100% probability), the information quantity is zero.

We can now calculate the entropy by adding up the quantities of information resulting from observation of all possible outcomes, weighted by the probabilities of those outcomes. If the outcomes are not a discrete set, but a continuous range, integration is the mathematical apparatus we can use for this task.

It turns out, of all possible probability distributions defined on the domain from 0 to plus-infinity (that is, all positive real numbers), the Exponential distribution has the maximum entropy.

What’s interesting is, time-in-process we observe in many creative, intellectual enterprises often has a probability distribution close enough to Exponential.

This is good news.

Each time we complete a work item and deliver something to a customer, we record its time-in-process. That’s one observation. As we operate in the high-entropy domain, we can expect our observations to produce a lot of information. Therefore we don’t need many of them to produce a robust model.

The manager in the above example soon discovered that they didn’t record the second-from-the right data point (84 days) correctly. However, they soon realized that correcting this error in any way or even removing this data point from the set altogether wouldn’t change any of the several important facts this model is telling the manager about:

  1. The average time in process you can use for your planning purposes is still pretty much the same.
  2. The percentile of time in process you would use to offer a service-level agreement (SLA) to customers is still the same.
  3. The worst observed case is still the same. (The “tail” of the distribution is still all the way up to there.) If you had an improvement idea to eliminate the root case of this delay, it is probably still valid.

--

--