Queue Wait Time computation

Nisha Manickam
4 min readJun 13, 2021

--

Thanks to the digital age, we no longer wait in the real-life queues or even step out of the house. Still, the queue, though invisible, is a reality. The Automatic Call Distribution (ACD) are the modern call centers who assign agents for different types of interactions including voice, chats, emails, messages, and callbacks.

Estimating wait time or delay time could prove to be very useful for a call center system to reducing peak congestion and improving overall efficiency of the service system. In a call center system, it’s common for callers to experience a lot of wait time in the queue until the next serviceable agent is available. If the delay forecast is announced to the callers, a customer may choose to hang up, or stay in the queue, or opt to call back later. Despite the probability of a customer dropping out, it still makes for a transparent system where customer is making his own decisions.

How is the wait time computed?

First thing first, Average speed of answer (ASA) is different from Estimated Wait Time (EWT). EWT predicts the transient state of the system and ASA gives the fixed measure of the system for a period.

Earlier call center systems had a simple analytical formula for FCFS which led to two classes of predictors:

1. Queue length predictors- QL predictors are derived from Queuing theory. It depends on length of queue and system parameters only.

2. Delay-History predictors- DH predictors are derived from parameter free heuristics. It depends on past delay history.

In real call centers, service is affected by many factors such as:

a. queue abandonment,

b. agent efficiency (agent activated in multiple queues),

c. concurrent interactions handling by agents

d. agent availability (fluctuation after predictions)

As per Ibrahim and Whitt (2009a, 2011a, 2011b), QL-based predictors are generally more accurate and preferable to DH-based ones, if all the information that goes into the formula is available. DH-based predictors also perform quite about the same as the QL-based predictors when the incoming service request rate and agents availability are constant in time. The DH-based predictors are attractive in practice because they require little and only observable information. DH based predictors perform better when dealing with unexpected events and unknown parameters. Looking at historical delay times can provider crucial pointer towards solving the problem of estimated wait time.

Let’s look at the algorithm used by Genesys’ PureCloud to calculate estimated wait time in their customer service queues.

Here, last N completed, and cached conversations are considered. For each of these cached conversations, Adjusted Handle Time is computed based on the below formula.

Adjusted Handle Time (AHT)= (Actual_wait_time * Agents_active_on_queue)/ Position_In_Queue

Position_In_Queue — The position in queue when the EWT inference was done

Agents_active_on _queue — The number of agents activated and online on target queue.

Actual_wait_time — Actual time that an agent answered the interaction since the EWT inference was made.

The median of the sampled Adjusted Handle Time (AHTs) for the cached conversations is considered as Predicted Handle Time (PredictedAHT).

EWT = PredictedAHT * Position_In_Queue / Number_Of_Agents

PredictedAHT — A representative service time for the call to be handled

Position_In_Queue — Position of the current interaction in queue

Number_Of_Agents — Total number of activated and online agents in Target queue.

Sometimes, the interactions handling behavior could be far off the numbers computed with the above formula. Two guardrails are employed to safeguad the computation:

1. 1.5x Inter-Quartile Range (IQR) : When the calculated EWT is above the upper range, it is capped to the upper range. Conversely, if it is below the lower range, it is capped to the lower range.

UpperEWT = [Median AWT] + 1.5 * IQR

LowerEWT = [Median AWT] — 1.5 * IQR

IQR = AWT Quartile3 — AWT Quartile1

IQR is measured from the AWTs of the recently handled interactions.

2. max(LowerEWT, min(historical AWTs evaluated)) : This would prevent the LowerEWT get unreasonably low, say <1.

Combining best of queuing theory and Delay history can help one to solve their problem of Estimated Wait Time. However, for a multi-skill call centers, a more complex algorithms would be required.

Certain best practices that would help the EWT computation :

1. Considering only the interactions that were complete, not abandoned.

2. Considering interactions that don’t have low Actual Wait Time, say < 1sec.

3. In a cold start case, when there is no cached interactions available, simple queuing formula

Wait Time! source: https://www.flickr.com/photos/milst1/14843708285

--

--

Nisha Manickam

ML Engineer, Data Enthusiast and forever Chai Lover :)