Google Cloud Contact Center Artificial Intelligence (CCAI): A Managerial View

Rubens Zimbres
12 min readFeb 3, 2023

Contact Centers have become one of the most important ways to provide customer support in recent years, whether for telecommunications, retail, telecom, finance or any other service or product provider. In a world where the online presence is multiplying and facilitating transactions and interactions, it is necessary that the support for these negotiations is fast and efficient. From 2020 to December 2022, customer support calls via contact centers increased by 48%. It is estimated that by 2024, up to 50% of all calls will be made without human intervention.

However, this industry has some challenges: the churn of employees is high, and the great competition between companies associated with the reduction in the purchasing power of customers has significantly increased the levels of expectation of the latter. Thus, there is a trade-off mechanism between costs, profitability and the final quality offered to the customer. There is no room for operational inefficiencies.

CPI. Source: US Bureau of Labor Statistics

Thus, it’s increasingly necessary to analyze customer feedback, behavior, feelings and expectations, so that a scalable 24/7 service can be offered at a competitive cost while maintaining the quality of the individual attention. On the other hand, it is also important to analyze contact center KPIs at the manager, supervisor, attendant and phone call levels.

This apparent contradiction between quality strategy (scope) and solution scaling (scale) (CHANDLER, 1994) can be solved with Artificial Intelligence and Machine Learning, without overloading employees and without losing the human touch, which is often the key that provides sustainable competitive advantage to a company (BARNEY, 2001). The idea is to free up employee time and improve the efficiency of each one of them with a better use of time, eliminating tasks that can be automated so that the attendant can perform his duties with greater quality (RISCHMEYER, 2021). This factor, per se, can decrease agent turnover by 5 to 7% and increase operational efficiency by 10 to 25%.

One of the ways to remove tedious tasks is through automation. Simple tasks can be automated, so that virtual agents (chatbots) solve them more easily and quickly. You can connect Dialogflow, Google’s chatbot/voicebot service, to a company’s FAQs (Frequently Asked Questions) knowledge base, so that contact center attendants don’t waste time searching for an example in a database of knowledge. This by itself reduces the silence time for calls. That chatbot used by the contact center attendant can consult a database, inform the best date for a technical visit, communicate the better solution to configure a smartphone, complementing the attendant’s service and reducing idle time in the phone call. The use of chatbots in this basic initial service can reduce the total volume of calls required by the Contact Center by up to 50%.

In addition, as Dialogflow ES and CX have a pay-as-you-go billing (, the contact center company is prepared for peaks of high and low demand, without the critical need to hire or fire employees based on demand fluctuations according to market seasonality.

Chatbots in Dialogflow can serve as the customer’s contact gateway, answering questions, queries and even collecting data on a possible complaint that will be transferred to a human attendant. This application has two main advantages: it considerably reduces the volume of calls by reducing the cost of the Contact Center, it exponentially increases the efficiency of this initial service (provided that the best development practices and architecture of the solution are followed) and it allows human agents to dedicate themselves to higher value-added tasks in contact with the customer, generating more satisfaction for both.

Another favorable point of Google Cloud’s Contact Center Artificial Intelligence (CCAI) is the fact that it is a cross-platform solution that also accepts hybrid solutions (multicloud). The chatbot in Dialogflow can also be inserted into a web page, accessible from a cell phone, WhatsApp, Facebook Messenger or even integrated into an Avaya, Genesys or CISCO switches. All these platforms support the voicebot, where customers interact via voice with the Dialogflow platform, through a regular telephone.

Google Cloud CCAI Partners

All these interactions can be recorded in a database and fed to dashboards in real time. This solution’s analytics has a reasonably broad scope if we consider that the chatbot will be integrated with a database of customers and suppliers. You can view the most frequently asked questions from consumers, generate intelligence based on the metadata of interactions and, given that we have a database of end customers (consumers) and Contact Center clients (service/product providers), we can generate different KPIs, such as the rate of callbacks from the same customer, silence time, TMA (average service time) per human attendant, call center, supervisor and manager, which products/services generate most of problems, and you can create a word cloud to see the most frequent topics (also for voice, with transcription), in real time.

The fact that the solution can be implemented in real time is often critical. If agents are making mistakes in the service, this cannot be discovered a week later. The speed and accuracy of the generated insights is essential for the manager’s confidence in an Artificial Intelligence solution, as his/her expectation is that the solution will be better than he/she currently has. The benefit gained from implementing CCAI must outweigh the switching cost of automating some tasks. Usually, a Contact Center Artificial Intelligence (CCAI) solution offers an average ROI (Return on Investment) of 130% with a payback time of 9 months, considering a Contact Center with 2,000 service positions.

The use of Artificial Intelligence in the form of automated attendants does not compromise the humanization of service. In fact, a good chatbot makes it difficult to distinguish whether we are talking to a human or a machine. There are several ways to humanize a chatbot, but the crucial point is attention, speed and an efficient service.

This increase in efficiency decreases the transaction costs of the customer-server interaction (WILLIAMSON, 1979) and reduces the TMA of calls by up to 40%. A well-done analytics of customer service and databases can reduce recruitment costs by an average of 25%, as it will be known which is the best profile for a service activity. Added to this, insights from analytics can guide new training efforts in the Contact Center, to qualify the workforce in order to follow the practices recommended by the company, increasing the final quality offered to consumers.

All this increase in efficiency has its counterpart in the architecture of the solution. In order to develop an appropriate CCAI, best practices of data engineering must be followed, as well as infrastructure modernization with the adoption of cloud computing or even hybrid (supported by Anthos, a container-based migration service). Best practices must be followed in the development of applications aimed at the availability and scaling of the solution, with a final focus on productivity, assertiveness and analytics leveraged by Artificial Intelligence.

Implementing a CCAI solution takes less than 6 months and in addition to implementing the chatbot/voicebot and analytics, you can also work on predictive monitoring/auditing models using Machine Learning.

Every Contact Center has a history of the quality monitoring that was carried out. The monitors have the voice calls and the corresponding scores of each one of the aspects analyzed. In other words, the Contact Center already has the database labeled as Compliant (1) and Non-compliant (0) for each feature, so that a Classification model can be trained with voice calls and monitoring results.

A Speech-to-Text algorithm can be used to transcribe these calls into texts and then apply Natural Language Processing (NLP) algorithms associated with Neural Networks to train a supervised algorithm (classification), so that all the new monitoring is done by the algorithm . This strategy considerably increases the amount of monitoring carried out in relation to the total number of calls, as it overcomes manpower limitations.

Dialogflow’s technology allows voice detection according to context and it has support for more than 120 languages, and an excellent Natural Language Understanding (NLU) algorithm. Dialogflow’s algorithm allows periodic training of phrases with low precision of detection. Thus, through human feedback, the assertiveness of the chatbot is increased. The solution goes from an interface with buttons that guide interactions and is based on decision trees, to a conversational agent with a fluid conversation, similar to a human.

A chatbot based on Dialogflow can incorporate all existing technology in Artificial Intelligence. A SSN can be validated by uploading and submitting this document to a Computer Vision API on Google Cloud itself, and PDFs can be generated through user inputs. In addition, it has built-in sentiment analysis and knowledge connectors to access customers’ purchase history and also integration with Google Worskpace.

Dialogflow offers multiple SDKs i.e. client libraries for development (C++, C#, Go, Java, Node.js, PHP, Python and Ruby) and is platform agnostic.

In a traditional Contact Center, the agents’ speech scripts are followed most of the time, but this automation of human speeches can generate an interaction with a feeling of artificiality. A chatbot, for example, also follows scripts, but adapts them according to customer demand, which ends up generating a more fluid and efficient interaction.

Another application of Artificial Intelligence is the analysis of the attendants’ adherence to the scripts suggested. It is possible, through Natural Language Processing, to determine on a scale of 0 to 100, which attendants follow the scripts strictly, and which do not. Voice call data is collected, transcribed by Google Cloud’s Speech-to-Text API, and the NLP algorithm provides a score for each operator. Among other things, this score helps the supervisor/manager to identify team members who need training.

Regarding the Speech-to-Text API, Google recommends that the audio be stereo (two channels) at 16 KHz to obtain the best possible transcription quality and the best speaker separation. However, in Brazil, due to cost and storage space issues, the vast majority of Contact Centers have 8 KHz, mono, single-channel audio. This requires an additional effort from the Machine Learning team in the hyperparameters of the transcription, so that the final result has a good quality.

A mono audio, with one channel

The chatbot, associated with a Machine Learning algorithm of the type ScaNN — Vector Similarity Algorithm (, in Vertex AI Matching Engine (https ://, can fetch data from a knowledge base in less than 100 milliseconds, and scales to millions of queries per second (qps), giving you an advantage in efficiency and scalability of the solution. Once the chatbot has access to the CRM, the attendant knows what are the customers’ past purchases, their demographics and preferences. Thus, it can also recommend complementary products or services to the problem reported by the consumer, in order to overcome the cognitive limitation arising from the excess of possible choices, as it works side by side with structured databases containing the entire base of products or services of the company. Thus, ScaNN can function as a recommendation or search algorithm.

Google Cloud solutions easily scale to astronomical demands, but also support low demand. Given that in the cloud we work with modules of APIs and services, building blocks, we can easily adjust the cloud architecture to, for example, only 20,000 users per month. For this, we use containers, such as the Cloud Run service, which allows zero scaling, that is, when the customer service/chatbot system is idle, there are no instances running in the cloud, and consequently there is no cost. Cloud Run can host ScaNN, any other Machine Learning algorithm, and even be used as a backend for a web application, as an option to Compute Engine (virtual machines).

Another container option involves GKE, Google Kubernetes Engine, which also uses containers, but has a more advanced management than Cloud Run, which is even more simplified in its Autopilot, self-manageable version. In this case, the system, despite not scaling to zero in terms of cost, offers advantages such as billing per pod (components of the nodes in each cluster), advanced security management, autoscaling and support for migrating workloads.

This complementation of operations with the aid of Artificial Intelligence changes some aspects of the solution offered by Contact Centers, without, however, disrupting operations. The robotic voice of traditional attendants is enhanced by a voice almost identical to that of humans, and the choice of personality, gender and 180 different chatbot voice tones can even be made. This technology is powered by the deep neural network WaveNet, from Google subsidiary Deep Mind, which offers the best voice quality for agents nowadays.

The rationale of bots based on rules and decision trees is then replaced by virtual agents that make use of Artificial Intelligence and with powerful Machine Learning algorithms coupled in natural conversational flows. Dialogflow has a self-training of interaction models, which makes its assertiveness even greater with a greater volume of data.

Dialogflow training

Another critical factor for the success of any solution is cybersecurity. A human attendant, for example, cannot have digital access to a system for checking balances in bank accounts. A virtual agent, on the other hand, has fixed connection rules, secure connections and limited scope of authorizations. Through IAM (Identity Access Management) the role of the chatbot and its interfaces can be restricted, and each of the components of the solution will be protected by different rules and services, that is, each user has a different role or scope of access. Cloud Armor will protect the Load Balancer with external IP from the Top 10 OWASP Threats, firewall rules will be applied at each layer of the solution, and additional services such as Authorized Views in BigQuery, Security Command Center, Web Security Scanner, Firewall Insights and Data Loss Prevention provide additional protection. The chatbot in Dialogflow is protected by reCAPTCHA Enterprise, which protects your business from credential stuffing.

The architecture is very similar to the one I presented in my other article, about the Virtual Career Center (VCC), available here on Medium at this link. In this specific case of CCAI, we have the following architecture:

CCAI Architecture

Full size image available at this link.

Consumers interact with the chatbot in Dialogflow to resolve trivial queries, and Contact Center agents turn to Dialogflow for knowledge base support. Audio files are sent via sFTP to an instance on Compute Engine which transfers them to Google Cloud Storage.

A data scientist, using Visual Studio Code or Jupyter, accesses Compute Engine or Vertex AI via internal IP with Identity Aware Proxy and TCP Forwarding to train Machine Learning models, saving structured output data in BigQuery. This trained model is then saved in Storage and, through GKE (Google Kubernetes Engine), an endpoint is made available for inference.

As soon as a new audio file arrives at the Storage via sFTP, this activates a Cloud Functions that transcribes the audio and submits this data to the GKE endpoint. Results are saved in BigQuery and made available to Looker Studio. Different Authorized Views in BigQuery define the level of access for each of the Contact Center supervisors and managers.

The users’ web application interface is then protected by Cloud Armor and its protection policies, an external layer to the HTTPS Load Balancer that, together with firewall rules and a service perimeter inside Google Cloud, protects the entire infrastructure of the solution.

All of this solution is monitored by Cloud Monitoring and Logging, generating email alerts and real-time API traffic dashboards. The service perimeter separates the infrastructure from the external environment and the Access Context Manager manages internal access, along with user roles defined in IAM (Identity and Access Management).

Nowadays, with advances in technology, algorithms, cloud services, and with a team of good developers, engineers and data scientists, absolutely any imaginable solution can be developed. Existing solutions can be customized with new and innovative features, generating greater customer and employee satisfaction and increasing the company’s profit.


Barney, Jay B. “Resource-based theories of competitive advantage: A ten-year retrospective on the resource-based view.” Journal of management 27.6 (2001): 643–650.

Chandler, Alfred D. “Scale and Scope: The Dynamics of Industrial Capitalism. “Harvard University Press (1994).

Oord, A. V. D., Dieleman, S., Zen, H., Simonyan, K., Vinyals, O., Graves, A., … & Kavukcuoglu, K. (2016). “Wavenet: A generative model for raw audio.” arXiv preprint arXiv:1609.03499.

Rischmeyer, Nadine. “Machine Learning as Key Technology of AI: Automated Workforce Planning.” Digitalization in Healthcare: Implementing Innovation and Artificial Intelligence (2021): 235–244.

Rolim, Gerson, Zimbres, Rubens. “Como melhorar a Produtividade e o Atendimento ao Cliente por meio da Transformação Digital.” Economia Digital (2019). Disponível em:

Williamson, Oliver E. “Transaction-cost economics: the governance of contractual relations.” The journal of Law and Economics 22.2 (1979): 233–261.



Rubens Zimbres

I’m a Senior Data Scientist and Google Developer Expert in ML and GCP. I love studying NLP algos and Cloud Infra. CompTIA Security +. PhD.