This article is part of GMI Cloud’s technical demo series.
With the recent release of ChatGPT 4o, AI voice agents have risen to the forefront of the public eye. However, for many businesses, this form of AI has already been on the radar as a tool to drive growth and profitability by automating and enhancing customer interactions and also streamlining internal operations. In this article, we will be going over how to create an AI voice agent using GMI Cloud — with all the tools you need in one place.
Creating AI Voice Agents with GMI Cloud
At their core, AI voice agents are similar to LLMs but require additional layers to abstract responses as speech. Voice agents need to take voice as an input, process that with an LLM, and then return a response using speech. Additional engines can be utilized to customize responses and add features such as emotion and interruption management. GMI Cloud has assembled all the required software layers needed to build an AI voice agent using existing open source models.
Demo Video
Step-by-Step Guide:
1. Log in to the GMI Cloud platform
- Create an account or log in using a previously created account
2. Launch a container
- Navigate to the ‘Containers’ page using the navigation bar on the left side of the page
- Click the ‘Launch a Container’ button located in the upper right-hand corner
3. Choose your model template and parameters
- In the first dropdown menu, select the GMI Cloud voice agent template that includes ASR and TTS. (In the demo, we use Chat GLM 6B as the LM for the agent but this can be replaced with any model such as Llama 3)
- Under the ‘Select Hardware Resources’ section, select the type of hardware you’d like to deploy such as the NVIDIA H100. This will auto-populate certain parameters
- Enter details for storage, authentication, and container name
4. Launch container:
- At the bottom of the page click ‘Launch Container’
- Returning to the ‘Containers’ page, you will be able to see the container you just created with the container name you provided
- Click the Jupyter Notebook icon to connect to your container
- Here, you can import common libraries and enter hugging face tokens
5. Adding additional functions and testing
- Within the Jupyter Notebook workspace, add a transcribe and voice response function
- Run functions using a Gradio UI
- Run on a public UI for testing
The New Advent of AI Voice Agents: Transforming Interactions and Operations
The use cases for AI voice agents are immensely broad. In short, any service or function that is based in dialogue can now theoretically be accomplished using AI voice agents.
Here are just a few examples of what AI voice agents can do to benefit businesses:
- Eliminate the need for extensive call centers and multi-lingual staffing, enabling businesses to expand their global reach and provide 24/7 high-quality service without proportional increases in costs. It’s estimated that AI at scale can increase customer service productivity by 30–50%.
- Streamline sales processes such as lead qualification, follow-up scheduling, and data entry into CRM systems, improving sales efficiency and data accuracy by up to 10%.
- Serve as a super-powered personal assistant for executives and other employees alike
- Free up human staff for more complex tasks and reduce operational costs. For example, using voice agents for frequently asked HR requests or troubleshooting for IT.
Why GMI Cloud
Accessibility:
GMI Cloud ensures broad access to the latest NVIDIA GPUs, including the H100 and H200 models. Leveraging our Asia-based data centers and deep relationships with NVIDIA as a Certified Partner, we provide unparalleled GPU access to meet your AI and machine learning needs.
Ease of Use:
Our platform simplifies AI deployment through a rich software stack designed for orchestration, virtualization, and containerization. GMI Cloud solutions are compatible with NVIDIA tools like TensorRT and come with pre-built images, making it easy to get started and manage your AI workflows efficiently.
Performance:
GMI Cloud delivers high-performance computing essential for training, inferencing, and fine-tuning AI models. Our infrastructure is optimized to ensure cost-effective and efficient operations, allowing you to maximize the potential of models like Llama 3.
GMI Cloud provides a full-stack AI platform for all your AI needs making it the ideal choice for building features such as a voice agent that requires several layers of functionality. With our integrated solutions, you can streamline your AI processes, improve performance, and ensure the security and compliance of your operations.