Introducing LangChain toolkits for Azure Cognitive Services
Building a multimodal application powered by GPT
In my previous article, I introduced LangChain agents as applications powered by LLMs and integrated with a set of tools like search engines, databases, websites, and so on. Within an agent, the LLM is the reasoning engine that, based on the user input, is able to plan and execute a set of actions that are needed to fulfill the request.
LangChain offers a set of tools to integrate your agents with external services like Bing Search, local File Systems, YoutTube Search, and so on. Some integrations however, like CSV, Pandas Dataframe, and Gmail, need a particular set of tools to properly function. Those sets of tools are called toolkits, and today we are going to explore Azure Cognitive Services toolkit to extend LLMs with multimodal capabilities.
Azure Cognitive Services
Azure Cognitive Services are cloud-based AI services that help developers build cognitive intelligence into applications without having direct AI or data science skills or knowledge. They are available through REST APIs and client library SDKs in popular development languages.
Azure Cognitive Services can be categorized into five main areas:
- Vision. These services provide access to advanced cognitive algorithms for processing images and returning information, such as face detection and recognition, computer vision, custom vision, and video indexer.
- Speech. These services add speech-enabled features to applications, such as speech-to-text, text-to-speech, speech translation, and speaker recognition.
- Language. These services provide natural language processing (NLP) features to understand and analyze text, such as language understanding, QnA Maker, translation, text analytics, and web language model.
- Decision. These services help monitor and detect abnormalities in data, moderate content, and personalize user experiences, such as anomaly detector, content moderator, and personalizer.
- Azure OpenAI Service. This service provides access to OpenAI models such as GPT-3, ChatGPT, and Dall-e.