Large Language Models(LLMs) in Google Cloud with VertexAI

Mamata Panigrahi
KPMG UK Engineering
7 min readJan 12, 2024

From concept to code: Everything you need to know to start building an application with GenAI’s LLMs.

Photo by Andy Kelly on Unsplash

It is always a good idea to adopt new, so why not a new buzz: GenAI?
Large Language Models offers several advantages and opportunities in various fields and no one wants to be left out.
In this blog, we will be introducing GenAI’s LLMs and their potential, we will delve into VertexAI and VectorSearch contributions that will help you decide for your organization.

Embracing AI

Generative AI: The Persistent NEXT

Artificial Intelligence (AI) is a theory and development that lets machines mimic human thinking and problem-solving capability and Machine Learning (ML) is a subset of AI that gives the machines the ability to learn without explicit programming.

A generative model can use the knowledge it acquires from the examples it’s exposed to and produces entirely new content based on that information. It could have seen parts of it and by that, it can synthesize and give new things such as text, video, images, code, and audio.

This characteristic of creating something new is why it’s referred to as “generative.”

It falls under the broad category of ML, a few such systems are ChatGPT and Bard.

Language Models: Probability Engines

Large Models or LMs are the engines of generative AI that help predict and create coherent texts. They predict the next word in a sequence of words.

These models are being trained with large volumes of texts so they get a better understanding to predict what word to come next — like how we learn from several sets of materials we study.

It gets the probability of a token(a word/char/subword) that occurs in a sequence of tokens. Consider the following:

"When I see a rainbow in the sky, I _______."

Assuming the token is a word, LM gets the probability of different words that can fill the sentence:

take a photo - 8.7%
make a wish - 6.3%
call a friend - 4.1%
dance with joy - 3.5%
sing a song - 2.9%

By such probability, it is useful to predict the next in a sequence that can be used for generating text, translating languages, answering questions, etc.

Large Language Models: The foundation of GenAI

Deep Learning- It is like giving robot a brain to understand things without us telling in every detail (alike teaching a kid difference between cat and dog). Years back we used to code every detail and pass on to machine to make it understand but by deep learning we prefer giving loads of relevant data to machine and let it figure out the difference on it own. It is a subset of ML that uses artificial neural network, inspired by human brain.

Large Language Models (LLMs) are a subset of Deep Learning trained on massive amounts of data that help generate human-like texts and solve language-related problems like text clarification, Q&A, Document Summarization, text generation, etc.

Below is the relationship summary between them:

Subsets

They are the engines that power generative AI.

  • These models are pre-trained with large amounts of data — petabytes of data that typically include billions of parameters, which measures the skill of the model.
  • They have minimum field training data of zero-shot or few-shot learning, which means these models can predict or create new output/data for which they were not expressly trained.
  • They can be fine-tuned as per business requirements, meaning you can develop your generative AI apps by training them with your data by tweaking them as per your requirements. They can be trained in a massive dataset of text and code.
  • Developers and enthusiasts need no prior ML or training knowledge at an expert level; all that’s required is brainstorming on prompt designing. It’s a process of trial and error where we provide efficient input to the GenAI model to achieve the desired result. Just like giving a proper query to Google search for desired results (you won’t get details of a cat if you search for a dog). Or try asking Bard for a diagram of an ML pipeline — it won’t.
  • Several frameworks can be used for prompt designing to create LLM-powered applications. One widely preferred is LangChain, which provides an interface for LLMs to take input and output as a string. To validate data in a LangChain document, Pydantic tools are widely used, helping shape the input and output responses of our models.

VertexAI — No-Code option for building models

GenAI models require several GPUs, compute resources for training, and even more to scale them up to the enterprise level.

To tackle all the technical hassle including scaling and security, as an AI-first company, Google brings VertexAI which lets developers focus more on project development and experimentation without worrying about infrastructure.

Google Cloud provides several solutions with the VertexAI platform to build and use AI with an open, responsible, and secure approach. By using VertexAI PaLM API and Codey APIs we can customize LLMs for our AI-powered app. Below is the sample code where we have used PaLM 2 for the text model:

from google.cloud import aiplatform
from vertexai.language_models import TextGenerationModel
import langchain

llm = TextGenerationModel.from_pretrained("text-bison@002")

response=print(llm.predict(
"What is Bengaluru famous for?",
max_output_tokens=256,
temperature=0.1,
top_p=0.8,
top_k=40,
))

print(f"LangChain version: {langchain.__version__}")
print(f"Vertex AI SDK version: {aiplatform.__version__}")

The end of the output will be similar to the following:

...

4. **Aerospace Industry:** Bengaluru is a significant center for the aerospace industry in India. It is home to several aerospace companies, including Hindustan Aeronautics Limited (HAL), the National Aerospace Laboratories (NAL), and the Indian Space Research Organisation (ISRO).

5. **Cultural Diversity:** Bengaluru is a cosmopolitan])
LangChain version: 0.0.323
Vertex AI SDK version: 1.39.0

We have used TextGenerationModel which is a class from Vertex AI's language models and text-bison002 is an instance of it.

There are several GenAI APIs and models that are categorized by content type including text, chat, image, videos, embeddings, code, and multimodal data.

Google provides Gemini and 130+ foundation models to choose from Model Garden where one can customize the model as per the use case using various tuning options.

And not to miss, VertexAI Studio where one can prototype and test models.

Vector Search

Vector Search is used to store and efficiently retrieve vectors that are generated by the text embedding model used in the context of similarity search or recommendation systems.

To understand it better let’s start with the basics:

Embeddings: It is the way of representing any data as points in space that are created using ML techniques. When we provide text input to a text embedding model all we get is the vector representation of it which is nothing but an array of floating numbers.

Then by using similarity search, we can find the similarity between the text or objects by comparing the numerical distance between the vectors. That’s how we can get the best movie recommendations as per our view history.

Consider below example:

from vertexai.language_models import TextEmbeddingModel


def text_embedding() -> list:
"""Text embedding with a Large Language Model."""
model = TextEmbeddingModel.from_pretrained("textembedding-gecko@001")
embeddings = model.get_embeddings(["LLM"])
for embedding in embeddings:
vector = embedding.values
print(f"Embedding Vectors: {vector}")
print(f"Length of Embedding Vector: {len(vector)}")
return vector


if __name__ == "__main__":
text_embedding()

Here we are using the embedding model to get the vector representation of the text “LLM”. The end of the output will be similar to the following:

...
0.03451257199048996, 0.02128589153289795, 0.003418264677748084, -0.046265438199043274]
Length of Embedding Vector: 768

These output embeddings can now be indexed and stored in a vector database i.e. Vector Search for efficient low-latency retrieval.

Opportunities for Business

In the landscape of finance, where data is king, preventing financial crimes is an ever-evolving challenge. Imagine a forensics team faced with mountains of data of irrelevant files in pursuit of crucial evidence for the investigation. One such scenario is infamous cases like the Enron Scandal which happened back in the early 2000s. LLM models we could tackle such challenges.

An evolution in healthcare whereby adapting GenAI, DaVita is now a. kidney care company from just being a dialysis company by using data-driven personalized and preventive medicines. Patient data insights helped in early detection which enhanced treatment plans.

In the manufacturing sector engineers do not have to build AI/ML algorithms from scratch, they leverage LLMs by which factory employees can now get the mechanical health of any part that needs adjustment or replacement in minutes instead of scrolling weeks of data manually. LLMs can analyze historical data, market trends, and external factors to provide accurate demand forecasts, helping optimize inventory levels.

These are just a few examples, and the potential applications of LLMs continue to expand as technology evolves. Their versatility makes them valuable across various sectors including education, HR, customer support, etc for automating tasks, enhancing productivity, and providing intelligent insights. Learn from more use-cases where many businesses have benefited by adapting GenAI.

After all, the era of GenAI is here :)

https://youtu.be/LsGWEeqWH6c

Conclusion

While we’ve witnessed machines successfully tackle straightforward tasks such as document generation, the potential of generative AI extends to even more ambitious endeavors, including solving problems that need deep thinking.

We bear the responsibility of carefully considering the practical applications of LLMs and ensuring that their impact is evolutionary rather than disruptive. It is not just a fancy tool; it’s also super practical. We can use it to come up with fresh product designs and make business processes even better.

Use your data sets and fine-tune your models to be a poet or developer or maybe both! Don’t hesitate — give generative AI a try and see the awesome things you can create.

Not to ignore, the next I see is Multimodal-Gemini where soon I am going to share my findings on it.

--

--

Mamata Panigrahi
KPMG UK Engineering

Cloud Devops Engineer and GCP enthusiast with 9 yrs in tech. Proud mom, avid reader, and perpetual learner. Sharing experience to foster growth.