Intro to Gen AI

Published in

Google for Developers EMEA

9 min readMay 14, 2024

It’s been a while since my last article. I’m glad to be back! 😀

As the GDG Cloud Thessaloniki Lead and WTM Thessaloniki Ambassador, I hosted a series of events on preparing people, on-site, for the PMLE Google Cloud Certification. During the events we hosted, I found out that a lot of people want to learn more about AI, ML, and Gen AI. I will try to write a series of articles to help.

In this article, we will:
1. Define what Gen AI is,
2. Explain how Gen AI works,
3. Describe Gen AI model types,
4. Describe Google’s Gen AI applications and
5. Understand the Benefits for the Developers.

1. Define what Gen AI is

AI vs ML vs DL vs Gen AI vs 😳

Artificial Intelligence (AI) is the broadest term, encompassing any attempt to imbue machines with intelligent behaviour. This can include everything from simple rule-based systems to complex algorithms that can learn and adapt.
💡 Suggested course to learn more: Crash Course in Artificial Intelligence, by Udacity. This beginner-friendly course offers a broad overview of AI concepts, applications, and their impact on society. 👨‍💻

Machine Learning (ML) is a subfield of AI that focuses on algorithms that can learn from data. These algorithms don’t need to be explicitly programmed for every situation but instead can improve their performance on a task by being exposed to more data. We have:

Supervised (Labeled data) Learning
Unsupervised (Unlabeled data) Learning
Semi-supervised Learning
Reinforcement Learning

💡 Suggested course to learn more: Machine Learning Crash Course by Google: This crash course by Google provides a solid foundation in machine learning fundamentals, algorithms, and tools. 👩‍💻

Deep Learning (DL) is a subfield of machine learning that uses artificial neural networks with many layers to process information. These neural networks are loosely inspired by the structure of the human brain and can be very effective at tasks like image recognition and natural language processing.
💡 Suggested course to learn more: Deeplearning.AI Specialization by deeplearning.ai. This specialization offers a comprehensive introduction to deep learning concepts, with coding exercises using TensorFlow. While it has paid options, there’s a significant amount of free content to get you started. 👨‍💻

Generative AI (Gen AI), which isn’t quite as widely used a term as the others, is a type of machine learning that focuses on creating new data, like images, text or even code. It uses techniques to learn the underlying patterns in existing data and then generate new content that follows those patterns.
💡 Suggested course to learn more: Generative AI for Everyone by fast.ai. This course from fast.ai delves into Generative AI concepts, techniques, and applications. While it might require some prior knowledge of machine learning, it offers valuable insights into the world of Gen AI. 👩‍💻

In a few words:
Gen AI is when the output is:

Natural Language
Image
Audio

Gen AI is not when the output is:

Number
Discrete
Class
Probability

2. Explain how Gen AI works

y = f(x)
It’s as simple as that. It says it all, right? 😁

Let’s explain it

y = Model Output
The culmination of this process is the Gen AI’s masterpiece — fresh, original data that adheres to the prompt’s specifications. This could be a realistic image generated from a text description or a piece of music composed based on a specific genre.

f = Model
The Masterpiece in the Making (Model Training): Just like an artist studies the masters, Gen AI has been trained on massive datasets. This training allows it to recognize underlying patterns and relationships within the data. Equipped with this knowledge, it can then begin to generate new pieces that cohere with those patterns.

The Brushstrokes (Algorithms): The magic lies in the algorithms. Generative models, often employing deep learning techniques, act as the artist’s tools. They process the prompt and the training data, iteratively building the new creation one element at a time.

x = Input Data
The Inspiration (Prompt): It all starts with a prompt — a specific instruction or question that ignites Gen AI’s creative spark. This could be a simple sentence describing a desired image or a complex set of parameters for generating novel code.

So the difference, in how they work, between Gen AI and Supervised learning, is as follows.
Supervised Learning works like this:
Training Data + Labeled Data = Model Building (Predict or Classify or Cluster) and
Gen AI works like this:
Training Code + Data (Labeled and Unlabeled) of ALL Data Types = Build Foundation Model. Foundation Model Generates New Content (Text, Code, Images, Audio, Video…)

A Gen AI model could be trained on a dataset of images of dogs and then used to generate new images of dogs. A discriminative AI model could be trained on a dataset of images of dogs and cats and then used to classify new images as either dogs or cats.

Our model output depends on our model input. On our Prompt. But what is a Prompt?

Prompt: is a short piece of text that is given to the Large Language Model (LLM) as input, and it can be used to control the output of the model in many ways.

The prompt in Gen AI is like the conductor’s baton in an orchestra. It sets the tone, establishes the direction, and guides the AI towards a masterpiece. It can be a concise phrase describing a desired image, a set of parameters for novel code generation, or even a question that sparks a creative response.

A well-crafted prompt acts as a bridge between human imagination and AI’s ability to translate that vision into reality.

However, if the bridge is shaky or the instructions unclear, the AI might struggle to follow. This can lead to Hallucinations: words or phrases that are nonsensical or grammatically incorrect — like a traveller emerging on the other side of the bridge to a fantastical, nonsensical land.

Just like a talented musician can hit a wrong note, Gen AI can sometimes hallucinate. This occurs when the AI misinterprets the prompt or the training data, leading it to create outputs that deviate from the intended outcome.

These hallucinations can range from slightly nonsensical to entirely outlandish. For instance, an image prompt for a “happy cat” might result in a cat with eight legs or a request for a “relaxing beach scene” might end up with palm trees growing underwater.

While some hallucinations can be amusing, they highlight the importance of clear prompts and ongoing research to refine Gen AI’s ability to discern the signal from the noise.

Some factors that can cause hallucinations are:
- The model is not trained enough,
- The model is not given enough context
- The model is trained on noisy or dirty data

3. Describe Gen AI model types
Generative AI (Gen AI) boasts a diverse toolbox, with each technique specializing in a unique form of creation. Here’s a closer look at some prominent models:

Text to Text: These models act as the wordsmiths of Gen AI, transforming one textual format into another. They can be adept at tasks like summarization, translation, and even creative writing.
Text to Image: Imagine a paintbrush dipped in language! T2I models bridge the gap between words and visuals. By understanding the nuances of text descriptions, they can generate realistic or artistic images based on the user’s prompt.
Text to Video: This cutting-edge technology takes storytelling to a new level. T2V models weave narratives from textual descriptions, generating short video clips that cohere with the provided story.
Text to 3D: Gen AI expands into the third dimension with T23D models! These models can sculpt 3D objects based on textual descriptions, opening doors for product design and virtual reality applications.
Text to Task: This approach focuses on automating real-world tasks based on textual instructions. Imagine a robot following a step-by-step recipe or completing a chore based on written directions! T2T has the potential to revolutionize human-machine interaction.
Foundation Models: These are the large language models (LLMs) that form the bedrock of many Gen AI applications. They are trained on massive datasets of text and code, providing a foundational understanding of language that other Gen AI models can leverage for specific tasks. They are large AI models pre-trained on a vast quantity of data that was “designed to be adapted” (or fine-tuned) to a wide range of downstream tasks, such as sentiment analysis, image captioning, and object recognition.

4. Describe Google’s Gen AI applications
For me, Vertex AI Studio, Vertex AI, PaLM API, and Gemini are must-try applications.

Vertex AI Studio
Think of it as your Gen AI playground. It provides a no-code interface for exploring Gen AI capabilities through a user-friendly interface. Developers can experiment with prompts, test different models, and visualize outputs without needing to write complex code. It helps Developers to:

Quickly Explorer and Customize Gen AI models,
Create and Deploy Gen AI by providing tools and resources that make it easy to get started.

Vertex AI
This is the unified platform that orchestrates the entire Gen AI workflow. Vertex AI integrates Vertex AI Studio with other essential tools for building and deploying Gen AI applications. Developers can leverage features like:

Model training and deployment: Train custom models or use pre-trained models offered by Vertex AI or third-party providers like OpenAI.
MLOps tools: Streamline the machine learning lifecycle with features for version control, monitoring, and scaling of Gen AI models.
Data management: Manage and organize the data used to train and refine Gen AI models.

Everyone with low or no coding experience and no ML experience can create:

Chatbots
Digital Assistants
Custom Search Engines
Knowledge Bases
Training Applications

PaLM API
This API grants access to Google’s powerful PaLM language model, a foundation model renowned for its capabilities in various NLP tasks. While PaLM access might evolve, developers can still leverage the capabilities of other foundation models offered through Vertex AI.

Integrate it with Maker Suite and use it to access the API using Graphical User Interface (GUI). Maker Suite includes:

Model Training Tool: Train data using different algorithms.
Model Deployment Tool: Deploy ML models to production with a number of deployment options.
Model Monitoring Tool: Monitor the performance of the ML models in production using a dashboard and a number of different metrics.

Gemini
This is Google’s suite of cutting-edge multimodal Gen AI models. Unlike traditional text-based models, Gemini can understand and respond to a combination of text, images, and even code. This opens doors for innovative applications like:

Generating images based on detailed textual descriptions.
Answering questions about uploaded images.
Creating new code snippets based on prompts or existing code.

A Multimodal AI model. Discover it in the Vertex Model Garden. It is not limited to understanding text alone. It can understand images, understand the nuances of audio, and interpret programming code.

5. Understand the Benefits for the Developers
By combining these tools, developers can unlock the full potential of Gen AI and bring groundbreaking ideas to life. Some of the benefits I’ve found out are:

Faster prototyping: Experiment with Gen AI concepts quickly and easily using Vertex AI Studio’s no-code interface.
Simplified development: Leverage pre-built models and tools within Vertex AI to streamline the development process.
Focus on creativity: Spend less time on infrastructure management and more time on crafting innovative Gen AI applications.
Access to cutting-edge models: Utilize powerful models like Gemini to create unique and impactful applications.

I am glad that you made it to the end. Follow me and Clap if you like my article.

Intro to Gen AI

1. Define what Gen AI is

2. Explain how Gen AI works

Written by Ilias Papachristos