Do I need to watch the Google Cloud Next ’24 Keynote or can Google Gemini summarize it for me?

Allan Alfonso
Google Cloud - Community
5 min readApr 22, 2024
Google Cloud Next Photo from the Google Cloud Blog

The Google Cloud ’24 Keynote is about 2 hours long.

Summarizing YouTube videos using Google Gemini was the subject of a previous blog post of mine so I thought I’d try this technique to summarize the recent Google Cloud Next ‘24 Keynote.

Could I save myself 2 hours?

Google Cloud Next 2024 Opening Keynote

Summarize the YouTube Video

Prompt:

Summarize the main points of this video in bullet form. https://www.youtube.com/watch?v=V6DJYGn2SFk

Gemini AI Output:

Here are the main points of the video in bullet form:

  • Google AI is being used today to solve tough problems including shipping goods more efficiently, reducing methane emissions, and translating languages.
  • Google Cloud is a leader in generative AI and offers a variety of AI products including Vertex AI, Generative AI Training, and AI Platform.
  • A new product called Google Vids was announced. Google Vids is an AI powered video creation app that uses Gemini and allows users to create videos with a script, music and stock footage.
  • Google Cloud announced partnerships with hundreds of leading companies including Anthropic AI, 21 Labs, Contextual AI, Essential AI, and Mistral AI.
  • Google Cloud offers a number of advantages over other cloud providers including its infrastructure, leading hyperscaler to market with Intel fifth generation Xeon processors, and its ability to connect any service.
  • Google Cloud announced a new capability called grounding with Google Search. Grounding with Google Search allows users to ground their models with Google search, improving response quality and reducing hallucination.
  • Another new capability announced is supervised adapter-based fine tuning. Supervised adapter-based fine tuning allows users to customize Gemini in an efficient lower cost way to augment models.

Verdict?

Not bad.

Not great either. In fairness, the keynote video is 2 hours long and even the latest Gemini 1.5 Pro model with 1M tokens can only process 1 hour of video so I can’t expect excellence with the regular Gemini, which is not as capable. This is a great start though.

Let’s try again using Gemini to summarize the auto-generated text transcript from YouTube.

Gemini 1.5 Pro support 1M tokens
Gemini 1.5 Pro can process 1 hour of video with 1M tokens

Summarize the Video Transcript

Prompt:

Summarize this YouTube transcript:

<YouTube transcript omitted since it’s too long for this blog post>

Gemini AI Output

This is a transcript of the Google Cloud Next ’24 Opening Keynote: The New Way to Cloud.

Here are the key takeaways:

  • Google Cloud is a leading cloud provider with a strong focus on AI.
  • Sundar Pichai, CEO of Google Cloud, highlighted the importance of AI and how Google Cloud is at the forefront of the AI platform shift.
  • Generative AI is a new type of AI that can create new things, like text, code, images or videos.
  • Google Cloud offers a variety of Generative AI tools and services, including Gemini, Vertex AI, and AI Hypercomputer.
  • Vertex AI is a unified platform for accessing, tuning, deploying, and managing custom models and building agents.
  • Many companies are using Google Cloud’s Generative AI tools and services to build new applications and improve their businesses.
  • Google Cloud announced several new features and products, including:
  • Gemini 1.5 Pro: A new version of Gemini with a larger context window and the ability to process audio.
  • Google Axion Processor: Google’s first custom ARM-based CPU designed for the datacenter.
  • Vertex AI Agent Builder: A new tool that makes it easier to build customer agents.
  • The ability to ground Gemini with Google Search for improved response quality.

Overall, the keynote focused on how Google Cloud is helping businesses transform their operations and become digital and AI leaders using the new way to cloud.

Verdict?

Better.

More announcements are captured but Gemini is still not a substitute for watching the entire video. In fairness, 218 things were announced at Google Cloud Next ’24 so any model would be challenged to summarize all 218 things. I do like the combination of the YouTube Video and Transcript Gemini outputs, however.

Google released a 14 minute summary of the keynote — a highlight reel — and combining the summary video with the Gemini outputs provides a great high-level overview of all the important announcements at Google Next ’24.

Google Cloud Next 2024 Opening Keynote in Under 14 Minutes

What announcements are the summaries missing?

  • New AI optimized infrastructure include new NVIDIA GPUs, Google TPUs, and Google ARM processors.
  • Gemini 1.5 Pro and access to variety of 3rd party AI models.
  • Gemini Code Assist, an enterprise grade AI code-assistance solution.
  • Data Agents that find insights grounded in your data.
  • Security Agents that use AI in prevention, detection, and response.
  • Improvements to Gemini for Workspace.
  • Google Vids, an AI video creation application.
  • Improved AI Agents that serves customers and employees.

Summary

Gemini is not a substitute for watching the keynote… yet.

Maybe I could get a better response with prompt engineering but I kept my prompts simple to see what response I would get. This was a fun experiment that produced decent results and I’m excited to see how this use case of summarizing long YouTube videos will improve. Less than a year ago, Google’s PALM model supported 32K tokens and now Google’s Gemini model supports 1M tokens. This pace of innovation exceeds that of Moore’s law so the challenge of summarizing long keynote videos may be solved sooner than we think.

I might not save 2 hours but I can still save time by using Gemini as an assistant to summarize many key points.

Resources:

--

--