Google Cloud Platform Technology Nuggets — July 16–31, 2024
Welcome to the July 16–31, 2024 edition of Google Cloud Technology Nuggets.
Please feel free to give feedback on this issue and share the subscription form with your peers.
Infrastructure
VMware Cloud Foundation on Google Cloud VMware Engine (GCVE) is now Generally Available (GA). In a move to help move more VMware workloads to Google Cloud, the VMware Cloud Foundation has key price discounts, lower commitment pricing, licence portability entitlement, new GCVE node types and more. Check out the blog post for more details.
Google Distributed Cloud air-gapped appliance, a new configuration of Google Distributed Cloud that brings Google’s cloud and AI capabilities to tactical edge environments, has gone GA. This unlocks various use cases where near real-time AI capabilities are required in local and edge operating environments. Check out the blog post for more information on this physical device.
Google Cloud Private Marketplace, a marketplace within the current Google Cloud Marketplace has gone GA. As the blog post states, “It allows Cloud Administrators to curate a collection of vetted products that’s specific to their organization. With Private Marketplace, organizations maintain governance and control costs, helping to ensure that only approved Google Cloud Marketplace solutions can be procured and deployed by end users.” Check out the blog post for more details.
Containers and Kubernetes
Commited Use Discounts (CUDs) have been a key component of cost savings while running compute on Google Cloud. But if you were using these flexible CUDs on workloads on Compute Engine, GKE and Cloud Run, there were CUDs to consider and bringing them under a single umbrella was not easy. That has changed now with Compute Engine Flexible CUD, covering Cloud Run on-demand resources, most GKE Autopilot Pods and the premiums for Autopilot Performance and Accelerator compute classes.
I reproduce the section from the blog post, that explains it well “With one CUD purchase, you can cover eligible spend on all three products: Compute Engine, GKE, and Cloud Run. You can save 46% for a three-year commitment, and 28% for one-year commitments. With this single unified CUD, you can now make a single commitment and spend it across all these products, maximizing its flexibility. Furthermore, these commitments are not region-specific, so you can use them on resources in any region across these products.” Check out the blog post for more details and how to get started.
Want to delay upgrading your GKE clusters and remain on an older version for much longer, due to valid business reasons. Now with GKE extended support and starting with GKE version 1.27, clusters can remain officially supported for up to 24 months on a specific GKE minor version. After GKE standard support’s 14 months are over, GKE extended support takes over, adding another ~10 months during which clusters continue to receive security patches. Check out the post for more details.
Identity and Security
A few editions back, we had covered how Gemini 1.5 Pro could help with Malware Analysis. The specific tasks that were looked at was to automate the reverse engineering and code analysis of malware binaries. This time, the team has evaluated Gemini 1.5 Flash to see how it performed on the task. Gemini 1.5 Flash is a much smaller model and comes at a much reduced cost. The post uncovers a few aspects to consider if using Flash 1.5 Pro. Check out the blog post.
One of the prominent phishing (or should we say ‘Vishing’) techniques now is to mimic someone else’s voice. Given the state of AI models, this comes as no surprise.
VPC Service Controls (VPC-SC) has introduced support for private IPs with VPC Service Controls. This support could help accelerate specific scenarios like Expanding your on-premise environment to a secure cloud perimeter, etc. Check out the blog post for more details.
The first CISO perspectives for July 2024 is out. In addition to regular news around Security, the key topic discussed is that of budgets for security teams. Check out this Mandiant report that highlights the need for building out security measures against AI voice-spoofing attacks.
Networking
Network architects and engineers who want to design and build distributed applications on Cross-Cloud Network can make use of a recently published guide titled “Cross-Cloud Network for distributed applications”. This guide, with contributions from several Googlers, provides you guidance on connecting, securing, and delivering applications across on-premises, Google Cloud, and third-party cloud environments. Check out the post.
Machine Learning
Meta’s Llama 3.1 family of models is available on Google Cloud via Vertex AI Model Garden. Currently, the Llama 3.1 405B is available in preview via the Model as a Service, which is a fully managed service. The 8B and 70B models will be available in a few weeks. Check out the blog for more details.
While Meta’s Llama got significant attention in the recent weeks, Mistral AI’s Codestral, a model explicitly designed for code generation tasks is also available as a service in Vertex AI Model Garden. In addition to Codestral, Mistral Large 2 and Mistral Nemo are also available. Check out the blog for more details.
If you have been developing Gen AI Agents, one of the important areas to address is that of providing access to the Gen AI Agent to user data. At a high level, you absolutely do not want to give the GenAI agent full read/write access to the database and more importantly, you want to drill down to providing just about enough permission and that too, limit to just the user data. One of the approaches to bake this security into your GenAI application is to use Agent Tooling, where the Tool takes in a specific authentication information, validates it and then accesses only that information from the database. How would you implement something like that in your application? Take a look at this blog post.
Hex-LLM (High-Efficiency LLM Serving with XLA), a LLM serving framework that is designed and optimized for Google’s Cloud TPU hardware, is available in Vertex AI Model Garden, via playground, notebook, and one-click deployment. Check out the blog post that dives into various aspects of this serving framework including a step by step guide to getting started.
Designing Generative AI solutions? How about Google teams sharing their experience in designing some key Generative AI solutions and what they have learnt? The best practices in the blog post address 3 key categories: reducing friction, prioritizing goals and building trust through transparency. Some interesting observations and design guidelines are in there. Check it out.
You’ve deployed a Gen AI application and are well aware that these models tend to hallucinate and have varied responses. How do you put in place a process that can help evaluate the LLM responses and then use the Vertex Gen AI Evaluation Service to automate the selection process of the best response and provide associated quality metrics and explanation. Check out this blog post that highlights how you can implement this workflow.
If you are a Google Cloud Partner, check out the Partner Advisor, an AI-powered collaborator, that has been trained on product documentation, service knowledge repositories, and delivery training and enablement resources available in the Delivery Enablement Portfolio (DEP). Access the Partner Companion over here.
Storage
Google Cloud Storage is often the glue between various Google Cloud services. Primarily, one integrates the Cloud Storage client library into the applications and with the increase in data-intensive applications, any optimization available within the library for large data transfers is beneficial to most applications. A new transfer module uses multiple workers, in threads or processes, to maximize throughput. This feature is now available in the Storage client library, whereas multiple transfers was available via the gcloud storage
utility. Check out the blog post on this feature, how it performs and links to the client library across various programming languages.
Databases
Cloud Spanner had some significant updates in this period. First up is the new Spanner geo-partitioning feature, which as the blog post states, “allows you to partition your table data at the row-level, across the globe, to serve data closer to your users. Even though the data is split into different data partitions, Spanner still maintains all your distributed data as a single cohesive table for queries and mutations.” Dive into the blog post on how this feature is likely to have a solid impact on latency and cost optimization, along with how it works and more.
Hot on the hells of geo-partitioning feature is Spanner dual-region configurations. This feature addresses the limitation, where in countries that have just 2 regions, you got 99.99% of availability. This is because Spanner multi-region configurations require three regions, one of which would be located outside the country. You can now meet the 99.999% availability while complying with data residency requirements using the new Spanner dual-region configurations. Dual region configurations are now available in Australia, Germany, India, and Japan. Check out the blog post for more details.
Looking to understand how Bigtable addresses HTAP (hybrid transactional and analytical processing) or in simple words, a best of breed implementation to address both OLTP And OLAP requirements? Check out the blog post.
Data Analytics
A brand new experience in Dataplex Catalog is now available in preview. Dataplex Catalog, search and discover your data across the organization, enables data governance and much more. Check out the blog post for details.
Looking to easily and reliably replicate data from your SQL Server databases to BigQuery, Cloud Storage, and other Google Cloud destinations? Datastream, a serverless change data capture (CDC) and replication service could help you do that. Check out the blog.
If you are user of Apache Airflow and/or Cloud Composer, which is its equivalent managed service offering on Google Cloud, here is a good guide that helps you understand and implement concurrency strategies optimizes resource utilization, improves scalability, and improves fault-tolerance in your data pipelines. It addresses this topic at four levels: Composer Environment, Airflow Installation, DAG and Task.
Developers and Practitioners
Here is a short and nice tutorial on how you can build out a low-code Generative AI application using several Google Cloud services. The application regularly ingests RSS Feeds along with the content of the feed into a BigQuery Dataset. Vertex AI Agent Builder is then used to create an AI Search Agent that can then use this grounded data source in BigQuery to help you query that information. Check out the blog post.
Learn Google Cloud
Have you joined the Cloud Innovators program? The program provides you 35 credits every month to use towards courses and hands-on labs.
Join the Innovators program at no cost today!
If you are a Security Professional, the Modern SecOps (MSO) course is something that could interest you. It is a six-week, platform-agnostic education program designed to equip security professionals with the latest skills and knowledge to help modernize their security operations, based on our Autonomic Security Operations framework and Continuous Detection, Continuous Response (CD/CR) methodology. The course is available on Coursera.
Stay in Touch
Have questions, comments, or other feedback on this newsletter? Please send Feedback.
If any of your peers are interested in receiving this newsletter, send them the Subscribe link.
Want to keep tabs on new Google Cloud product announcements? We have a handy page that you should bookmark → What’s new with Google Cloud.