Self-hosting Gemma2 on GKEa practical look at hosting Gemma2 on GKE in a production-grade GKE cluster.Dec 17, 2024Dec 17, 2024
Microbenchmarking: process for evaluating AI serving changesThe field of Generative AI is filled with hype. How can you tell if a change is “worth it”? This is how I think about evaluating changes.Dec 10, 2024Dec 10, 2024
Which Gemini AI is right for you?Google’s Gemini AI brand contains 2 separate products — here’s how to tell them apart, and pick the one that’s right for your needs.Jul 23, 2024Jul 23, 2024
Building blocks of a Developer PlatformAn overview of some important capabilities for Platform Engineering. Reusable components, Service catalog, and Infrastructure config…Jun 17, 2024Jun 17, 2024
Three flavors of Terraform iterationI was writing a Terraform module to create a Google Cloud Load Balancer with an arbitrary set of GKE services as backends. To achieve this…May 21, 2024May 21, 2024
Architecting for Traffic DrainsBuilding resilient services by routing around failing components, as illustrated with GKE.Apr 5, 2024Apr 5, 2024
Day 2 Observability — calls to other servicesThis post assumes you’re already familiar with OpenTelemetry, and are already collecting some observability data.Mar 31, 2023Mar 31, 2023