Optimize Your Athletic Performance with Gemini and Vertex AI

Alok Pattani
Google Cloud - Community
6 min readMay 13, 2024

Co-Author: Zack Akil

Many of us like to participate in sports, and if you’re competitive at all, you likely want to get better in some way: run faster, be more accurate, have more power, vanquish your opponents — and maybe even look good while doing it.

One challenge in improving is that some aspects of our performance are hard to measure, and getting access to high quality data and personalized coaching can be difficult — we’re not all on track to be professional athletes, after all. But what if we can get some of that insightful data and analysis with the help of AI?

Enter one of Google Cloud’s most popular in-person demos that’s gone from London to Berlin to Las Vegas, and is coming to Google I/O this week: AI Penalty Challenge! It’s a unique end-to-end interactive soccer coaching experience that uses the latest in Generative AI on Google Cloud, combined with other Cloud and Android technologies. The demo has stolen the show at various events over the last several months and continues to inspire participants to build amazing things with Google.

(Sidenote: For those outside the US, please feel free to read the word “soccer” as “football” and forgive us for the “misspelling” throughout this article 😆.)

The AI Penalty Challenge setup is pretty straightforward, as shown in the video above and explained in this one: each participant takes three penalty kicks at a soccer goal, aiming for one of the yellow boxes in the top corners.

The magical part is that each kick is evaluated on power, accuracy, and style within seconds, and afterwards, participants receive personalized feedback from an AI soccer coach and a unique player card as takeaways.

Let’s dive into the different aspects of this experience in more detail to understand how it works.

Recording the Penalty Kick

The first step is to actually record the kicks, which is done using the cameras on six Pixel phones placed in different spots around pitch: one focusing on the goal, one focused on the ball path, one from each side focused on the kicker, and one in each top corner box.

The camera timing is synchronized using a “referee” tablet that has an app that starts and stops recording between each session. The raw videos are uploaded to Cloud Storage almost immediately after each kick, and are used in replays on screens around the pitch.

Triggering Video Analysis in the Cloud

When a video hits the specified Cloud Storage bucket, a Firestore trigger for Eventarc routes the video through a Cloud Function that triggers Python scripts that run the analysis in a serverless way. Results are then sent to Cloud Firestore (stats) and Cloud Storage (images).

Measuring Speed and Accuracy

Diving more into the actual analysis, custom object detection models trained with Vertex AI Vision are used to track the ball and the goal. Task-specific models shine in cases like these where there is enough labeled training data (images with the types of soccer balls and goals used in the demo) and the desired results are precise position estimates of objects in frame.

Tracking the ball across frames enables the speed calculation that makes up the power score, and how close the ball is to the closest bin determines the accuracy score. All of this logic is called within a Cloud Function, with the calculated scores going to Cloud Firestore and some visualizations of the kick going to Cloud Storage.

Analyzing Style

The most subjective of the three attributes is style. Leveraging Gemini in Vertex AI’s multimodality and reasoning ability, a prompt asks the model to look at frames of the kick, analyze how good or creative the kicker’s technique is, and return a style score with the reasoning behind it. With no additional soccer-specific training data (unlike what might be required for a more traditional ML model), Gemini still returns insightful results and explanations about style.

The style score and reasoning are sent to Cloud Firestore, where the three scores can be averaged together to get a total score for each kick.

Coaching Feedback from a Pro

Once all three kicks for a given participant have scores, another Firestore trigger for Eventarc initiates another Cloud Function, this time prompting Gemini 1.5 Pro (with extended reasoning ability) with the player’s scores and specific instructions to generate some encouraging constructive feedback in the voice of a soccer coach. The coaching feedback is converted to audio using the Text-to-Speech API, and, with the help of Custom Voice, comes out sounding like it came from a professional soccer coach!

Below are the images and the coaching feedback I received for the three kicks I took at Google Cloud Next ’24 in Las Vegas.

He’s right: I should focus on driving through the ball more to get more power!

Generating a Player Card

In addition to the practical coaching feedback, participants get another cool takeaway from this experience: a player card featuring their image and stats from their best kick, with a cool AI-generated background. Participants pick their background theme from a variety of types — forest, cosmic, heroic, etc. — and then Imagen on Vertex AI takes the kicker image and specific prompt to do mask-based editing, generating a new image that gets inserted into the player card.

The result is often something spectacular, like what I got in back in April.

What a cool bonus to help memorialize my penalty kick experience!

Highlighting Top Performers on a Leaderboard

No interactive demo experience where people get scores would be complete without a leaderboard, so of course all AI Penalty Challenge events have one. In this case, there’s a web client that renders the top performers along with some stats like total number of kicks for the day or overall event, backed by tables in BigQuery with all the data for every kick.

After each kick gets analyzed, Firestore triggers a Cloud Function that injects the complete kick stats as a row in BigQuery, so that the leaderboard updates in near real-time. My 80 in Las Vegas wasn’t enough to make the screen, but I’ll console myself knowing that it’s hard to crack the top 20 when there were more than 1000 kicks a day at that particular event!

Putting It All Together

So that’s it: 15+ Google products including some of Cloud’s most exciting AI tools like Gemini, Imagen, and custom text-to-speech — all part of the Vertex AI platform — come together organically to power this one-of-a-kind experience.

If this gets you excited, look out for the AI Penalty Challenge at an event near you — including Google I/O in Mountain View later this week. And even if you can’t make it to the actual demo, get started with Vertex AI and stay tuned for more detailed architecture breakdowns and code coming soon — hopefully to serve as inspiration to build the next great AI-powered experience with Google Cloud!

--

--