Orchestrate Generative AI Applications

with Cloud Workflows

Published in

Google Cloud - Community

6 min readJul 20, 2024

Let’s start with a simple use case. A workflow that generates recipes and an image of each and stores both on Google Cloud Storage. We also implement a rudimentary web app to show the recipes.

In reality, generative AI applications can be even more complex. With Cloud Workflows, we can orchestrate business processes much more flexibly.

Livestream, YouTube and Code

The full code for this article is available on GitHub.

GitHub — SaschaHeyer/generative-ai-orchestration

Contribute to SaschaHeyer/generative-ai-orchestration development by creating an account on GitHub.

github.com

This article was part of a livestream series. You can watch the recordings, and there is a playlist containing all 2 episodes for this specific topic.

Join me every Friday from 10–11:30 AM CET / 8–10:30 UTC for the Coding GenAI Applications Live Stream! 📺 Get Ready to Code and Laugh Live.

📅 Mark your calendars, grab your favorite beverage, and let’s turn coding chaos into creative solutions together. Who knows, you might even learn something new or at least get a good laugh at my expense!

Never miss a session. Click here to add the live stream to your Google Calendar.

What is Google Cloud Workflows?

Cloud Workflows is an orchestration service that allows us to build workflows. Unlike Cloud Composer, it is serverless and very cost-effective.

You can think of combing steps. Each step might have an input and an output. Steps can call Google services (via Workflow Connectors) and APIs with standard HTTP requests.

Workflows are defined as YAML or JSON and have a specific syntax. Google has great documentation. Essentially it pins down to a structure like that.

main:
      steps:
      - STEP1:
      - STEP2:

There are also features such as parallel steps, iterations, conditions, retries, jumps, variables, parameters, and more.

Cloud Workflow Libraries and Connectors

Connectors

Workflow Connectors are available for a variety of Google Cloud products like BigQuery, CloudSQL, GKE, Vertex AI, and more. In our example we use a connector to call the Vertex AI Gemini model and also upload the recipe to Google Cloud Storage.

steps:
- call_gemini:
        call: googleapis.aiplatform.v1.projects.locations.endpoints.generateContent
        ...
- upload_recipe_to_gcs:
        call: googleapis.storage.v1.objects.insert
        ...

Libraries

Workflow standard libraries offer support for functions we need regularly. We use it to parse the JSON response from the Gemini model and also to generate a unique ID for each recipe.

steps:
- init:
        assign:
          - recipe_id: ${uuid.generate()}

The most important standard library is HTTP. This provides us the flexibility to call any API as part of a workflow.

Common use cases for Generative AI orchestration

Multiple calls to LLMs or other Generative AI APIs that combine their output.
In the past weeks, one specific use case has stood out. With models like Gemini, we are seeing a huge increase in context sizes, with up to 2 million tokens. Still, the number of output tokens “lacks” behind, with 8192 tokens. Looping applications that call the same LLM multiple times is a great use case for workflows, for example, generating a book that would otherwise be limited by the output token.
LLM is combined with other business processes.
Robust retry mechanisms.

In this article, we focus on use cases related to Generative AI, but Cloud Workflows can be used for any kind of orchestration.

Our Recipe Generation Workflow

Step 1: Initialize

We start by generating a unique ID for each recipe to ensure that every generated recipe and its corresponding image are uniquely identifiable.

The assign keyword is used to set or initialize variables within the workflow. In this case, it generates a unique recipe_id using the uuid.generate() function.

steps:
- init:
    assign:
      - recipe_id: ${uuid.generate()}

Step 2: Generate Recipe using Gemini

Next, we call the Vertex AI Gemini model to generate a detailed recipe based on a given prompt. The model provides the recipe title, description, image prompt, and list of ingredients.

call is used to run a function like. Here, it calls the generateContent endpoint of the Gemini model.
args provides the necessary arguments for the API call, including the model details, region, request body, and optional settings.
result stores the API response in the variable gemini_response.

steps:
- call_gemini:
    call: googleapis.aiplatform.v1.projects.locations.endpoints.generateContent
    args:
      model: "projects/sascha-playground-doit/locations/us-central1/publishers/google/models/gemini-1.5-pro-001"
      region: "us-central1"
      body:
        contents:
          role: user
          parts:
            - text: ${input.recipePrompt}
        safety_settings:  # optional
          category: HARM_CATEGORY_DANGEROUS_CONTENT
          threshold: BLOCK_ONLY_HIGH
        generation_config:  # optional
          temperature: 0.2
          maxOutputTokens: 2000
          topK: 10
          topP: 0.9
          responseMimeType: application/json
          responseSchema:
            type: object
            properties:
              recipe_title:
                type: string
                description: The recipe title.
              recipe_description:
                type: string
                description: The recipe description.
              recipe_image_prompt:
                type: string
                description: The prompt to generate the recipe image.
              ingredients:
                type: array
                items:
                  type: object
                  properties:
                    name:
                      type: string
                      description: The name of the ingredient.
                    quantity:
                      type: string
                      description: The quantity of the ingredient.
                    unit:
                      type: string
                      description: The unit of measurement for the ingredient.
                  required:
                    - name
                    - quantity  # Adjust required fields as needed
            required:
              - recipe_title
              - recipe_description
              - recipe_image_prompt
    result: gemini_response

Step 3: Parse and Validate JSON Response

To ensure the data generated is in the correct format, we parse the JSON response from the Gemini model.

json.decode is used to parse the JSON response from the Gemini model and assign it to the parsed_recipe variable. It also extracts the recipe_image_prompt from the parsed recipe. The prompt we gonna use in another step to generate the recipe image.

steps:
- parse_json_recipe:
    assign:
      - parsed_recipe: '${json.decode(gemini_response.candidates[0].content.parts[0].text)}'
      - recipe_image_prompt: '${parsed_recipe.recipe_image_prompt}'

Step 4: Store Recipe on Google Cloud Storage

The recipe is stored on Google Cloud Storage. This ensures that all generated content is securely saved and easily accessible for example for a web app.

We have the same syntax structure as before. Using call with an existing connector for Cloud Storage and providing the necessary parameters as args like the bucket, filename and also our file content our recipe JSON. You can find all the args for all the services in the Google documentation.

steps:
- upload_recipe_to_gcs:
    call: googleapis.storage.v1.objects.insert
    args:
      bucket: "doit-llm"
      uploadType: "media"
      name: ${"recipes/" + recipe_id + ".json"}
      body: '${json.encode(parsed_recipe)}'

Step 5: Generate Recipe Image using Cloud Run Service

We couldn’t call the Vertex AI Imagine API directly as part of our Google Workflow step due to the max response size limitation of 2MB. Luckily we can call any API using HTTP. As a solution, we use the Vertex AI SDK to generate an image and deploy this as a service to Cloud Run. With http.post we can directly call our Cloud Run service that generates the image.

steps:
- generate_image:
    call: http.post
    args:
      url: "https://image-generation-xgdxnb6fdq-uc.a.run.app"
      body:
        uuid: ${recipe_id}
        prompt: ${recipe_image_prompt}
    result: image_generation_response

Step 6: Return the Results

Finally, we return the generated recipe and the image generation response, providing a complete workflow output.

steps:
- returnStep:
    return:
      recipe: ${parsed_recipe}
      image_generation_response: ${image_generation_response}

Limitations

Cloud Workflows HTTP request max response size is 2MB. We run into this limitation during the live stream it's a limit, and you cannot increase that. Therefore, we used a slightly different approach. instead of using only Cloud Workflows service connectors, we implemented the image generation in Python and deployed that as an API to Cloud Run. This way we can use Cloud Workflows and call the Cloud Run service.

Overall, Cloud Workflow is a great product actively used by many companies I work with. Delivery Hero wrote a great article that covers the most common challenges you might encounter. I recommend checking it out.

Overcoming Common Challenges of Google Cloud Workflows

Google Cloud Workflows is a robust service that provides a potent platform for event processing and batch jobs. It is…

medium.com

Pricing

With $1 we can generate approximately 49 recipes, one single recipe costs us $0.0204251.

Let us break down the full costs for this generative AI application. Since we are generating recipes it makes sense to break it down to the costs of a single recipe. Some of the services have a free tier, for the cost calculation let’s assume we don’t have that.

Generate the recipe content with Gemini 1.5 Pro
$0.0003615
You could generate around 2765 recipes JSON for the cost of a single penny.
Generate the recipe image with Imagen
$0.020
This is the majority of our costs for a recipe.
Cloud Workflows
$0.00006
Cloud Run
$0.0000036

Thanks for reading and watching

I appreciate your feedback and questions. You can find me on LinkedIn. Even better, subscribe to my YouTube channel ❤️.