Going beyond standard HTTP timeouts in GCP Workflows — the practice

Yurii Serhiichuk
Google Cloud - Community
6 min readApr 21, 2024
DALL-E view on the Tasks Runner service
DALL-E view on the Tasks Runner service

As previously mentioned, we will deploy a fully functional AppEngine Tasks Runner service and all the required infrastructure components.

This article aims to provide you with a simple way of verifying the mentioned approach works and give you guidelines on how to include such a solution in your project.

You can jump into the appengine-tasks-runner repository and try it out yourself or go through the code or we can do this together throughout this article.

The application itself is super simplistic and exposes a single FastAPI-based endpoint that accepts structured HTTP request information with the following structure:

class HttpServiceRequest(pydantic.BaseModel):
"""A request to call a service."""

url: pydantic.AnyHttpUrl = pydantic.Field(
title="URL", description="The URL of a service to be called."
)
"""The URL of a service to be called."""

body: pydantic.BaseModel | dict[str, Any] | str | bytes = pydantic.Field(
default_factory=dict, title="Body", description="The HTTP request body payload."
)
"""The request body payload."""

content_type: str = pydantic.Field(
default="application/json",
title="Content Type",
description="The HTTP request body content type. Defaults to JSON.",
)
"""The HTTP request body content type. Defaults to JSON."""

method: str = pydantic.Field(
default="POST", title="Method", description="The HTTP request method."
)
"""The HTTP request content type."""

headers: dict[str, str] = pydantic.Field(
default_factory=dict, title="Headers", description="The HTTP request headers."
)
"""The HTTP request headers."""

timeout: float = pydantic.Field(
default=43200, title="Timeout", description="The request timeout in seconds."
)
"""The request timeout in seconds."""

As you can see you’re gonna be able to provide all the required HTTP call configurations when requesting the Tasks Runner to do a request on your behalf.

The response is pretty simple as well:

class HttpServiceResponse(pydantic.BaseModel):
"""A request to call a service."""

body: pydantic.BaseModel | dict[str, Any] | str | bytes | None = pydantic.Field(
default_factory=dict, title="Body", description="The HTTP response body payload."
)
"""The response body payload."""

headers: dict[str, str] = pydantic.Field(
default_factory=dict, title="Headers", description="The HTTP request headers."
)
"""The HTTP request headers."""

status_code: int = pydantic.Field(
title="HTTP Status", description="The response HTTP status code."
)
"""The response HTTP status code."""

The main part of the code is HttpServiceCaller and its source code is available here. Upon instantiation, it fetches a secret with a service account and then uses that service account to perform authenticated calls to your services.

And main.py provides a simple POST request handler that consumes HttpServiceRequest objects then pass them to HttpServiceCaller and returns back the results as HttpServiceResponse .

@app.post("/")
async def handle_task(http_service_request: HttpServiceRequest) -> HttpServiceResponse:
"""Handles AppEngine HTTP Cloud Task requests.

Sends the request content to the specified by the request URL service and returns
back the response.
"""
response: HttpServiceResponse = service_caller.call_service(request=http_service_request)
return response

So that’s basically it. The implementation is pretty straightforward and can be improved in multiple different ways, but the magic happens when you combine this simplicity with the unique AppEngine capability of running long-term requests. So let’s proceed and set up a new Tasks Runner service in your own GCP project.

First, we will create a new GCP project for demo purposes and you can do that at https://console.cloud.google.com/projectcreate.

Creating new GCP project
Creating new GCP project

Note: Please do not share your project ID as it may pose a security risk as the project ID is widely used in GCP to perform various tasks. (The one on the screenshot is a fake one, no worries)

In the new project, we’re jumping into Cloud Shell Editor where we clone the appengine-tasks-runner. If you are not familiar with Cloud Shell it has a pretty descriptive documentation and a Cloud Skills Boost lab that may help you out.

gh repo clone xSAVIKx/appengine-tasks-runner
cd appengine-tasks-runner
Cloning the repo in Cloud Shell

With the repository cloned, we’re first going to set up the environment required for the Tasks Runner service itself and then proceed with the deployment of the demo workflow and demo job service.

The setup-env.shscript is parameterized with GOOGLE_CLOUD_PROJECT and GCP_REGION environment variables. The GCP_REGION defaults to us-central and that’s the region where the AppEngine application and Cloud Tasks queue are going to be created.

You can just run the setup-env.sh script and jump over to the service deployment.

But if you’re interested, here’s what is going on in the script itself.

So first we propagate the envs:

GCP_PROJECT="${GOOGLE_CLOUD_PROJECT}"
APPENGINE_GCP_REGION="${APPENGINE_GCP_REGION:-us-central}"
TASKS_GCP_REGION="${TASKS_GCP_REGION:-us-central1}"

Then we enable all required GCP services:

gcloud services enable serviceusage.googleapis.com --project="${GCP_PROJECT}"
gcloud services enable servicemanagement.googleapis.com --project="${GCP_PROJECT}"
gcloud services enable secretmanager.googleapis.com --project="${GCP_PROJECT}"
gcloud services enable cloudapis.googleapis.com --project="${GCP_PROJECT}"
gcloud services enable cloudtasks.googleapis.com --project="${GCP_PROJECT}"
gcloud services enable storage-component.googleapis.com --project="${GCP_PROJECT}"
gcloud services enable monitoring.googleapis.com --project="${GCP_PROJECT}"
gcloud services enable cloudbuild.googleapis.com --project="${GCP_PROJECT}"
gcloud services enable logging.googleapis.com --project="${GCP_PROJECT}"
gcloud services enable appengine.googleapis.com --project="${GCP_PROJECT}"
gcloud services enable iamcredentials.googleapis.com --project="${GCP_PROJECT}"
gcloud services enable iam.googleapis.com --project="${GCP_PROJECT}"

Now we can create the App Engine application:

gcloud app create \
--region="${APPENGINE_GCP_REGION}" \
--project="${GCP_PROJECT}"

It will take some time and we can proceed to the creation of all the required service accounts and permissions.

SERVICE_CALLER_SA="service-caller"

gcloud iam service-accounts create "${SERVICE_CALLER_SA}" \
--display-name="Service Caller" \
--description="Performs authorized service and API HTTP calls." \
--project="${GCP_PROJECT}"

SERVICE_CALLER_SA_EMAIL="${SERVICE_CALLER_SA}@${GCP_PROJECT}.iam.gserviceaccount.com"

gcloud projects add-iam-policy-binding ${GCP_PROJECT} \
--member="serviceAccount:${SERVICE_CALLER_SA_EMAIL}" \
--role="roles/cloudfunctions.invoker" \
--condition=None
gcloud projects add-iam-policy-binding ${GCP_PROJECT} \
--member="serviceAccount:${SERVICE_CALLER_SA_EMAIL}" \
--role="roles/run.invoker" \
--condition=None

The Tasks Runner service is going to use the Service Caller secret key to perform the actual calls to the services. You may want to add e.g. Workflows Invoker role to the service account if you decide to send callbacks from the Tasks Runner.

So now we need to export the service account key and create a secret out of it. We can do that all from the CLI as well:

# Create a service account JSON key
gcloud iam service-accounts keys create "service-caller.key.json" \
--iam-account="${SERVICE_CALLER_SA_EMAIL}" \
--project="${GCP_PROJECT}"

# Store the key in a Secret Manager secret
gcloud secrets create "service-caller-sa-key" \
--data-file="service-caller.key.json" \
--labels="service=appengine-tasks-runner" \
--project="${GCP_PROJECT}"

The final step is to create a Cloud Tasks push queue that will be used to propagate tasks to the Tasks Runner. Here’s how to do that:

gcloud tasks queues create "scheduled-tasks" \
--location="${TASKS_GCP_REGION}" \
--max-attempts=3 \
--max-backoff="10s" \
--max-dispatches-per-second=1 \
--max-concurrent-dispatches=500 \
--routing-override="service:tasks-runner" \
--project="${GCP_PROJECT}"

The tasks queue configuration is opinionated and you may want to tweak it to match your own needs and resources. For example, you may want to disable retries completely on the Cloud Tasks side and rely on your internal retries logic or you may want to increase the max concurrent tasks per second to improve performance.

At this moment we have all the required pieces in place and the only thing left is to deploy the App Engine service itself. While the deployment step will probably be used more than once there is a separate deploy.sh script to do that:

GCP_PROJECT="${GOOGLE_CLOUD_PROJECT}"

poetry export --no-interaction --without-hashes --format requirements.txt --output requirements.txt

echo "gunicorn" >> requirements.txt

gcloud app deploy app.yaml --project="${GCP_PROJECT}"

You may need to install poetry in order to export the defined dependencies but that’s as easy as running:

curl -sSL https://install.python-poetry.org | POETRY_VERSION=1.8.2 python3 -

When successfully deployed you should see the following response when opening the app URL in the browser:

{"detail":"Method Not Allowed"}

That’s OK. We’re not expecting anyone to use GET HTTP requests.

Now we have it the App Engine Tasks Runner service is up and ready and can forward your requests to the services.

So if your goal was to deploy the service and you’re ready to plug your own services and workflows you’re now good to go. For those who are interested in the e2e example, we have to set up and configure a couple of more things. Stay tuned for the testing part of the setup 😉

This service is still in active use in a serverless data processing platform at Travelshift where we are building next-gen travel experience solutions. You can check it out at Guide to Europe and Guide to Iceland.

--

--

Yurii Serhiichuk
Google Cloud - Community

GCP Champion Innovator, 6x GCP Certified, tech-savvy Cloud Engineer. Troubleshooter and problem solver.