Building gateway for ML media services using Go

Kostya Amelichev
neiro.ai
Published in
4 min readMar 15, 2023

At Neiro.ai we build Generative AI tools for personality cloning. We have Text-to-Speech, Voice Conversion, LipSync and other AI techs in one user-friendly web interface. We also provide an API for Business Customers.

Our machine-learning services are maintained by different engineering teams and have different APIs. We developed the gateway to monitor these services and wrap them into one convenient service.

We need the gateway to meet the following requirements:

  1. We need to process gRPC requests coming from our mobile apps
  2. We need to process REST requests coming from our Web Studio
  3. All infrastructure logic should be implemented once and for all — rate limits, token verification, tracing, monitoring, and so on
  4. In most cases, requests include media files: audio and video. Depending on the case, we want to send either URL or binary file to the gateway.

As a startup, we want to move fast and decided to build a gateway using Go infrastructure.

We provide this article with a sample echo application. The source code can be found here: https://github.com/mynalabsai/grpc_gateway_media_example

Service interface described in protobuf format

Use both REST and gRPC

In order to fulfill the first requirement, we decided to use gRPC-gateway, which generates a proxy from JSON to protobuf object format and can redirect the rest request to the gRPC server.

gRPC-gateway generates reverse proxy from REST to gRPC

Here are three things we need to do:

  • Add a handler from gRPC-gateway that will broadcast the request from REST to gRPC. It gets attached to http.ServeMux. Here's how it's done:
Mux will handle /v1/echo by proxying request to localhost:9090 grpc endpoint
  • Start the gRPC server
  • For the HTTP server, filter out Content-type: application/grpc, and send it directly to the gRPC-handler. We do this by using a method from this article. The rest of the requests are processed with the existing mux and are assigned to gRPC-gateway proxy.

Common logic

The redirect of the request to the desired ML service is a callback. It is called inside the shared wrapper. The shared wrapper handles the logic related to the infrastructure and is not influenced by request type.

In our example, the echo service callback waits for a second and then returns its input as a response. Wrapper measures the latency of callback and can be reused for other requests.

By using the wrapper we can measure the duration for every type of requests

Handle media files

Once you run gRPC-gateway with requests containing raw data in bytes, an error occurs.

curl for HTTP endpoint
grpcurl for gRPC endpoint

This is because data field type in our data model is bytes . It is impossible to transmit bytes in the JSON text format, which is why we use base64 representation of the data. For it to work, we will pass base64(abacaba), which is YWJhY2FiYQ==, instead of abacaba for field data

This is an expected echo response for HTTP
This is an expected echo response for gRPC

Basically, if the goal is to transmit a media file in the request body while preserving the structure, we create a string with a base64 representation of the file.

However, this string will be too long for an audio or a short video, which makes it inconvenient to work with JSON. Besides, the user must be able to manually check that the request will return a specific file. In the first iteration, we made a python script that encodes a file and outputs a line that needs to be explicitly copied to the right place in JSON. Besides being inconvenient, anyone who wants to make a request will need to have a script.

Another option is to support files via URLs in the API. We allow this option, but manual testing may be tricky: you need to send the audio to the object storage. We wanted an option to send a local file.

The perfect way to send files would be to support multipart/form-data requests. Yet, the gRPC-gateway ecosystem does not allow it. The authors suggest handling file downloads separately. We are looking to make a request in the same format as before, but with sending files via multipart/form-data instead of explicitly sending them inside the structure with the request.

That’s why we arrived at the solution of using macros that suggest which string fields need to be additionally expanded through the base64 representation of a specified file.

Here is how our request looks like
Now we can send request and specify local file

Here’s how to do such processing. First, we install middleware. After receiving multipart/form-data, middleware recursively parses the JSON with data and expands macros.

The new request follows the same path as before. This is how it looks in Go:

This is how we process multipart/form-data and expand macros

And the handler is slightly updated:

Filter application/grpc for gRPC server, expand macros and use reverse proxy for the rest

Conclusion:

This is an overview of how we used gRPC-gateway to handle media files in our ML services.

We hope you enjoyed the ride and learned something useful for your projects. Once again, you can find full example in our github: https://github.com/mynalabsai/grpc_gateway_media_example

At Neiro.ai, we’re all about moving fast, and the gateway we built using Go infrastructure has allowed us to do just that. By using gRPC-gateway, we were able to fulfill our requirements to process both gRPC and REST requests, and share infrastructure logic for rate limits, token verification, tracing, monitoring, and more. We even found a neat workaround to handle media files by using macros expansion. If you’re in a similar boat, we highly recommend checking out gRPC-gateway as a solution. Thanks for reading, and stay tuned for more AI adventures.

--

--