Image Processing as a Microservice

Pete Saia
Stackahoy
Published in
3 min readApr 3, 2018
Go + GCP = 🚀

Distributed State of Mind

When creating a distributed system, photo handling becomes less trivial. At the first signs of scale it can become a real pain point in the application. Why?

  • Mounting multiple running nodes to the same persistent storage is impossible — for good reason. If it isn’t with some providers/stores, you will have to worry about data corruption during writes.
  • Storing and serving (bill|mill|trill)ions of images on a disk and serving them yourself isn’t fun and is error prone. You also probably won’t be able to serve it as quickly as Google.
  • Image manipulation (resizing, cropping, auto-orientation) is resource intensive. In a distributed system, you want your nodes to have maximum capacity for throughput.

Potential Solution 1: Google Function/ Serverless

Kubernetes cluster interfacing with a DB service and Google Function

I want to discuss this solution because image processing is a common use case for going serverless. Google Functions (AWS Lambda can also be considered, but we’re focusing on Google here — same idea) are a great way to handle this for some cases.

Here’s the general idea:

  1. Create an endpoint to receive the full size image and send to store.
  2. Create a background function to handle image processing once it’s uploaded to the bucket.
  3. Update persistent storage accordingly and send any other smoke signal that processing is complete.

This is nice because it takes the heavy lifting off the core application and is quite scalable. However, there are some downsides — which may not effect you:

  • The application will have to handle uploading the file to the storage bucket which requires you interfacing with Google’s GCP’s libraries and authorizing. When you’re dealing with many uploads, this can be a bottleneck. Alternatively, you can create a new public Lambda endpoint to accept the image, but you’ll have to also handle application level authentication there (if needed).
  • If you want to know when your processed images are ready you’ll have to create endpoints to consume the various update messages from the process.

As you can see, this has the potential to become complicated due to all of the moving parts, especially if your application requires any level of user auth. This was not ideal for us.

Potential Solution 2: Imageup Microservice

Imageup is a replicable http based microservice which can run privately within your network. It’s written in Go and makes use of the blazingly fast Imaging library. This allows for a synchronous way to send, process, and retrieve remote unique URL’s for images.

Imageup within a Kubernetes cluster

The general idea for this is:

  1. Create an endpoint to handle authentication if needed and the file to upload.
  2. Synchronously stream the file along with dimensions to imageup and receive a corresponding array of remote images ready to serve.
request -> payload

Kubernetes makes it easy to set this up, and you can deploy as many replicas as needed to handle concurrent processing. The key aspect for connectivity here, is that the Imageup service is using a private load balancer:

metadata: 
name: imageup-service
annotations:
cloud.google.com/load-balancer-type: "Internal"

From here, you’ll be able to interface with Imageup from your application. For us, it was from a node.js application using the interface module linked below. For more information, please checkout the imageup API and related resources below.

--

--