Doried Abdallah
The Startup
Published in
9 min readMay 23, 2020

--

Photo by Andy Art on Unsplash

Setting up Symfony with Imagine library and cloud storage (Amazon S3) efficiently, on top of Kubernetes

TL;DR

What are we trying to achieve?

  • Storing & serving user-uploaded images for a Symfony application using cloud storage (like Amazon S3 or DigitalOcean Spaces)
  • Manipulating images to generate different sized thumbnails, and storing the results back on S3 while being able to serve them to the clients instantly.
  • Images and the results of image manipulation should be available via CDN
  • Storage to cloud should happen asynchronously without blocking web servers

Challenge: Performance

Technologies & tools used: Symfony, LiipImagineBundle, Redis, Symfony Messenger with RabbitMQ, NodeJS and Cloud Storage.

Let’s start with a brief introduction, skip to the solution part if you are familiar with Symfony and LiipImagineBundle

Introduction: Symfony & LiipImagine

Symfony is a great PHP framework that allows you to build huge, organised and scalable web applications, using many reusable components (or as they call it, Bundles), that you can just plug-and-play in your project.

LiipImagineBundle is one of the greatest additions to Symfony, that wraps PHP’s Imagine library for image manipulation, and provides a simple way to optimize pictures before serving them to the front-end.

Use cases of LiipImagine include generating image thumbnails with different sizes to reduce response size, adding water marks and others.

To see how powerful and easy-to-use LiipImagineBundle is, take a look at this example (taken from the library’s GitHub page)

<img src="{{ asset('/relative/path/to/image.jpg') | imagine_filter('my_thumb') }}" />

As you can see, we just applied a twig filter `imagine_filter(‘my_thumb’)` (assuming we already defined my_thumb in liipImagine configuration as a filter that generates, let’s say 80x80 thumbnail of the picture).

This will replace the image path with a url, that get’s later requested by the browser, then handled by a controller provided from LiipImagine bundle, to:

  • Generate the thumbnail out of the provided image
  • Store the result (on local filesystem, by default)
  • Return a redirect response to the path of the generated image

From now on, any time imagine_filter(‘my_thumb’) gets requested again for the same image, LiipImagine will realize that this filter already got applied to the provided image, and the result is saved on the disk, so it will just return the path of the generated resulting image as a static file directly, without going through a Symfony controller

What did we want to achieve?

We have a website, running on top of Kubernetes, where users can upload their own content, which might include pictures of any size or format.

We’re using DigitalOcean Spaces (cloud storage service, similar to AWS S3) to store user-uploaded images. External storage service is necessary in containerized applications, but we’re also making use other features provided by the cloud storage, like out-of-the-box CDN to distribute static files very efficiently. To learn more about the architecture we’re using, read my other article on that topic

To be able to serve user-uploaded images to other users, we want to do some manipulations first, like resizing & compressing them, depending on the page/platform requesting them. We also want to store the results back to Cloud Storage service, to make use of its features -like CDN- and prevent generating the same images again and again.

LiipImagine is powerful, it can connect to cloud storage to retrieve pictures, generate thumbnails, and even store the results to cloud storage. However, the issue with this approach is the huge impact on the website performance, hence the end user, as it will increase response time.

Every time a request involves applying a filter to an image, LiipImagine will check if that filter was already applied to that image, and return the result if that’s the case. Otherwise, it will apply the filter, store the resulting binary, then return the path to it.

Imagine a request that involves generating filters(thumbnails) for a list consisting of 20 articles. This would mean at least 2*20 different requests to cloud storage, first checking if requested filters were generated/stored before (default behavior), then asking it to store 20 images (results of applying the filters) and having to wait for it to completely fulfill each request individually, before returning any response to the user. Well, that won’t work in a real-world application.

How did we achieve the requirements?

LiipImagine is powerful and well-engineered, and allows you to override many parts and gives you a lot of flexibility.

According to my understanding of how the bundle works, imagine a request to apply filter thumb_80x80 on image /avatar.png is being handled:

  • The bundle first calls its ImageCacheResolver, let’s call this $resolver, so it will ask $resolver->isStored($path, $filter). This should return ‘true’ if we already have this filter applied before to the image in $path, and we know where the result is stored.
  • If the previous call returns true, the bundle will just return $resolver->resolve as the path to the cached result (of applying that filter to this image)
  • If the call $resolver->isStored returned false, the bundle will first apply the filter to the image, then will give the generated binary to $resolver->store($binary, $path, $filter). $resolver now should store this binary, and make it accessible using ($path, $filter) combination. Note this usually should store it in a deterministic path: for the same $path/$filter combination, we always resolve to the same path.

There are different implementations for ImageCacheResolver (Liip\ImagineBundle\Imagine\Cache\Resolver) many of them live under (Liip\ImagineBundle\Imagine\Cache) namespcace.

For a local filesystem resolver, to implement isStored, it will just check if there is a file locally for the requested image/path combination to know whether we have the result cached or not, where a cloud storage implementation might issue a network call to figure that out. Same applies for $resolver->store.

Now in our implementation, we used a Redis to cache a lot of information. If you don’t know Redis, it’s an efficient distributed in-memory cache, and you should definitely take a look at it!

Take a look at the following diagram, then read the steps below to understand what we exactly do:

Image filter generation flow
Block diagram showing different systems involved in images handling

Step by step walk-through

  • Whenever a request to generate $filter for image $path comes, we query Redis to see if such filter was generated before.
  • If the filter was not generated before, we generate it, and add an entry to Redis with two fields (stored => false, content => binary content of the result). We also add a message to a queue, to be consumed offline, the messages tells (filter $filter was requested for image $path and result is now available in Redis)
  • A Symfony message consumer is running in a different deployment in the k8s cluster, to handle messages of this form. It will just move the binary content of the result from Redis to Cloud Storage and set (stored => true) to mark this image as permanently stored. It will also ask Redis to expire the field ‘content’ of the image after some time, to avoid filling Redis with binary content.
  • Now, any request to generate $filter out of image $path will be handled correctly: the $resolver will find that we have a Redis entry for this combination, that says ‘stored => 1’. Hence, it safely assume the result is stored in Cloud Storage and returns the path to where is should be stored.
  • Notice that some requests for ($image/$filter) might come before the Symfony message consumer stores the result in Cloud Storage, specially for example the request where the image was uploaded. In this case, we serve the image from Redis itself (remember, we stored the binary content there). I’ve created a tiny nodeJS service to handle that, as it’s much faster than a Symfony controller, however you can also create a Symfony controller to read the image from Redis and return it.

As you can see, the mentioned approach helps us in two different ways:

  • No need to connect to Cloud Storage to check if image/filter already exists or needs to be generated
  • No need to make web-requests block, waiting for the web-server to write resulting images to Cloud Storage.

Of course, Redis is not a permanent storage, so it might be worth investing some time to use MySql or some other system when checking if a filter was already applied to an image ($resolver->isStored implementation), but you get the idea :)

Now, we’re done with the theoretical part. If you want to continue to implementation, take a look at the next section.

Some pieces needed to implement this

Looks like there are many components, right? but the implementation is straight-forward. If you are building a scalable Symfony website, most likely you have all of those components already available!

  • A Redis cluster (deployment to k8s is straight forward)
  • Override LiipImagine Cache Resolver with a custom one (will come soon)
  • Setting up Symfony Messenger, defining a custom message with a custom asynchronous handler, and running a consumer instance. I’m using RabbitMQ as a message broker.
  • Implement a simple nodejs image server, to serve images stored in Redis before they are moved to cloud storage.

I don’t want to go too deep in different implementation steps, but I’m going to mention the most essential parts, and here I’m talking about a local setup.

1- Try to install Redis, the fastest way might be to use a docker image, and use port-forwarding to make it available to the php application

docker run -p6379:6379 --name my-redis -d redis redis-server --appendonly yes

2- Install a Redis client bundle in Symfony. I think you can use this bundle.

3- Custom CacheResolver: Assuming you’ve already set and used LiipImaging bundle before, let’s override the bundle’s Cache Resolver. Please follow the instructions in this page. Your cache resolver should look similar to this

4- NoeJs image service: to get something up and running quickly, let’s set up the nodeJS image server, and start serving images from it for now.

At this stage, after installing any missing nodeJS modules, running the image server and connecting things together, you should be able to start serving images from this newly created Redis image service!

However, to make the solution works till the end, the most important part is uploading the generated images to persistent (or cloud) storage, which will require configuring Symfony Messenger for async message handling.

5- Now, time for Symfony Messenger!

Honestly, I’m not sure if I remember all the steps I did to get this component working, specially that I’m using Symfony 3.4, while this component was released for Symfony > 4.

Anyway, I think the documentation is good to help you understand and configure the Messenger Component for whatever Symfony version you have. I’ll let you try it a bit! however, I’ve provided my main configuration file below, you might find it helpful.

Just keep in mind, we want asynchronous message handling, and for that, we’ll use RabbitMQ as a transport.

Make sure you have php-amqp installed, then install rabbitMQ (As usual, I’d advice running it from a docker image!)

docker run -d -p 5672:5672 -p 15672:15672 rabbitmq:3-management

Now, you’ll need to do some research on how to configure Symfony Messenger with rabbitMQ as a transport, check my configurations below. Alternatively (and for tests purpose only), you can do synchronous message handling to test the things out.

Now, create a message type

And finally, create a message handler

As promised above, this is my configuration for Symfony Messenger. It might contain extra pieces that you won’t need, but it’s hard for me to isolate them ;)

Now, and assuming you configured Messenger and linked it to rabbitMQ, you should be ready to run your messages consumer

From your Symfony root dir, execute the following command

bin/console messenger:consume-message amqp_message_receiver -vv

This will run a worker to handle all messages coming on the message receiver amqp_message_receiver. For production environment, I have another kubernetes deployment to run those consumers, and I run them inside supervisor.

To conclude, it might have been hard for me to remember all the details, specially after a while of implementing this architecture. So, if you find that I’ve missed anything, please let me know.

And feel free to suggest any alternatives to what I did or share your own experiences. At the same time, if you have any questions, please don’t hesitate to ask.

--

--

Doried Abdallah
The Startup

Senior Software Engineer and Co-Founder of Oufok. Passionate about web development, architecture, and cutting-edge technologies