How can you deliver a variety of images in different formats on a high-traffic site in real time?

Jean-Michel Bouvier
Peaksys Engineering
9 min readMay 12, 2022

Images are essential to an e-commerce website: they represent 80% of queries and 60% of the data transferred. They are at the core of the browsing experience: if an image platform is unavailable, slow or of poor quality, it has immediate consequences on the site’s sales activity. Conversion drops, orders are impacted, and turnover falls.

Images also play a primary role in SEO. Google prioritises faster websites in its results: they offer a good user experience, so they are promoted to a higher ranking in the results pages. The appearance of core web vitals in May 2020, especially the weight of the Largest Contentful Paint (LCP), demonstrated the important of quickly delivering images, which in most cases make up the biggest piece of content on a webpage. Delivering images quickly helps optimise SEO!

For Cdiscount, this means applying these key principles of availability and speed to more than 1.4 billion images!

Images come in different “flavours”

Different sizes and formats to be delivered quickly

Images must be distributed quickly to the end user: reducing the bandwidth needed to carry them is a strong need. Image sizes must therefore be optimised specifically according to the device that will display them so as not to be too “big”. A happy medium must be found to offer the most comfortable browsing experience. The image must not be pixelated, but its quality must not be too good, either. There are several resolutions available depending on the type of device — mobiles or desktops — as well as the display environment, such as small vignettes, traditional display on a product page, or a zoomed-in image. Each image at Cdiscount can be converted into seven possible resolutions.

Then, the image compression format needs to be optimised to further reduce its size and optimise transport over the network. Browser diversity and the differences in supported format contribute to increasing the number of formats. Cdiscount has chosen two formats: jpeg for broad compatibility and webp for the best compression performance on modern browsers.

Each image is therefore available in 2 formats *7 sizes, or 14 flavours.

Example of the same image in different size on the site
Example of the same image in different size on the site

A large volume of images

The website www.cdiscount.com offers a wide variety of images, up to 10 visuals per product.

Cdiscount offers 100 million active products; our stock of adapted images amounts to nearly 1.4 billion!

14 flavours multiplied by 1.4 billion images brings us to several terabytes of storage, with significant hardware costs. Add to that the additional several million of new image files per day, and streamlining storage is more essential than ever.

A Media Delivery Platform for real time delivery

We chose to build a dedicated Media Delivery Platform (MDP) to meet the challenges of faster distribution, a growing catalogue and SEO performance.

There are several ways to look at managing a variety of image formats:

  1. The most basic way is to systematically convert the images to the various formats and sizes as they are imported.
    This model can be a good choice if the permutations are limited, and response time is a big constraint. Storage costs are high.
  2. It is possible to carry out an adaptation “on demand” and store the result for future use. This solution allows you to process only the various formats actually used for each image.
    This solution is relevant when there is a great number of images and formats, especially if only a small number of permutations is actually used. Processing time is greater, however, and some clients receive images that are not optimised to limit LCP on client side.
  3. The last solution is a real-time conversion of the original image in order to adapt it to the requested format, and to only keep the result in memory or in cache, not in long-term storage for cost efficiency. While real-time image conversion offers flexibility, it also involves processing for each new query. When the same image is requested multiple times, it is costly in processing resources and penalises response time.
    This solution is relevant when there is a great number of images and formats, and issues of speed and data freshness are significant. It also allows a new image format or resolution to be introduced with more flexibility than the two previous solutions by processing images in high demand very quickly at a certain moment, and finishing the conversion of the rest of the catalogue as it is used. Processing demand is high with this solution if the cache is not managed with precision.

We chose the last solution. We should note that this solution comes with several constraints:

  • Conversion time must be compatible with response time constraints. At most 200 ms per conversion at the 95th percentile to keep up with a FAST LCP on client side
  • Using the cache is highly recommended to reduce the number of CPU used to compute images, for cost efficiency

So, we have combined a cache solution with a Time-to-Live (TTL) configuration set to several days.
This allows us to act faster when changing formats and to have better control over our storage.

Implementation of our Media Delivery Platform

The website Cdiscount.com has been developed to have a global availability rate of over 99.99% — i.e., less than one hour of downtime per year. Thus, the MDP supplying the site’s images must have a higher availability rate, nearly 99.9999% to keep up with global SLO.

Real time with open-source Thumbor

Thumbor is an open-source real-time image conversion API used by many of the biggest names on the web. Thumbor is a cornerstone of image conversion, but the MDP handles other media in addition to images and provides a variety of features, such as PDF storage.

This tool, developed in Python, can be containerised and is easy to deploy. Thumbor has many image-processing filters and many extension abilities, which makes it very flexible to use.

To use Thumbor, you can simply pass the query URL with all the information needed to locate the source image and convert it to the desired result.

For example: /unsafe/fit-in/300x300/filters:format(webp)/pictures/pic8321925629414.jpg

Meaning:

  • fit-in/300x300: the image must be adjusted to fit into a 300x300 square
  • filters:format(webp): the image must be in webp format
  • pictures/pic8321925629414.jpg: the last part of the path identifies the image

Integration into the IT system

Our MDP orchestrates the following tasks when it receives a request for an image that is not in the cache:

  • Analyse the path of the request to:
    * Identify the source media
    * Identify the conversion actions that need to be performed — if it is an image, what size is being requested, for example)
  • Collect the resource:
    * Images: build the request and call Thumbor
    * Other media: access the resource
  • Set up a mediation strategy if the resource is not found
    * Fallback: search in various image banks
    * NoVisuel: the image returned when the requested image is not found
  • Formalise the response:
    * Manage response codes
    * Manage caches
Media Rendering Module architecture overview
Media Rendering Module architecture overview

Remember: How URLs are defined is important for a website because it can have repercussions for its visibility. Image URLs are not exempt from this rule. They are therefore specific to the site and can incorporate various elements in order to add meaning and context:

  • Aspect SEO
  • Grammar related to conversion — size, for example
  • Information about the resource’s localisation (language)
  • Etc.

Performance with cache

To reduce the processing impact of multiple requests for the same image, we have implemented a cache in the Baleen CDN (https://baleen.cloud/). This helps to drastically reduce the number of requests the MDP actually receives. In our case, we have a cache hit ratio of around 90%.

A lesser-known aspect of variability is related to the image formats that browsers support and the bots that consume them. This variability is not written into the URL but in the query metadata (the accept header). Thus, the resources have the same path, which makes cache management more delicate.

While it is preferable to respond with the most compact image possible to enjoy the performance gains related to the download time, above all it is necessary to respond in a format that the client supports.

To handle this problem, we used the http protocol’s “vary” header. The idea is as follows:

  • In its response, a content production server specifies the client request headers that can cause its response to vary.
  • This allows any cache systems between the server and client to correctly manage the variants in future queries.
  • The cache key is therefore not an absolute resource ID, but it points to a list of objects.
  • The resource is selected according to the vary instructions given by the server in its previous responses.
  • If there is no resource management, then the content production server is queried to obtain the right resource variant (return to the first point), and it will be cached.
How header Vary management works
How header Vary management works

Vary must be used carefully, because this solution is not appropriate if there is a large number of variants:

  • The number of variants that are actually possible must be limited
  • It is essential to standardise the headers used in the vary clauses in order to maintain control over the number of variants

If these conditions are not respected, the cache will not be as efficient.

Using vary has the advantage of grouping all the variant resources under the same cache key, which makes it simpler to cancel.

The refresh problem

In the world of the web, using the cache typically comes with the cancellation problem. The longer the duration of the cache, the more significant the problem becomes.

To reduce image conversion processing, we tend to set up cache durations of around 10 days. This avoids querying the MDP for things like conversions or storage access for these ten days, but it also prevents a resource modification from being visible for this period.

This latency is not compatible with the constraints of e-commerce. To solve this problem, we have implemented a flush management module whose role is to cancel URLs whose source image has changed.

Flush management overview
Flush management overview

The module uses mongoDB’s data expiry capabilities.

  • The image URLs that the MDP returns are stored in the base for the same duration as the cache instruction.
  • When the image is reloaded (different PoPs of Baleen’s CDN, a cache server restart, cancellation of an object due to memory pressure, etc.), the URL expiry date is pushed back as a consequence.
  • Soon after the image has expired in Baleen’s cache, the data is erased from the database.

If the source image is modified, all the related URLs that are still alive in the cache are retrieved from the database. The module then queries the Baleen CDN API to cancel them.

This system makes the MDP’s resource changes visible quickly while taking advantage of a long-term cache.

Traffic vs performance over the component stack
Traffic vs performance over the component stack

Encouraging results

The MDP overhaul is a long-term project that has shown good results in various aspects.

  • Performance :
    * Gain of 60% on images’ TTFB (100 ms -> 40 ms)
  • Green IT :
    * A gain of 33% on size with WebP
    * Gain of 180 CPU on the Media Delivery component (200 -> 20)
  • Load :
    * Peak at 4,000 requests/sec
  • Availability :
    * 100% in 2021
  • Improved consistency thanks to the flush module

Users and SEO are the big winners

Managing real-time image conversions combined with a cache helped to optimise image rendering performance and also contributed to making Cdiscount.com a Fast site in Google’s CRuX performance indicators. Users and our SEO are the big winners from the creation of our MDP platform.

These successes allow us to look to the future with confidence and open up new possibilities. It allows us to work on new needs, such as new sizes and formats, and to ready ourselves to tackle new challenges for Cdiscount and Octopia — new clients, new geographical areas, new URL formats, and new growth.

--

--

Jean-Michel Bouvier
Peaksys Engineering

High availability and high performance systems enthusiast. Platform Technical Leader @ Cdiscount