NGINX reverse proxy image resizing + AWS S3

How to set up a dynamic image resizer using NGINX with Amazon S3 as an origin.

Introduction

Many solutions can be found on the internet for on-demand image resizing. A very popular solution is a combination of NGINX with the image_filter plugin, where the image retrieved is downloaded from its source, resized on the fly and returned to the client.

For our project, we chose an Amazon S3 bucket as the storage location for our images. S3 can hold large number of images (close to infinite), is quite cheap and keeps our NGINX server stateless.

As we only have a limited set of images for our project, which will be returned in a limited amount of resolutions, the image processing overhead can reduced by adding a cache after the resizer. Using an NGINX proxy_cache, we have set the caching duration of the resized images to 30 days.

Setup from image-source to client

Initial solution

Using NGINX, the conf file looks as follows:

# NGINX will create a cache capable of storing 100MB of keys and 1000MB of data.
proxy_cache_path /tmp/nginx_cache levels=1:2 keys_zone=nginx_cache:100M max_size=1G inactive=40d;
log_subrequest on;
# front facing virtual host for caching
server {
listen 80;
server_name localhost;
 location /img/ {
proxy_pass http://127.0.0.1:10177;
proxy_cache nginx_cache;
proxy_cache_key “$proxy_host$uri$is_args$args”;
# proxy_cache_lock on; # Allow only one request
proxy_cache_valid 30d; # Cache valid images for 30 days.
expires 30d;
}

}
# resizing server
server {
listen 10177;
server_name localhost2;
 resolver 8.8.8.8; # Use Google for DNS.
resolver_timeout 60s;
 set $backend ‘s3.eu-central-1.amazonaws.com/image-bucket’;
 proxy_buffering off;
proxy_http_version 1.1;
proxy_pass_request_body off; # Not needed by AWS.
proxy_pass_request_headers off;
 # Clean up the headers going to and from S3.
proxy_hide_header “x-amz-id-2”;
proxy_hide_header “x-amz-request-id”;
proxy_hide_header “x-amz-storage-class”;
proxy_hide_header “Set-Cookie”;
proxy_ignore_headers “Set-Cookie”;

location ~ ^/img/([0–9]+)x([0–9]+)/(.+) {
error_page 415 =404 /empty.gif;
image_filter_buffer 20M;
image_filter_jpeg_quality 75; # Desired JPG quality
image_filter_interlace on; # For progressive JPG
   image_filter resize $1 $2;
   proxy_pass http://$backend/$3;
}
}

Requesting an image like so: /img/200x400/picture_1.png, would:

  1. Download the picture from the S3 bucket in amazon. 
    Url: s3.eu-central-1.amazonaws.com/image-bucket/picture_1.png
  2. Resize the image to a 200 x 400 format
  3. Cache the image for 30 days in memory.

Random HTTP 415 errors

During testing, the clients received a couple of unexplained 415 HTTP status messages. This error occurred even more often when disabling the caching server (which is logical, seeing that it needs to resize more and cannot take anything from the cache when disabled).

This might also explain why few people are running into this problem: when the image cannot be downloaded, if you try again it likely it will succeed, then the 30 day cache will make sure that that particular image keeps working for a while. Nevertheless, this is not an ideal solution.

Status code 415 (unsupported media type), according to the Image filter module spec, means that the downloaded data is not recognised as supported image type.

To investigate this issue, we went into the NGINX code (and the Image resizer module in particular). This is the relevant code snippet:

ngx_http_image_test(ngx_http_request_t *r, ngx_chain_t *in)
{
u_char *p;
p = in->buf->pos;
    if (in->buf->last - p < 16) {
return NGX_HTTP_IMAGE_NONE;
}
...

This function tests whether the data is a type of image and to do so, it requires at least 16 bits/2 bytes. This code also reveals that if the buffer does not at least contain 16 bits, it is deemed as "not an image".

As the communication with AWS is TCP based, the buffer is filled with TCP packages, without guarantees that at least 16 bits are available in the buffer from just the first package.

Knowing the cause of our problem, Googling revealed an open issue on NGINX: https://trac.nginx.org/nginx/ticket/756

The issue has been open for 2 years now and there seems to be quite some discussion about how to resolve this for all cases, with no real consensus. Solving this issue for all scenarios without breaking something else is quite challenging.

Solution

In the meantime, we found a simple workaround that works very nicely for our particular scenario. Our solution being: create a 'caching sandwich'. Placing another NGINX cache in front of the Image resizer makes sure that the image is completely downloaded before it is resized.

New setup from image-source to client, including second caching server

The conf file then becomes:

# NGINX will create a cache capable of storing 100MB of keys and 1000MB of data.
proxy_cache_path /tmp/nginx_cache levels=1:2 keys_zone=nginx_cache:100M max_size=1G inactive=40d;
log_subrequest on;
# front facing virtual host for caching
server {
listen 80;
server_name proxy_server;
location /img/ {
proxy_pass http://127.0.0.1:10177;
proxy_cache nginx_cache;
proxy_cache_key "$proxy_host$uri$is_args$args";
# proxy_cache_lock on; # Allow only one request
proxy_cache_valid 30d; # Cache valid images for 30 days.
expires 30d;
}
}
# resizing server
server {
listen 10177;
server_name image_resize_server;
    location ~ ^/img/([0-9]+)x([0-9]+)/(.+) {
error_page 415 =404 /empty.gif;
image_filter_buffer 20M;
image_filter_jpeg_quality 75; # Desired JPG quality
image_filter_interlace on; # For progressive JPG
        image_filter resize $1 $2;
        rewrite ^ $request_uri;
rewrite ^/img/([0-9]+)x([0-9]+)/(.*) $3 break;
return 400;
        proxy_pass http://127.0.0.1:10178/img/$uri;
}
}
# back-end virtual host for retrieving file from AWS
server {
listen 10178;
server_name second_proxy_server;
    resolver 8.8.8.8;  # Use Google for DNS.
resolver_timeout 60s;
    set $backend ‘s3.eu-central-1.amazonaws.com/image-bucket’;
    proxy_buffering off;
proxy_http_version 1.1;
proxy_pass_request_body off; # Not needed by AWS.
proxy_pass_request_headers off;
    # Clean up the headers going to and from S3.
proxy_hide_header "x-amz-id-2";
proxy_hide_header "x-amz-request-id";
proxy_hide_header "x-amz-storage-class";
proxy_hide_header "Set-Cookie";
proxy_ignore_headers "Set-Cookie";
    proxy_connect_timeout   60;
proxy_send_timeout 60;
proxy_read_timeout 60;
    location ~ ^/img/(.+) {
rewrite ^ $request_uri;
rewrite ^/img/(.*) $1 break;
return 400;

proxy_pass http://$backend/$uri;
error_page 415 =404 /empty.gif;
proxy_cache nginx_cache;
proxy_cache_key "temp_$proxy_host$uri$is_args$args";
# proxy_cache_lock on; # Allow only one request
        proxy_cache_valid 60s;
expires 60s;
}
}

Summary

When using NGINX's image resizing capabilities in combination with an AWS S3 bucket (or maybe even other storage types), you will likely run into the problem of getting an unexplainable 415 error once in a while.

Our suggestion would be to add a caching server in front of the resizer as it has no negative side-effects, can be implemented in minutes and could even boost performance.