The Pichasso image service

OpenHPI
6 min readNov 15, 2017

--

Introduction.

Pichasso is our open source image web service which converts images given by an url into different sizes, crops them while compressing and converting the image format. The service allows to access one single version of an image while it dynamically adds other versions in all desired formats. Based on this idea webmasters may always use images in the right size to embed them into html while they only need to keep one version of the image which is large enough on their resource server.

This microservice can be easily adopted to existing projects just by updating existing image urls to use pichasso like a proxy. The independent implementation from other services allows to use it anywhere. All image settings will be set using get parameters in the new image url which means like zero configuration inside the service (except your preferable defaults).

Motivation.

Webpages have to be optimized for many reasons. An important part is the handling of static resources and assets to reduce workload in backend systems on the one hand while reducing traffic on client devices and fast rendering of webpages for low bandwidth of the client connection on the other hand.

Users may switch to a different site just because the time to load all resources to display the desired content is too long. Or huge images or other block content without predefined size is getting loaded very slowly and during the rendering phase of the webpage already displayed content moves away because of resources that have been loaded completely after that.

The simple version that motivated us to build this service was that it is just more simple to keep only one version of an image in a quality that is good enough on a resource server while pichasso dynamically generates smaller versions of our images in formats we need to display on our webpages.

Pichasso.

The service is based on the express.js framework, the sharp image library and others. To access resources they have to be available by a public reachable URI.

current sample original image
screenshot of the currently displayed wide image on mobile devices
cropped and resized image area based on face detection

Pichasso not only helps us to compress an image, it is also able to select a region of interest. This is done by face detection or an algorithm that is looking for image parts with the highest entropy, which basically means a lot of contrast and edges.

As you can see, it is very simple to get different versions of one image. Beside some defaults you can set in the configuration, there are a lot of varieties you may set. So instead of using always one image and cropping the viewable area in the front-end like it has been done on the openHPI learning platform, now we can always deliver the right image in the correct size for every client and device. If the platform would use our service at least the wide version could be improved as you may see in the screenshots above.

Currently images are placed as covered background-image of a div container. The image is just centered and fills the container. In the sample the upper and lower region is cropped by the div container size after being loaded and transferred to the client. Pichasso instead would crop the images with more intelligence.

The basic configuration is done setting specific get parameters. This image is created using the URI https://pichasso.azureedge.net/image?file=https%3A%2F%2Fopenwho.org%2Ffiles%2Ffea5f7a0-0520-4f1b-83e5-53388417bf97&width=1000&height=300&gravity=faces.

Further parameters and defaults available on the ‘/image’ route are listed on GitHub. But let’s go through them.

Height, width & region of interest.

You can either provide height and width or just one value. If you provide only one value, the second one will be generated keeping the ratio of the image. Providing both values you can decide between different gravities and crop methods to define exactly how pichasso should process your image.

Image format & quality.

The image format is selected by default based on the formats the client accepts, and transparency inside the image. If the client accepts webp, pichasso will generate it. Otherwise it will look for an alpha channel and convert to png, if it is present, or jpeg. For that reason png will be used (but only if webp is not supported by the client). Why are we using webp? It has better compression than jpeg or png. But if you plan to compress drawings with reduced colors or clear edges maybe the png format is the right one for your choice because it supports lossless compression too. For all formats the quality can be adjusted in a predefined range. You may find the right compression level for your needs.

Media queries and the picture tag.

But how to place images for different devices perfectly into your HTML code? The best method we recommend is the use of the picture tag. Using it, you may offer an very compressed image for mobile devices, a default image and for retina displays one image with high resolution too. The browser will only load the one version displayed from your server.

<picture>
<source srcset=”img_small_very_compressed.jpg”
media=”(max-width: 400px)”>
<!-- small image for mobile devices -->
<source srcset=”img_default.jpg,
img_high_resolution.jpg 2x”>
<!-- default image -->
<img src=”img_default.jpg” alt=”Description”>
<!-- Fallback for old browsers -->
</picture>

In the code sample above the selected image will change depending on the width of your browser window. And only this single one image will be loaded.

As you may not know: images declared hidden with display:none are however loaded instead.

Caching.

As mentioned above caching of static data on the client will reduce load on server side. Because images will normally not change, the image can be hold in the browser cache and there is no need to reload them upon every requested page load.

To reduce server load and deliver assets faster on worldwide requested web applications, the use of a content delivery network may bring you very high potential. Read our article, why page speed matters.

Cache-Control: public, max-age=2629000000
Expires: Wed, 08 Nov 2017 18:08:55 GMT

The `expires` and `cache-control` headers of the response define after what time the browser should reload the resource. The caching time in cache-control header is defined in seconds from now, the expires header defines a fixed date as string.

But know that the size of a browser cache is limited. If you are going to browse on other webpages that will cache other content, older files can be deleted locally. But for revisits of pages on the same server the browser caching will help to reduce traffic.

There are other mechanisms like the `etag` header. But we do not recommend it for images because the validation of when to reload a file is validated through a server request. While the response only contains the information to still use the cached file there is a server request required always. And this is normally not required for static images that will not change.

If you request the same image from pichasso multiple times, pichasso will not recalculate the image again. Instead it caches the generated files internally and may deliver them faster to the clients beginning from the second request.

PDF conversion.

Pichasso does not only handle images. If the local system running pichasso has ghostscript installed it will be able to compress pdf files too. If Pichasso is running in our prepared Docker environment it will work by default. We have defined two presets. Both optimize the images using jpeg compression, but one will use 72dpi and the other one 300dpi. Using the compressed pdfs they will be looking good on screen or can be used to print while the regular file size gets decreased a lot.

To compress pdf files use the `/pdf` route, define a source file using the file parameter as well. The quality is stepped into screen and printer currently and you may decide between viewing or downloading the file setting the download parameter to an value interpreted as true.

It is scheduled to support pdf to image conversion for using pichasso as thumbnail service of pdf pages too. The current API is also listed on GitHub.

--

--