Comparison of Image Encoders

Scope
8 min readFeb 14, 2020

--

MozJpeg vs. OpenJpeg vs. encoder (Jpeg XT?) in Netflix article:
https://netflixtechblog.com/avif-for-next-generation-image-coding-b1d75675fe4

MozJPEG @ 40,129 bytes (original encoded JPEG)
MozJPEG (-sample 1x1 -tune-ms-ssim) @ 39,610 bytes
Jpeg2000 (OpenJPEG -s 1,1 -q x) @ 40,282 bytes
Jpeg2000 (OpenJPEG -r 29) @ 40,640 bytes
MozJPEG @ 19,694 bytes
MozJPEG (-sample 1x1 -tune-ms-ssim) @ 19,858 bytes
Jpeg2000 (OpenJPEG -s 1,1 -q x) @ 20,128 bytes
Jpeg2000 (OpenJPEG -r 58.75) @ 20,074 bytes
MozJPEG @ 68,847 bytes
MozJPEG (-sample 1x1 -tune-ms-ssim) @ 80,520 bytes
Jpeg2000 (OpenJPEG -s 1,1 -q x) @ 80,711 bytes
Jpeg2000 (OpenJPEG -r 68) @ 80,607 bytes
MozJPEG @ 81,006 bytes
MozJPEG (-sample 1x1 -tune-ms-ssim) @ 81,538 bytes
Jpeg2000 (OpenJPEG -s 1,1 -q x) @ 81,611 bytes
Jpeg2000 (OpenJPEG -r 67) @ 81,751 bytes
MozJPEG @ 80,945 bytes
MozJPEG (-sample 1x1 -tune-ms-ssim) @ 79,953 bytes
Jpeg2000 (OpenJPEG -s 1,1 -q x) @ 80,583 bytes
Jpeg2000 (OpenJPEG -r 68) @ 80,594 bytes

JPEG XL

Jpeg XL is not optimized for very low bpp (Bits Per Pixel), as he tries to save most of the details and textures, rather than filter them out and reconstruct something similar, his artifacts become more noticeable, on such ranges AVIF may be visually more acceptable.
But according to my personal visual tests, starting from 0.4–0.5 bpp it usually wins most other formats.
There are also tests from other people, for example: https://forum.doom9.org/showpost.php?p=1894341&postcount=167
In particular, the DSSIM, SSIMULACRA, and Butteraugli metrics are much more consistent with the human perception of image quality.

JPEG XL ( -s 8 -q x) @ 80,133 bytes

Image comparison slider

Netflix (internal) boxshots-1 dataset
JPEG 444 (Netflix)
AVIF 444 (Netflix)
Jpeg2000(OpenJPEG -s 1,1)
MozJPEG 444 (-optimize -sample 1x1 -tune-ms-ssim)
MozJPEG (-optimize)
WebP (-m 6 -pass 10 -sharp_yuv)
JPEG XL (cjpegxl --speed=8)
JPEG XL ( -s 8 -q x) @ 39,851 bytes
JPEG XL (-s 8 -q x) @ 85,173 bytes
JPEG XL (-s 8 -q x) @ 20,170 bytes
JPEG XL (-s 8 -q x) @ 76,091 bytes

Additional comparisons

Kodak and CLIC images dataset

AVIF 1.0.0-errata1-avif-252-gb8752448c (aomenc --cpu-used=0 --sharpness=7)
JPEG XL [b3a65719](cjpegxl --speed=8)
kodim19 (original source image from the Kodak dataset)

Image comparison slider (~44740 bytes each image)

kodim01 (original source image from the Kodak dataset)

Image comparison slider (~87777 bytes each image)

https://www.artstation.com/artwork/N5DeOJ
AVIF (aomenc --cpu-used=0 --sharpness=7)
JPEG XL (cjpegxl --speed=8)
WebP (-m 6 -pass 10 -sharp_yuv)

Image comparison slider (~308000 bytes each image)

Woman Face Photo
AVIF (aomenc --cpu-used=0 --sharpness=7)
JPEG XL (cjpegxl --speed=8)
MozJPEG (-optimize)
WebP (-m 6 -pass 10 -sharp_yuv)

Image comparison slider (~200000 bytes each image)

Comparison with HEIC and extreme compression

AVIF 1.0.0-errata1-avif-252-gb8752448c (aomenc --cpu-used=0 --sharpness=7) + (MP4Box -ab avif)
JPEG XL [b3a65719] (cjpegxl --speed=8)
MozJPEG 4.0.0 (-optimize)
HEIC 3.2.1+36-g36fcfc308 (x265 --preset veryslow --deblock -2:-3) + (MP4Box -ab heic)
WebP (-m 6 -pass 10 -sharp_yuv)
HTJ2k GIT (OpenJPH)
CLIC dataset 2048x1320_alex-siale-95113

Image comparison slider (~506000 bytes each image)

With sufficient bitrate, MozJpeg performs quite well, but AVIF loses and blurry some of the image details, HEIC works a little better, and the winner is Jpeg XL.

HTJ2K @ 506,676 bytes
WebP @ 506,100 bytes

CLIC dataset 2048x1320_andrew-coelho-46449
CLIC dataset 2048x1320_stefan-kunze-26928

Image comparison slider (~273500 bytes each image)

AVIF and HEIC noticeably lose detail on the water, MozJpeg also has artifacts in the sky.

CLIC dataset 2048x1320_nick-scheerbart-15636

Image comparison slider (~221900 bytes each image)

An example with insufficient bitrate, compared to other encoders, HEIC much less distorted and blurred the image.

MozJPEG @ 221,651 bytes
MozJPEG (-quant-table 2) @ 221,575 bytes
HTJ2K @ 220,218 bytes

CLIC dataset 2048x1320_fineas-anton-143501

Image comparison slider (~170200 bytes each image)

Although all encoders are far from the original, AVIF has the least unpleasant artifacts, although it has removed many details.
MozJpeg is already unsuitable for such bitrates, and also Jpeg XL artifacts become very noticeable.

MozJpeg @ 172,266 bytes
MozJPEG (-quant-table 2) @ 171,255 bytes
AVIF @ 169,632 bytes
HTJ2K @ 169,798 bytes

CLIC dataset 2048x1320_picseli-6726
MozJpeg @ 47,166 bytes
MozJPEG (-quant-table 2) @ 47,148 bytes
HEIC @ 47,006 bytes
HTJ2K @ 47,122 bytes

CLIC dataset 2048x1320_ray-hennessy-118048
MozJpeg @ 51,655 bytes
MozJPEG (-quant-table 2) @ 51,803 bytes
HEIC @ 50,629 bytes
HTJ2K @ 50,472 bytes

CLIC 2048x1320_tony-webster-97532

Image comparison slider (~43250 bytes each image)

Jpeg XL has very noticeable artifacts, AVIF and HEIC have a visually acceptable image, but with a loss of many details. MozJpeg vs HEIC example below.

MozJpeg @ 43,253 bytes
MozJPEG (-quant-table 2) @ 43,238 bytes
HEIC @ 43,159 bytes
HTJ2K @ 43,201 bytes

CLIC 2048x1320_nitish-kadam-34748

Image comparison slider (~10650 bytes each image)

AVIF looks better than the rest. Jpeg XL has blockiness and resolution reduction. And even the more advanced MozJpeg encoder does not help improve Jpeg here, it looks worse than Netflix’s JPEG examples.

AVIF @ 10,393 bytes
MozJpeg @ 10,621 bytes
HTJ2K @ 10,406 bytes

Color banding, 8 and 10-bit AVIF/HEIC, HTJ2K

AVIF (aomenc --cpu-used=0 --sharpness=7)
AVIF 10-bit (--cpu-used=0 --sharpness=7 --bit-depth=10)
HEIC (x265 --preset veryslow --deblock -2:-3)
HEIC 10-bit (--preset veryslow --deblock -2:-3 --output-depth 10)
JPEG XL (cjpegxl --speed=8)
MozJPEG (-optimize)
HTJ2k (OpenJPH)

Color banding is a problem that may appear on images after encoding, one of the solutions is adding noise, but noise requires more bits per pixel for the same quality.
Another solution is to increase the color depth, which usually helps to improve quality even with the same size of encoded images.

I also added HTJ2K encoded examples to compare the differences in formats based on DWT (Discrete Wavelet Transform) and DCT (Discrete Cosine Transform).

High-throughput JPEG2000 (HTJ2K), also known as JPH, JPEG2000 Part 15, ISO/IEC 15444–15, and ITU-T T.814. Part 15 is intended to be royalty-free.

CLIC 2048x1320_john-cobb-14128
AVIF @ 85,283 bytes
AVIF 10-bit @ 83,619 bytes
HEIC @ 85,436 bytes
HEIC 10-bit @ 83,259 bytes
HTJ2K @ 85,296 bytes
Jpeg XL @ 85,204 bytes
MozJpeg @ 85,667 bytes

CLIC 2048x1320_yulia-vambold-20364
AVIF @ 187,654 bytes
AVIF 10-bit @ 186,993 bytes
HEIC @ 187,236 bytes
HEIC 10-bit @ 187,509 bytes
HTJ2K @ 187,808 bytes
Jpeg XL @ 187,202 bytes
MozJpeg @ 187,652 bytes

AV1 (AVIF) Film Grain Synthesis

AVIF (aomenc --cpu-used=0 --sharpness=7)
AVIF FG (--cpu-used=0 --sharpness=7 --denoise-noise-level=10)
HEIC (x265 --preset veryslow --deblock -2:-3)
JPEG XL (cjpegxl --speed=8)
MozJPEG (-optimize)
HTJ2k (OpenJPH)
CLIC 2048x1320_andrew-neel-178721

Image comparison slider (~230300 bytes each image)

AVIF (AV1) very aggressively removes small details and noise, for a more acceptable quality without increasing the size of the image, it is possible to synthesize film grain.

AVIF @ 230,253 bytes
AVIF Film Grain Synthesis @ 230,309 bytes
HEIC @ 230,232 bytes
HTJ2K @ 230,353 bytes
Jpeg XL @ 230,319 bytes
MozJpeg @ 229,937 bytes

--

--