X-Trans: The Promise and the Problem
FujiFilms’s X-Trans III sensor has been out since the X-Pro2 hit the scene in March 2016. It was joined recently by the X-T2, and presently the X-T20 and X100F are about to be unleashed upon the world in February 2017. I’ve spent some time with both the X-Pro2 and X-T2, as well as every generation of X-Trans sensor package going back to the X-Pro1. In the process of using these cameras, I’ve become intimate with X-Trans CFA and the problems it presents.
With this new wave of cameras on the verge of release, I’ve decided to share some of what I’ve discovered.
X-Trans is, for the purpose of this topic, FujiFilm’s alternative color filter array for CMOS sensors.
FujiFilm claims that:
“Moiré is tackled at its root cause by the revolutionary X-Trans CMOS sensor’s colour filter array. By enhancing aperiodicity (randomness) in the array arrangement, the colour filter minimizes generation of both moiré and false colours, eliminating the need for an optical low-pass filter in the lens[sic] and enabling [the] X-Trans CMOS sensor to capture full “unfiltered” lens performance.”
FujiFilm has also claimed that X-Trans provides a more “film like” image than the much more common Bayer CFA and can produce as good or better IQ than a full frame Bayer sensor (with AA filter). All of these statements are… shall we say redolent of a certain barnyard aroma? What X-Trans really does is trade a bit of chrominance resolution for a bit of luminance resolution (which you may love if you shoot Black & White) and make the demosaicking process (the interpolation of the RAW sensor output into an RGB image) far more complicated. There’s nothing random about the X-Trans CFA either, it’s an ordered pattern that repeats on a slightly larger scale than Bayer (6x6 vs 2x2). And recently most manufacturers have ditched the AA filters on their Bayer sensors as resolution has increased anyway, removing the original point of distinction. All of this applies whether you shoot RAW or JPEG, but Fuji’s JPEG engine has some particular shortcomings. Hey, that’s marketing for you. Now let’s try to ignore that smell on our boots and move on.
X-Trans JPEGs are “Waxy” at High ISOs
FujiFilm has been praised by many for their “color science.” That is, the reproduction of colors and tones in the straight out of camera JPEGs, through what FujiFilm calls Film Simulations. For this reason and for the generally high quality of the in-camera demosaicking vs. some infamously half-hearted attempts by Adobe et al., many photographers have announced that they prefer to eschew RAW processing and just deliver the SOOC JPEG images instead.
However, many who have attempted this have run into a significant obstacle, commonly known as the Waxy Skin-Tone problem. Much has been said about this issue elsewhere online, as the problem has existed at least since the introduction of the X-Pro1 in 2012.
I’m going to attempt to offer some evidence and insight into the issue as it applies to the current generation of cameras, and, in conclusion, ask Fuji what they plan to do about it.
Breaking it down
I mentioned earlier that the X-Trans CFA strikes a different compromise between luma and chroma detail (and noise) than Bayer. As you can see from the above figure, the X-Trans array has only 88.89% of the chroma resolution (red and blue photosites) of the Bayer array (which is traded for green photosites for increased luma resolution), and with larger gaps. This is certainly part of the problem, but it’s not the whole story. The demosaicking process is an interpolation process, whereby the colors missing from the sensor data are arrived at by an algorithm’s educated guess. With Bayer, and especially in the presence an optical low pass (AA) filter, the uncertainty of the red or blue value of a green pixel in the CFA has a specific limit. With X-Trans, this uncertainty is higher in part because the distance between same-colored pixels in certain directions is greater, up to 6 pixels (this is important for interpolating across gradients)! In other words, at small scales the color of individual pixels in in-focus areas of the final image is more guess and less fact. This kind of uncertainty (with both Bayer and X-Trans) results in a type of artifact known as “false color” at the output of the demosaicking algorithm with certain subject matter. False colors are the symptom of incorrect guesses based on the limited information contained in the raw image samples. There are various techniques for mitigating this artifact, and new methods are being researched all the time. False color suppression and chroma noise reduction can, in some implementations, be treated with the same processing step, as part of the demosaicking.
It’s anyone’s guess what algorithms FujiFilm’s cameras use to demosaic and denoise images, but FujiFilm’s implementation performs similarly enough to known and documented algorithms that it’s not necessary to know the all of the details to understand where the problem lies.
Shooting at higher ISO reduces the signal to noise ratio of the image and exacerbates the problem of false colors (you may have noticed this before as colored blotches in high ISO images). In order to mitigate this, FujiFilm cameras ramp up the chroma denoising along with ISO, as do cameras from other brands. But, as mentioned earlier, with the X-Trans CFA, false colors are more of a problem than they are with Bayer, and stronger filtering is required in order to smooth them away. The problem of waxiness arises because Fuji decided to use much stronger chroma NR than is strictly necessary to suppress false colors in the general case. The result is that colors bleed together (especially red/blue hues). The effective color resolution is lower than it should be, even taking that 88.89% figure into account. Teeth and eyes become the color of the surrounding skin. Rosy cheeks appear wan and corpse-like, and, generally speaking, people are rendered cartoonishly.
Contrary to popular belief, the “NR” setting in the camera’s menus does not significantly affect this chroma smoothing at all — It only impacts luminance noise reduction.
Some viewers are reported to actually prefer this effect, but technically it is an objective and measurable flaw — the rendered image is no longer representative of the scene (known as the “ground truth” in academia).
The issue becomes a significant obstacle because FujiFilm cameras give the user no control in the matter. There is no “High ISO NR” menu setting like cameras from other manufacturers have. There is no separate chroma NR setting. You either live with the reduced quality of the JPEGs or you don’t.
The alternative is to shoot RAW. Which is fine, but demosaicking X-Trans files is less efficient than demosaicking Bayer files —and, as anyone who has tried it knows, this translates to a much slower workflow — and you lose all of that “color science” too, because, sadly (shamefully, in my opinion), FujiFilm does not publish color profiles for their sensors nor embed the color matrices in the RAW files as some other manufacturers do. And if you want to use the camera’s WiFi feature to transfer the images to your smartphone and process/post them on the go— you can only do that with the JPEGs.
Because the problem is one of color resolution, you are unlikely to notice the Waxy Skin-Tones problem in headshots or the like without involving high ISOs. In those cases there are hundreds pixels whose common values overwhelm the noise and allow the color of the ground truth to show in the final image.
The problem appears in fine detail, and is particularly noticeable in human faces at a distance, but also in headshots in the capillaries in the eye or the color of fine hair (where it differs from skin tone).
I have come across several real-world instances of this problem and have decided to present these rather than some kind of laboratory setup to illustrate that it is indeed a problem encountered in practice and to make it perfectly clear what information — what objective quality — Fuji’s JPEG engine tosses out as if it were noise. Whether you care about this discarded information or not is up to you.
The following examples are heavy crops. The images are cropped to illustrate the differences for viewers with all sizes of displays. Keep in mind that FujiFilm sold us a new 24 megapixel sensor on the promise of being able to crop more. Once you’ve noticed the effect, I think you’ll be able to spot it in uncropped images too.
All examples were shot on Fuji’s new flagship camera, the X-T2 with the Fujinon 35mm F2 WR lens.
A note about FujiFilm’s JPEG output: “FINE” quality in camera translates, in more specific terms, to 99% quality level JPEG with sampling factors 2x1,1x1,1x1. (This means that there is less color resolution in the horizontal plane than the vertical, but don’t get hung up on that because much of the color information in these images is interpolated by the demosaicking process anyway and it can be shown that this level of chroma subsampling of the JPEG image cannot account for the waxy skin tone effect [this will be left as an exercise for the reader.])
The image was shot at ISO 1600.
Note that the skin tone is very uniform and its color has bled into the sclera of the eyes, giving the face a waxen, wooden aspect.
The RAW file was processed using Darktable’s Markesteijn demosaicking algorithm (3-pass mode) with a single iteration 9x9 chroma median filter followed by application of a bilateral filter on the chroma channel and light sharpening. The color profile is my own, generated from shots of a Wolf Faust IT8 chart and should accurately represent the colors in front of the camera. No lens distortion correction was applied although the lens used (Fujinon 35mm F2) is heavily corrected in camera (which makes a more direct comparison difficult).
Note that the sclera are white. Also note the greater contrast and tonal variation present in the face, with a red nose and rosy cheeks. This is closer to the ground truth than the camera JPEGs, but not perfect (more on that later).
The Effect of the In-Camera NR Setting
Here is the same image as above, processed in camera at the extremes of the NR setting.
As you can see, the difference is extremely subtle and does not appear to affect the color/waxy skin tone situation at all. Just to drive this point a little further home, since there is much misinformation about this on the web, here is a difference image of the NR -4 and NR +4 images, blurred (3x3) to reduce the influence of JPEG artifacts, and stretched into a visible lightness (moving the whitepoint from 255 to 10):
I know, it’s not much to look at. You can see a little bit of difference in luminance information around edges and some subtle color variations (likely artifacts).
Now here is the same process applied to the processed RAW above, and the same image with a bilateral filter applied with a strength tuned to match the appearance of the camera JPEG. In other words, this is more or less what we could expect to see in the above difference image if the camera’s NR setting significantly affected chroma denoising:
Unfortunately, the inability to disable software lens correction in the in-camera RAW developer (not to mention the secret color profiles and LUTs of the Film Simulations) prevents a more direct comparison between the camera JPEG and Darktable’s output, but this should give us an approximate picture of the level of color detail being discarded by the FujiFilm JPEG engine. Note the differences in the eyes and tip of the nose, this is where the effect was most noticeable in this example.
The image was shot at ISO 1600.
Here we see once again that the sclera are skin-toned and the teeth have taken on the color of the surrounding skin. This is the epitome of ‘waxen’ appearance.
While in the processed RAW file, we see the true tooth color, the whites of the eyes, and even the red of the water line area.
Remember how earlier in the article I implied that the Waxy Skin-Tone problem only cropped up at high ISO? Well, that’s only if you don’t look too closely. With X-Trans II/III, chroma NR is too high at all ISOs, from 200 up. It’s just more noticeable with the higher ISOs. But I don’t expect you to take my word for it. Here’s another example.
The image was shot at ISO 200.
Note that the flesh visible on the head is an even hue.
Note the variation in the hue and saturation of the flesh on the head, which was bright red in reality.
Conclusion: A Devil’s Bargain
The X-Trans sensors and the X series of cameras have been plagued by image quality problems from their inception and it’s clear from these examples that the problem persists in the current generation (X-Trans III), whose members include the X-Pro2, X-T2, X-T20 and X100F. With every new camera generation, people claim that the problem is solved once and for all, but the reality is that it has never been solved, just tweaked here and there. The underlying problem remains.
On the bright side, it has been shown here that much more color information can be extracted from the raw sensor data than FujiFilm’s in-camera processing is currently capable of. Enough, in my opinion, to reasonably live up to the resolution claim of 24 effective megapixels.
So, why, you ask, does FujiFilm use such strong chroma NR if it causes this ugly rendering? If you’ve been paying attention you might have already guessed. The answer is simple: To suppress moiré! Ah ha, you say, but, according to FujiFilm’s marketing department (there’s that smell again), X-Trans is immune to moiré! Well, I’m here to tell you that it’s no more immune to moiré than Bayer and in fact probably a little less. It all comes down to the signal processing, and in this case, that processing is not without its side-effects. I can tell you with certainty that the level of chroma NR used when processing the example RAW images, which was mild enough to preserve the skin coloration, would barely make a dent in the waves of green and magenta false colors that come along with moiré. Don’t believe me? Go back and look at the man’s tie in Example 2 again. In reality the tie was gray — not covered in green and magenta glitter, just plain old shades of gray (and that’s not even a severe example). But recording that is a challenge for a sensor without an AA filter, and is even more challenging when that sensor uses the X-Trans CFA with its broader spacing of same-colored pixels.
X-Trans I had more susceptibility to moiré (still skeptical? go back and look closely at those camera settings images above, they were shot with an X-Pro1 and you can see that they exhibit some moiré and maze artifacts), X-Trans II swung in the opposite direction and introduced people to the full side-show horror of the Waxy Skin-Tones effect, and X-Trans III just tones it down little, striking a balance still in favor of protecting against moiré.
I say X-Trans N throughout this article, but what’s really relevant is not the sensor itself, but the image processing pipeline in each camera generation where the demosaicking and noise reduction happens.
The reason that FujiFilm hasn’t fixed the Waxy Skin-Tone problem after all these years and three camera generations is simply that they can’t — not without a significant, breakthrough advancement of their algorithms. They made a proverbial deal with the devil with their immune-to-moiré claims for the X-Trans CFA. If they get rid of the moiré, people will complain about waxy skin tones, and if they get rid of the waxy skin tones, people will complain about moiré (but, hey, they never made marketing claims about not making people look like wax figures or wooden dolls, so it’s no surprise which problem they’ve prioritized). Certainly, more sophisticated moiré suppression algorithms could do better at sparing skin and faces where the color smoothing is most objectionable (but could they ever recognize chicken skin, I wonder?) Will Fuji ever invest any time or money in that kind of optimization? Especially with reviewers constantly praising their JPEG engine (from a comfortable distance and without wearing their spectacles)?
In the future, I sincerely hope that FujiFilm stops producing sensors with the X-Trans CFA. I am convinced that the X-Trans CFA causes more problems (many) than it solves (none) and FujiFilm could have well known this from computer simulations before ever manifesting it physically. If the GFX 50s is any indication, then the next generation of X-series cameras may indeed utilize a Bayer CFA.
In the meantime, a firmware update to give users the option to customize the chroma NR/moiré removal strength ramp ourselves in camera (like some other manufacturers like do with their High ISO NR customization) could alleviate the problem for users who consider the lifelike rendition of human faces more important than complete freedom from moiré. I challenge FujiFilm to offer a solution. Any solution. Because until they do so, the JPEG images from your new FujiFilm camera are going to be compromised in color resolution. For this reason I would strongly recommend recording RAW files in addition to JPEG if you care about this sort of thing.
All of that being said, allow me to put things in perspective by sharing the uncropped image from Example 1, both the camera JPEG and the more mildly denoised RAW (via Darktable). Whether you notice this effect depends on how much you crop/how big you print and how high you push the ISO. It has annoyed me enough that I’ve become adept at spotting it and wasted a couple of perfectly good evenings writing this article. When I look at an image and the teeth are skin-toned or the veins in the eyes are gray, I can confidently say to myself, “That’s a FujiFilm!” And perhaps, for better or worse, now you will be able to do the same.
Some readers were curious why the Camera JPEG examples used Color +4 and Sharpness +4 settings. This was done in the interests of fairness and clarity. Fairness because the Provia Film Simulation (the Fuji standard value) is quite desaturated compared to the calibrated RAW data, and this desaturation makes the lifeless appearance even more dramatic. Clarity because it’s a crop, crops are less sharp, and it’s uncomfortable to stare at unsharp images. Here’s the image from Example 1 with full default settings (all zeros):
Some readers were unsure what I was referring to when I mentioned the X-Pro1 and maze artifacts. Here’s a crop of one of the images referred to. Yes, I know, it’s a crop, but now you see what I’m talking about. Deal with it.
A reader asked to see the uncropped images from Example 2 and 3. Not exactly Pulitzer Prize material here, but, hey, you asked for it.
A reader has complained about my opening paragraph, saying that FujiFilm has never claimed that the X-Trans CFA is random, only that it is “more random” or has “increased randomness.” My position is that this wording alone constitutes a very misleading statement, but they have indeed used stronger wording.
Direct from the FujiFilm website regarding the X-Pro2’s X-Trans III sensor:
“The unique random color filter array reduces moiré and false colors without an optical low-pass filter. These color filters also have the effect of increasing the resolution so, when shooting with a high-resolution Fujinon lens, the camera delivers images with a perceived resolution far greater than the actual number of pixels used.”
I’m not going to present any more examples of this language. Interested readers can peruse the web and satisfy themselves that this is the way FujiFilm presents their technology. There are many of other instances both written and spoken.