Recently I’ve been working on simulating parts of the imaging pipeline in digital cameras. A source of confusion for me is that there appears to be several distinct concepts that are all commonly referred to as both “white balance” and “color constancy”. Unfortunatly very few imaging science authors acknowledge the ambiguity around these terms before diving into the topic.
Let’s start with a quote from a paper on white balance:
The problem of adjusting the color such that the output image from a digital camera, viewed under a standard condition, matches the scene observed by the photographer’s eye is called white-balance. — Chromatic adaptation and white-balance problem 2005
The background here is that human vision attempts to make objects retain their color under varying colors of illuminants (fancy word for a light source). For example a red apple still looks red under both yellow indoor light and blueish sunlight. This process is called chromatic adaptation.
The quote is saying that the goal of white balance is to model human color vision so that color reproductions (like digital images) appear to the viewer similar to how an adapted human would have experienced the scene.
Case closed! But wait..
We can now state our AWB [Automatic White Balance] goal as follows: we seek to minimize the effect of [the illuminant] I(λ) and ensure that R sensor, G sensor, and B sensor correlate with the object reflectance R(λ) only
This is a very different goal. This has nothing to do with modeling the human adaptation to light or human perception at all. This is about the physics of the camera and scene. The goal is to remove the contributions of the scene illuminant from the recorded camera RGB values. If you succeed in this you will have the same camera response to the same scene under different illuminants.
If you set out to build a system using this goal you may very well design something that outperforms the human eye at chromatic adaptation because your goal is no longer about modeling human perception.
Due to the ambiguity around the terms from here on out I will refer to the two above approaches as the “human centric” approach and the “camera centric” approach respectively.
So which is correct?
I emailed color appearance expert Mark Fairchild at RIT and he graciously provided some guidance. Here is a selection of his response (emphasis mine):
I think there is no disagreement among those that understand there are different goals. E.g., so-called color constancy and actual color appearance are two different things. I think anyone doing a good job at white balance probably understands that neither of the two can be achieved and something in between (accurately or nicely reproducing neutrals) is the actual objective of white balance. You might be right that people taking either of the approaches are getting to results that are indistinguishable because you simply cannot do either of the above objectives with 3 non-colorimetric signals that are available.
— Mark Fairchild
This acknowledgement helped my sanity a lot, it turns out both of the approaches are reasonable goals for different use cases. He also makes it clear that both of these very specific approaches can serve a higher level goal, to nicely reproduce neutral colors.
I work with satellite imagery at Planet and our goal is not to produce images that look the way a human floating in space would have seen them on any specific day. We don’t care about reproducing daily fluctuations in atmospheric conditions that affect lighting. We want to produce a consistent view of the world that overcomes changes in illumination, so we would not be well served by the human centric approach to white balance.
I later discovered that there is in fact an entire textbook dedicated to the topic of white balance and color constancy. I was thrilled to find an acknowledgement of the differing white balance goals.
Thus, there are two main roads to follow in developing color constancy algorithms. One goal is to determine the reflectance of objects. The second goal is to perform a color correction that closely mimics the performance of the visual system. The first goal is important from a machine vision point of view, whereas the second goal is very important for consumer photography
In retrospect the use cases for these white balance approaches seem obvious:
- Consumer photographers are likely to value the accurate reproduction of a human experience.
- A self driving car is going to value the most accurate color constancy possible so it can identify traffic signs regardless of lighting conditions.
And while this is definitely not universally agreed upon here, is my best understanding of the different terms:
- Color Constancy: The attempt to make color correlate only with object reflectance. The attempt to remove the contributions of the real scene illuminant so an object can be rendered under some standard illuminant (e.g. a standard white, like D65).
- Color Appearance: The attempt to make colors appear the same in a reproduction as they would have to a human observer.
- White Balance: A general term for the goal of nicely reproducing neutral colors. A good White Balance strategy will probably incorporate Color Constancy or Color Appearance models or both.
Here’s a wholly unscientific white balancing I made by hand in Lightroom. It’s not likely that this represents the color sensation I experienced in the room when I took this photo, but this is definitely a more pleasing reproduction than the ones on the cover.
Bonus: effect on the imaging pipeline
I want to quickly note that these differing goals effect which layer of the image processing pipeline is appropriate for your white balancing algorithm.
The human centric approach must done after your color values have been converted into a perceptual color space like CIE XYZ/CIE Lab. A camera centric white balance algorithm does not carry this requirement, you could apply the algorithm directly to the RGB values measured from the camera sensor without any need for conversion into a color space (whether or not native camera RGB should count as a “color space” seems debatable).
There actually was a study about which color space is most effective white balance target for the camera centric approach:
Six different methods for white-balancing digital images were compared in terms of their ability to produce white-balanced colors close to those viewed under a specific viewing illuminant.
….We found that illuminant-dependent characterization produced the best results, sharpened camera RGB and native camera RGB were next best, XYZ and CAM02 were often not far behind, and balancing in the -709 primaries was significantly worse. — Comparison of the accuracy of different white balancing options as quantified by their color constancy