Channeling Alpha

DS McCoy
8 min readJan 14, 2024

People who work in computer graphics often deal with something called an alpha channel. A standard way to store an image digitally is to store three numbers at each pixel representing the amount of red, green, and blue light that the pixel should show. But sometimes you need a way to encode how opaque or transparent the pixel should be in the case where it might be put over another image. An example is if you have a character in one image and a background in another and you want to put the character over the background. You need a way of telling the process which pixels have the character in them and which don’t. The pixels that don’t have the character remain the background and the pixels which do take on the color of the character.

If you had enough pixels and they were too small to see individually, then a simple yes/no might suffice, but if the pixels can be seen, what if the character covers only half the pixel? Then a better guess would be to mix the color of the half of the character in the pixel with the background color in the pixel. This isn’t perfect, since you haven’t specified which half of the pixel the character covers, but it’s pretty good and only gets better as the pixels get smaller. If you store another value at each pixel similar to the color channels, you can specify the fraction of the pixel that should be covered to the same precision you’re storing color.

Back in the late 1970s, at the New York Institute of Technology, Alvy Ray Smith and Ed Catmull, the pair who would later found Pixar, started working with this idea and called the extra information the alpha channel. The alpha name coming from the mathematical use of the first letter of the Greek alphabet in the formula for the kind of interpolation that needed to be done between the foreground and background color.
× Fg + (1 — ) × Bg
Later, after Smith and Catmull had founded the pre-cursor to Pixar at LucasFilm Computer Division, in 1984 Thomas Porter and Tom Duff published a paper on the mathematics of image compositing where they used premultiplied alpha. In the interpolation formula above, the alpha () value is the coverage value of the foreground image. Premultipled alpha merely means that one does the first multiplication, × Fg, into each of the color channels before saving the image, so that each pixel only contains the amount of color that would be added to the pixel. So the math done at actual compositing time is Fg + (1 — ) × Bg. In subsequent work at Pixar, pre-multiplied alpha became the normal way of storing images.

Image compositing as defiined the Porter-Duff paper. (source Wikimedia)

Usage of the alpha channel spread through the computer graphics industry, but the name “alpha channel” for many people outside of Pixar came to mean just a channel of extra information stored at each pixel. You would sometimes hear people talk about “an alpha channel” rather than “the alpha channel”. This came up when Sam Leffler (then of SGI) and I (representing Pixar) went to a meeting of the committee responsible for the TIFF image file format specification, at the time still owned by Aldus before it was purchased by Adobe. Many of the people on the committee were more involved with graphics for print rather than film/video, and they would ask questions such as “Why would you ever need more than one bit for alpha?” and for premultiplied alpha “Why would you want all that black mixed in there?” But they were receptive and alpha channels got added to the TIFF specification. To specify the difference between “the alpha channel” and “an alpha channel” usage, the terminology used in the TIFF specification is associated alpha for premultiplied alpha the way Pixar uses it, and unassociated alpha for an extra channel of information at each pixel. For unassociated alpha it’s up to the application to decide what information is stored in the channel. Some applications like to store the alpha value unpremultiplied and it’s somewhat unfortunate that there is no explicit way to store a channel that is tagged as alpha (ie. pixel transparency/opacity) without being premultiplied, that is just unassociated alpha and the application has to know to treat it as an unpremultiplied alpha.

Later, when the PNG image file format was created, I only found out about the specification as it was published, and was unable to talk to them as it was being created. They ended up specifying that the PNG alpha channel is unpremultiplied. I don’t know if I could have convinced them, but I would have tried to talk them into premultiplied alpha. Applications that display PNG images do fine now, but in the early days, it caused problems for less sophisticated viewing applications. When displaying an image that contains alpha without compositing it over another image, one is essentially putting it over a plain black image (all zeroes). That means that second multiply involving the Bg in the alpha compositing formula above is just zeroes, so it just can be skipped. But if the alpha is unpremultiplied, the first multiply in the formula hasn’t been done yet and the image viewer needs to do it before displaying the image. Early PNG viewers often forgot this step and pixels with partial alpha values often looked strange around the edges of objects.

The alpha value can come from different sources. In processing green-screen live-action footage for digital compositing, the alpha is calculated from the amount of the green screen color detected. Working with Pixar’s RenderMan rendering software, the renderer brings together a number of things into the alpha value. The main one is coverage calculated by sampling the pixel at a number of sub-pixel locations. If half are covered, then the coverage is one half. But coverage can also come from motion blur. If each of the samples are taken at different times during the frame time, you factor the amount of time the object is in the pixel into the coverage. Similarly depth-of-field blur factors in the object’s relationship to the lens into the amount of coverage.

It would be simpler if alpha were only calculated from coverage as above, then going back and forth from multiplied to unpremultipled alpha wouldn’t be too much of a problem. But the renderer can also factor transparency into the alpha. Say the object in question is a wall with a glass window. One could use a ray tracer to cast rays through the glass to bring in the color behind the glass into the pixel value, but if the wall is rendered as a separate element, it might be given an opacity/transparency value which would affect the alpha value so that it could be composited over the scene behind the window. If the glass passes 90% of the light coming through it, the pixel could be given an opacity value of 10% in the alpha channel. Then this is used in the same math as the coverage, letting 90% of the background through. If at the edge of the glass there is a pixel where the glass only covers half of the pixel and there is no window frame, so the rest of the pixel is empty, then the 10% would be combined with the coverage to leave 5%, passing through 95% of the background. Similarly, if there were an opaque window frame covering the other half of the of the pixel, it would leave alpha at 55% and let 45% of the background through. The major problem with this usage comes into play with the specular reflections on the glass. If we were raytracing through the glass, this would all be handled, but a reflection on the glass does not block the colors coming from behind the glass, it only adds to them. So the reflection colors should never be multiplied by the alpha value. In fact, if the glass is very transparent and blocks no light coming through, the alpha value might be zero, but the glass still might have a reflection on it, where multiplying by alpha would totally destroy the reflection information. In the case where what is stored is unpremultipled alpha, the information about how much of the alpha is opacity/transparency versus coverage and what is reflection versus object color is all lost before the multiply of the foreground color by the alpha happens so multiplying the color by alpha might not be appropriate.

One reason people like unpremultiplied alpha is that they may want to alter the color space of the image before compositing, sending the color values through some curves that alter the balance. This is fine for completely covered pixels, but it’s a bit problematic for partially covered pixels.
If you have premultiplied (aka associated) alpha and you want to change the color of the object covering half the pixel, it’s common to unpremultiply, that is divide out the alpha before changing the color, then multiply the color back in. But if you are altering the color space of an image, there is really no absolutely correct way of doing it with partially covered pixels. Partially covered pixel colors are essentially values in the middle of an equation that is assuming all the values are involved are linear. The compositing formula:
× Fg + (1 — ) × Bg
is a linear equation. A common image operation is to pass an image through a gamma curve. A common curve for a gamma of 2.0 would be the same shape as a square root of the image. Doing this to the final image would mathematically be:
gamma( × Fg + (1 — ) × Bg)
If you pass the separate color values through a curve before compositing them, though the result may be “good enough” for your purposes, it’s good to keep in mind that the result is not mathematically “correct”, i.e. is not mathematically the same as compositing the Fg and Bg images and then running them through the curve. It doesn’t matter whether the Fg is premultipled or not, both of these will produce different results than the above:
gamma( × Fg) + (1 — ) × gamma(Bg))
× gamma(Fg) + (1 — ) × gamma(Bg))
So even if the results may work for you, it’s good to remember there is really no mathematically “correct” way to re-curve the colors in an image until they are fully covering the pixel after compositing is complete.

(I originally typed this article fairly quickly in response to a question. Though I published it quickly, I have continued updating/editing it as I think of improvements to the language and the content.)

--

--