Color

Posted on 2025-05-15 In SNU , 4-1 , 컴퓨터그래픽스 Reading time ≈ 8 mins.

Physical properties of color

Lights: Things that emit energy to surroundings
Material: Things that absorb or reflect a portion of incident energy

Light

Types of energy being transferred determines the color of light source.
Visible spectrum: 400nm ~ 700nm

White is defined as a color of the sunlight.
It is actually a mixture of all visible frequencies.

Emission Spectrum

Light sources have different emission spectrum.

Absorption Spectrum

Materials have absorption spectrum.
Reflectance spectrum is also possible. (1 - absorption)

When a light hits a material, the energy existing for each wavelength will be absorbed proportionally to the material's reflectance spectrum. $\Phi(\lambda) = I(\lambda)R(\lambda)$

Color mixing

Additive Color Mixing: Add emission spectra; e.g. multiple lights are added
Subtractive(Multiplicative) Color Mixing: Multiply reflectance spectra; e.g. multiple paints(materials) are added
Since any number multiplied by 0 is 0, only colors reflected by both materials will be shown.

The Human Visual/Perception System

The human eye is a camera! ~~in fact it is opposite~~

Aperture: Pupil(동공/눈동자)
Lens: Lens(수정체)
Film/Sensor: Retina(망막)
Focal point: Fovea centralis(중심와)
Mean of spectrum determines the hue (color).
Variance of spectrum determines the saturation.
Area of spectrum determines the brightness.

Retina

Rods are primary receptors under dark viewing conditions. They can perceive brightness on a scale of $10^10$ .
Cones are the primary receptors under high-light viewing conditions. They can perceive color.

Under dark viewing conditions, only rods work (Rod vision), so we can't perceive color very well.
If environment becomes brighter, cones can work (Cone vision), allowing us to perceive color.

Distribution of Rods and Cones

There are 120 million rods and 6~7 milion cones in the human eye. Rods are much more common than cones!
However, at fovea, cones are much more common than rods.

e.g. Night sky have more stars off-center, because there are less rods at fovea.
e.g. If object is far from fovea, we can't perceive color.

Spectral Response of Cones

Type of cones

There are three types of cones: S, M, L.

We have more L-type cones than other types because longer wavelengths have lower energy.
Despite the large number of L-type cones, the most sensitive cones are M-type.

Tristimulus values

We can define tristimulus values (S,M,L) with three types of cones!

$\begin{align*} S &= \int_\lambda I(\lambda)\,S(\lambda)\,d\lambda\\ M &= \int_\lambda I(\lambda)\,M(\lambda)\,d\lambda\\ L &= \int_\lambda I(\lambda)\,L(\lambda)\,d\lambda \end{align*}$

The brain receives light as a electric signals of tristimulus values.

Metamers

If two different spectra have same tristimulus values, we perceive them as the same color.
e.g. we can detect counterfeit currency by using special inks that appear to be the same color but have different appearance under UV light.

Probably we can use metamers to reduce the difference between the display and the printed image?

Color space

RGB color space

Motivation: color matching experiment!
Can we match target light by mixing primary lights?
In fact, we can't!!! Sometimes, we need to add light to target light to match two colors.
This can be viewed as mixing negative amount of colors???

e.g. To match 500nm light, we need to mix blue, green, and negative amount of red light.

Grassmann's Law

For color matches,

Symmetry: $U = V \Leftrightarrow V = U$
Transitivity $U = V \land V = W \Leftrightarrow U = W$
Proportionality $U = V \Leftrightarrow tU = tV$
Additivity $U = V \land W = X \Rightarrow U + W = V + X$

i.e. we can view color matching as an vector space!

XYZ color space

Using grassmann's law, we define three imaginary primary colors X, Y, Z satisfying:

All visible light can be made by mixing positive amount of X, Y, Z.
Y corresponds to the luminance (perceived brightness).
X, Z correspond to the chromaticity (color).

$\begin{gather*} \begin{bmatrix} X \\ Y \\ Z \end{bmatrix} = \begin{bmatrix} 0.49000 & 0.31000 & 0.20000\\ 0.17697 & 0.81240 & 0.01063\\ 0.00000 & 0.01000 & 0.99000 \end{bmatrix} \begin{bmatrix} R \\ G \\ B \end{bmatrix} \\ \begin{bmatrix} R \\ G \\ B \end{bmatrix} = \frac{1}{3400850} \begin{bmatrix} 8041697 & -3049000 & -1591847 \\ -1752003 & 4851000 & 301853 \\ 17697 & -49000 & 3432153 \end{bmatrix} \begin{bmatrix} X \\ Y \\ Z \end{bmatrix} \end{gather*}$

xyY color space

$x = \frac{X}{X+Y+Z}, y = \frac{Y}{X+Y+Z}$

Instead of X, Y, Z, we use normalized version x, y, and luminance Y.
z is not used because it can easily obtained from x, y. ( $z = 1-x-y$ )
We can restore original X, Y, Z values from x, y.

$X = \frac{Y}{y}x, Z = \frac{Y}{y}z = \frac{Y}{y}(1-x-y)$

xy chromaticity diagram

Because x, y is normalized, only Y contains luminance information, and only x, y contains chromaticity information.
Therefore, we can use xy plane to illustrate every visible color.

Perceptually Uniform Space

Does L2 norm (i.e. eclidean distance) in color space reflect perceived differences between colors?
xyY color space doesn't satisfy perceptually uniform color space.

MacAdam ellipses specify areas of unperceivable differences.
People can't distinguish between hue in certain areas and saturation in others.

If we can map MacAdam ellipses into circles, we can have perceptually uniform space.
However, we need non-linear mapping.

e.g. CIE LAB Color Space use the color of the reference white $(X_0, Y_0, Z_0)$ to calculate L, a, b values.
L represents lightness, and a, b represents color opponents.

$\begin{align*} L &= 25 \left( 100\frac{Y}{Y_0} \right)^\frac{1}{3} - 16 \\ a &= 500 \left[ \left( \frac{X}{X_0} \right)^\frac{1}{3} - \left( \frac{Y}{Y_0} \right)^\frac{1}{3} \right] \\ b &= 200 \left[ \left( \frac{Y}{Y_0} \right)^\frac{1}{3} - \left( \frac{Z}{Z_0} \right)^\frac{1}{3} \right] \end{align*}$

Color Gamut

Real devices can't use imaginary color like XYZ. We can only mix existing colors!

If we mix positive amount of n colors, we can represent every color inside the convex hull of the color positions.
This area - the range of colors that can be accurately represented - is called color gamut.

sRGB is the most popular color gamut, using red, green and blue. ~~But you can't represent pure red color in sRGB.~~
CMYK is used in printers, and they use more than 3 colors because color gamut is too small if only 3 colors are used.

Color gamut outside color space are not representable in devices.
They are used to define image format that can be converted to any other color gamut.

Reference white

Which color is "white"? We choose specific color as a pure white.
e.g. D65: White similar to sunlight, color temperature is 6504K

Complementary Colors

Draw a straight line connecting the reference white and the color. The color on the other side is the complementary color.

Dominant Wavelength

Draw a straight line connecting the reference white and the color. The point where the line intersects the spectral locus (the curve part) is the dominant wavelength of the color.
i.e. The dominant wavelength is the hue of the color. (Pure color) i.e. This color can be made by mixing dominant wavelength and reference white.

If the line intersects the line of purples (the straight part), it is non-spectral. (i.e. no monochromatic (single frequency) light source can generate it)
Instead, we define the intersection between the line and the spectral locus as a subtractive dominant wavelength (complementary wavelength) of the color. i.e. This color can be made by subtracting complementary wavelength from reference white.

Color models

Color space is the full range of colors we can choose from.
Color model is the way a particular color in a color space is specified, which is relevant to color gamut.

Convenience: User can easily choose the color they want
Color compositing/processing: Color space matters when we interpolate/blend colors.
Efficiency of encoding: We can use more of numberical range for perceptually significant colors.

Additive/Subtractive Color Models

Additive: Combines colored lights, e.g. RGB
Subtractive: Combines paint colors, e.g. CMYK ~~Actually multiplicative because we multiply colors~~

RGB Color Model

Additive color model
Easy for display devices
VERY widely used
Not perceptual (e.g. How do I make indigo?)

RGB Color Encoding

Usually RGB color is encoded in 8bpc (bits per color channel) hexadecimal values. e.g. #1B1F8A
But sRGB is already small, is 256 possible values per channel enough???

High dynamic range (HDR) images use floating point number instead of 8-bit integer.
Devices with high dynamic range are also available, but it is very expensive.

CMYK Color Model

Subtractive color model
Easy for printing devices
Still, not perceptual (e.g. When do we use black?)

HSV Color Model

Additive color model
Intuitive (Use Hue, Saturation (purity), Value (brightness) directly)

Actually made with RGB Color Model?
We can rotate the RGB cube to make it look like a hexagon, then turn it into a circle to create a HSV model.

The color around the perimeter of the circle is called the pure color.

HSV Color Encoding

HSV is also encoded in 8bpc.
But actually, RGB and HSV values are not 1:1 correspondences when divided by 256 equally spaced values!
In practice, we use certain algorithm to match RGB and HSV values.

Intensity-based Color Models

Use intensity (brightness) as Y, and use color as other two.
Why? Human are more sensitive to brightness.
If we use full resolution for intensity, and low resolution for colors, we can actually get similar image to the original!

YIQ model and YCbCr model

YIQ: Color model used by analog NTSC color TV systems, uses orange as I and magenta as Q.
YCbCr: Color model used by digital videos and photography systems, uses blue as Cb and red as Cr.

Color sensing device

Camera

Color Sensing Camera

As light reflects, refracts, and passes through a prism, it splits into three branches.
Each branch is headed toward the color sensor.

Color sensor have cellophane-like color filter in front of it, so it can measure intensity of red, green, and blue.

Professional camera use this, but this make camera too large, and each sensor have to process 3 colors.

Bayer filter

Instead, we normally use Bayer filter.
It divides single pixel into four subpixel, and each subpixel measure intensity of single color.
We use 1 red, 1 blue, and 2 green sensors because humans are more sensitive to the green color.

The color of pixel is estimated from neighboring 8 subpixel's value.
This algorithm... is actaully trade secret!
By using different algorithms in bayer filter, we can make various filters. e.g. food, portrait, night

White balance

Chromatic adaptation: Most white lights are not white!
e.g. sunlight is blue, fluorescent lamp is yellow
But human eye adjust color based on light color, then perceive white paper as a white.

Problem: Devices doesn't do that... Real picture taken by camera is not white!
We have to adjust color based on light color too!
First, we find pixel that should be white, then we adjust color based on that pixel.

Color Temperature

We use temperature as a color of the light sources!
e.g. Sunrise/sunset's color temperature is 2000K, and sunlight in day's color temperature is 5500~6500K.

To make precise camera, we should measure color temperature of the light, and use it in white balance.
~~But most people just believe auto white balance~~

Human Perception

Humans are more sensitive to the difference of stimulus rather than its absolute value.

Weber-Fechner law: The subjective sensation is proportional to the logarithm of the stimulus intensity.

$p = k \ln\left( \frac{S}{S_0} \right)$

Color Quantization Problem

We want to encode intensity level. Because of Weber-Fechner law, linear encoding doesn't feel like linear...

Human perceive intensity is linear if relative differences is constant.

$\frac{I_1}{I_0} = \cdots = \frac{I_n}{I_{n-1}} = \gamma, I_k = \gamma^k I_0$

Gamma Correction

Linear encoding can't differenciate darker intensity, and waste too much bits at lighter intensity.
Solution: we use gamma function! $I_{encoded} = I^\frac{1}{\gamma}$
We don't use log because log gets crazy around 0.

Most images (especially sRGB images) are encoded using gamma correction!
To get actual intensity of the image, we should apply inverse gamma function. $I = I_{encoded}^\gamma$

2.2 is normally used because of historical reasons such as CRTs.
But gamma can be any value higher than 1, or even any value lower than 1.