laitimes

The color space in a video clip

author:Flash Gene

1. Preamble

To view the description of the color in the video file, use the ffprobe command: ffprobe -i video file address -show_streams, and print the following color related information:

The color space in a video clip

The above parameters are indicated: the color model of the video is represented by YUV, and the sampling format is 420p; The color range uses TV mode, also known as Limited Range; The conversion matrix of YUV to RGB adopts BT.601 (smpte170m); Transfer function: BT.601 (smpte170m); Primary colors: BT.601 (bt470bg).

In the editing scenario, you often need to process multiple video files, and the color information of the video files may be different, how to use these parameters correctly so that the video is displayed correctly on the screen during playback and the exported video does not produce color cast? With these questions in mind, let's first understand some concepts of color space: color gamut, transfer function, transformation matrix, YUV/YCbCr model.

2. Color space

2.1. Horseshoe Diagram:

Through experiments, it has been found that other colors can be combined only by three basic colors, which is the theory of three primary colors, which corresponds to the three cones L, M, and S that the human eye perceives colors.

Color matching experiment: Split the screen into two halves and project solid color light on the left. Using RGB mixing on the right, the observer can vary the intensity of the basic light until it is no longer clear that there is a difference between the colors of the two screens, and record the intensity of the three primary colors.

In 1931, the International Commission on Illumination CIE used 700nm red, 546.1nm green, and 435.8nm blue as the three primary colors, and established the CIE 1931 RGB chromaticity system through color matching experiments, and derived the plane coordinate diagram with only rg two axes from the system CIE 1931 RGB chromaticity diagram, as shown in Figure 1, after eliminating the negative values and brightness in the RGB chromaticity diagram, the CIE 1931 XY chromaticity diagram, also known as the horseshoe diagram, was obtained, as shown in Figure 2.

The color space in a video clip
The color space in a video clip

Figure 1: CIE 1931 RGB chromaticity diagram Figure 2: CIE 1931 XY chromaticity diagram

2.2. Color gamut: color_primaries

The color gamut is made up of three primary colors and white dots.

Take R(1,0,0),G(0,1,0),B(0,0,1) and map them to CIE 1931 XY space to construct a triangle, this triangle area is the color gamut, the three vertices represent the red, green and blue primary colors under the color gamut, and the coordinates of the three primary colors in the XY space are different in different color gamuts, for example, the coordinates of the three primary colors of BT.709 (sRGB) are: R(0.640, 0.330), G(0.300, 0.600), B (0.150, 0.060); The coordinates of the three primary colors of BT.2020 were: R(0.708, 0.292), G(0.170, 0.797), B(0.131, 0.046). The larger the size of the triangle, the more colors can be expressed, with BT709 (sRGB) accounting for 35.9% of all colors and BT2020 accounting for 75.8% of all colors.

The color space in a video clip
The color space in a video clip

Figure 3: Representation range of different color gamuts in a horseshoe chart

The white dot in the center of the gamut triangle represents the reference white in the gamut, and moving the white dot will cause the entire gamut to be relatively blue or reddish, and the reference white of BT601/BT709/BT2020 is D65 with coordinates (x=0.3127, y=0.3290).

Color gamut conversion. The two-dimensional color gamut diagram is a triangle, the three-dimensional color gamut diagram is a cube, Figure 4: is the three-dimensional mapping of RGB space in CIE XYZ space, the cube composed of vectors SR, SG, SB is the RGB color space of the color gamut, the color gamut is different is the RGB cube located in the XYZ coordinate system, then the color gamut conversion needs to first transfer the rgb value to the XYZ coordinate system to get the xyz value, and then transfer the xyz value from the xyz value to the target color gamut RGB coordinate system to get the new rgb value.

The color space in a video clip
The color space in a video clip

Figure 4: 3D mapping of the RGB color space in the CIE XYZ color space

色域转换流程:BT2020色域 => XYZ坐标 => BT709如下:

BT2020RGB转XYZ矩阵:

The color space in a video clip

XYZ转BT709RGB矩阵:

The color space in a video clip

Multiply to get the BT2020 to BT709 matrix:

The color space in a video clip

2.3. Transfer Functions: color_transfer

Transfer function is also known as gamma correction, transfer function is used to convert linear natural light signals into nonlinear electrical signal storage, this process is gamma coding, called photoelectric conversion (OETF), the display screen converts nonlinear electrical signals into screen light display, this process is gamma decoding, called electro-optical conversion (EOTF), the color_transfer value in the video file is displayed as BT.601 (smpte170m), BT.709, etc., the gamma value of BT601 is 2.4, the gamma value of BT709 is 2.2, and Figure 5 below is the transfer function transformation curve.

The color space in a video clip

Figure 5: Gamma encoding/decoding curve

There are 3 main reasons why gamma correction is needed:

1. The human eye's perception of brightness is nonlinear, and it is more sensitive to dark details and relatively insensitive to bright details. This means that using linear luminance values directly to represent an image will cause the image to appear visually unnatural. Gamma correction adjusts brightness through nonlinear transformations to make it more in line with the perceptual characteristics of the human eye. The change in brightness in nature in Figure 6 is linear, while the human eye is visually logarithmic.

The color space in a video clip

Figure 6: Physical and visual grayscales vs visual grayscales

2. Display characteristics

Display devices such as CRT monitors, LCD screens, projectors, etc., also have a nonlinear relationship between the response input voltage and the output brightness. The relationship between the light output of a CRT display and the input voltage roughly follows a power function, and the relationship between the output brightness L and the input signal V can be expressed as L=, where is the gamma value, which is usually about 2.2 to 2.5. In order to display the image correctly on these display devices, gamma correction is necessary.

3. Retain more dark details

Gamma correction of images can improve encoding and storage efficiency. In the absence of gamma correction, high brightness values take up a lot of encoding space, while details at low brightness values can be lost. Through gamma correction, the luminance value is transformed nonlinearly, which can better allocate the coding space and save more shadow details.

2.4. YUV转换矩阵:color_space

color_space used to represent the conversion matrix between YUV and RGB, the coefficient used is a 3*3 matrix.

YUV color model

Before talking about the conversion matrix, let's understand the definition of YUV, where Y represents luminance information and UV represents chromaticity information. Advantages of YUV: 1. Separation of luminance and chromaticity, which is more suitable for the field of image processing. 2. Compared with RGB, YUV uses less storage space.

The YCbCr model is derived from the YUV model, which is used in analog television systems, such as PAL and NTSC standards, to ensure the compatibility of color TV signals with black-and-white TV signals. YCbCr color model is mainly used in digital video standards, such as JPEG image compression, MPEG video compression, and digital TV standards (such as ITU-R BT.601 and BT.709), people often refer to YVbCr, YUV in this article all means YCbCr, YUV has many sampling formats, such as 4:4:4, 4:2:2, 4:1:1 and 4:2:0. The more commonly used format is 4:2:0, 4 Y's share a pair of UVs, YUV420 has four formats i420, YV12, NV12, NV21, the difference between these formats is mainly in the UV storage order is different, as shown in Figure 7 below:

The color space in a video clip

Figure 7: Differences between the 4 formats of the YUV420

Transformation matrix

The color_space field in the video file indicates which conversion matrix needs to be used when converting YUV to RGB, such as BT601/BT709/BT2020, taking BT709 as an example.

The color space in a video clip

BT709YUV与RGB之间相互转换的矩阵

YUV color range

YUV有两种色彩范围:tv range和pc range,tv range又叫limited range,pc range又叫full range,ffmpeg中用AVCOL_RANGE_MPEG和AVCOL_RANGE_JPEG分别表示tv range和pc range。 下表展示了tv range与pc range的数值范围对比:

The color space in a video clip

Why do you need TV Range

1. Historical reasons: The limited range of YUVs is designed to be compatible with analog TV signals. In analog signals, the values cannot reach full black 0 or full white 255, so a limited range is used to ensure signal stability and compatibility.

2. Prevent cropping and over-quantization: The use of a limited range can prevent image loss caused by cropping or over-quantization during signal transmission and processing. For example, the value of the Y component between 0-15 and 236-255 is reserved for synchronization signals and error detection. The 16 numbers that are not used in the lower segment are called footroom, and the high segment is called headroom, and the number that cannot be used in this part represents the sync signal.

3. Color space conversion: When converting from RGB to YUV color space, the limited range can prevent extreme color values from causing distortion or clipping during the conversion process.

tv range YUV->RGB转换矩阵

You need to move the origin from (16, 128, 128) back to (0, 0, 0), and then scale the YUV value to the range of pc range, as follows:

The color space in a video clip

Take BT709YUV, for example.

BT709归一化YUV转换到RGB:

The color space in a video clip

BT709 8位YUV转换到RGB:

The color space in a video clip

3. Processing in clips

Take BT709 as an example: input video to the BT709 color gamut.

The color space in a video clip
  • Firstly, according to the color_space and color_range information, the matrix of YUV to RGB is obtained, and RGB = YUV2RGB_BT470BG_TV_MAT * YUV is obtained.
  • According to the transfer function obtained from the color_transfer, the nonlinear RGB is converted to linear, and the gamma is decoded: RGB = BT470BG_EOTF_gammaMethod (RGB).
  • Color gamut conversion, first from BT601 color gamut coordinate system to XYZ coordinate and then to BT709 coordinate system, rgb = BT601_TO_BT709_MAT * rgb.
  • Send linear RGB data to the rendering engine and add effect information, such as Gaussian blur, etc.
  • The input RGB of the screen is required to be non-linear, and the gamma encoding is required before it can be displayed on the screen: RGB = BT709_OETF_gammaMethod (RGB).
  • Video export: Multiply the nonlinear rgb in front of the screen by the inverse matrix of the BT709 color_space/tv color_range to obtain the yuv data, and configure the color range of the video stream as: tv range, color_space: 709, transfer function: BT.709, color primary color: BT.709.

4. Summary

In video editing, color processing is a crucial step in ensuring video quality and visual effects. Understanding the concept of color gamut and color space, correct color correction and processing, can significantly improve the professionalism and attractiveness of the video, and the support for subsequent HDR, HDR and SDR conversion is also very important.

Author: Yi Zen

Source-WeChat public account: Bilibili Technology

Source: https://mp.weixin.qq.com/s/4n70cc9_R2KJh4GF2fexJg

Read on