In the field of intelligent transportation, pedestrian detection has made a lot of achievements.

Using computer vision technologies such as image segmentation and deep learning, pedestrian detection provides an important technical guarantee for the development of the autonomous driving industry, enabling autonomous vehicles to perceive and understand the surrounding environment more accurately and make more intelligent and safe decisions.

Below, this article will continue to introduce pedestrian detection technologies that are common in autonomous driving.

1. Pedestrian detection method based on feature extraction

Feature extraction is a basic concept in computer vision and image processing technology, the so-called feature extraction refers to the selection of some of the most representative features with strong classification ability from the target image features.

Each image has many characteristic appearances, such as color, grayscale, texture features, etc., which represent the entire image from different angles. In the field of pedestrian detection, pedestrians are non-rigid objects, and the position and pose are diverse, so they have strong discrimination ability and good robustness, which is more suitable for the detection method of feature extraction.

Underlying feature detection method

Among the many underlying features applied to pedestrian detection, the mainstream detection method is the Histogram of Oriented Gradient (HOG) feature proposed in 2005, and most pedestrian detection methods are improved based on the ideas of HOG features and SVM classifiers.

HOG detects human graphics by calculating the gradient, amplitude, direction and other features of the local information of the image, obtains the horizontal and vertical gradient amplitude and the gradient angle of the pixel point, uses the gradient amplitude to unfold the weighting of the gradient angle, outlines the local gradient amplitude of the image and the corresponding direction, and finally uses the SVM classifier to classify and predict.

The HOG feature detection method allows different modules to overlap each other, so there is a large tolerance for light shift and position shift, which is highly robust and can represent the characteristics of the human body well.

Hybrid feature detection

Multiple features are constructed on the basis of single features, and single feature detection often has defects such as low detection accuracy and weak generalization ability.

For example, in a low-light environment, the captured visible light is very bad, and it is difficult for the road pedestrian detector based on the single visible light mode to obtain enough pedestrian features, which will bring certain difficulties to subsequent detection.

Hybrid feature detection can use multispectral images for all-round recognition, combine multispectral images and other types of image features, including texture features, shape features, color features, etc., to find the most representative features in the entire image feature space, thereby weakening the influence of illumination on the detection results.

Common multimodal detection methods are:

● By fusing HOG features and LBP features describing texture, the algorithm adds weighted components to attach weights to each target in the image to be measured, effectively reducing the impact of occlusion or overlap of pedestrian targets.

● The pedestrian detection method based on symbiotic gradient direction histogram (CoHOG), which describes pedestrian targets through features in multiple gradient directions, has higher accuracy and recall than the HOG-based single-feature pedestrian detection method.

● Based on the edge feature detection method, combined with pedestrian color and texture information, partial least squares (PLS) is used to reduce the dimension and improve the calculation efficiency.

The detection speed, accuracy and robustness of the multimodal fusion pedestrian detection method are different, so the pedestrian feature description selected in different application scenarios is also different.

2. Pedestrian detection method based on classification and positioning

The detection method can directly determine whether there are pedestrians in the image to be detected and locate the window. At present, commonly used classification and positioning methods include sliding window method and beyond sliding window method.

Sliding window method

The sliding window method requires selecting a fixed width and height window in the image to be measured and sliding it sequentially, and identifying the pedestrian features in the window through the classifier model. This method consists of two methods: the whole method and the partial method.

丨Holistic approach

The holistic method can directly extract the global features of pedestrian targets in the window, and common methods include SVM, Boosting, etc. When training the dataset, the overall method only needs to label the pedestrian area with a rectangular box, which is more efficient, but at the same time, it is easy to ignore the detailed features, and the pedestrian detection accuracy will be greatly reduced under the influence of crowded environment and occlusion, so this method is not suitable for the detection of small target pedestrians.

丨Partial method

The basic principle of the partial method is to divide the pedestrian into several different parts, and then train the classifiers that detect different parts separately, and establish the geometric relationship between each part. Compared with the overall method, the partial method usually has strong robustness for scenes such as more occlusion and pedestrian overlap, but it takes a lot of time to match each part with the overall image, so it is difficult to meet the application requirements in terms of real-time.

Go beyond the sliding window method

Compared with the sliding window method, the beyond sliding window method is less used in pedestrian detection technology.

● Efficient sub-window search method: As the name suggests, this is a method that can quickly locate the target, but the calculation process is more complicated.

● Approximation algorithm of alternating search: It is an improvement on the efficient sub-window search method, and its speed can be increased by about 900 times, but its performance is more general in occlusion environment.

● Implicit shape model: The implicit shape model is a combination of the first two methods, which does not rely on image preprocessing to distinguish complex scenes, not only can be quickly located, but can effectively solve the occlusion problem, but the detection accuracy of its local features is poor, it is difficult to take into account the real-time detection, and the false detection rate is high.

Third, the implementation details

The construction of pedestrian detection system is not only related to feature extraction, classification and positioning, image segmentation, deep learning and other technologies, but also closely related to the quality of training samples.

The quality of training samples is very important for the construction of pedestrian detection systems. When training a deep learning model, you need a large amount of high-quality training data to improve the performance of the model. For pedestrian detection systems, training samples should contain pedestrian images under different scenes, weather, time and lighting conditions to ensure that the model has good generalization ability in various situations.

In addition, it is necessary to pay attention to the labeling quality of the training samples, and the accuracy of the labeling will directly affect the performance and effect of the pedestrian detection model. If the bounding box and classification label of the label are inaccurate, the model will learn the wrong feature, resulting in an increase in the detection error rate.

What are the common pedestrian detection technologies in autonomous driving? (ii)

Therefore, when labeling data, it is necessary to use high-quality data labeling tools and be labeled by professional annotators. Labelers need to label each pedestrian's bounding box and determine its category according to predefined labeling rules. During the labeling process, it is necessary to carefully review each picture and carefully label each pedestrian to ensure the accuracy and consistency of the annotation.

When labeling pedestrians, commonly used tools include semantic segmentation, instance segmentation, rectangular box and polygon annotation, etc., to more intuitively label the position and shape of each pedestrian and improve the timeliness of labeling work.

"For example, when pulling the box label, the labeling of the pedestrian training sample often gives the smallest rectangular window containing the pedestrian, but in fact, the box can continue to extend downward, because the pedestrian generally stands on the ground, and the characteristics of the ground are usually relatively fixed, and the extended window can improve the accuracy of the pedestrian label."

In summary, the quality of annotated data is a key factor in whether a pedestrian detection system can achieve the expected performance. If the labeling is inaccurate and the training data quality is poor, the performance of the model will be seriously affected, resulting in problems such as decreased pedestrian detection accuracy or missed detection.

What are the common pedestrian detection technologies in autonomous driving? (ii)

1. Pedestrian detection method based on feature extraction

2. Pedestrian detection method based on classification and positioning

Third, the implementation details