Research status of deep learning technology in animal object detection and recognition

2024-07-29 18:32:00

This article cites format

Guo Yangyang, Du Shuzeng, Qiao Yongliang, Liang Dong. Research and Application Progress of Deep Learning in Smart Livestock Breeding[J]. Smart Agriculture, 2023, 5(1): 52-65. doi:10.12133/j.smartag.SA202205009

GUO Yangyang, DU Shuzeng, QIAO Yongliang, LIANG Dong. Advances in the Applications of Deep Learning Technology for Livestock Smart Farming[J]. Smart Agriculture, 2023, 5(1): 52-65. doi:10.12133/j.smartag.SA202205009

CNKIRead the full article

Read the full article for free on the official website

Research status of deep learning technology in animal object detection and recognition

Animal target detection and recognition has become an integral part of animal husbandry, and it is the only way to realize modern refined and scientific animal husbandry. In smart animal husbandry, it can detect animal targets in a timely manner, determine the identity of individual animals and obtain relevant information, establish individual animal files, and provide information support for digital management and traceability of animal husbandry products.

Currently, it is common to detect and identify an individual animal by giving it a unique identifier or mark. Among them, the most common method is to use plastic ear tags or radio frequency identification devices to identify individual animals, but this method has problems with external interference such as equipment damage or loss and collisions. In recent years, with the development of information technology, object detection and recognition methods based on computer vision technology have been widely used in livestock recognition research due to their advantages of non-contact and practicability, usually based on the visual characteristics of target samples (such as shape, texture, color, etc.), combined with intelligent algorithms to achieve object detection and recognition. The commonly used sample images for animal individual detection and recognition include areas such as mouth, nose, face, and trunk, and individual detection and recognition are realized based on regional feature information (Fig. 1).

Research status of deep learning technology in animal object detection and recognition

Fig.1 Commonly used target detection areas for livestock

Fig. 1 Common target detection area for livestock

1 Face Detection Recognition

Non-contact animal detection and identification based on deep learning can effectively reduce the pressure on livestock farms and promote the development of refined scientific breeding. Recently, researchers have used deep learning algorithms to achieve non-contact and efficient detection and recognition of the faces of pigs, sheep, cows and other animals.

Xiangyu Li and Huiying Li trained the Deformable Convolution Networks (DCN) with a pig face matching dataset with high similarity to obtain the deformed pig face dataset, and used the deformed pig face dataset to fine-tune the Tweaked Convolutional Neural Network (TCNN) to obtain the pig face feature point detection model. The error rate of pig face feature point detection by this method is only 5.60%. He Yutong et al. introduced the close block and SPP (Spatial Pyramid Pooling) module into the YOLOv3 model, and proposed the YOLOv3DBSPP (YOLOv3-DenseBlock-SPP) model to detect and identify pigs, which achieved an average accuracy of 90.18%, and when the threshold of the region of interest was 0.5 and the classification probability threshold was 0.1, the average accuracy of the model was 9.87% higher than that of the YOLOv3 model. Wei Bin et al. used the sheep face detected by the YOLOv3 algorithm as the data of individual recognition, and achieved a recognition accuracy of about 64% after training with the VGGFace model. When the positive sheep face was selected as the input data to train the VGGFace model, the recognition accuracy of more than 91% was obtained. Xue et al. proposed a sheep face detection and recognition method based on European spatial metrics (SheepFaceNet), which uses sheep face image samples in the natural environment to train the network to achieve non-contact sheep identity recognition. In addition, in order to solve the problems of many invalid information in sheep face images and poor posture and angle, a sheep face detection and correction method (SheepFaceRepair) was proposed to align the sheep face area, and finally SheepFaceNet was used to realize the recognition of sheep face. Li et al. combined Mobilenetv2 with Vision Transformer to propose a sheep face detection and recognition model named MobileViTFace. The model enhances the model's ability to extract fine-grained features and suppresses the interference of background information through Transformer, so as to distinguish different sheep faces more effectively. Kumar et al. developed a deep learning network model for individual recognition based on cow mouth and nose image samples, using CNN and Deep Belief Nets (DBN) to extract a set of texture features and represent the mouth and nose images of cows, and encoding the extracted image features by Stacked Denoising Auto Encoder (SDAE), which is better than the most advanced method of identifying cattle based on mouth and nose image database.

2. Detection and identification of overall and critical areas

In the field of smart animal husbandry, further detection and identification of the whole animal and key areas will be conducive to deeper mining of animal information, such as the position relationship between the legs and the torso, the relative relationship between the legs, etc., which can reflect the animal health information, and the information mining research based on deep learning in this aspect still needs to be further explored. Qiao et al. proposed a deep learning model for individual recognition of beef cattle based on image sequences. The visual features were extracted from the image sequence through the CNN network, and then these extracted features were used to train the LSTM to capture spatiotemporal information and identify individual cattle, achieving 88% and 91% accuracy at 15 and 20 frames, respectively. He Dongjian et al. proposed an improved YOLOv3 model for individual dairy cow recognition, and the improved YOLOv3 model in the back image dataset of dairy cows had an individual recognition accuracy of 95.91% and an average frame rate of 32 f/s, which could quickly identify individual dairy cows. Hu et al. used the YOLO model to detect and segment the cow area from the original image, and divided the detected cow object into three parts, namely the head, trunk and legs, extracted deep features from these three parts by training three independent CNN networks, and designed a feature fusion model to obtain the final features, and finally identified each cow through the Support Vector Machines (SVM) classifier, achieving a cow recognition accuracy of 98.36%. Jiang et al. proposed a FLYOLOv3 (FilterLayer YOLOv3) deep learning framework to realize the detection of key areas of dairy cows (such as torso, legs and heads) in complex scenes, and verified them on daytime and nighttime datasets, and achieved good detection results.

3 UAV image target detection

Due to the large range of activities in livestock farms, unmanned aerial vehicles (UAVs) are often used to take aerial photographs and monitor livestock activity information. For UAV-based animal monitoring, although the current processing speed of UAV hardware has been greatly improved, the algorithm performance will still affect the effect of UAV-based real-time detection, and deep learning can meet this demand. Based on the UAV aerial video data, Andrew et al. first realized the individual detection and tracking of Holstein cows through the regional convolutional neural network R-CNN and the Kernel Correlation Filter (KCF), and then realized the individual identification of dairy cows through the Inception V3-LSTM network structure, and the final recognition accuracy was 98.1%. Shao et al. and Barbedo et al. realized the target detection and number statistics of dairy cows on UAV images through the CNN network model.

The above studies show that it is feasible to apply deep learning technology to object detection and recognition of UAV images, and it is also one of the main trends of future development to apply deep learning technology to other hardware devices (robots, ground vehicles, etc.) to build intelligent monitoring systems.

4 Summary

Although deep learning technology has made progress in the field of animal detection and identification, there are still some problems, such as the lack of benchmark datasets and evaluation criteria. Comparisons of existing methods are not entirely plausible due to the different datasets, preprocessing techniques, metrics, and models used in the studies. Specifically, the results of object detection are directly related to the sample, and when the sample contains only a single individual or the target is relatively prominent, the accuracy of object detection and individual recognition is high, but the selection of target area detection and feature extraction methods will also directly affect the final detection results, and the external environment (light intensity, occlusion, etc.) as well as the shooting angle and image quality will affect the detection results. Therefore, it is still necessary to further explore the construction of sample sets from multiple perspectives to achieve target detection and individual recognition. For example, in face-based individual recognition research, most of the datasets are images taken from the front of the face, but the head will also show multiple angles, and the actual demand is to randomly capture an image to detect and identify the individual cow and other relevant information when the video is monitored on site or remotely. Therefore, it is necessary to construct more complex sample sets (multi-angle, day and night) to simulate the actual breeding scenes, build intelligent algorithms that can be applied to different scenarios, and further develop efficient, accurate and easy-to-operate detection and identification systems.

Smart agriculture WeChat exchange service group

In order to facilitate the academic exchange of readers, authors and reviewers in the field of agricultural science, promote the development of smart agriculture, and better serve the readers, authors and reviewers, the editorial department has established a WeChat communication service group, where questions related to issues in the professional field and submissions can be consulted. How to join the group: Add the editor's WeChat 331760296, note: name, unit, research direction, the editor pulls you into the group, and the institutional marketing and advertising personnel do not disturb.

Research status of deep learning technology in animal object detection and recognition

Post a call for submissions

Welcome to publish information such as the introduction of the scientific research team, innovative scientific research achievements and related activities on our official account.

Research status of deep learning technology in animal object detection and recognition

Read on