Research trends in livestock facial identification: a review

Kang, Mun-Hye; Oh, Sang-Hyon

doi:10.5187/jast.2025.e4

J Anim Sci Technol 2025; 67(1):43-55

pISSN: 2672-0191, eISSN: 2055-0391

DOI: https://doi.org/10.5187/jast.2025.e4

REVIEW

Research trends in livestock facial identification: a review

Mun-Hye Kang¹

, Sang-Hyon Oh²^,^*

Author Information & Copyright ▼

¹Division of Aerospace and Software Engineering, Gyeongsang National University, Jinju 52828, Korea

²Division of Animal Science, Gyeongsang National University, Jinju 52828, Korea

^*Corresponding author: Division of Animal Science, Gyeongsang National University, Jinju 52828, Korea. Tel: +82-55-772-3285, E-mail: shoh@gnu.ac.kr

© Copyright 2025 Korean Society of Animal Science and Technology. This is an Open-Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Received: Nov 17, 2024; Revised: Dec 16, 2024; Accepted: Jan 02, 2025

Published Online: Jan 31, 2025

Abstract

This review examines the application of video processing and convolutional neural network (CNN)-based deep learning for animal face recognition, identification, and re-identification. These technologies are essential for precision livestock farming, addressing challenges in production efficiency, animal welfare, and environmental impact. With advancements in computer technology, livestock monitoring systems have evolved into sensor-based contact methods and video-based non-contact methods. Recent developments in deep learning enable the continuous analysis of accumulated data, automating the monitoring of animal conditions. By integrating video processing with CNN-based deep learning, it is possible to estimate growth, identify individuals, and monitor behavior more effectively. These advancements enhance livestock management systems, leading to improved animal welfare, production outcomes, and sustainability in farming practices.

Keywords: Livestock; Recognition; Identification; Re-identification; Convolutional neural network; Deep learning

INTRODUCTION

Improving production efficiency, ensuring animal welfare, and reducing environmental impact require technologies for growth estimation, individual identification, and behavior monitoring [1–3]. As computer technology advances, livestock monitoring systems have also evolved. These systems are broadly categorized into sensor-based contact methods and video-based non-contact methods [4–7]. Sensor-based contact methods involve collecting behavioral data using a sensor attached to the ear or gathering real-time information from a microchip implanted in the neck of an animal. However, these methods are prone to sensor failures, difficult to scale for large populations, and can be stressful for the animals [8]. Particularly, radio frequency identification (RFID) tags, which are widely used due to their low cost, are limited by their restricted range, inability to read multiple tags simultaneously, and time consumption, and the attachment process itself can be stressful for animals [9–12]. On the other hand, video-based non-contact monitoring technologies do not require physical contact. As a result, they eliminate stress for the animals, allow for remote monitoring of their condition, and enable monitoring even at night.

Recently, the rapid advancement of computer technologies, including deep learning algorithms, has enabled the analysis of accumulated data to continuously monitor and analyze animal conditions without human intervention, resulting in efficient and automated monitoring systems. This review examines and summarizes research related to video processing and convolutional neural network (CNN)-based deep learning for animal face recognition [13–36], identification [8,28,37–44] and re-identification [45], and assesses its applicability to precision livestock farming (PLF) for improving animal welfare and production efficiency (Table 1 and Fig. 1).

Table 1. Recent research regarding recognition/identification/re-identification

Research areas	Reference	Target animal	Dataset	Pre-trained/transfer learning status	Feature	Algorithm
Wildlife recognition	[26]	Wildlife	Wildlife Spotter	×	−	Lite AlexNet, VGG-16, ResNet50
Wildlife recognition	[27]	Wildlife	Fishmarket, MS COCO 2017	×	−	WildARe-YOLO
Wildlife face recognition	[29]	Chimpanzee	Self-created dataset	×	Annotation Automation Framework	SSD, CNN
	[25]	Giant panda	Self-created dataset, ImageNet	○	Pre-trained AlexNet, GoogLeNet, ResNet-50, VGG-16	NIPALS
	[21]	Panda	Self-created dataset, COCO	○	Pre-trained Faster R-CNN, fine-tuned ResNet-50	DNN
	[39]	Golden snub-nosed monkey	Self-created dataset	×	−	Faster-RCNN
Livestock face recognition	[24]	Pig	Self-created dataset	×	Automatic selection of training and testing data	Haar cascade, Deep CNN
	[31]	Sheep		×	−	YOLOv5s, RepVGG
	[28]	Aberdeen-Angus cow	Self-created dataset	○	Pre-trained VGGFACE, VGGFACE2	−
	[34]	Cattle	Self-created dataset	x	Embedded system,automatically processing datasets	CNN
	[35]	Cattle	Self-created dataset	x	channel pruning	YOLOv5
identification	[37]	Cattle		×	−	Inception-V3 CNN, LSTM
	[42]	Cattle	ImageNet, COCO	x	Mobile devices	YOLOv5, ResNet18 Landmark
	[44]	Horse, etc.	THDD dataset	○	Hybrid	YOLOv7, SIFT, FLANN
Re-identification	[40]	Amur tiger	ATRW, ImageNet	○	Pre-trained SSD-MobileNet-v1, SSD-MobileNet-v2, DarkNet	YOLOv3

YOLO, You Only Look Once; MS COCO, Microsoft common objects in context; SSD, single shot multibox detector; CNN, convolutional neural network; DNN, deep neural network.

Download Excel Table

Fig. 1. Animal face recognition/identification processing.

Download Original Figure

Precision livestock farming and monitoring systems

PLF has grown alongside advancements in sensing technology, big data, and deep learning. PLF applies these technologies to individual recognition and behavior monitoring, feed intake and weight measurement, barn temperature control, body temperature and estrus detection, activity levels, gait, body condition, and carcass traits [1–3]. The goal of PLF is to enhance farm management efficiency, conserve resources, improve animal welfare, and maximize productivity by implementing real-time data monitoring and automated management systems.

The monitoring methods used in PLF are categorized into sensor-based contact methods and video-based non-contact methods. Contact methods involve attaching devices like collars, bands, ear tags, and RFID tags to animals to collect data. While these methods can gather accurate physiological data, they may also cause stress to the animals and are challenging to manage and maintain on a large scale [8–12]. Non-contact methods collect data remotely without direct contact with the animals by using tools like CCTV, special cameras, drone cameras, and sound detection systems, and rely primarily on analyzing video and image data [4–7]. Although this method may be less accurate compared to contact-based methods, it is advantageous for animal welfare as it does not cause stress to the animals. It also allows for monitoring over a relatively wide area and is more economical in terms of equipment management and maintenance.

Animal object detection

Non-contact based monitoring is primarily performed through object detection [46–49], which is a technology that detects objects in images or videos and indicates the location of each object [50–52]. Even more detailed analysis is possible when object detection is combined with a CNN [13–21] because CNN enables powerful feature extraction while maintaining spatial structure in large volumes of images. Furthermore, various architectures and high-performance algorithms have been developed. Recently, research has been conducted not only on object recognition [22–36], but also on object identification [28,45].

Object recognition involves distinguishing a specific object from other objects by classifying and recognizing the type of object detected in an image or video. Object identification includes matching the recognized object in a database to identify the specific object. As a real-world application of object recognition, inter-species recognition research is being conducted to effectively recognize faces among different animal species so that this technology can recognize various animal species in one system [34].

The field of human face recognition is already widely used in biometric authentication. The deep learning-based algorithm ArcFace, which converts the features of each face into embedding vectors, shows an accuracy of 99.78% [24]. On the other hand, the field of animal recognition or identification has seen significant research in recent years, but has fewer results. Animal identification involves distinguishing and recognizing specific animals and can be applied to research that monitors individual animals’ health, behavior, and reproductive status, and can be used for the protection of endangered species [27]. Technologies for animal face identification, recognition, re-identification, and inter-species recognition can be utilized to monitor the health status, growth patterns, and behavior patterns of individual animals. In the case of wild animals, these technologies play a crucial role in biological conservation and research by helping to determine an animal’s population or monitor their migration paths [41].

Understanding the health and behavior patterns of animals in the livestock sector is crucial for early disease detection, diagnosis, and animal welfare. As a result, animal recognition technologies are essential in PLF [22]. Furthermore, research on animal re-identification is also being conducted. This research aims to recognize previously identified animals for long-term monitoring of behavior patterns, survival rates, and migration paths [8,41–45,48,53]. To accurately identify individual animals, it is necessary to precisely detect their location within an image through object detection and accurately classify them. The more accurate the detection results are, the more accurate the recognition results will be.

Traditional object detection algorithms use manual methods that involve feature extraction considering color, gradient, texture, and shape, and use K-nearest neighborhood (KNN), support vector machine (SVM), and Bayesian classifiers. These methods are suitable for detecting small, distinct objects, but they are less accurate and inefficient for detecting objects in real-world images that include noise such as backgrounds. Object detection has significantly improved in accuracy due to machine and deep learning algorithm improvements, and it is being utilized in various fields, including PLF for non-invasive identification [46,47].

Generally, deep learning-based object detection algorithms can be divided into one- and two-stage methods. One-stage algorithms process the image only once within the network to directly extract features, classify them, and determine their location. Examples include You Only Look Once (YOLO) and Single Shot MultiBox Detector (SSD). On the other hand, two-stage algorithms, such as R-CNN, Fast R-CNN, and Faster R-CNN, first select region proposals within the image, and then classify and refine the boundaries of the objects in each region. These algorithms require large training and validation datasets to show accurate learning results.

Dataset

Recording and observing animal behavior through videos is common, but manually processing large amounts of data requires significant time and labor. Particularly for animals, the individual characteristics of various species differ and their living environments are diverse. Additionally, they do not cooperate in acquiring images so the data is insufficient for adequate training. In fields such as image recognition, video processing, and speech recognition, CNNs require a substantial amount of training data to train an effective recognition system [13–21].

Animal recognition and identification datasets are designed to distinguish and identify animals at the species or individual level. These datasets include images or videos of animals, as well as metadata describing the characteristics of each animal. Recently, there has been increasing interest in long-term tracking to observe how individual animals change and behave over time and in different environments. This has led to the use of animal re-identification datasets. These datasets are used to re-identify specific animals across various times, locations, or other conditions [41]. However, animal re-identification datasets are not widely available, and the few well-summarized datasets often have small data sizes, limited annotations, and images captured in non-wild settings.

Fortunately, with the advancement of facial recognition technology, more and more open-source datasets are being made available for research, and animal datasets are becoming increasingly diverse. Labeled Faces in the Wild (LFW) provides a total of 13,233 annotated face images from 5,749 people in natural and complex environments [54]. ImageNet offers over 14 million images, including animal images with backgrounds, categorized into 27 major categories and over 20,000 subcategories [55]. PASCAL visual object classes (VOC) includes approximately 11,530 images containing 27,450 objects, with bounding boxes and pixel-level masks encoded by class [56]. Datasets that include various animals are Animal Web [57], which contains over 21,000 species-specific face images, Animals with Attributes [58], which includes 37,322 images from 50 species in versions 1 and 2, Animal Faces-HQ [59], which contains a total of 15,000 high-resolution animal face images from three categories (dogs, cats, and wild animals), and ZooAnimal Faces (https://www.kaggle.com/datasets/jirkadaberger/zoo-animals), which includes face images of zoo animals.

Wild animal image datasets captured in various environments are mainly collected through automatic camera traps and include metadata such as species, location, date, and time. Notable datasets include Smithsonian Wild provided by the Smithsonian Conservation Biology Institute, AfriCam (https://emammal.si.edu/), Caltech Camera Traps (https://beerys.github.io/CaltechCameraTraps/) provided by the California Institute of Technology, and Wild Animal Face, which is extensively used in computer vision and machine learning research for training and evaluating animal face recognition models. Datasets collected for specific wild animal research include Amur Tiger Re-identification in the Wild, which contains images of wild Amur tigers [45], the Grévy’s zebra dataset (https://datasets.wri.org/dataset/grevy-s-zebra-population-in-kenya-1977-78) containing images of Grevy’s zebras in Kenya, Chimpanzee Faces in the Wild (ChimpFace), which stores images of wild chimpanzee faces, and the African elephant dataset, which includes images of various ear shapes and facial features of African elephants. The Animal Movement and Location dataset collects movement patterns and location information of wild animals and is used in re-identification research.

With the increasing importance of PLF, the collection of livestock image datasets is also actively being conducted. Notable datasets include CattleCV (https://www.kaggle.com/datasets/trainingdatapro/cows-detection-dataset), which contains thousands of cattle images and health data, Afimilk Cow, and Dairy Cattle Behavior. Pig image datasets include PigPeNet, which contains over 10,000 pig face images, and RiseNet, which includes 7,647 pig face images collected from 57 videos [34]. Other livestock image datasets include ThoDTEL; 2015, which contains 1,410 images from 50 horses, Sheep Face, which contains hundreds of sheep face images, Goat-21, which contains approximately 2,100 goat face images, and Poultry-10K, which contains about 10,000 chicken images (https://livestockdata.org/datasets).

Performance enhancement: data pre-processing and augmentation

There are various ways to improve the performance of machine or deep learning models. Images collected from different environments often contain noise from being obscured by obstacles or being darkened or blurred due to light. Data pre-processing is necessary to improve the quality of the data before deep learning model training and analyzing in order to enhance the model’s efficiency and accuracy. Image pre-processing includes resizing images for consistent input, improving image quality, or restoring images to make analysis easier. This involves techniques such as histogram equalization, grayscale conversion, image smoothing, noise removal, and image restoration. Additionally, to increase the generalization performance of the model or to prevent overfitting to the same data, data augmentation is performed to artificially increase the diversity of the dataset and extend or augment the limited data. Image augmentation techniques include mirror imaging, rotation, scale transformation, translation, left-right flipping, zooming in/out, color dithering, noise addition, distortion, and other pre-processing methods [60–62].

Performance enhancement: pre-training and transfer learning

Training recognition models using deep learning requires a vast amount of training data. Even when utilizing open datasets or performing image augmentation, it is often challenging to secure a sufficient amount of labeled image data for specific animals. In such situations, pre-training and transfer learning are used to improve model performance and enable efficient training [11,46,63]. Pre-training involves using large-scale datasets like ImageNet to pre-train the model to learn general features and set stable initial weights. This accelerates training and enhances model performance. The process of adjusting the weights of a pre-trained model to fit a new task is called fine-tuning [13,34], and it is used to achieve optimal performance. Recent studies actively explore enhancing network performance through both pre-training and fine-tuning.

Transfer learning is a technique that utilizes a pre-trained model for a new task by using the lower layers of the pre-trained model as feature extractors. By retraining a model learned from a previous task, transfer learning allows rapid learning on new datasets and improves model performance even in data-scarce situations. Even when data is sufficient, using the weights of an existing model as initial values through transfer learning can reduce the training time and allow training to proceed efficiently, thereby improving performance.

Animal face recognition/identification/re-identification

To recognize animal faces and identify the species or individuals from given images or video frames, it is necessary to extract animal face features using deep learning models like CNNs and train classifiers based on these features. CNNs introduce convolutional layers within the network to learn feature maps that represent the spatial similarities of patterns found in images. This makes them effective deep learning models for processing and analyzing visual data like images or videos [16,17].

CNNs consist of convolutional layers, which extract local features from the input image, pooling layers, which reduce the spatial size to decrease computation and emphasize important features, and fully connected layers, which perform classification tasks at the end of the network. The training process uses a backpropagation algorithm to calculate the gradient of the loss function and to update the network weights, and employs optimization techniques such as gradient descent to minimize errors.

Standard CNN frameworks include AlexNet, VGG16, GoogLeNet/InceptionNet, ResNet, and CapsNet [19]. With the advancement of deep learning technologies such as CNNs, research on recognizing, identifying, or re-identifying animal faces using these technologies has been actively progressing. Animal face recognition is the process of determining whether a detected animal face belongs to a specific animal or species. Distinct from this, animal face detection involves locating the face of an animal in an image or video, identifying the position of the face, and marking the area with a box.

Animal face identification is the process of confirming whether a recognized animal face belongs to a specific individual within the same species. Re-identification refers to repeatedly identifying the same animal over time and across different locations. Re-identification techniques involve complex algorithms that compare existing databases to determine if it is the same individual, and measure the similarity between feature vectors. These techniques are necessary for tracking individuals and analyzing behaviors.

Wildlife recognition

Experiments in 2018 were conducted to classify animal and non-animal images using the Wildlife Spotter dataset, and to recognize and identify birds, rats, bandicoots, rabbits, wallabies, and other mammals using three CNN architectures: Lite AlexNet, VGG-16, and ResNet50 [28]. The results showed that ResNet50 achieved the highest accuracy and performance. However, while fine-tuning slightly improved the performance of VGG-16, it decreased the performance of ResNet50 due to overfitting.

In a study published in 2024 [29], a proposed lightweight WildARe-YOLO technique for wildlife recognition was tested using the Wild Animal Facing Extinction, Fishmarket, and Microsoft Common Objects in Context (MS COCO) 2017 datasets. Compared to the latest deep learning models, the proposed technique increased the frames per second (FPS) by 17.65%, reduced the model parameters by 28.55%, and decreased the floating point operations per second (FLOPs) by 50.92%. In a paper published in 2019, a deep learning-based automated pipeline was developed to efficiently annotate datasets by providing a toolset and an automated framework. This pipeline identifies and tracks individuals, and provides gender and identity recognition from a video archive collected over 14 years from 23 chimpanzees [31].

Annotation was performed using a web-based VGG Image Annotator (VIA) annotation interface by drawing tight bounding boxes around each chimpanzee’s head. The proposed model achieved 84% accuracy in 60ms using a Titan X GPU and in 30 seconds using a standard CPU, surpassing expert annotators in both speed and accuracy. Using 50 hours of frontal, side, and extreme side videos, the SSD model was employed to detect faces, and a deep CNN model was trained to implement face recognition and gender recognition. The recognition model trained with the generated annotations achieved 92.47% identity recognition accuracy and 96.16% gender recognition accuracy. Using only frontal faces, it achieved 95.07% identity recognition accuracy and 97.36% gender recognition accuracy.

Matkowski et al. [27] obtained 163 images from 28 Chengdu giant pandas, and manually extracted images of their frontal faces. Then, a two-stage algorithm was proposed to recognize panda faces using a classifier based on the NIPALS algorithm. This classifier was also used to calculate comparison scores between the panda images. Compared to networks pre-trained on the ImageNet dataset, such as AlexNet, GoogLeNet, ResNet-50, and VGG-16, the proposed method achieved a 6.43% and 8.59% higher accuracy than the second-best ResNet-50.

There was also a study that built a dataset containing 6,441 images from 218 pandas, with manual annotations inserted for panda faces, ears, eyes, noses, and mouths [23]. A Faster R-CNN detection network pre-trained on the COCO dataset was applied for face detection, and normalized face images were input into a deep neural network (DNN) to propose a fully automated deep learning algorithm for panda face recognition. Then, a fine-tuned ResNet-50 was used to verify panda IDs, achieving 96.27% accuracy in panda recognition and 100% accuracy in detection.

In 2020, a deep network model called Tri-AI was developed. It was reported that the model could quickly detect, identify, and track individuals using Faster R-CNN from videos or still images in a dataset containing 102,399 images of 1,040 known individuals [40]. This model demonstrated a face detection accuracy of 98.70%, an individual identification accuracy of 92.01%, and a new individuals identification accuracy of 87.03% in frame-by-frame detection and identification of 22 individuals using a test dataset of 10 videos of golden snub-nosed monkeys.

Wildlife recognition technologies play a crucial role in achieving various ecological and conservation goals, such as protecting endangered species, tracking population numbers, and monitoring behavior. Deep learning models like ResNet, Faster R-CNN, and YOLO are widely utilized for wildlife detection and identification, with their performance heavily influenced by the quality and quantity of datasets. Additionally, significant efforts are being made to develop lightweight models and high-performance algorithms that reduce computational costs while maintaining high accuracy.

Livestock face recognition

For pig face recognition, an adaptive approach was proposed to automatically select high-quality training and test data before applying a deep CNN, and an augmentation approach was proposed to improve the accuracy [26]. This approach measures the structural similarity index (SSIM) of pig face images to remove identical frames and uses a Haar cascade classifier in two stages to automatically detect pig faces and eyes. By selecting high-quality training and test images, it recognizes pig faces after applying the deep CNN technique.

Meanwhile, a technique was also proposed to improve the accuracy and robustness of the recognition model. This technique involves cutting out faces detected from images taken at various distances and angles by YOLOv5’s object detection algorithm, extracting important features with the Shuffle Attention (SA) [63] spatial channel attention mechanism and the Reparameterizable VGG (RepVGG) algorithm, and fusing features of the same scale [32]. The SA block enhances the network’s feature extraction ability, while the RepVGG block improves the recognition efficiency through lossless compression. The proposed model achieved 95.95% accuracy on a side-face dataset, 97.64% on a frontal face dataset, and 99.43% on a full-face dataset. A study was reported for cow face recognition using transfer learning and additional data augmentation and fine-tuning on an RGB dataset containing 315 face images of 91 Aberdeen-Angus cows. Pre-trained neural networks VGGFACE and VGGFACE2 were used, with VGGFACE2 achieving better accuracy at 97.1% [30].

In a 2022 study, Li [35] constructed a dataset of 10,239 cow face images collected under various angles and lighting conditions from 103 cows on a farm. The study proposed a lightweight neural network consisting of six convolutional layers for cow face recognition. The proposed network used global average pooling instead of fully connected layers on top of the convolutional layers, reducing the number of parameters to 0.17 M, the model size to 2.01 MB, and the computation to 9.17 mega floating point operations per second (MFLOPs). The model achieved a recognition accuracy of 98.7%, and Gradient-weighted Class Activation Mapping (Grad-CAM) was used to visualize and confirm which valid features were extracted. Additionally, the small size of the model allows it to be implemented on embedded systems or portable devices, enabling real-time cow identification [35].

In a 2024 study, Weng proposed a method for automatically detecting cow faces using a YOLOv5 network-based approach. The dataset consisted of images taken at various angles of 80 cows (Simmental beef cattle and Holstein dairy cows) at a farm in Hohhot, Inner Mongolia, using five smartphones. The study applied channel pruning and model quantization to reduce the model size, the number of parameters, and FLOPs by 86.10%, 88.19%, and 63.25%, respectively, compared to the original YOLOv5 model. This enabled real-time cow face detection on mobile devices [36].

Livestock face identification and re-identification

An identification method was proposed using the Inception-V3 CNN network to extract image features from each frame, and train a long short-term memory (LSTM) network to capture temporal information and identify individual animals [39]. Combining the strengths of the Inception V3 and LSTM networks, the cattle recognition method achieved 88% accuracy on 15-frame video lengths and 91% on 20-frame video lengths. These results were superior to frameworks using only CNNs, and demonstrated the ability of the method to extract and learn additional information related to individual identification from video data.

Li et al. [41] conducted a re-identification study on the Amur Tiger Re-identification in the Wild (ATRW) dataset. This dataset was built from 92 Amur tigers, a critically endangered species with fewer than 600 individuals remaining. It includes 8,076 high-resolution video clips capturing tigers in various poses and lighting conditions, annotated with bounding boxes, pose keypoints, and tiger identities. The study used deep models to perform re-identification of Amur tigers. Additionally, by using the ImageNet pre-trained backbone to benchmark the performance of the SSD-MobileNet-v1 [52] and SSD-MobileNet-v2 [26] models, and by benchmarking object detectors using TinyDSOD [17], which was trained from scratch on the training set, and YOLOv3 [24], which used the pre-trained backbone DarkNet from ImageNet, it was demonstrated that these models can be utilized for the protection and management of individual animals.

Dac et al. [43] proposed a face recognition pipeline for Holstein-Friesian dairy cows, recorded in RGB videos within a fixed frame at a robotic dairy farm located at Dookie College, University of Melbourne, Victoria, Australia. The pipeline uses images trained and fine-tuned on widely known public datasets such as ImageNet and COCO with the MobileNetV2 model, which are then registered in a database. For input cow images, the YOLOv5 model detects the face and extracts the facial region. Landmark features such as eyes and nose are extracted using a ResNet18-based landmark prediction model. Finally, face encoding is performed using embedding features from a ResNet101-based model, and face matching is conducted by comparing the similarity scores between the encoded results and the embedding features of other cow faces in the database. This study tested the method on the NVIDIA Jetson Nano device for real-time operation, achieving 84% accuracy for 89 cows captured more than twice [43].

Qiao et al. [51] proposed a deep learning framework for cow identification by collecting 363 video datasets from 50 cows. Spatial features were extracted using CNNs, while spatiotemporal information across sequential frames was learned using Bidirectional Long Short-Term Memory (BiLSTM). The proposed model achieved 93.3% accuracy and 91.0% recall, outperforming existing methods such as Inception-V3, MLP, SimpleRNN, LSTM, and BiLSTM.

Ahmad et al.[8] introduced a method for automatically identifying animals by detecting their faces and muzzles using the YOLOv7 model, followed by extracting muzzle pattern features with the Scale-Invariant Feature Transform (SIFT) algorithm. The extracted features were then matched against a database using the Fast Library for Approximate Nearest Neighbors (FLANN) algorithm. The method achieved over 99.5% accuracy in cow identification and demonstrated a lightweight structure and real-time performance, making it suitable for embedded systems or mobile devices. Deep learning often relies on high-performance computing devices, limiting its application in mobile devices. However, as the use of small mobile devices has become more widespread, recent studies [8,35,36] have focused on improving detection accuracy and speed while reducing computational costs or quickly and accurately detecting obstacles in outdoor environments [36,64–67]. Similar research has also begun in the field of livestock face recognition.

CONCLUSION

This review examines contactless techniques for animal face recognition, identification, and re-identification. In the data collection phase, animal face images are captured under various angles and lighting conditions, and data preprocessing normalizes the images to enhance the efficiency and accuracy of model training. Data augmentation and transfer learning (e.g., using pre-trained models like VGG and ResNet) are employed to address data scarcity, followed by fine-tuning to adapt the models to specific animal datasets. The integration of video processing and CNN-based deep learning presents a highly promising approach for PLF. These technologies enhance production efficiency, improve animal welfare, and reduce environmental impact. They provide accurate and efficient tools for growth estimation, individual identification, and behavior monitoring, driving innovation in livestock management.

Competing interests

No potential conflict of interest relevant to this article was reported.

Funding sources

Not applicable.

Acknowledgements

This work was carried out with the support of “Cooperative Research Program for Agriculture Science and Technology Development (Project No. RS-2024-00424820)” Rural Development Administration, Korea.

Availability of data and material

Upon reasonable request, the datasets of this study can be available from the corresponding author.

Authors’ contributions

Conceptualization: Oh SH.

Formal analysis: Kang MH, Oh SH.

Methodology: Kang MH, Oh SH.

Validation: Kang MH, Oh SH.

Investigation: Kang MH, Oh SH.

Writing - original draft: Kang MH.

Writing - review & editing: Kang MH, Oh SH.

Ethics approval and consent to participate

This article does not require IRB/IACUC approval because there are no human and animal participants.

REFERENCES

Berckmans D, Guarino M. From the editors: precision livestock farming for the global livestock sector. Anim Front. 2017; 7:4-5

Jang JC, Oh SH. Livestock animal breeding in the phenomic era. J Agric Life Sci. 2023; 57:1-10

Jiang B, Tang W, Cui L, Deng X. Precision livestock farming research: a global scientometric review. Animals. 2023; 13:2096

Fuentes S, Gonzalez Viejo C, Tongson E, Dunshea FR. The livestock farming digital transformation: implementation of new and emerging technologies using artificial intelligence. Anim Health Res Rev. 2022; 23:59-71

Jorquera-Chavez M, Fuentes S, Dunshea FR, Jongman EC, Warner RD. Computer vision and remote sensing to assess physiological responses of cattle to pre-slaughter stress, and its impact on beef quality: a review. Meat Sci. 2019; 156:11-22

Larsen MLV, Wang M, Norton T. Information technologies for welfare monitoring in pigs and their relation to welfare quality®. Sustainability. 2021; 13:692

Neethirajan S, Kemp B. Digital livestock farming. Sens Biosensing Res. 2021; 32:100408

Ahmad M, Abbas S, Fatima A, Issa GF, Ghazal TM, Khan MA. Deep transfer learning-based animal face identification model empowered with vision-based hybrid approach. Appl Sci. 2023; 13:1178

Reiners K, Hegger A, Hessel EF, Böck S, Wendl G, Van den Weghe HFA. Application of RFID technology using passive HF transponders for the individual identification of weaned piglets at the feed trough. Comput Electron Agric. 2009; 68:178-84

10.

Qin L. Research and development of the information collection and management system for stocking sheep based on RFID. Hohhot, Inner Mongolia, China: Inner Mongolia University. 2016; p. p. 1-48

11.

Ahmad M, Abbas S, Fatima A, Issa GF, Ghazal TM, Khan MA. Deep transfer learning-based animal face identification model empowered with vision-based hybrid approach. Appl Sci. 2023; 13:1178

12.

Neethirajan S. The role of sensors, big data and machine learning in modern animal farming. Sens Biosensing Res. 2020; 29:100367

13.

Alzubaidi L, Zhang J, Humaidi AJ, Al-Dujaili A, Duan Y, Al-Shamma O, et al. Review of deep learning: concepts, CNN architectures, challenges, applications, future directions. J Big Data. 2021; 8:53

14.

Xu B, Wang W, Falzon G, Kwan P, Guo L, Sun Z, et al. Livestock classification and counting in quadcopter aerial images using Mask R-CNN. Int J Remote Sens. 2020; 41:8121-42

15.

Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification with deep convolutional neural networks.In: Proceedings of the 26th International Conference on Neural Information Processing Systems. 2012; Lake Tahoe, NV. p. p. 1097-105

16.

LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015; 521:436-44

17.

Li Z, Liu F, Yang W, Peng S, Zhou J. A survey of convolutional neural networks: analysis, applications, and prospects. IEEE Trans Neural Netw Learn Syst. 2022; 33:6999-7019

18.

Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, et al. ImageNet large scale visual recognition challenge. Int J Comput Vis. 2015; 115:211-52

19.

Sarker IH. Deep learning: a comprehensive overview on techniques, taxonomy, applications and research directions. SN Comput Sci. 2021; 2:420

20.

Sharma N, Jain V, Mishra A. An analysis of convolutional neural networks for image classification. Procedia Comput Sci. 2018; 132:377-84

21.

Vedaldi A, Lenc K. MatConvNet: convolutional neural networks for MATLAB.In: Proceedings of the 23rd ACM International Conference on Multimedia. 2015; Brisbane Australia. p. p. 689-92

22.

Abdelhady AS, Hassanenin AE, Fahmy A. Sheep identity recognition, age and weight estimation datasets. arXiv:1806.04017 [Preprint]. 2018 [cited 2024 Oct 9]

23.

Chen P, Swarup P, Matkowski WM, Kong AWK, Han S, Zhang Z, et al. A study on giant panda recognition based on images of a large proportion of captive pandas. Ecol Evol. 2020; 10:3561-73

24.

Deng J, Guo J, Xue N, Zafeiriou S. ArcFace: additive angular margin loss for deep face recognition.In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2019; Long Beach, CA. p. p. 4690-9

25.

Hansen MF, Smith ML, Smith LN, Salter MG, Baxter EM, Farish M, et al. Towards on-farm pig face recognition using convolutional neural networks. Comput Ind. 2018; 98:145-52

26.

Marsot M, Mei J, Shan X, Ye L, Feng P, Yan X, et al. An adaptive pig face recognition approach using convolutional neural networks. Comput Electron Agric. 2020; 173:105386

27.

Matkowski WM, Kong AWK, Su H, Chen P, Hou R, Zhang Z. Giant panda face recognition using small dataset.In: Proceedings of the 2019 IEEE International Conference on Image Processing (ICIP). 2019; Taipei, Taiwan. p. p. 1680-4

28.

Nguyen H, Maclagan SJ, Nguyen TD, Nguyen T, Flemons P, Andrews K, et al. Animal recognition and identification with deep convolutional neural networks for automated wildlife monitoring.In: Proceedings of the 2017 IEEE International Conference on Data Science and Advanced Analytics (DSAA). 2018; Tokyo, Japan. p. p. 40-9

29.

Bakana SR, Zhang Y, Twala B. WildARe-YOLO: a lightweight and efficient wild animal recognition model. Ecol Inform. 2024; 80:102541

30.

Ruchay A, Akulshin I, Kolpakov V, Dzhulamanov K, Guo H, Pezzuolo A. Cattle face recognition using deep transfer learning techniques.In: Proceedings of the 2023 IEEE International Workshop on Metrology for Agriculture and Forestry (MetroAgriFor). 2023; Pisa, Italy. p. p. 569-74

31.

Schofield D, Nagrani A, Zisserman A, Hayashi M, Matsuzawa T, Biro D, et al. Chimpanzee face recognition from videos in the wild using deep learning. Sci Adv. 2019; 5:eaaw0736

32.

Wang K, Chen C, He Y. Research on pig face recognition model based on keras convolutional neural network. IOP Conf Ser Earth Environ Sci. 2020; 474:032030

33.

Wan Z, Tian F, Zhang C. Sheep face recognition model based on deep learning and bilinear feature fusion. Animals. 2023; 13:1957

34.

Shi X, Yang C, Xia X, Chai X. Deep cross-species feature learning for animal face recognition via residual interspecies equivariant network.In: In: Vedaldi A, Bischof H, Brox T, Frahm JM, editors.editors. Computer Vision - ECCV 2020. Part XXVII. Cham: Springer. 2020; p. p. 667-82

35.

Li Z, Lei X, Liu S. A lightweight deep learning model for cattle face recognition. Comput Electron Agric. 2022; 195:106848

36.

Weng Z, Liu K, Zheng Z. Cattle face detection method based on channel pruning YOLOv5 network and mobile deployment. J Intell Fuzzy Syst. 2023; 45:10003-20

37.

Carter SJB, Bell IP, Miller JJ, Gash PP. Automated marine turtle photograph identification using artificial neural networks, with application to green turtles. J Exp Mar Biol Ecol. 2014; 452:105-10

38.

Hou J, He Y, Yang H, Connor T, Gao J, Wang Y, et al. Identification of animal individuals using deep learning: a case study of giant panda. Biol Conserv. 2020; 242:108414

39.

Qiao Y, Su D, Kong H, Sukkarieh S, Lomax S, Clark C. Individual cattle identification using a deep learning based framework. IFAC-PapersOnLine. 2019; 52:318-23

40.

Guo S, Xu P, Miao Q, Shao G, Chapman CA, Chen X, et al. Automatic identification of individual primates with deep learning techniques. iScience. 2020; 23:101412

41.

Li S, Li J, Tang H, Qian R, Lin W. ATRW: a benchmark for Amur tiger re-identification in the wild.In: Proceedings of the 28th ACM International Conference on Multimedia (MM ‘20). 2020; Seattle, WA. p. p. 2590-8

42.

Kalafut KL, Kinley R. Using radio frequency identification for behavioral monitoring in little blue penguins. J Appl Anim Welf Sci. 2020; 23:62-73

43.

Dac HH, Gonzalez Viejo C, Lipovetzky N, Tongson E, Dunshea FR, Fuentes S. Livestock identification using deep learning for traceability. Sensors. 2022; 22:8256

44.

Qiao Y, Clark C, Lomax S, Kong H, Su D, Sukkarieh S. Automated individual cattle identification using video data: a unified deep learning architecture approach. Front Anim Sci. 2021; 2:759147

45.

Schneider S, Taylor GW, Linquist S, Kremer SC. Past, present and future approaches using computer vision for animal re-identification from camera trap data. Methods Ecol Evol. 2019; 10:461-70

46.

Banupriya N, Saranya S, Jayakumar R, Swaminathan R, Harikumar S, Palanisamy S. Animal detection using deep learning algorithm. J Crit Rev. 2020; 7:434-9

47.

Sreedevi KL, Edison A. Wild animal detection using deep learning.In: 2022 IEEE 19th India Council International Conference (INDICON). 2022; Kochi, India. p. p. 1-5

48.

Tan M, Chao W, Cheng JK, Zhou M, Ma Y, Jiang X, et al. Animal detection and classification from camera trap images using different mainstream object detection architectures. Animals. 2022; 12:1976

49.

Lee J, Kang H. A study of duck detection using deep neural network based on RetinaNet model in smart farming. J Anim Sci Technol. 2024; 66:846-58

50.

Morota G, Ventura RV, Silva FF, Koyama M, Fernando SC. Big data analytics and precision animal agriculture symposium: machine learning and data mining advance predictive big data analysis in precision animal agriculture. J Anim Sci. 2018; 96:1540-50

51.

Qiao Y, Guo Y, Yu K, He D. C3D-ConvLSTM based cow behaviour classification using video data for precision livestock farming. Comput Electron Agric. 2022; 193:106650

52.

Yin M, Ma R, Luo H, Li J, Zhao Q, Zhang M. Non-contact sensing technology enables precision livestock farming in smart farms. Comput Electron Agric. 2023; 212:108171

53.

Zajdel W, Zivkovic Z, Krose BJA. Keeping track of humans: have I seen this person before?.In: Proceedings of the 2005 IEEE International Conference on Robotics and Automation. 2005; Barcelona, Spain. p. p. 2081-6

54.

Huang GB, Mattar M, Berg T, Learned-Miller E. Labeled faces in the wild: a database for studying face recognition in unconstrained environments.In: Workshop on faces in ‘real-life’ images: detection, alignment, and recognition. 2008 Marseille, France.

55.

Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L. ImageNet: a large-scale hierarchical image database. In: Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition. 2009; Miami, FL p.:248-55

56.

Everingham M, Van Gool L, Williams CKI, Winn J, Zisserman A. The PASCAL visual object classes (VOC) challenge. Int J Comput Vis. 2010; 88:303-38

57.

Khan MH, McDonagh J, Khan S, Shahabuddin M, Arora A, Shahbaz Khan F. AnimalWeb: a large-scale hierarchical dataset of annotated animal faces.In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2020; Seattle, WA. p. p. 6939-48

58.

Lampert C. Animals with attributes: a dataset for attribute based classification [Internet]. c2008-2010 [cited 2024 Oct 9]https://cvml.ista.ac.at/AwA/

59.

Choi Y, Uh Y, Yoo J, Ha JW. StarGAN v2: diverse image synthesis for multiple domains.In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2020; Seattle, WA. p. p. 8185-94

60.

Shorten C, Khoshgoftaar TM. A survey on image data augmentation for deep learning. J Big Data. 2019; 6:60

61.

Taylor L, Nitschke G. Improving deep learning with generic data augmentation.In: Proceedings of the 2018 IEEE Symposium Series on Computational Intelligence (SSCI). 2018; Bangalore, India. p. p. 1542-7

62.

Traore BB, Kamsu-Foguem B, Tangara F. Deep convolution neural network for image recognition. Ecol Inform. 2018; 48:257-68

63.

Zhang QL, Yang YB. SA-Net: shuffle attention for deep convolutional neural networks.In: Proceedings of the ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2021; Toronto, ON. p. p. 2235-9

64.

Zhao T, Yi X, Zeng Z, Feng T. MobileNet-YOLO based wildlife detection model: a case study in Yunnan Tongbiguan Nature Reserve, China. J Intell Fuzzy Syst. 2021; 41:2171-81

65.

Ma N, Zhang X, Zheng HT, Sun J. ShuffleNet V2: practical guidelines for efficient CNN architecture design.In: In: Ferrari V, Hebert M, Sminchisescu C, Weiss Y, editors.editors. Computer Vision-ECCV 2018. Part XIV. Cham: Springer. 2018; p. p. 122-38

66.

Liu J, Zhuang B, Zhuang Z, Guo Y, Huang J, Zhu J, et al. Discrimination-aware network pruning for deep model compression. IEEE Trans Pattern Anal Mach Intell. 2022; 44:4035-51

67.

Li Z, Qiu K, Yu Z. Channel pruned YOLOv5-based deep learning approach for rapid and accurate outdoor obstacles detection. arXiv:2204.13699 [Preprint]. 2022 [cited 2024 Oct 9]

Related Articles

Research trends in livestock facial identification: a review

Abstract

INTRODUCTION

CONCLUSION

Competing interests

Funding sources

Acknowledgements

Availability of data and material

Authors’ contributions

Ethics approval and consent to participate

REFERENCES