INTRODUCTION
The management of pig reproduction is an important factor that is directly related to the success or failure of pig farms [1–3]. Therefore, methods for diagnosing pregnancy in sows have a significant impact on reproductive management and are essential in pig farming [4–6]. It can increase pig reproduction by shortening the non-pregnant condition of sows and increasing the number of births. Pregnancy diagnosis of sows can be confirmed through observations for return of estrus, vaginal biopsy, serum analysis, hormone measurement, and ultrasound detection methods [7–9]. However, if the sow shows no clear signs of pregnancy, the manager who is inexperienced or lacks the time and labor may not notice the pregnancy until the due date. In such cases, the pregnant sows cannot receive proper treatment for pregnancy and miscarriages can occur in stressful situations [10]. These issues increase the feed, management, and labor costs, which has a major adverse effect on profitability. Therefore, as mentioned before, the pregnancy diagnosis of sows has a great effect on reproduction and determines the success or failure of pig farms. As the necessity of diagnosing the pregnancy of sows is emphasized, many institutions and organizations have conducted research and a variety of methods are used to diagnose the pregnancy of sows [11]. Cameron [12] made a detailed description of the reproductive tract of the sow as felt by rectal examination. Lin et al. [13] showed the expression of αV and β3 integrin subunits in the endometrium during implantation in pigs. Zhou et al. [14] hypothesized that circulating exosome-derived miRNAs might be used to differentiate the pregnancy status as early as several days after insemination in pigs and successfully identified circulating exosomal miRNA profiles in the serum of pigs in early pregnancy. Kauffold and Althouse [15] reviewed an update on the current status of B-mode ultrasonography in pig reproduction and how this technology can be of value when used in pig production medicine. Also, Kauffold et al. [16] provided an overview of the principles and clinical uses of ultrasonography (RTU) for application to address swine reproductive performance. Kousenidis et al. [17] studied the ultrasonic typification of sows to develop a methodology for pregnancy diagnosis and suggested that detailed real-time ultrasonic scanning, can help predict litter size and the precise management of pregnant sows.
In this study, we developed a computer-aided diagnosis (CADx) method to diagnose the pregnancy of sows using ultrasound images, which has advantages over other methods mentioned above in terms of simplicity, low cost, and high accuracy. CADx is expected to provide additional information to pig farmers by showing the diagnostic result of artificial intelligence to assist the farmer in making a diagnosis decision of the image. We compared the accuracy of three computerized classification approaches with two types of noise: Gaussian and speckle. Of the three computerized classification approaches selected, the Inception model is one of the most used convolution neural network (CNN) models, Xception is based on Inception with depthwise separable convolution, and EfficientNet is a model that achieved state-of-the-art (SOTA) performance on image classification tasks with much few parameters. We added the Gaussian and speckle noises because ultrasound images are usually corrupted by them. Although the issues that we could explore in one study are only a small fraction of those involved in the entire CADx process of sow pregnancy diagnosis, it is expected that this study will provide useful information for the design of a robust CADx system that uses ultrasound images.
MATERIALS AND METHODS
Ultrasound images of pregnant and non-pregnant sows were collected by experts and used as the dataset for training and performance evaluation of pregnancy diagnosis using deep learning algorithms. In consideration of use in various environments in pig farms, ultrasound images containing noise were generated and were used together with the other images in the performance evaluation. To find the optimal method for diagnosing sow pregnancy, we compared the performance of several classification algorithms.
A data set was collected from the files of sows that had undergone ultrasound imaging in the hog barn of the National Institute of Animal Science (NIAS) located in Cheonan, with the approval of the Institutional Animal Care and Use Committee (IACUC) of Rural Development Administration (approval No. NIAS-2021-538). All ultrasound images were acquired by trained experts using a MyLab™OmegaVET (Esaote) ultrasonic device and a convex array ultrasound transducer AC2541 (Esaote) with 1.0–8.0 MHz frequency range. We acquired ultrasound images of 5,292 pregnant and 5,367 non-pregnant from 44 sows. Among them, 29 sows were at least 23 days pregnant and 15 sows were not pregnant. The images of pregnant sows were confirmed by the experts. The ultrasound images were collected in GEN-M format in 4.0–6.0 MHz frequency range with general resolution and middle penetration. The collected ultrasound images were extracted as 860 × 808 resolution Bitmap Image format with lossless and uncompressed characteristics to minimize feature loss.
The 5,292 ultrasound images of pregnant sows were divided into 4,241 images (88 with invisible embryonic sacs) for training and 1,051 images (14 with invisible embryonic sacs) for performance evaluation. It is difficult for even experts to accurately identify pregnancy in images with invisible embryonic sacs. Of the 5,367 ultrasound images of non-pregnant sows, 4,231 images we used for training and 1,136 images were used for performance evaluation. Overall, the training set consisted of 4,241 images of pregnant and 4,231 images of non-pregnant sows, and the test set (Dataset-A) consisted of 1,051 images of pregnant and 1,136 images of non-pregnant sows. And part of the test set (Dataset-A) in which the embryonic sac was not visible was composed as the other test set (Dataset-B). The specifications of the images are shown in Fig. 1.
Noise is an unwanted phenomenon that is ubiquitous in digital ultrasound images. It can appear in different forms and distributions such as speckle and Gaussian. Diagnosis of pregnancy in sows using an ultrasound device can be performed in various situations depending on the surrounding environments [18]. Speckle noise is a type of noise that is multiplicative and independent. It is the result of interference between returning light from rough surfaces and the aperture creating a granular shape pattern in the camera sensor. This type of noise affects both the resolution and contrast in ultrasound images. Gaussian noise is another type of noise that is also additive and independent. It can be the product of sources such as amplifiers, shot noise and film grain noise, among others [19]. The configuration of ultrasonic devices and probes used in all pig farms is the same as that of this study. In addition, the frequency used to diagnose pregnancy depends on the physical characteristics of the sow; the ultrasound image can contain Gaussian noise and speckle noise depending on the surrounding environment. Therefore, we added these two noises to the ultrasound images to make them similar to the noise that occurs in typical farm situations [20,21]. Speckle noise 0.7 (variance) and Gaussian noise 0.02 (zero mean and variance 0.02) were added to 1,051 ultrasound images of pregnant sows and 1,136 non-pregnant sows used for the test, and speckle noise 0.4 and Gaussian noise 0.01 were applied in the same way. The number of test images with noises is the same as original and noise images were not used in the training stage. The ultrasound images with noise for the test are shown in Fig. 2. Ultrasound images with noises were used together with the original images for performance evaluation so that the deep learning-based classification algorithm can show robustness in various environments.
To develop a method to diagnose pregnancy in sows that can be used in real-time in various environments with high processing speed and low computational cost, we decided to use a deep learning-based classification algorithm [22]. It has high accuracy based on neural network structure and a high processing speed with no position calculation, so it is considered ideal for diagnosing pregnancy in real-time. To select an optimal classification algorithm for sow ultrasound image pregnancy detection, various deep learning-based classification algorithms known for high performance were used. Inception-v4, Xception, and EfficientNetV2 classification algorithms were all used to train the ultrasound images and generate trained weights. Performance evaluation and comparison for the original ultrasound images and the noise ultrasound images were performed to select the optimal algorithm.
The inception model is one of the most used CNN models since the release of TensorFlow [23]. The core of the inception model is in the Conv layer called the inception module. Conventional Conv layers usually use data composed of width, height, and channels. Width and height decrease through max-pooling according to the progress of the network model, and the channel progresses in the direction of increasing. The inception model uses the form of 1 × 1 Conv to make the filter 1 × 1, and it is performed in the direction of decreasing channels. Through this, a fully connected computation of the channel called network-in-network is performed, and a compression effect of reducing the dimension can be achieved. Therefore, 1 × 1 Conv structure of Inception was able to increase the accuracy and reduce the amount of computation. Inception-v2 has a change on the existing inception module. To reduce the amount of computation, module A with factorizing was applied by changing 5 × 5 Conv to two 3x3 Conv, and module B with asymmetric factorization was made. To reduce the grid size of the feature map, module C was created by combining pooling to Conv structure and Conv to pooling structure in parallel, and these replaced the existing inception module. Inception-v3 has the same structure as Inception-v2, and various techniques such as RMSProp, Label Smoothing, Factorized 7-7, and BN-auxiliary are applied to increase performance. In the Inception-v4 used in our proposed study, the modules that change the grid are distinguished from the structure of Inception-v3. Along with the inception module A-B-C, the reduction module A-B, which reduces the size of the grid, has been added and improves accuracy. The structure of Inception-v4 is shown in Fig. 3.
Xception is based on Inception, but it is a model to which the concept of modified depthwise separable convolution is applied [24]. Xception went further from the existing inception module and aimed to completely separate cross-channel correlations and spatial correlations. Therefore, as shown in Fig. 4 correlation between channels was mapped through 1 × 1 Conv in the existing inception module, and then spatial correlation was mapped for all output channels. Through this, Xception was able to show high classification accuracy when compared to Inception-v3, which has a similar scale and is used as a pretrain for various encoders due to its simple concept and structure and high performance.
EfficientNetV1 is a model that achieved SOTA performance in 2019 with good performance with much fewer parameters than other image classification tasks [25]. The performance of CNN tends to be proportional to the scale of the model, and many studies have been conducted to improve the performance by increasing the model. There are three methods of scaling up: deepening the network depth, increasing the channel width, and increasing the resolution of the input image. EfficientNetV1 found the optimal combination of these three through automated machine learning [26], and suggested a compound scaling method to achieve high performance even with a small model. EfficientNetV2 is a model that succeeded in increasing the learning speed while maintaining accuracy through progressive learning, which gradually increases the input image size while using the existing structure and the non-uniform scaling technique that compensates for progressive learning [27]. The basic structure of EfficientNetV2 is shown in Fig. 5.
Inception-v4, which reduces the complexity of calculations through the inception module, achieving fast processing and high accuracy; Xception, which uses the concept of depthwise separable on ultrasound image because it is basically one-channel grayscale; and EfficientNetV2, which performs classification through optimal combination using automated machine learning because frequency bands exist but cannot define accurate image resolution, were selected as the ultrasound pregnancy diagnosis algorithms.
Inception-v4, Xception and EfficientNetV2 training was done for pregnancy diagnosis in sows. The 5,292 ultrasound images of pregnant sows were divided into 4,241 for training and 1,051 for testing. The 5,367 ultrasound images of non-pregnant sows were divided into 4,231 for training and 1,136 for testing. The training images were further divided into training and validation at a ratio of 8:2. The training the network models was continued until the validation loss converged. All training and performance evaluations were performed using Windows 10 x64, CUDA 10.1 with cuDNN, and Python 3.7.4 with the following specifications: Intel(R) Xeon(R) W-2133, NVIDIA TITAN Xp, and 128 GB RAM.
RESULTS AND DISCUSSION
The performance of the pregnancy diagnosis in sows was evaluated by weights trained through Inception-v4, Xception, and EfficientNetV2. The overall structure of the study is shown in Fig. 6. The dataset used for the performance evaluation was divided into Dataset-A and Dataset-B. Dataset-A consisted of 1,051 ultrasound images of pregnant sows with all situations and visible embryonic sacs and 1,136 ultrasound images of non-pregnant sows. Dataset-B which is a subset of the Dataset-A consisted of 14 ultrasound images of pregnant sows with invisible embryonic sacs and 14 ultrasound images of non-pregnant sows. Each of Dataset-A and Dataset-B was divided once more into original, NoiseT1 with added speckle noise of variance 0.4 and Gaussian noise of zero mean and variance 0.01 into original images and NoiseT2 with added speckle noise of variance 0.7 and Gaussian noise of zero mean and variance 0.02 into original images depending on the application of noise. Therefore, a total of 6 test datasets were used for performance evaluation: Original Dataset-A, Original Dataset-B, NoiseT1 Dataset-A, NoiseT1 Dataset-B, NoiseT2 Dataset-A, and NoiseT2 Dataset-B.
The ultrasound images used in the study were organized as shown in Table 1. Ultrasound images in Dataset-A and Dataset-B were classified for pregnancy through weights trained using Inception-v4, Xception, and EfficientNetV2. A confusion matrix consisting of true positive (TP), true negative (TN), false positive (FP), and false negative (FN) was used for evaluation. TP is the case in which pregnant is predicted as pregnant, and TN is the case in that the non-pregnant is predicted as non-pregnant. FP is the case that non-pregnant is incorrectly predicted as pregnant, and FN is the case that pregnant is incorrectly predicted as non-pregnant. We also employed the performance metrics of specificity, sensitivity, and accuracy to evaluate the pregnancy diagnosis performance. Sensitivity is calculated as TP / (TP+FN) and is the ratio determined as pregnant in all pregnant, and specificity is calculated as TN / (TP+FP) and is the ratio determined as non-pregnant in all non-pregnant. Accuracy includes all elements of sensitivity and specificity and can confirm the overall pregnancy diagnosis performance.
The results of ultrasound pregnancy diagnosis performance evaluation for Dataset-A are shown in Table 2. Xception achieved the highest overall performance. In the original ultrasound images result, Xception, EfficientNetV2, and Inception-v4 achieved 0.98, 0.99, and 0.98 accuracy, respectively. However, when the noise was added, the performance of EfficientNetV2 and Inception-v4 significantly decreased. The performance of Xception was reduced by 0.02, a minor difference from the original. Results for Dataset-B are shown in Table 3: again, Xception achieved the highest performance. In the original ultrasound images result, Xception, EfficientNetV2, and Inception-v4 achieved 0.89, 0.82, and 0.93 accuracy, respectively. Dataset-B was difficult to distinguish even for experts because the embryonic sacs are not visible. However, the proposed method achieved high overall performance. When the ultrasound images contain noise, the performance of EfficientNetV2 and Inception-v4 significantly decreased. Although the performance of the Xception was also reduced from the original performance, the difference was only 0.04. Dataset-B shows a lower sensitivity compared to Dataset-A. This is thought to be because the number of images with invisible embryonic sacs is not sufficient for training; they are only 88 out of the 4,241 training images. On the other hand, specificity was 1.00 for all models in Dataset-B. This is the opposite of the previous case. Non-pregnant was trained using many images, but the results were confirmed only using 14 images. Although there was a data imbalance problem in Dataset-B, we were able to confirm the unbiased performance through the comparison of three classification algorithms.
The classification algorithms used in this study have high performance. When tested with the original ultrasound images, they achieved high performance in both Dataset-A and Dataset-B. However, when noise was included or the intensity of noise was increased, the performance decrease drastically, except for Xception. Xception maps the correlation between channels and then maps spatial correlation. It means that the relationship between the channels and spatial are separated due to the depthwise separable. Two noises were added to the ultrasound images according to the characteristics of the ultrasonography. Xception, which is based on CNN structure is robust against noise when extracting spatial features. Furthermore, against speckle noise, which has 3-channels unlike 1-channel of ultrasonography, it is presumed that a robust classification was achieved by separately extracting the channels and spatial features. As a result, it was found that it is best to use the Xception classification algorithm for pregnancy diagnosis using ultrasound images.
CONCLUSION
In this study, ultrasonography-based deep-learning algorithms to diagnose pregnancy in sows were proposed. Inception-v4, Xception, and EfficientNetV2 were used for deep learning-based classification algorithms. Gaussian and speckle noise with parameters of each 0.01, 0.02, and 0.4, 0.7, respectively, were added to ultrasound images as these are easily affected by noise from the surrounding environments.
The pregnancy diagnosis algorithms achieved good overall performance. The algorithms performed highly on ultrasound images with visible embryonic sacs. Even on ultrasound images with invisible embryonic sacs, which are difficult for experts to distinguish, the algorithms achieved accuracies of up to 0.93 . When the embryonic sac was visible in the ultrasound image containing noise, the accuracy reached 0.98. For ultrasound images with noise and invisible embryonic sacs, accuracy was reduced to 0.89. The Xception algorithm showed robustness against noise and achieved overall high performance. For future study, we plan to collect more images with invisible embryonic sacs; the current study had only a few of these. Also, this study considered pregnancy of at least 23 days; therefore, we plan to include pregnancy between 10 and 23 days.