Enhancing Animal Breeding through Quality Control in Genomic Data - A Review
Abstract
High-throughput genotyping and sequencing has revolutionized animal breeding by providing access to vast amounts of genomic data to facilitate precise selection for desirable traits. This shift from traditional methods to genomic selection provides dense marker information for predicting genetic variants. However, the success of genomic selection heavily depends on the accuracy and quality of the genomic data. Inaccurate or low-quality data can lead to flawed predictions, compromising breeding programs and reducing genetic gains. Therefore, stringent quality control (QC) measures are essential at every stage of data processing. Quality control in genomic data involves managing single nucleotide polymorphism (SNP) quality, assessing call rates, and filtering based on minor allele frequency (MAF) and Hardy-Weinberg equilibrium (HWE). High-quality SNP data is crucial because genotyping errors can bias the estimates of breeding values. Cost-effective low-density genotyping platforms often require imputation to deduce missing genotypes. QC is vital for genomic selection, genome-wide association studies (GWAS), and population genetics analyses because it ensures data accuracy and reliability. This paper reviews QC strategies for genomic data and emphasizes their applications in animal breeding programs. By examining various QC tools and methods, this review highlights the importance of data integrity in achieving successful outcomes in genomic selection, GWAS, and population analyses. Furthermore, this review covers the critical role of robust QC measures in enhancing the reliability of genomic predictions and advancing animal breeding practices.