Background

Genomic selection (GS) [1, 2] has been intensively used in routine genomic evaluations of pigs, especially in developed agricultural countries [3]. In the Chinese pig industry, GS is a newly introduced technology, and a small number of pig companies have started applying GS as a routine genetic evaluation approach. Due to the different types of single nucleotide polymorphism (SNP) arrays available on the fiercely competitive market and the limited knowledge of the performance of these SNP arrays, many pig companies tend to use different SNP arrays to genotype their pigs in the initial stage of implementing GS. Consequently, pigs within one population can be genotyped with different SNP arrays. This has also been reported in a study on dairy cattle [4]. SNP arrays usually contain a large number of unique SNPs that are not shared with other chips. Thus, the integration of genomic information from different SNP arrays and the application of such information in pig genomic evaluation pose a challenge to these pig companies. The imputation of genotypes from a low-density to a high-density SNP panel is routinely performed [5, 6], providing a strategy for combining data from different SNP arrays for genomic evaluation. However, an appropriate strategy for integrating genomic information from different SNP arrays of medium density (i.e., 50K to 60K) for pig genomic evaluation has not yet been reported and deserves to be further investigated.

Although previous studies have demonstrated that dominance effects are not negligible [8], they are usually ignored in genetic evaluations because of the high computation requirements, and the large-scale datasets with high proportions of full sibs [7]. With the increases in computation ability and the availability of SNPs, it has become feasible to estimate dominance effects accurately [8, 9]. In previous studies, dominance effects have been fitted as a ‘genotypic’ (biological) effect (d) in linear mixed models. For example, SNPs are coded as 0, 1, and 2 for genotypes AA, Aa, and aa, respectively, and the coding of dominance effects is equal to 0, 1, and 0 for genotypes AA, Aa, and aa, respectively [8, 9]. In contrast, in traditional genetic evaluations, dominance effects are included in linear mixed models as dominant deviations. For instance, SNP dominance effects are coded as \(-2{p}^{2},2pq,\) and \(-2{q}^{2}\) for genotypes AA, Aa, and aa, respectively [10]. Vitezica et al. [10] referred to this parameterization as ‘classical’ (statistical). In our study, we used the terms ‘genotypic dominance effect’ and ‘classical dominance effect’ to refer to the dominance effects coded in either a genotypic manner or a dominant deviation manner, respectively, to avoid potential confusion.

An increasing number of studies have investigated the influence of including dominance effects in prediction models on genomic evaluations of livestock [8, 11,1). In total, there were 467,244 pigs in the pedigree. Descriptive statistical data of the phenotypes are in Table 1. The mean pedigree-based inbreeding coefficient was 0.007 (ranging from 0 to 0.267).

Table 1 Descriptive statistics

These genotypic data were included in the single-step additive genetic evaluation model and were used to calculate the pre-corrected phenotypes of each trait. The pre-corrected phenotype (\({\mathbf{y}}_{c}\)) was calculated as \({\mathbf{y}}_{c}=\widehat{\mathbf{a}}+\widehat{\mathbf{e}}\), where \(\widehat{\mathbf{a}}\) and \(\widehat{\mathbf{e}}\) were the estimated additive genetic values and residuals for each tested pig. The pre-corrected phenotypes \(({\mathbf{y}}_{c})\) of the 6614 genotyped pigs were used for the subsequent genomic prediction analysis. To evaluate the impact of dominance effects and inbreeding depression effects on genomic prediction, six genomic models were used to estimate variance components and predict total genetic effects as follows:

$$\begin{aligned}MA:&\quad {\mathbf{y}}_{{\varvec{c}}}=\bf{1}\mu +\mathbf{Z}\mathbf{a}+\mathbf{e},\\ MAD: &\quad{\mathbf{y}}_{{\varvec{c}}}=\bf{1}\mu +\mathbf{Z}\mathbf{a}+\mathbf{W}\mathbf{v}+\mathbf{e},\\ MA{D}^{*}:&\quad {\mathbf{y}}_{{\varvec{c}}}=\bf{1}\mu +\mathbf{Z}\mathbf{a}+\mathbf{W}{\mathbf{v}}^{\mathbf{*}}+\mathbf{e},\\ MAI: &\quad {{\varvec{y}}}_{{\varvec{c}}}=\bf{1}\mu +\mathbf{f}\eta +\mathbf{Z}\mathbf{a}+\mathbf{e},\\ MAID: &\quad {\mathbf{y}}_{{\varvec{c}}}=\bf{1}\mu +\mathbf{f}\eta +\mathbf{Z}\mathbf{a}+\mathbf{W}\mathbf{v}+\mathbf{e},\\ MAI{D}^{*}:&\quad {\mathbf{y}}_{{\varvec{c}}}=\bf{1}\mu +\mathbf{f}\eta +\mathbf{Z}\mathbf{a}+\mathbf{W}{\mathbf{v}}^{\mathbf{*}}+\mathbf{e},\end{aligned}$$

where \({\mathbf{y}}_{c}\) is the vector of pre-corrected phenotypes for each trait; \(\mu\) is the overall mean; \(\mathbf{f}\) is the vector of genomic inbreeding coefficients, calculated as \(\bf{1}-\frac{{\varvec{h}}}{m}\), where 1 is a vector in which all elements are 1, \(m\) is the number of SNPs, and \(\mathbf{h}\) is a vector of the number of heterozygous loci for each individual [32,33,34,35]. Thus, we investigated the distribution of the MAF of imputed SNPs and studied the highest relatedness of individuals between the imputed and reference populations. The proportion of SNPs with a low MAF was lower in Scenario 1 than in Scenario 2 (see Additional file 2: Figure S1), and the top genomic relatedness was slightly lower in Scenario 2 than in Scenario 1 (see Additional file 1: Table S6), which would probably lead to a higher imputation accuracy in Scenario 1.

Estimated variance components

In this study, the estimated narrow-sense heritability confirmed that ADG, BF, LMD, and AGE100 were moderately heritable and that TNB was lowly heritable, in line with many other reports [13, 15, 36]. No significant difference in narrow-sense heritability was observed among the genomic models, which indicates that the additive genetic variance was accurately separated from the phenotypic variance in all genomic models, regardless of the nonadditive effects.

In this study, the proportions of dominance variation to the total genetic variance in production traits were relatively low (ranging from 1.9 to 10.5%) and generally lower than those found in other studies on production traits in pigs [8, 16]. The proportion of genotypic dominance variations relative to the total genetic variance of TNB was moderate (ranging from 18.2 to 20.3%) and was similar to that reported in a previous study [17]. Our finding that the proportion of genotypic dominance variations relative to the total genetic variance of TNB (20.3%) was higher than that for the production traits (~ 8.5%) in Yorkshire pigs was consistent with a previous study that reported that the proportion of classical dominance variation relative to the total genetic variance for another reproduction trait (calving interval) was ~ 34.3%, whereas that for production traits (milk, fat, and protein yields) was ~ 8.5% on average in Holstein cattle [14]. For all traits, there were no significant differences between the proportions of classical and genotypic dominance variation when standard errors were taken into account. One possible reason could be that the dominance variance was too small to distinguish between its two forms, and therefore this needs to be further investigated. Our data showed that although both classical dominance variance and genotypic dominance variance were small, the genotypic dominance variance was slightly larger than the classical dominance variance, as reported by Vitezica et al. [9]. Based on the conversion method described by Vitezica et al. [9], the estimated genotypic dominance variance can be easily converted into that obtained via the classical approach. As shown in Additional file 1: Table S7, after transformation, the estimated genetic variances from the genotypic dominance model (MAID) were close to those obtained from the classical dominance model (MAID*), which confirmed the equivalence of the estimates of dominant variation generated in this study. The standard error of the estimates of dominance variation was still relatively large, which indicates that the size of our dataset was not sufficient to accurately estimate dominance variation. Therefore, more data are needed to further investigate the dominance effects in the current population.

In this study, we used the pre-corrected phenotypes of the genotyped pigs as the response variables to estimate dominance variances. These genotyped pigs were not randomly sampled from the population, and most of them showed high EBV and were selected as parents for producing the next generation. **ang et al. [18] reported that preselection and precorrection greatly reduced the variances of the dominance effects. In addition, putative errors in the imputed genotypes might increase the uncertainty of genomic evaluations [37]. It should be noted that in some other studies, the proportion of dominance variation to total genetic variance was found to be lower than in our study and even closer to 0 [38, 39]. Previous studies have shown that the proportion of dominance variation to total genetic variance is affected by various factors, i.e., the studied population, the target traits, types of information, and genomic models [8, 16]. Thus, more studies are needed to further investigate the effect of various factors on dominance variation.

Estimates of inbreeding depression

As shown in Table 4, there were no large differences in the estimates of inbreeding depression parameters among the MAI, MAID and MAID* models when standard errors were taken into account, which is in line with previous studies [14, 18]. The estimates of inbreeding depression showed that inbreeding depression had detrimental effects on ADG, LMD, AGE100, and TNB, thus should be included in the model for genetic evaluation [30]. Inbreeding depression estimates for the same traits from previous studies [18, 19, 30, 36, 40] were similar to our results. For BF, inbreeding depression (negative value) did not show a detrimental effect in this study, in agreement with results on Pietrain pigs reported in [28]. For the BF trait in model MAI, we estimated a \(\eta\) value of − 4.749, which means that an increase of 0.10 unit in the inbreeding coefficient led to a decrease of 0.479 mm in backfat thickness. Another study reported that inbreeding depression had no effect on backfat [41], and the authors attributed this to the change in dominance effect values across genes, suggesting that dominance effects at different loci might be either positive or negative [23]. Notably, the standard errors of the backfat estimates were large in our study, and the estimates of dominance effects of BF only slightly differed from 0. Therefore, larger datasets are needed to further investigate the inbreeding depression effects of BF.

The ratio of the estimated inbreeding depression effect divided by the phenotypic standard deviation for the trait is an indicator of the importance of the inbreeding depression effect [30]. In model MAI, for the ADG, LMD, AGE100, and TNB traits, the absolute values of this ratio were equal to 4.023, 2.516, 3.261, and 2.197, respectively. Note that the estimate of this effect refers to an individual with 100% inbreeding. For BF, the absolute value of the ratio was 1.702, which showed that inbreeding depression had little impact on BF. This phenomenon was consistent with the above findings.

Our study is the first to report the proportion of additive genetic effects that is contributed by inbreeding depression effects. According to the formula for calculating the additive variance, the proportion contributed by inbreeding depression is mainly affected by allelic frequencies, the magnitude of the estimated inbreeding depression parameter, and the number of SNPs used. As shown in Additional file 2: Figure S2, for a single locus, the value of \({2{p}_{j}{q}_{j}\left({q}_{j}-{p}_{j}\right)}^{2}\) is largest when the frequency of the reference allele is approximately 0.15. However, even if the frequency of the reference allele was 0.15 for all loci, the proportion of additive variance contributed by inbreeding depression would not change much since it needs to be divided by the number of SNPs used, \(m\). This explains why the proportion of additive variance contributed by inbreeding depression was small for all traits in this study.

Overall, the inclusion of the inbreeding depression effect in the genomic model had no significant effect on the estimation of variance components for all traits, although all of the dominance variances were slightly reduced, as also reported by Aliloo et al. [14].

Predictive abilities

The goodness-of-fit of the six genomic models showed that those with inbreeding depression effects (MAI, MAID, and MAID*) presented a better goodness-of-fit than the model with only additive effects (MA) for all traits, in line with Aliloo et al. [14]. This result suggests that inbreeding depression had an impact on the production traits and TNB, and thus this effect should be explicitly fitted in genomic evaluation models. This was confirmed by the results regarding predictive ability. Previous studies have reported that including dominance effects in a genomic model can improve its predictive abilities [8, 11, 15]. However, our study showed that including dominance effects in the genomic model only slightly improved predictive abilities for TNB. This might be related to the degree to which traits are affected by dominant genes. The observation that including inbreeding depression in the model improved the predictive ability whereas including dominance effects did not was also reported by **ang et al. [18] and Aliloo et al. [14]. Our explanation for this finding is that dominance has two components that can be modeled separately [18]. The first is the directional dominance effect [18], which accumulates across loci and leads to an inbreeding depression effect that is modeled via a single covariate, with an accurately estimated effect. For the remaining residual dominance effects (which show a mean of zero), it is difficult to obtain accurate estimates using a dominance relationship matrix, especially when the sample size is not sufficient. Thus, even when dominance deviations were included in the genomic model, predictive abilities were not further improved. However, our study showed that although including dominance effects in the model did not improve its predictive ability for production traits, it did not decrease them either, which agrees with the results of a study on the total number of piglets born to Danish Yorkshire pigs [18].

Conclusions

Our results revealed that the inclusion of an inbreeding depression effect in the genomic model increased its predictive ability for the four production traits (ADG, BF, LMD, and AGE100) and the reproduction trait (TNB) studied and that when the tested trait was strongly affected by dominance genes, the inclusion of the dominance effect in the model further improved its predictive ability. Even when the trait was little affected by dominance, the inclusion of the dominance effect in the model did not decrease its predictive ability.