Often, after fitting a complex linear mixed model (LMM), we are interested in constraining some of the variance components (VCs). There are many reasons why we might want to do this, such as when we combine data from several sources and we want to have a common parameter, or when we want to simplify a model to evaluate a specific hypothesis relating to these components.
In this snippet we will briefly explore an example where we constrain two VCs to be identical. The example comes from Becker (1984) and corresponds to data from 28 white pine (Pinus monticola) families formed by crossing 4 male to 7 female parents. The progeny was evaluated in a randomized complete block design with 4 replicates and planted in large plots. Our response of interest was epicotyl length, the part of the embryonic seedling stem from which the plant’s shoot system develops.
Using ASReml-R, the reference model fitted to this data is:
mod0 <- asreml(fixed=length~rep,
random=~female + male + female:male,
data=pine)
In the above model we have a fixed effect of replicate (), and random effects for the parents and their interaction (, and ). This LMM produces the following output in ASReml-R:
component | std.error | z.ratio | bound | %ch | |
---|---|---|---|---|---|
male | 0.1203614 | 0.1203723 | 0.9999094 | P | 0 |
female | 0.0209448 | 0.0420690 | 0.4978693 | P | 0 |
female:male | 0.1368513 | 0.0628149 | 2.1786443 | P | 0 |
units!R | 0.2004263 | 0.0314941 | 6.3639416 | P | 0 |
In this example, the variance of males ( = 0.120) is approximately six times larger than the variance for females ( = 0.021). Therefore, the additive variance from males and females are contrastingly different: .
The above results may seem biologically unreasonable for pines, where the genetic contribution of male and female parents (i.e., ) is often very similar. Hence, we expect their genetic variances to be almost identical. That is, we think . The inconsistent VC estimates from our reference model could be the result of random sampling of families, or to the limited number of parents considered in this dataset (recall that we have only 4 male and 7 female parents).
In ASReml-R, the constraint is easy to implement. This is done by creating a matrix, say , that defines the grouping of the VCs. All VCs in the same group (i.e., with the same number) will be constrained to be identical. For our problem, the matrix looks like:
## V1 V2
## female 1 1
## male 1 1
## female:male 2 1
## units!R 3 1
The first column, , indicates the grouping; hence, there are three VCs two of which ( and ) will be identical. The second column, , is used to specify the multiplicative relationships between components, but this is a topic of another ASReml-R snippet.
The construction of the above matrix can be facilitated with the following R code. This starts by requesting the table of VC parameters () from ASReml-R:
modC <- asreml(fixed=length~rep,
random=~female + male + female:male,
start.values=TRUE,
data=pine)
gam <- modC$vparameters.table
It then proceeds to generate our matrix of constraints using only the column names from the table :
M <- as.matrix(data.frame(V1=c(1,1,2,3), V2=c(1,1,1,1)))
dimnames(M)[[1]] <- gam$Component
Now, we are ready to fit the model in ASReml-R with the constraint . The matrix is incorporated using the option :
modC <- asreml(fixed=length~rep,
random=~female + male + female:male,
vcc=M,
data=pine)
The output from fitting this constrained model in ASReml-R is:
component | std.error | z.ratio | bound | %ch | |
---|---|---|---|---|---|
male | 0.0639041 | NA | NA | C | 0.5 |
female | 0.0639041 | 0.0369027 | 1.731692 | P | 0.5 |
female:male | 0.1327523 | 0.0602139 | 2.204676 | P | 0.0 |
units!R | 0.2005090 | 0.0312342 | 6.419535 | P | 0.0 |
In the above table see that = 0.064. Also note that the z-ratio for this variance component (1.731) is much higher than the z-ratios for both (1.000) or (0.498) obtained from our reference model (from the first table), indicating better use of the available information.
We compare our constrained model () against the reference model () statistically, using a likelihood ratio test (LRT). This is implemented using the function ():
lrt.asreml(modC, mod0, boundary=FALSE)
df | LR-statistic | Pr(Chisq) | |
---|---|---|---|
mod0/modC | 1 | 1.03068 | 0.309999 |
The LRT indicates that our null hypothesis of cannot be rejected in favor of our alternative hypothesis $ (p-value = 0.31). That is, despite the estimated variance for being approximatelly six times larger than the variance for under , there is no sufficient statistical evidence that they are different. Probably, this result is due to the limited number of parents and crosses considered in this study (i.e., low statistical power).
And as a final extension, for the constrained analysis we also can calculate a narrow-sense heritability using the function , as:
vpredict(modC, h2~4*V2/(V1+V2+V3+V4))
Estimate | SE | |
---|---|---|
h2 | 0.5543991 | 0.2486712 |
The above output provides us with an estimate of , 0.554 (SE = 0.249), a reasonably high heritability estimate but with a large amount of uncertainty.
The appeal of this ASReml-R approach for constraining variance components is that, for any complex linear mixed models, constraints can be easily specified in column of matrix , for most hypothesis of interest, or to evaluate models with specific biological or mathematical constraints.
Salvador A. Gezan
Becker (1984). Manual of Quantitative Genetics, 4th Edition. Academic Enterprises, Pullman, WA.
Notes: SAG May-2020