LD Weightings in Prediction

Please note, we no longer recommend using MultiBLUP, therefore this page exists only for completeness.

In our AJHG paper, we demonstrated that using LD-adjusted kinship matrices leads to more precise estimates of variance explained (or SNP heritability). By contrast, we do not advise using them for prediction.

When a predictor receives weighting zero, this means that its signal is (almost) perfectly captured by neighbouring predictors. Suppose this predictor contributes heritability. When estimating variance explained, it is not necessary to include this predictor, as its heritability contribution will be appreciated by considering the neighbouring predictors which tag its signal. By contrast, when constructing a prediction model, it is important this predictor (or a perfect tag) remains so that it can be assigned effect; otherwise, its effect will be distributed over neighbouring predictors, which will include non-associated signal, adding noise into the prediction model.

Therefore, when calculating kinship matrices to use in MultiBLUP, we advise using --ignore-weights YES, and also when supplying regional kinships to the REML solver. A consequence of this is that the estimates of variance components might be biased, but this is not a problem when the final goal is to estimate effect sizes.