LDAK can both add and subtract kinship matrices. These features are useful when wishing to perform complementary kinship matrices, such as when performing leave-one-chromosome-out (LOCO) mixed-model analysis (see Single-Predictor Analysis). For example, to construct kinship matrices that use all SNPs except those on a single chromosome, we could first create per-chromosome kinship matrices, then join these to obtain a genome-wide kinship matrix, then subtract from this each per-chromosome kinship matrix.
Always read the screen output, which suggests arguments and estimates memory usage.
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
Add kinship matrices:
The main argument is --add-grm <outfile>.
The only required option is
--mgrm <kinstems> - to provide kinship matrices
LDAK will add the kinship matrices specified in <kinstems> and save with stem <outfile>.
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
Subtract kinship matrices:
The main argument is --sub-grm <output>.
Then you must use either
--mgrm <kinstems> - to provide kinship matrices
or
--grm <kinship> and --extract <extractfile> or --exclude <excludefile> - to provide one kinship matrix and either a list of predictors to extract or a list of predictors to exclude.
If using --mgrm <kinstems>, LDAK will subtract from the first kinship matrix, the remaining kinship matrices (e.g., if <kinstems> provides the stems of K kinship matrices; LDAK will subtract kinship matrices 2, 3, ..., K from the first kinship matrix).
If instead using --grm <kinship> and --extract <extractfile>, LDAK will exclude from the kinship matrix with stem <kinship> the contribution of predictors not in <extractfile>. While if using --grm <kinship> and --exclude <excludefile>, LDAK will exclude from the kinship matrix with stem <kinship> the contribution of predictors in <excludefile>. With these two options, it is necessary to use --bfile/--gen/--sp/--speed <datastem> to specify the genetic data files. Note that this approach is used in Adaptive MultiBLUP, to remove contributions of regions from a genome-wide kinship matrix.
In all cases, the new kinship matrix will be saved with stem <outfile>.
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
Example:
Here we use the binary PLINK files human.bed, human.bim and human.fam from the Test Datasets.
We assume we are performing leave-one-chromosome-out (LOCO) mixed-model analysis. Therefore, we need to create a kinship matrix for each chromosome which is constructed using only predictors on other chromosomes (a complementary kinship matrix). We recommend doing this assuming a thinned version of the GCTA Model, for which we must obtain a list of predictors in approximate linkage equilibrium
./ldak.out --thin le --bfile human --window-prune .05 --window-cm 1
./ldak.out --calc-kins-direct le --bfile human --ignore-weights YES --power -1 --extract le.in
The list of thinned predictors is saved in le.in.
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
First we create per-chromosome kinship matrices
for j in {21..22}; do
./ldak.out --calc-kins-direct le$j --bfile human --ignore-weights YES --power -1 --extract le.in --chr $j
done
Then we join these to obtain a genome-wide kinship matrix
rm list.All
for j in {21..22}; do echo "le$j" >> list.All; done
./ldak.out --add-grm leAll --mgrm list.All
Finally, we subtract the per-chromosome kinship matrices from the genome-wide matrix
for j in {21..22}
do echo "leAll
le$j" > list.$j
./ldak.out --sub-grm leN$j --mgrm list.$j
done
The complementary kinship matrices are saved with stems leN21 and leN22. Note that in this script, we loop from 21 to 22, because our example dataset contains only these two chromosomes; usually you would loop from 1 to 22.
Instead of the final script, it would have been equivalent to instead have run
rm chr{1..22}; awk < human.bim '{print $2 > "chr"$1}'
for j in {21..22}
do ./ldak.out --sub-grm leN$j --grm leAll --exclude chr$j --bfile human
done