Whenever using SumHer, the first step is to construct a Tagging File, which records the (relative) expected heritability tagged by each predictor. For this it is necessary to specify a heritability model, which describes the distribution of E[h^{2}_{j}], the expected heritability (uniquely) contributed by Predictor j. SumHer allows for models of the form

E[h^{2}_{j}] propto w_{j} [f_{j}(1-f_{j})]^{(1+power)}

where w_{j} is the predictor weighting of SNP j andĀ f_{j} is its minor allele frequency.

The GCTA/LDSC Model assumes E[h^{2}_{j}] is constant across all predictors; this is obtained by setting w_{j}=1 and power=1. However, we instead recommend using the LDAK Model, achieved by using the LDAK weightings and power=-0.25.

To use the LDAK heritability model, you must first Get Weightings (using your Reference Panel). When doing this, it is important you use --extract to ensure that LDAK only considers predictors for which summary statistics are available and --exclude to ensure LDAK ignores predictors within the major histocompatibility complex and those in LD with large-effect loci. If your aim is to estimate Genetic Correlations, and you are uncertain whether summary statistics are strand-aligned, we recommend you also exclude predictors with ambiguous alleles. See Summary Statistics for details on how to construct the required lists of predictors.

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

Example: for this we assume the Reference Panel is stored in binary PLINK format in the files ref.bed, ref.bim and ref.fam, and we use the files height.txt, height.predictors and height.exclude constructed in the example for Summary Statistics.

The basic way to compute weightings is using

../ldak.out --cut-weights sumsect --bfile ref --extract height.predictors --exclude height.exclude

../ldak.out --calc-weights-all sumsect --bfile ref --extract height.predictors --exclude height.exclude

The final weightings will be saved in sumsect/weights.short. An easy way to speed up this process is to instead compute weightings separately for each chromosome, then merge. A possible script for use on a cluster would be

#!/bin/bash

#$ -S /bin/bash

#$ -t 1-22

number=$SGE_TASK_ID

../ldak.out --cut-weights sumsect$number --bfile ref --extract height.predictors --exclude height.exclude --chr $number

../ldak.out --calc-weights-all sumsect$number --bfile ref --extract height.predictors --exclude height.exclude --chr $number

When this script has completed, we can merge weights across chromosomes using

cat sumsect{1..22}/weights.short > sumsect/weights.short