Equalise Tagging

The LDAK weightings are designed to equalize the tagging of SNPs across the genome; SNPs in regions of high linkage disquilbirum (LD) will tend to get low weightings, and vice versa. The way LDAK calculates the weightings is described in full in our paper Improved heritability estimation from genome-wide SNPs (AJHG, 2012). However, very briefly, the method first assesses patterns of local LD by calculating a matrix of local pairwise squared-correlations between SNPs. Row i of this matrix will indicate to what extent the signal of SNP i is replicated by its neighbouring SNPs, so that the sum of these values will reflect the total amount that the signal of SNP i is replicated. Based on this matrix, LDAK determines SNP weightings so that the sum of the values in Row i times the SNP weightings equals (approximately) one. Originally, these weightings were calculated using the simplex method (linear optimization), however, we subsequently switched to a quadratic solver (the two approaches results in very similar weightings, but the latter is more efficient).
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

The following picture describes the intuition behind the LDAK weightings.

In this toy example, there are nine SNPs, X1, X2, ..., X9, with corresponding weightings w1, w2,..., w9. SNPs 1 & 2 are highly correlated, as are SNPs 4 & 5, and also SNPs 6, 7, 8 & 9. Therefore, we view the nine SNPs as tagging only four distinct sources of underlying variation (U1, U2, U3 & U4). If the SNPs were weighted evenly, then U4 would get twice as much weighting as U1 & U3, and four times as much weighting as U2 (because U4 is tagged by twice as many SNPs as U1 & U3, and four times as many SNPs as U2). The LDAK weightings might instead set w1=w2=1/2, w3=1, w4=w5=1/2 and w6=w7=w8=w9=1/4, so that the total weighting for each source of underlying variation is one.