BLD-LDAK Annotations

The BLD-LDAK and BLD-LDAK+Alpha Models use 65 SNP annotations. Click here to download the first 64 annotations (warning, the file is 660Mb). For details of how to use these annotations (including how to create the final annotation), see Calculate Taggings.

Note that the predictor names are in the form Chr:BP, where the positions are from the GRCh37/hg19 assembly. Therefore, to use these annotations, you should first ensure the genomic positions in your genetic data files are also from the GRCh37/hg19 assembly. If not, you can update them using the LiftOver Tool. Next, you should ensure the predictor names in your data files are in the form Chr:BP. As an example, here is a script to change the predictor names in a bim file called data.bim.

mv data.bim data.bim.old
awk < data.bim.old '{$2=$1":"$4";print $0}' > data.bim
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

We obtained the 64 annotations as follows:

First we downloaded the folder 1000G_Phase3_baselineLD_v2.1_ldscores.tgz from https://data.broadinstitute.org/alkesgroup/LDSCORE. Within this folder, the .annot.gz files contain the 74 annotations of the Baseline LD Model. The BLD-LDAK Model uses Annotations 1-58 and 59-64.

We extracted all 74 annotations (plus Annotation 0, the base category) using the following commands:
rm bld0 base{1..74}
for j in {1..22}; do gunzip -c baselineLD_v1.1/baselineLD.$j.annot.gz | awk '(NR>1){for(j=1;j<=74;j++){if($(5+j)!=0){print $1":"$2, $(5+j) >> "base"j}}print $1":"$2 >> "bld0"}'; done

Then we excluded the 10 MAF bins using these two commands:
for j in {1..58}; do cp base$j > bld$j; done
for j in {59..64}; do cp base$((10+j)) bld$j; done
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

Below are the names of the 66 annotations in the BLD-LDAK Model (the 64 provided here, plus the LDAK Weightings and the base category). Note that the 26 marked with a * are binary annotations that can be used for estimating Heritability Enrichments.

1 Coding_UCSC*
2 Coding_UCSC.extend.500
3 Conserved_LindbladToh*
4 Conserved_LindbladToh.extend.500
5 CTCF_Hoffman*
6 CTCF_Hoffman.extend.500
7 DGF_ENCODE*
8 DGF_ENCODE.extend.500
9 DHS_peaks_Trynka
10 DHS_Trynka*
11 DHS_Trynka.extend.500
12 Enhancer_Andersson*
13 Enhancer_Andersson.extend.500
14 Enhancer_Hoffman*
15 Enhancer_Hoffman.extend.500
16 FetalDHS_Trynka*
17 FetalDHS_Trynka.extend.500
18 H3K27ac_Hnisz*
19 H3K27ac_Hnisz.extend.500
20 H3K27ac_PGC2*
21 H3K27ac_PGC2.extend.500
22 H3K4me1_peaks_Trynka
23 H3K4me1_Trynka*
24 H3K4me1_Trynka.extend.500
25 H3K4me3_peaks_Trynka
26 H3K4me3_Trynka*
27 H3K4me3_Trynka.extend.500
28 H3K9ac_peaks_Trynka
29 H3K9ac_Trynka*
30 H3K9ac_Trynka.extend.500
31 Intron_UCSC*
32 Intron_UCSC.extend.500
33 PromoterFlanking_Hoffman*
34 PromoterFlanking_Hoffman.extend.500
35 Promoter_UCSC*
36 Promoter_UCSC.extend.500
37 Repressed_Hoffman*
38 Repressed_Hoffman.extend.500
39 SuperEnhancer_Hnisz*
40 SuperEnhancer_Hnisz.extend.500
41 TFBS_ENCODE*
42 TFBS_ENCODE.extend.500
43 Transcr_Hoffman*
44 Transcr_Hoffman.extend.500
45 TSS_Hoffman*
46 TSS_Hoffman.extend.500
47 UTR_3_UCSC*
48 UTR_3_UCSC.extend.500
49 UTR_5_UCSC*
50 UTR_5_UCSC.extend.500
51 WeakEnhancer_Hoffman*
52 WeakEnhancer_Hoffman.extend.500
53 Super_Enhancer_Vahedi*
54 Super_Enhancer_Vahedi.extend.500
55 Typical_Enhancer_Vahedi*
56 Typical_Enhancer_Vahedi.extend.500
57 GERP.NS
58 GERP.RSsup4
59 MAF_Adj_Predicted_Allele_Age
60 MAF_Adj_LLD_AFR
61 Recomb_Rate_10kb
62 Nucleotide_Diversity_10kb
63 Backgrd_Selection_Stat
64 CpG_Content_50kb
65 LDAK_Weightings
66 Base_Category