---------------------------- ADVANCED ANALYSIS------------------ ---------------------------- Here we provide sketch details for some of the other analyses performed in the main text, as well as some additional features of LDAK. If including lower-quality SNPs, then simply add --infos when computing kinships. The genotype scaling can be varied using the option --power. For example, to use the previous default scaling, use --power -1. After performing REML, the .share file provides relative estimates. These are useful when interested in relative contributions (e.g., of different SNP classes). The .reml file contains the null and alternative (log) likelihoods and a likelihood ratio test (LRT) statistic (the null model corresponds to only covariates). To test significance, we typically computed the difference in LRT statistics between results from partitioned and non-partitioned model (e.g., when comparing the GCTA and LDAK Models, we performed REML using just the genome-wide kinship matrix, then using two kinships, one computed from low-LD SNPs, the other from high-LD SNPs. To test the contribution of DNaseI hypersensitivity sites (DHS), we downloaded DHS annotations from \url{hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeRegDnaseClustered/wgEncodeRegDnaseClusteredV3.bed.gz}. From these we created dhs.txt which had four columns, providing a unique identifier, chromosome, start and end basepairs for each. Note that if two-way SNP partitioning results in a very uneven divide of SNPs, instead of computing both kinships from scratch, it can be much quicker to first compute kinships for the small tranche, then subtract these from the (previously-computed) genome-wide kinships using the option --sub-grm. We did this when testing DHS, as these contain less than 20% of the genome. A similar strategy can be used when considering more than two partitions. ------------------------------------------------------------------------------ #To calculate enrichment of DHS, first compute kinships from DHS SNPs, then its complement (i.e., subtract from genome-wide kinships), then perform two-way REML ldak --cut-genes dhs --speed data --genefile dhs.txt --ignore-weights YES ldak --calc-kins-direct dhs --speed data --extract dhs/genes.predictors.used --weights sections/weights.all --power -0.25 echo -e "kinships/kinshipALL\ndhs" > listsub ldak --sub-grm not_dhs --mgrm listsub echo -e "dhs\nnot_dhs" > listdhs ldak --reml dhs --mgrm listdhs --pheno phen.pheno --covar covar.covar --keep prune.keep \ --top-preds top.in --speed data #If including rare variants, then partition based on MAF #Exact boundaries are not too important, but consider extra boundaries (say at 0.00025) if using very rare SNPs #Likely some of the rare variants will be low quality, so add '--infos' awk < data.stats '(NR>1 && $3>0.1){print $1}' > maf1 awk < data.stats '(NR>1 && $3>0.01 && $3<=0.1){print $1}' > maf2 awk < data.stats '(NR>1 && $3>0.0025 && $3<=0.01){print $1}' > maf3 awk < data.stats '(NR>1 && $3>0.001 && $3<=0.0025){print $1}' > maf4 awk < data.stats '(NR>1 && $3<=0.001){print $1}' > maf5 ldak --cut-kins rare --speed data --partition-number 5 --partition-prefix maf for j in {1..5}; do ldak --calc-kins rare --speed data --partition $j --weights sections/weightsALL \ --power -0.25 --infos data.infos done ------------------------------------------------------------------------------