Genomic Partitioning

Genomic partitioning enables us to investigate the genetic architecture of complex traits (see the first application here). It involves estimates heritability contributions from subsets of predictors in order to better understand the distribution of causal variants. These subsets will normally be disjoint (hence the term partitioning), but need not be. For example, we might compare the relative contributions of genic and inter-genic SNPs, in which case we would partion the genome into SNPs inside and outside genes;  but alternatively, we might perform a pathway analysis using overlapping subsets of SNPs.

To perform genomic partitioning, use the options --partition-number  and --partition-prefix when calculating kinships. For example., create index files list1, list2, list3, then when cutting the genome add  --partition-number 3 and --partition-prefix list.
_ _ _ _ _ _ _ _ _ _ _ __ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

When performing genomic partitioning, the weightings should ideally be calculated over the union of all subsets being considered. If this union very almost contains all predictors, it should suffice to use weightings calculated over all predictors, but otherwise it is necessary to compute new weightings. Consider the following scenarios, which correspond to the binary PLINK files test.bed, test.bim and test.fam in the Test Datasets (in total, there are 5000 predictors).

A - we wish to compare h2 from predictors 1-2500 and from predictors 2501-5000. The union of these two subsets is 1-5000, so we can simply use weightings calculated over all 5000 predictors.

B - we wish to compare h2 from predictors 1-2499 and from predictors 2501-5000. The union of the regions no longer includes predictor 2500, so strictly we should recalculate weightings over predictors {1-2499,2501-5000}, however, accuracy is unlikely to suffer much if instead we use weightings calculated over all predictors.

C - we wish to compare h2 from {1-350}, {300-500} and {2801-5000}. Now, the union of regions is considerably different from all predictors, so we should first recalculate weightings using only the predictors {1-500,2801-5000} (the union of the three subsets).
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

Example (corresponding to Scenario C above): for this we use the binary PLINK files test.bed, test.bim and test.fam available in the Test Datasets, as well as the lists of predictors list1, list2 and list3, and their union listALL. First calculate weightings.

../ldak.out --cut-weights genpart --bfile test --extract listALL
../ldak.out --calc-weights-all genpart --bfile test --extract listALL
../ldak.out --join-weights genpart --extract listALL

Weightings will be saved in genpart/weights.all and genpart/weights.short. Now calculate kinships.

../ldak.out --cut-kins genpart --bfile test --partition-number 3 --partition-prefix list
../ldak.out --calc-kins genpart --bfile test --partition 1 --weights genpart/weights.short --power -0.25
../ldak.out --calc-kins genpart --bfile test --partition 2 --weights genpart/weights.short --power -0.25
../ldak.out --calc-kins genpart --bfile test --partition 3 --weights genpart/weights.short --power -0.25

The three kinship matricse will be saved with stems genpart/kinships.1, genpart/kinships.2 and genpart/kinships.3. We can then use REML to estimate the heritability contributions of the three subsets of predictors.