Simulations

LDAK allows generation of phenotypic and genotypic data. The former can be used for testing the accuracy of SNP-based heritability analysis for different phenotypic models.

The argument for generating phenotypic values is

--make-phenos <output>

which requires the following options (to restrict to a subset of data, see Data Filtering):

--bfile/--gen/--sp/--speed <prefix> - to specify data files (see File Formats).

--weights <weightsfile> and --power <float> - to specify the Heritability Model.

--her <float> - to specify the heritability for the simulated phenotypes (i.e., the proportion of total phenotypic variation explained by the genetic contribution).

--num-phenos <integer> - to specify the number of phenotypes to generate.

--num-causals <integer> - to specify the number of predictors contributing to each phenotype (to specify that all predictors are causal, use --num-causals -1).

LDAK will first compute breeding values (the genetic contributions) and save these to <output>.breed; then it will add noise to produce phenotypes with the desired heritability, and save these to <output>.pheno (both files will be in PLINK format). The list of causal predictors and effects will be stored in <output>.effects.

By default, LDAK will pick causal predictors at random; if you would prefer to specify which predictors are causal for each phenotype, use --causals <causalsfile>. Similarly, LDAK will by default sample effect sizes from a standard normal distribution; to instead specify the effect sizes use --effects <effectsfile>. Both <causalsfile> and <effectsfile> should be text files with one row per phenotype and one column per causal predictor.

To generate binary phenotypes, add the option --prevalence. LDAK will then treat the just-generated phenotypes as liabilities, so that individuals with value above Inverse_CDF(prevalence) will become cases (phenotype 2), while those below this threshold will become controls (phenotype 1). These binary phenotypes will be stored in <output>.pheno, while the liabilities will be stored in <output>.liab.

To generate correlated predictors, add the option --bivar; LDAK will then generate pairs of traits with the specified genetic correlation. For example, if you use --her 0.8, --num-phenos 4 and --bivar 0.5, then LDAK will generate four phenotypes with heritability 0.8, such that Phenotypes 1 and 2 will have correlation 0.5, and likewise Phenotypes 3 and 4 (whereas phenotype pairs 1 and 3, 1 and 4, 2 and 3, 2 and 4 will be uncorrelated).
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

The argument for generating SNP values is

--make-snps <output> --num-snps <integer> --num-samples

which requires the following two options:

--num-samples <integer> - to specify the number of samples.

--num-snps <integer> - to specify the number of SNPs.

LDAK will create a simple dataset, saved in binary PLINK format to <output>.bed, <output>.bim and <output>.fam. The generation process is very simple, assuming Hardy-Weinberg equilibrium and linkage equilibrium. The default MAF range of SNPs is 0 to 0.5, but this can be changed using --maf-low <float> and --maf-high <float>.