LDAK allows generation of phenotypic and genotypic data. The former can be used for testing the accuracy of SNP-based heritability analysis for different phenotypic models.
The argument for generating phenotypic values is
which requires the following options (to restrict to a subset of data, see Data Filtering):
--bfile/--gen/--sp/--speed <prefix> - to specify data files (see File Formats).
--weights <weightsfile> and --power <float> - to specify the Heritability Model.
--her <float> - to specify the heritability for the simulated phenotypes (i.e., the proportion of total phenotypic variation explained by the genetic contribution).
--num-phenos <integer> - to specify the number of phenotypes to generate.
--num-causals <integer> - to specify the number of predictors contributing to each phenotype (to specify that all predictors are causal, use --num-causals -1).
LDAK will first compute breeding values (the genetic contributions) and save these to <output>.breed; then it will add noise to produce phenotypes with the desired heritability, and save these to <output>.pheno (both files will be in PLINK format). The list of causal predictors and effects will be stored in <output>.effects.
By default, LDAK will pick causal predictors at random; if you would prefer to specify which predictors are causal for each phenotype, use --causals <causalsfile>. Similarly, LDAK will by default sample effect sizes from a standard normal distribution; to instead specify the effect sizes use --effects <effectsfile>. Both <causalsfile> and <effectsfile> should be text files with one row per phenotype and one column per causal predictor.
To generate binary phenotypes, add the option --prevalence. LDAK will then treat the just-generated phenotypes as liabilities, so that individuals with value above Inverse_CDF(prevalence) will become cases (phenotype 2), while those below this threshold will become controls (phenotype 1). These binary phenotypes will be stored in <output>.pheno, while the liabilities will be stored in <output>.liab.
To generate correlated predictors, add the option --bivar; LDAK will then generate pairs of traits with the specified genetic correlation. For example, if you use --her 0.8, --num-phenos 4 and --bivar 0.5, then LDAK will generate four phenotypes with heritability 0.8, such that Phenotypes 1 and 2 will have correlation 0.5, and likewise Phenotypes 3 and 4 (whereas phenotype pairs 1 and 3, 1 and 4, 2 and 3, 2 and 4 will be uncorrelated).
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
The argument for generating SNP values is
--make-snps <output> --num-snps <integer> --num-samples
which requires the following two options:
--num-samples <integer> - to specify the number of samples.
--num-snps <integer> - to specify the number of SNPs.
LDAK will create a simple dataset, saved in binary PLINK format to <output>.bed, <output>.bim and <output>.fam. The generation process is very simple, assuming Hardy-Weinberg equilibrium and linkage equilibrium. The default MAF range of SNPs is 0 to 0.5, but this can be changed using --maf-low <float> and --maf-high <float>.