Test Datasets

Click here to download some test datasets (compressed size 34Mb). These include sample genotype data for humans (from the 1000 Genomes Project) and for mice (from the Heterogeneous Stock collection), as well as simulated phenotypes.

These data can be used with the example commands on this website (as well as for the practical available within the Short Courses). Please be aware that the example commands often begin with ./ldak.out; you should replace ./ldak.out with the name of the LDAK executable you obtained from Downloads (e.g., replace ./ldak.out with ./ldak5.1.linux or ./ldak5.1.mac).

Note that many of the example commands use awk. This tool is very efficient at processing large files and is usually installed by default on any UNIX operating system. You can read more about awk here.

For simplicity, most of the examples commands assume that the test datasets have already undergone quality control, which is not the case (in fact, the example for Quality Control shows that the human data contain ancestral outliers, that should be excluded from all analyses, as well as related samples, that should be excluded from heritability analyses).