Calculate Statistics

LDAK is able to compute basic metrics (e.g., allele frequencies and missing rates) for genetic data files These can subsequently be used as part of Quality Control.

Always read the screen output, which suggests arguments and estimates memory usage.
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

The main argument is --calc-stats <outfile>.

The only required argument is

–bfile/–chiamo/–sp/–speed <prefix> - to specify genetic data files (see File Formats)

The output file <outfile>.stats contains the observed frequency of the A1 allele, the MAF and call rate for each SNP. If the data files provide SNP probabilities, then <outfile>.stats will also contain information scores. The output file <outfile>.missing contains the missing and heterozygosity rates for each individual.
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

Example:

Here we use the binary PLINK files human.bed, human.bim and human.fam from the Test Datasets.

./ldak.out --calc-stats stats --bfile human

The metrics are stored in stats.stats and stats.missing.