TetraHer and QuantHer

TetraHer is our new tool for estimating liability heritability of binary phenotypes (e.g., diseases). QuantHer is an analogous method for estimating heritability of quantitative phenotypes. Note that TetraHer and QuantHer require individual-level data for related samples; if you have individual-level data for unrelated samples, then you probably should estimate SNP heritability using REML, Haseman-Elston Regression or PCGC.

Always read the screen output, which suggests arguments and estimates memory usage.
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

TetraHer:

The main argument is --family-binary <outfile>.

The two required options are

--relatives <pairsfile> - to specify pairs of related individuals. <pairsfile> should have either five or six columns. Columns 1 & 2 should provide the two IDs for the first individual in each pair, while Columns 3 & 4 should provide the two IDs for the second individual in each pair. Column 5 should specify the relatedness between the pair, while Column 6 (if provided) should specify the environmental similarity. See Relatives File for more details, and scripts for constructing this file.

--pheno <phenofile> - to specify phenotypes (in PLINK format). Samples without a phenotype will be excluded. If <phenofile> contains more than one phenotype, specify which should be used with --mpheno <integer>.

Use --covar <covarfile> to provide covariates (in PLINK format) as fixed effects in the regression; when calculating heritabilties, the phenotypic variance explained by these will be discounted.

Use --prevalence <prevalence> to specify the population prevalence of the phenotype (otherwise, LDAK will assume the population prevalence equals the sample prevalence).

The main output file is <outfile>.mle. The row labelled "Genetic" reports the estimated heritability (h2L), while the row labelled "Environmental" reports the estimated contribution of common environment (h2C).
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

QuantHer:

The main argument is --family-quant <outfile>.

The two required options are

--relatives <pairsfile> - to specify pairs of related individuals. <pairsfile> should have either five or six columns. Columns 1 & 2 should provide the two IDs for the first individual in each pair, while Columns 3 & 4 should provide the two IDs for the second individual in each pair. Column 5 should specify the relatedness between the pair, while Column 6 (if provided) should specify the environmental similarity. See Relatives File for more details, and scripts for constructing this file.

--pheno <phenofile> - to specify phenotypes (in PLINK format). Samples without a phenotype will be excluded. If <phenofile> contains more than one phenotype, specify which should be used with --mpheno <integer>.

Use --covar <covarfile> to provide covariates (in PLINK format) as fixed effects in the regression; when calculating heritabilties, the phenotypic variance explained by these will be discounted.

If the phenotype is binary, you can use --prevalence <float> to specify the population prevalence, and LDAK additionally report heritability estimates on the liability scale (however, in general, it is preferable to instead use --family-binary).

The main output file is <outfile>.mle. The row labelled "Genetic" reports the estimated heritability (h2O), while the row labelled "Environmental" reports the estimated contribution of common environment (h2C).
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

Example:

Here we use the relatives files disease.relatives and disease.enviro, the phenotype file disease.pheno, and the covariates file disease.covar from the Test Datasets. The phenotypes are from a simulated disease with liability heritability 0.5 and prevalence 0.1.
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

1 - TetraHer.

The simplest TetraHer analysis uses the command

./ldak.out --family-binary tetraher1 --relatives disease.relatives --pheno disease.pheno

By viewing tetraher1.mle, we see that the estimated heritability (on the liability scale) is 0.54 (SD 0.05).

The phenotype has been ascertained (this is evident from the fact that 20% of the individuals in disease.pheno are cases, which is twice the population prevalence). We can allow for this ascertainment using the command

./ldak.out --family-binary tetraher2 --relatives disease.relatives --pheno disease.pheno --prevalence 0.1

Now the estimate of heritability is 0.45 (SD 0.04).

The file disease.covar contains ages for each individual. To include these in the analysis, we use the command

./ldak.out --family-binary tetraher3 --relatives disease.relatives --pheno disease.pheno --prevalence 0.1 --covar disease.covar

The revised estimate of heritability is 0.56 (0.05). The increase compared to the previous analysis reflects that age is an important covariate (e.g., we see from the file tetraher3.mle, that it is estimated to explain 18% of liability variation).

To allow for the contribution of common environment, we use the command

./ldak.out --family-binary tetraher4 --relatives disease.enviro --pheno disease.pheno --prevalence 0.1 --covar disease.covar

Now the estimated heritability is 0.32 (SD 0.18), while the estimated contribution of common environment is 0.12 (SD 0.10). The reduced precision of estimates reflects both the difficulty of estimating common environment (because environmental similarity tends to correlate with genetic similarity), and that this dataset contains only full-siblings and half-siblings (higher precision would be obtained if the dataset also contained identical twins).
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

2 - QuantHer.

Note that the following three commands continue to use the binary phenotype disease.pheno, even though QuantHer is primarily designed for continuous phenotypes.

The simplest QuantHer analysis uses the command

ldak.out --family-quant quanther1 --relatives disease.relatives --pheno disease.pheno

By viewing quanther1.mle, we see that the estimated heritability (on the observed scale) is 0.29 (SD 0.03).

We can include the covariate file disease.covar using the command

./ldak.out --family-quant quanther2 --relatives disease.relatives --pheno disease.pheno --covar disease.covar

The revised estimate of heritability is 0.31 (0.03).

To allow for the contribution of common environment, we use the command

./ldak.out --family-quant quanther3 --relatives disease.enviro --pheno disease.pheno --covar disease.covar

Now the estimated heritability is 0.19 (SD 0.10), while the estimated contribution of common environment is 0.07 (SD 0.05).