---
# YAML header created by ox-ravel
title: Twin models
author: Klaus Holst & Thomas Scheike
output:
rmarkdown::html_vignette:
fig_caption: yes
vignette: >
%\VignetteIndexEntry{Twin models}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
This document provides a brief tutorial to analyzing twin data using the
**`mets`** package:
```{r include=FALSE,echo=FALSE,message=FALSE,warning=FALSE }
options(warn=-1, family="Times")
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
dpi=50,
fig.width=7.15, fig.height=5.5,
out.width="600px",
fig.retina=1
##dev="png",
##dpi=72,
## out.width = "70%")
)
library("mets")
```
\(
\newcommand{\cov}{\mathbb{C}\text{ov}}
\newcommand{\cor}{\mathbb{C}\text{or}}
\newcommand{\var}{\mathbb{V}\text{ar}}
\newcommand{\E}{\mathbb{E}}
\newcommand{\unitfrac}[2]{#1/#2}
\newcommand{\n}{}
\)
```{r install, eval=FALSE, echo=FALSE}
# install.packages("remotes")
remotes::install_github("kkholst/mets", dependencies="Suggests")
```
# Twin analysis, continuous traits
In the following we examine the heritability of Body Mass
Index\n{}korkeila_bmi_1991 hjelmborg_bmi_2008, based on data
on self-reported BMI-values from a random sample of 11,411 same-sex
twins. First, we will load data
```{r twinbmi }
library(mets)
data("twinbmi")
head(twinbmi)
```
The data is on *long* format with one subject per row.
- **`tvparnr`:** twin id
- **`bmi`:** Body Mass Index (\(\mathrm{kg}/{\mathrm{m}^2}\))
- **`age`:** Age (years)
- **`gender`:** Gender factor (male,female)
- **`zyg`:** zygosity (MZ, DZ)
We transpose the data allowing us to do pairwise analyses
```{r twinwide }
twinwide <- fast.reshape(twinbmi, id="tvparnr",varying=c("bmi"))
head(twinwide)
```
Next we plot the association within each zygosity group
```{r scatterdens, echo=FALSE,message=FALSE,warning=FALSE }
library("cowplot")
scatterdens <- function(x) {
require(ggplot2)
sp <- ggplot(x,
aes_string(colnames(x)[1], colnames(x)[2])) +
theme_minimal() +
geom_point(alpha=0.3) + geom_density_2d()
xdens <- ggplot(x, aes_string(colnames(x)[1],fill=1)) +
theme_minimal() +
geom_density(alpha=.5)+
theme(axis.text.x = element_blank(),
legend.position = "none") + labs(x=NULL)
ydens <- ggplot(x, aes_string(colnames(x)[2],fill=1)) +
theme_minimal() +
geom_density(alpha=.5) +
theme(axis.text.y = element_blank(),
axis.text.x = element_text(angle=90, vjust=0),
legend.position = "none") +
labs(x=NULL) +
coord_flip()
g <- plot_grid(xdens,NULL,sp,ydens,
ncol=2,nrow=2,
rel_widths=c(4,1.4),rel_heights=c(1.4,4))
return(g)
}
```
We here show the log-transformed data which is slightly more symmetric
and more appropiate for the twin analysis (see Figure \@ref(fig:scatter1) and \@ref(fig:scatter2))
```{r scatter1, warning=FALSE,message=FALSE,fig.cap="Scatter plot of logarithmic BMI measurements in MZ twins" }
mz <- log(subset(twinwide, zyg=="MZ")[,c("bmi1","bmi2")])
scatterdens(mz)
```
```{r scatter2, warning=FALSE,message=FALSE,fig.cap="Scatter plot of logarithmic BMI measurements in DZ twins" }
dz <- log(subset(twinwide, zyg=="DZ")[,c("bmi1","bmi2")])
scatterdens(dz)
```
The plots and raw association measures shows considerable stronger
dependence in the MZ twins, thus indicating genetic influence of the
trait
```{r }
cor.test(mz[,1],mz[,2], method="spearman")
```
```{r }
cor.test(dz[,1],dz[,2], method="spearman")
```
Ńext we examine the marginal distribution (GEE model with working
independence)
```{r gee }
l0 <- lm(bmi ~ gender + I(age-40), data=twinbmi)
estimate(l0, id=twinbmi$tvparnr)
```
```{r }
library("splines")
l1 <- lm(bmi ~ gender*ns(age,3), data=twinbmi)
marg1 <- estimate(l1, id=twinbmi$tvparnr)
```
```{r marg1, warning=FALSE,message=FALSE,fig.cap="Marginal association between BMI and Age for males and females."}
dm <- lava::Expand(twinbmi,
bmi=0,
gender=c("male"),
age=seq(33,61,length.out=50))
df <- lava::Expand(twinbmi,
bmi=0,
gender=c("female"),
age=seq(33,61,length.out=50))
plot(marg1, function(p) model.matrix(l1,data=dm)%*%p,
data=dm["age"], ylab="BMI", xlab="Age",
ylim=c(22,26.5))
plot(marg1, function(p) model.matrix(l1,data=df)%*%p,
data=df["age"], col="red", add=TRUE)
legend("bottomright", c("Male","Female"),
col=c("black","red"), lty=1, bty="n")
```
## Polygenic model
We can decompose the trait into the following variance components
\begin{align*}
Y_i = A_i + D_i + C + E_i, \quad i=1,2
\end{align*}
- **\(A\):** Additive genetic effects of alleles
- **\(D\):** Dominante genetic effects of alleles
- **\(C\):** Shared environmental effects
- **\(E\):** Unique environmental genetic effects
Dissimilarity of MZ twins arises from unshared environmental effects
only, \(\cor(E_1,E_2)=0\) and
\begin{align*}
\cor(A_1^{MZ},A_2^{MZ}) = 1, \quad
\cor(D_1^{MZ},D_2^{MZ}) = 1,
\end{align*}
\begin{align*}
\cor(A_1^{DZ},A_2^{DZ}) = 0.5, \quad
\cor(D_1^{DZ},D_2^{DZ}) = 0.25,
\end{align*}
\begin{align*}
Y_i = A_i + C_i + D_i + E_i
\end{align*}
\begin{align*}
A_i \sim\mathcal{N}(0,\sigma_A^2), C_i
\sim\mathcal{N}(0,\sigma_C^2), D_i
\sim\mathcal{N}(0,\sigma_D^2),
E_i \sim\mathcal{N}(0,\sigma_E^2)
\end{align*}
\begin{gather*}
\cov(Y_{1},Y_{2}) = \\
\begin{pmatrix}
\sigma_A^2 & 2\Phi\sigma_A^2 \\
2\Phi\sigma_A^2 & \sigma_A^2
\end{pmatrix} +
\begin{pmatrix}
\sigma_C^2 & \sigma_C^2 \\
\sigma_C^2 & \sigma_C^2
\end{pmatrix} +
\begin{pmatrix}
\sigma_D^2 & \Delta_{7}\sigma_D^2 \\
\Delta_{7}\sigma_D^2 & \sigma_D^2
\end{pmatrix} +
\begin{pmatrix}
\sigma_E^2 & 0 \\
0 & \sigma_E^2
\end{pmatrix}
\end{gather*}
```{r}
dd <- na.omit(twinbmi)
```
Saturated model (different marginals in MZ and DZ twins and different marginals
for twin 1 and twin 2):
```{r lmsat, eval=FALSE}
l0 <- twinlm(bmi ~ age+gender, data=dd, DZ="DZ", zyg="zyg", id="tvparnr", type="sat")
```
Different marginals for MZ and DZ twins (but same marginals within a pair)
```{r lmflex, eval=FALSE}
lf <- twinlm(bmi ~ age+gender, data=dd,DZ="DZ", zyg="zyg", id="tvparnr", type="flex")
```
Same marginals but free correlation with MZ, DZ
```{r lmeqmarg}
lu <- twinlm(bmi ~ age+gender, data=dd, DZ="DZ", zyg="zyg", id="tvparnr", type="eqmarg")
estimate(lu)
```
A formal test of genetic effects can be obtained by comparing the MZ and DZ correlation:
```{r }
estimate(lu,lava::contr(5:6,6))
```
We also consider the ACE model
```{r ace}
ace0 <- twinlm(bmi ~ age+gender, data=dd, DZ="DZ", zyg="zyg", id="tvparnr", type="ace")
summary(ace0)
```
# Bibliography
[korkeila_bmi_1991] Korkeila, Kaprio, Rissanen & Koskenvuo, Effects of gender and age on the heritability of body mass index, Int J Obes, 15(10), 647-654 (1991). [↩](#b71edfd9bc946c317f4a732845bcaf93)
[hjelmborg_bmi_2008] Hjelmborg, Fagnani, Silventoinen, McGue, Korkeila, Christensen, Rissanen & Kaprio, Genetic influences on growth traits of BMI: a longitudinal study of adult twins, Obesity (Silver Spring), 16(4), 847-852 (2008). [↩](#718839fcb6ade82ebb2d7de853582b80)