--- # YAML header created by ox-ravel title: Twin models author: Klaus Holst & Thomas Scheike output: rmarkdown::html_vignette: fig_caption: yes vignette: > %\VignetteIndexEntry{Twin models} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- This document provides a brief tutorial to analyzing twin data using the **`mets`** package: ```{r include=FALSE,echo=FALSE,message=FALSE,warning=FALSE } options(warn=-1, family="Times") knitr::opts_chunk$set( collapse = TRUE, comment = "#>", dpi=50, fig.width=7.15, fig.height=5.5, out.width="600px", fig.retina=1 ##dev="png", ##dpi=72, ## out.width = "70%") ) library("mets") ``` \( \newcommand{\cov}{\mathbb{C}\text{ov}} \newcommand{\cor}{\mathbb{C}\text{or}} \newcommand{\var}{\mathbb{V}\text{ar}} \newcommand{\E}{\mathbb{E}} \newcommand{\unitfrac}[2]{#1/#2} \newcommand{\n}{} \) ```{r install, eval=FALSE, echo=FALSE} # install.packages("remotes") remotes::install_github("kkholst/mets", dependencies="Suggests") ``` # Twin analysis, continuous traits In the following we examine the heritability of Body Mass Index\n{}korkeila_bmi_1991 hjelmborg_bmi_2008, based on data on self-reported BMI-values from a random sample of 11,411 same-sex twins. First, we will load data ```{r twinbmi } library(mets) data("twinbmi") head(twinbmi) ``` The data is on *long* format with one subject per row. - **`tvparnr`:** twin id - **`bmi`:** Body Mass Index (\(\mathrm{kg}/{\mathrm{m}^2}\)) - **`age`:** Age (years) - **`gender`:** Gender factor (male,female) - **`zyg`:** zygosity (MZ, DZ) We transpose the data allowing us to do pairwise analyses ```{r twinwide } twinwide <- fast.reshape(twinbmi, id="tvparnr",varying=c("bmi")) head(twinwide) ``` Next we plot the association within each zygosity group ```{r scatterdens, echo=FALSE,message=FALSE,warning=FALSE } library("cowplot") scatterdens <- function(x) { require(ggplot2) sp <- ggplot(x, aes_string(colnames(x)[1], colnames(x)[2])) + theme_minimal() + geom_point(alpha=0.3) + geom_density_2d() xdens <- ggplot(x, aes_string(colnames(x)[1],fill=1)) + theme_minimal() + geom_density(alpha=.5)+ theme(axis.text.x = element_blank(), legend.position = "none") + labs(x=NULL) ydens <- ggplot(x, aes_string(colnames(x)[2],fill=1)) + theme_minimal() + geom_density(alpha=.5) + theme(axis.text.y = element_blank(), axis.text.x = element_text(angle=90, vjust=0), legend.position = "none") + labs(x=NULL) + coord_flip() g <- plot_grid(xdens,NULL,sp,ydens, ncol=2,nrow=2, rel_widths=c(4,1.4),rel_heights=c(1.4,4)) return(g) } ``` We here show the log-transformed data which is slightly more symmetric and more appropiate for the twin analysis (see Figure \@ref(fig:scatter1) and \@ref(fig:scatter2)) ```{r scatter1, warning=FALSE,message=FALSE,fig.cap="Scatter plot of logarithmic BMI measurements in MZ twins" } mz <- log(subset(twinwide, zyg=="MZ")[,c("bmi1","bmi2")]) scatterdens(mz) ``` ```{r scatter2, warning=FALSE,message=FALSE,fig.cap="Scatter plot of logarithmic BMI measurements in DZ twins" } dz <- log(subset(twinwide, zyg=="DZ")[,c("bmi1","bmi2")]) scatterdens(dz) ``` The plots and raw association measures shows considerable stronger dependence in the MZ twins, thus indicating genetic influence of the trait ```{r } cor.test(mz[,1],mz[,2], method="spearman") ``` ```{r } cor.test(dz[,1],dz[,2], method="spearman") ``` Ńext we examine the marginal distribution (GEE model with working independence) ```{r gee } l0 <- lm(bmi ~ gender + I(age-40), data=twinbmi) estimate(l0, id=twinbmi$tvparnr) ``` ```{r } library("splines") l1 <- lm(bmi ~ gender*ns(age,3), data=twinbmi) marg1 <- estimate(l1, id=twinbmi$tvparnr) ``` ```{r marg1, warning=FALSE,message=FALSE,fig.cap="Marginal association between BMI and Age for males and females."} dm <- lava::Expand(twinbmi, bmi=0, gender=c("male"), age=seq(33,61,length.out=50)) df <- lava::Expand(twinbmi, bmi=0, gender=c("female"), age=seq(33,61,length.out=50)) plot(marg1, function(p) model.matrix(l1,data=dm)%*%p, data=dm["age"], ylab="BMI", xlab="Age", ylim=c(22,26.5)) plot(marg1, function(p) model.matrix(l1,data=df)%*%p, data=df["age"], col="red", add=TRUE) legend("bottomright", c("Male","Female"), col=c("black","red"), lty=1, bty="n") ``` ## Polygenic model We can decompose the trait into the following variance components \begin{align*} Y_i = A_i + D_i + C + E_i, \quad i=1,2 \end{align*} - **\(A\):** Additive genetic effects of alleles - **\(D\):** Dominante genetic effects of alleles - **\(C\):** Shared environmental effects - **\(E\):** Unique environmental genetic effects Dissimilarity of MZ twins arises from unshared environmental effects only, \(\cor(E_1,E_2)=0\) and \begin{align*} \cor(A_1^{MZ},A_2^{MZ}) = 1, \quad \cor(D_1^{MZ},D_2^{MZ}) = 1, \end{align*} \begin{align*} \cor(A_1^{DZ},A_2^{DZ}) = 0.5, \quad \cor(D_1^{DZ},D_2^{DZ}) = 0.25, \end{align*} \begin{align*} Y_i = A_i + C_i + D_i + E_i \end{align*} \begin{align*} A_i \sim\mathcal{N}(0,\sigma_A^2), C_i \sim\mathcal{N}(0,\sigma_C^2), D_i \sim\mathcal{N}(0,\sigma_D^2), E_i \sim\mathcal{N}(0,\sigma_E^2) \end{align*} \begin{gather*} \cov(Y_{1},Y_{2}) = \\ \begin{pmatrix} \sigma_A^2 & 2\Phi\sigma_A^2 \\ 2\Phi\sigma_A^2 & \sigma_A^2 \end{pmatrix} + \begin{pmatrix} \sigma_C^2 & \sigma_C^2 \\ \sigma_C^2 & \sigma_C^2 \end{pmatrix} + \begin{pmatrix} \sigma_D^2 & \Delta_{7}\sigma_D^2 \\ \Delta_{7}\sigma_D^2 & \sigma_D^2 \end{pmatrix} + \begin{pmatrix} \sigma_E^2 & 0 \\ 0 & \sigma_E^2 \end{pmatrix} \end{gather*} ```{r} dd <- na.omit(twinbmi) ``` Saturated model (different marginals in MZ and DZ twins and different marginals for twin 1 and twin 2): ```{r lmsat, eval=FALSE} l0 <- twinlm(bmi ~ age+gender, data=dd, DZ="DZ", zyg="zyg", id="tvparnr", type="sat") ``` Different marginals for MZ and DZ twins (but same marginals within a pair) ```{r lmflex, eval=FALSE} lf <- twinlm(bmi ~ age+gender, data=dd,DZ="DZ", zyg="zyg", id="tvparnr", type="flex") ``` Same marginals but free correlation with MZ, DZ ```{r lmeqmarg} lu <- twinlm(bmi ~ age+gender, data=dd, DZ="DZ", zyg="zyg", id="tvparnr", type="eqmarg") estimate(lu) ``` A formal test of genetic effects can be obtained by comparing the MZ and DZ correlation: ```{r } estimate(lu,lava::contr(5:6,6)) ``` We also consider the ACE model ```{r ace} ace0 <- twinlm(bmi ~ age+gender, data=dd, DZ="DZ", zyg="zyg", id="tvparnr", type="ace") summary(ace0) ``` # Bibliography [korkeila_bmi_1991] Korkeila, Kaprio, Rissanen & Koskenvuo, Effects of gender and age on the heritability of body mass index, Int J Obes, 15(10), 647-654 (1991). [↩](#b71edfd9bc946c317f4a732845bcaf93) [hjelmborg_bmi_2008] Hjelmborg, Fagnani, Silventoinen, McGue, Korkeila, Christensen, Rissanen & Kaprio, Genetic influences on growth traits of BMI: a longitudinal study of adult twins, Obesity (Silver Spring), 16(4), 847-852 (2008). [↩](#718839fcb6ade82ebb2d7de853582b80)