---
title: "Estimating a relative risk or risk difference with a binary exposure"
author: Klaus Kähler Holst
date: "`r Sys.Date()`"
output:
  knitr:::html_vignette:
    fig_caption: yes
    fig_width: 7.15
    fig_height: 5.5
vignette: >
  %\VignetteIndexEntry{Estimating a relative risk or risk difference with a binary exposure}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
 collapse = TRUE,
 #dev="png",
 comment = "#>"
)
library(lava)
```

Introduction
============

Let \(Y\) be a *binary response*, \(A\) a *binary exposure*, and \(V\)
a *vector of covariates*.

![DAG for the statistical model with the dashed edge representing a potential
interaction between exposure \(A\) and covariates \(V\).](dag1.svg)


In a common setting, the main interest lies
in quantifying the treatment effect, \(\nu\), of \(A\) on \(Y\) adjusting for the set of
covariates, and often a standard approach is to use a Generalized
Linear Model (GLM):

\[g\{ E(Y\mid A,V) \} = A\nu^tW +
\underset{\mathrm{nuisance}}{\mu^tZ}\]

with link function \(g\), and \(W = w(V)\), \(Z= v(V)\) known vector
functions of \(V\).

The canonical link (logit) leads to nice computational properties
(logistic regression) and parameters with an odds-ratio
interpretation. But ORs are not *collapsible* even under
randomization. For example

\[
E(Y\mid X) = E[ E(Y\mid X,Z) \mid X ] = E[\operatorname{expit}( \mu +
\alpha X + \beta Z ) \mid X]
\neq \operatorname{expit}[\mu + \alpha X + \beta E(Z\mid X)],
\]

When marginalizing we leave the class of logistic regression. This
non-collapsibility makes it hard to interpret odds-ratios and to
compare results from *different studies*


**Relative risks** (and risk differences) are collapsible and
generally considered easier to interpret than odds-ratios.
[Richardson et al](https://doi.org/10.1080/01621459.2016.1192546)
(JASA, 2017) proposed a regression model for a binary exposures which
solves the computational problems and need for parameter contraints
that are associated with using for example binomial regression with a
log-link function (or identify link for the risk difference) to obtain
such parameter estimates. In the following we consider the relative
risk as the **target parameter**

\[
\mathrm{RR}(v) = \frac{P(Y=1\mid A=1, V=v)}{P(Y=1\mid A=0, V=v)}.
\]

Let \(p_a(V) = P(Y \mid A=a, V), a\in\{0,1\}\), the idea is then to
posit a linear model for \[ \theta(v) = \log \big(RR(v)\big) \],
i.e.,
\[\log \big(RR(v)\big) = \alpha^Tv,\]

and a **nuisance model** for the *odds-product*
\[ \phi(v) =
\log\left(\frac{p_{0}(v)p_{1}(v)}{(1-p_{0}(v))(1-p_{1}(v))}\right) \]

noting that these two parameters are *variation independent* as illustrated
by the below L'Abbé plot.

```{r, fig.width=5, fig.height=5}
  p0 <- seq(0,1,length.out=100)
  p1 <- function(p0,op) 1/(1+(op*(1-p0)/p0)^-1)
  plot(0, type="n", xlim=c(0,1), ylim=c(0,1),
     xlab="P(Y=1|A=0)", ylab="P(Y=1|A=1)", main="Constant odds product")
  for (op in exp(seq(-6,6,by=.25))) lines(p0,p1(p0,op), col="lightblue")
```

```{r, fig.width=5, fig.height=5, fig.caption=' '}
  p0 <- seq(0,1,length.out=100)
  p1 <- function(p0,rr) rr*p0
  plot(0, type="n", xlim=c(0,1), ylim=c(0,1),
     xlab="P(Y=1|A=0)", ylab="P(Y=1|A=1)", main="Constant relative risk")
  for (rr in exp(seq(-3,3,by=.25))) lines(p0,p1(p0,rr), col="lightblue")
```

Similarly, a model can be constructed for the risk-difference on the
following scale

\[\theta(v) = \operatorname{arctanh} \big(RD(v)\big).\]


Simulation
============

First the `targeted` package is loaded
```{r setup}
library(targeted)
```

This automatically imports [lava](https://arxiv.org/abs/1206.3421)
[(CRAN)](https://CRAN.R-project.org/package=lava)
which we can use to simulate from the Relative-Risk Odds-Product (RR-OP) model.

```{r}
m <- lvm(a ~ x,
         lp.target ~ 1,
         lp.nuisance ~ x+z)
m <- binomial.rr(m, response="y", exposure="a", target.model="lp.target", nuisance.model="lp.nuisance")
```

The `lvm` call first defines the linear predictor for the exposure to
be of the form

\[\mathrm{LP}_A := \mu_A + \alpha X\]

and the linear predictors for the /target parameter/ (relative risk) and the /nuisance
parameter/ (odds product) to be of the form

\[\mathrm{LP}_{RR} := \mu_{RR},\]

\[\mathrm{LP}_{OP} := \mu_{OP} + \beta_x X + \beta_z Z.\]

The covariates are by default assumed to be independent and standard
normal \(X, Z\sim\mathcal{N}(0,1)\), but their distribution can easily
be altered using the `lava::distribution` method.

The `binomial.rr` function
```{r}
args(binomial.rr)
```

then defines the link functions, i.e.,

\[\operatorname{logit}(E[A\mid X,Z]) = \mu_A + \alpha X,\]

\[\operatorname{log}(E[Y\mid X,Z, A=1]/E[Y\mid X, A=0]) = \mu_{RR},\]

\[\operatorname{log}\{p_1(X,Z)p_0(X,Z)/[(1-p_1(X,Z))(1-p_0(X,Z))]\} =
\mu_{OP}+\beta_x X + \beta_z Z\]

with \(p_a(X,Z)=E(Y\mid A=a,X,Z)\).

The risk-difference model with the RD parameter modeled on the
\(\operatorname{arctanh}\) scale can be defined similarly using the `binomial.rd`
method
```{r}
args(binomial.rd)
```

We can inspect the parameter names of the modeled
```{r}
coef(m)
```

Here the intercepts of the model are simply given the same name as the
variables, such that \(\mu_A\) becomes `a`, and the other regression
coefficients are labeled using the "~"-formula notation, e.g.,
\(\alpha\) becomes `a~x`.

Intercepts are by default set to zero and regression parameters set to
one in the simulation. Hence to simulate from the model with
\((mu_A, \mu_{RR}, \mu_{OP}, \alpha, \beta_x, \beta_z)^T =
(-1,1,-2,2,1,1)^T\), we define the parameter vector ``p`` given by

```{r}
p <- c('a'=-1, 'lp.target'=1, 'lp.nuisance'=-1, 'a~x'=2)
```

and then simulate from the model using the ``sim`` method

```{r}
d <- sim(m, 1e4, p=p, seed=1)

head(d)
```

Notice, that in this simulated data the **target parameter**
\(\mu_{RR}\) has been set to `lp.target =` `r p['lp.target']`.

Estimation
============

MLE
------------

We start by fitting the model using the maximum likelihood estimator.

```{r}
args(riskreg_mle)
```

The ``riskreg_mle`` function takes vectors/matrices as input arguments
with the response ``y``, exposure ``a``, target parameter design
matrix ``x1`` (i.e., the matrix \(W\) at the start of this text), and
the nuisance model design matrix ``x2`` (odds product).

We first consider the case of a correctly specified model, hence we do
not consider any interactions with the exposure for the odds product
and simply let ``x1`` be a vector of ones, whereas the design matrix
for the log-odds-product depends on both \(X\) and \(Z\)

```{r}
x1 <- model.matrix(~1, d)
x2 <- model.matrix(~x+z, d)

fit1 <- with(d, riskreg_mle(y, a, x1, x2, type="rr"))
fit1
```

The parameters are presented in the same order as the columns of
``x1``and ``x2``, hence the target parameter estimate is in the first row

```{r}
estimate(fit1, keep=1)
```

DRE
------------
We next fit the model using a double robust estimator (DRE) which introduces
a model for the exposure \(E(A=1\mid V)\) (propensity model). The double-robustness
stems from the fact that the this estimator remains consistent in the
union model where either the odds-product model or the propensity
model is correctly specified. With both models correctly specified the
estimator is efficient.

```{r}
with(d, riskreg_fit(y, a, target=x1, nuisance=x2, propensity=x2, type="rr"))

```

The usual /formula/-syntax can be applied using the ``riskreg``
function. Here we illustrate the double-robustness by using a *wrong propensity
model* but a correct nuisance paramter (odds-product) model:
```{r}
  riskreg(y~a, nuisance=~x+z, propensity=~z, data=d, type="rr")
```

Or vice-versa
```{r}
  riskreg(y~a, nuisance=~z, propensity=~x+z, data=d, type="rr")
```

whereas the MLE in this case yields a biased estimate of the relative risk:
```{r}
  fit2 <- with(d, riskreg_mle(y, a, x1=model.matrix(~1,d), x2=model.matrix(~z, d)))
  estimate(fit2, keep=1)
```

Interactions
------------

The more general model where
\[\log RR(V) = A \alpha^TV\]
for a subset \(V\) of the covariates can be estimated using the
``target`` argument:
```{r}
fit <- riskreg(y~a, target=~x, nuisance=~x+z, data=d)
fit
```
As expected we do not see any evidence of an effect of \(X\) on the
relative risk with the 95% confidence limits clearly overlapping zero.

Note, that when the ``propensity`` argument is omitted as above, the same design
matrix is used for both the odds-product model and the propensity model.


Risk-difference
------------

The syntax for fitting the risk-difference model is similar.
To illustrate this we simulate some new data from this model

```{r}
m2 <- binomial.rd(m, response="y", exposure="a", target.model="lp.target", nuisance.model="lp.nuisance")
d2 <- sim(m2, 1e4, p=p)
```

And we can then fit the DRE with the syntax

```{r}
riskreg(y~a, nuisance=~x+z, data=d2, type="rd")
```


Influence-function
------------
The DRE is a *regular and asymptotic linear* (RAL) estimator, hence
\[\sqrt{n}(\widehat{\alpha}_{\mathrm{DRE}} - \alpha) = \frac{1}{\sqrt{n}}\sum_{i=1}^{n} \phi_{\mathrm{eff}}(Z_{i}) + o_{p}(1)\]
where \(Z_i = (Y_i, A_i, V_i), i=1,\ldots,n\) are the i.i.d. observations
and \(\phi_{\mathrm{eff}}\) is the *influence function*.

The influence function can be extracted using the ``IC`` method
```{r}
head(IC(fit))
```

SessionInfo
============


```{r}
sessionInfo()
```