For recurrent events data it is often of interest to compute basic descriptive quantities to get some basic understanding of the phenonmenon studied. We here demonstrate how one can compute:
We also show how to improve the efficiency of recurrents events marginal mean.
In addition several tools can be used for simulating recurrent events and bivariate recurrent events data, also with a possible terminating event:
We start by simulating some recurrent events data with two type of events with cumulative hazards
where we consider types 1 and 2 and with a rate of the terminal event given by ΛD(t). We let the events be independent, but could also specify a random effects structure to generate dependence.
When simulating data we can impose various random-effects structures to generate dependence
Dependence=0: The intensities can be independent.
Dependence=1: We can one gamma distributed random effects Z. Then the intensities are
Dependence=2: We can draw normally distributed random effects Z1, Z2, Zd were the variance (var.z) and correlation can be specified (cor.mat). Then the intensities are
Dependence=3: We can draw gamma distributed random effects Z1, Z2, Zd
were the sum-structure can be speicifed via a matrix cor.mat. We compute
Z̃j = ∑kZkcor.mat(j, k)
for j = 1, 2, 3.
Then the intensities are
We return to how to run the different set-ups later and start by simulating independent processes.
The key functions are
In addition we can simulate data from the Ghosh-Lin model and where marginals of the rates among survivors are on on Cox form
see examples below for specific models.
We here mention two utility functions
We start by estimating the marginal mean E(N1(t ∧ D)) where D is the timing of the terminal event. The marginal mean is the average number of events seen before time t.
This is based on a rate model for
and is defined as μ1(t) = E(N1*(t)) where S(t) = P(D ≥ t) and dR1(t) = E(dN1*(t)|D > t)
and can therefore be estimated by a
where Y•(t) = ∑iYi(t) such that the estimator is
Cook & Lawless (1997), and developed further in Gosh & Lin (2000).
The variance can be estimated based on the asymptotic expansion of μ̂1(t) − μ1(t)
with mean-zero processes
as in Gosh & Lin (2000)
We start by generating some data to illustrate the computation of the marginal mean
data(base1cumhaz)
data(base4cumhaz)
data(drcumhaz)
ddr <- drcumhaz
base1 <- base1cumhaz
base4 <- base4cumhaz
rr <- simRecurrent(200,base1,death.cumhaz=ddr)
rr$x <- rnorm(nrow(rr))
rr$strata <- floor((rr$id-0.01)/100)
dlist(rr,.~id| id %in% c(1,7,9))
#> id: 1
#> entry time status rr2 dtime fdeath death start stop rr1 x
#> 1 0.0 451.1 1 1 3291 1 0 0.0 451.1 1 1.5212
#> 201 451.1 2687.9 1 1 3291 1 0 451.1 2687.9 1 0.3290
#> 337 2687.9 3290.8 0 1 3291 1 1 2687.9 3290.8 1 -0.4887
#> strata
#> 1 0
#> 201 0
#> 337 0
#> ------------------------------------------------------------
#> id: 7
#> entry time status rr2 dtime fdeath death start stop rr1 x strata
#> 7 0 658.3 0 1 658.3 1 1 0 658.3 1 -0.04719 0
#> ------------------------------------------------------------
#> id: 9
#> entry time status rr2 dtime fdeath death start stop rr1 x strata
#> 9 0.0 433.5 1 1 505.3 1 0 0.0 433.5 1 -0.3530 0
#> 205 433.5 505.3 0 1 505.3 1 1 433.5 505.3 1 0.7694 0
The status variable keeps track of the recurrent evnts and their type, and death the timing of death.
To compute the marginal mean we simly estimate the two rates functions of the number of events of interest and death by using the phreg function (to start without covariates). Then the estimates are combined with standard error computation in the recurrentMarginal function
# to fit non-parametric models with just a baseline
xr <- phreg(Surv(entry,time,status)~cluster(id),data=rr)
dr <- phreg(Surv(entry,time,death)~cluster(id),data=rr)
par(mfrow=c(1,3))
plot(dr,se=TRUE)
title(main="death")
plot(xr,se=TRUE)
# robust standard errors
rxr <- robust.phreg(xr,fixbeta=1)
plot(rxr,se=TRUE,robust=TRUE,add=TRUE,col=4)
# marginal mean of expected number of recurrent events
out <- recurrentMarginal(xr,dr)
plot(out,se=TRUE,ylab="marginal mean",col=2)
We can also extract the estimate in different time-points
summary(out,times=c(1000,2000))$pbaseci[[1]]
#> new.time mean se-mean CI-2.5% CI-97.5% strata
#> 259 1000 1.29 0.0989616 1.109913 1.499306 1
#> 363 2000 1.81 0.1381837 1.558450 2.102153 1
The marginal mean can also be estimated in a stratified case:
xr <- phreg(Surv(entry,time,status)~strata(strata)+cluster(id),data=rr)
dr <- phreg(Surv(entry,time,death)~strata(strata)+cluster(id),data=rr)
par(mfrow=c(1,3))
plot(dr,se=TRUE)
title(main="death")
plot(xr,se=TRUE)
rxr <- robust.phreg(xr,fixbeta=1)
plot(rxr,se=TRUE,robust=TRUE,add=TRUE,col=1:2)
out <- recurrentMarginal(xr,dr)
plot(out,se=TRUE,ylab="marginal mean",col=1:2)
Further, if we adjust for covariates for the two rates we can still do predictions of marginal mean, what can be plotted is the baseline marginal mean, that is for the covariates equal to 0 for both models. Predictions for specific covariates can also be obtained with the recmarg (recurren marginal mean used solely for predictions without standard error computation).
# cox case
xr <- phreg(Surv(entry,time,status)~x+cluster(id),data=rr)
dr <- phreg(Surv(entry,time,death)~x+cluster(id),data=rr)
par(mfrow=c(1,3))
plot(dr,se=TRUE)
title(main="death")
plot(xr,se=TRUE)
rxr <- robust.phreg(xr)
plot(rxr,se=TRUE,robust=TRUE,add=TRUE,col=1:2)
out <- recurrentMarginal(xr,dr)
plot(out,se=TRUE,ylab="marginal mean",col=1:2)
# predictions witout se's
outX <- recmarg(xr,dr,Xr=1,Xd=1)
bplot(outX,add=TRUE,col=3)
Here I simulate multiple types and two causes of death causes of death
rr <- simRecurrentIII(100,list(base1,base1,base4),death.cumhaz=list(ddr,base4),cens=3/5000,dependence=0)
dtable(rr,~status+death,level=2)
#>
#> status
#> death 0 1 2 3
#> 0 38 113 119 8
#> 1 51 0 0 0
#> 2 11 0 0 0
mets:::showfitsimIII(rr,list(base1,base1,base4),list(ddr,base4))
We now simulate some data where there is strong heterogenity such that we can improve the efficiency for censored survival data. The augmentation is a regression on the history for each subject consisting of the specified terms terms: Nt, Nt2 (Nt squared), expNt (exp(-Nt)), NtexpNt (Nt*exp(-Nt)) or by simply specifying these directly. This was developed in Cortese and Scheike (2022).
rr <- simRecurrentII(200,base1,base4,death.cumhaz=ddr,cens=3/5000,dependence=4,var.z=1)
rr <- count.history(rr)
rr <- transform(rr,statusD=status)
rr <- dtransform(rr,statusD=3,death==1)
dtable(rr,~statusD+status+death,level=2,response=1)
#>
#> statusD
#> status 0 1 2 3
#> 0 95 0 0 105
#> 1 0 287 0 0
#> 2 0 0 32 0
#>
#> statusD
#> death 0 1 2 3
#> 0 95 287 32 0
#> 1 0 0 0 105
xr <- phreg(Surv(start,stop,status==1)~cluster(id),data=rr)
dr <- phreg(Surv(start,stop,death)~cluster(id),data=rr)
# marginal mean of expected number of recurrent events
out <- recurrentMarginal(xr,dr)
times <- 500*(1:10)
recEFF1 <- recurrentMarginalAIPCW(Event(start,stop,statusD)~cluster(id),data=rr,times=times,cens.code=0,
death.code=3,cause=1,augment.model=~Nt)
with( recEFF1, cbind(times,muP,semuP,muPAt,semuPAt,semuPAt/semuP))
#> times muP semuP muPAt semuPAt
#> [1,] 500 0.7883893 0.08775967 0.8057351 0.08751332 0.9971929
#> [2,] 1000 1.1244065 0.13070316 1.1668609 0.12959723 0.9915386
#> [3,] 1500 1.5822884 0.19192917 1.6166893 0.18909020 0.9852082
#> [4,] 2000 2.1152757 0.28820098 2.1048191 0.26294823 0.9123780
#> [5,] 2500 2.6582616 0.40030720 2.4874446 0.31938256 0.7978437
#> [6,] 3000 3.0780951 0.52159370 2.7085295 0.36263192 0.6952383
#> [7,] 3500 3.4874035 0.57804925 2.9020299 0.43312174 0.7492817
#> [8,] 4000 3.7493609 0.62909428 3.0480868 0.48090357 0.7644380
#> [9,] 4500 3.7493609 0.62909428 3.0480868 0.48090357 0.7644380
#> [10,] 5000 3.7493609 0.62909428 3.0480868 0.48090357 0.7644380
times <- 500*(1:10)
###recEFF14 <- recurrentMarginalAIPCW(Event(start,stop,statusD)~cluster(id),data=rr,times=times,cens.code=0,
###death.code=3,cause=1,augment.model=~Nt+Nt2+expNt+NtexpNt)
###with(recEFF14,cbind(times,muP,semuP,muPAt,semuPAt,semuPAt/semuP))
recEFF14 <- recurrentMarginalAIPCW(Event(start,stop,statusD)~cluster(id),data=rr,times=times,cens.code=0,
death.code=3,cause=1,augment.model=~Nt+I(Nt^2)+I(exp(-Nt))+ I( Nt*exp(-Nt)))
with(recEFF14,cbind(times,muP,semuP,muPAt,semuPAt,semuPAt/semuP))
#> times muP semuP muPAt semuPAt
#> [1,] 500 0.7883893 0.08775967 0.7905297 0.08726473 0.9943603
#> [2,] 1000 1.1244065 0.13070316 1.1395816 0.12910248 0.9877533
#> [3,] 1500 1.5822884 0.19192917 1.5732924 0.18709270 0.9748008
#> [4,] 2000 2.1152757 0.28820098 1.9996717 0.25896134 0.8985443
#> [5,] 2500 2.6582616 0.40030720 2.3825506 0.30941522 0.7729444
#> [6,] 3000 3.0780951 0.52159370 2.4772598 0.34057804 0.6529566
#> [7,] 3500 3.4874035 0.57804925 2.5423090 0.39879811 0.6899033
#> [8,] 4000 3.7493609 0.62909428 2.5658396 0.42374850 0.6735850
#> [9,] 4500 3.7493609 0.62909428 2.5658396 0.42374850 0.6735850
#> [10,] 5000 3.7493609 0.62909428 2.5658396 0.42374850 0.6735850
bplot(out,se=TRUE,ylab="marginal mean",col=2)
k <- 1
for (t in times) {
ci1 <- c(recEFF1$muPAt[k]-1.96*recEFF1$semuPAt[k],
recEFF1$muPAt[k]+1.96*recEFF1$semuPAt[k])
ci2 <- c(recEFF1$muP[k]-1.96*recEFF1$semuP[k],
recEFF1$muP[k]+1.96*recEFF1$semuP[k])
lines(rep(t,2)-2,ci2,col=2,lty=1,lwd=2)
lines(rep(t,2)+2,ci1,col=1,lty=1,lwd=2)
k <- k+1
}
legend("bottomright",c("Eff-pred"),lty=1,col=c(1,3))
In the case where covariates might be important but we are still interested in the marginal mean we can also augment wrt these covariates
n <- 200
X <- matrix(rbinom(n*2,1,0.5),n,2)
colnames(X) <- paste("X",1:2,sep="")
###
r1 <- exp( X %*% c(0.3,-0.3))
rd <- exp( X %*% c(0.3,-0.3))
rc <- exp( X %*% c(0,0))
fz <- NULL
rr <- mets:::simGLcox(n,base1,ddr,var.z=0,r1=r1,rd=rd,rc=rc,fz,model="twostage",cens=3/5000)
rr <- cbind(rr,X[rr$id+1,])
dtable(rr,~statusD+status+death,level=2,response=1)
#>
#> statusD
#> status 0 1 3
#> 0 86 0 114
#> 1 0 584 0
#>
#> statusD
#> death 0 1 3
#> 0 86 396 0
#> 1 0 188 114
times <- seq(500,5000,by=500)
recEFF1x <- recurrentMarginalAIPCW(Event(start,stop,statusD)~cluster(id),data=rr,times=times,
cens.code=0,death.code=3,cause=1,augment.model=~X1+X2)
with(recEFF1x, cbind(muP,muPA,muPAt,semuP,semuPA,semuPAt,semuPAt/semuP))
#> muP muPA muPAt semuP semuPA semuPAt
#> [1,] 1.048797 1.041365 1.037753 0.08722635 0.08710396 0.08701823 0.9976140
#> [2,] 1.821124 1.803973 1.785595 0.15935610 0.15840140 0.15825132 0.9930672
#> [3,] 2.507231 2.459431 2.466326 0.22858120 0.22557601 0.22497497 0.9842234
#> [4,] 3.402783 3.306165 3.305969 0.33359534 0.32597388 0.32473520 0.9734405
#> [5,] 4.103953 4.067034 3.988912 0.42935713 0.42480571 0.42197375 0.9828036
#> [6,] 4.981871 4.960978 4.880440 0.61935904 0.61169264 0.60738655 0.9806696
#> [7,] 5.822659 5.903066 5.755781 0.85832570 0.83711396 0.82758340 0.9641834
#> [8,] 6.977489 7.043288 6.890696 1.17681304 1.14359505 1.10953199 0.9428277
#> [9,] 7.542910 7.616148 7.327451 1.25610973 1.23432570 1.20129472 0.9563613
#> [10,] 8.056929 7.993268 7.672123 1.33485953 1.32500121 1.29382677 0.9692606
xr <- phreg(Surv(start,stop,status==1)~cluster(id),data=rr)
dr <- phreg(Surv(start,stop,death)~cluster(id),data=rr)
out <- recurrentMarginal(xr,dr)
mets::summaryTimeobject(out$times,out$mu,times=times,se.mu=out$se.mu)
#> times mean se-mean CI-2.5% CI-97.5%
#> 1 500 0.7221083 0.04977059 0.6308601 0.8265547
#> 2 1000 0.9838696 0.08493678 0.8307159 1.1652591
#> 3 1500 1.1002233 0.10835948 0.9070801 1.3344923
#> 4 2000 1.1682893 0.13539495 0.9308967 1.4662207
#> 5 2500 1.2043173 0.15523610 0.9354473 1.5504671
#> 6 3000 1.2315813 0.17433185 0.9331947 1.6253764
#> 7 3500 1.2551414 0.19303517 0.9284943 1.6967040
#> 8 4000 1.2875013 0.22042825 0.9204792 1.8008658
#> 9 4500 1.3033452 0.23379433 0.9169970 1.8524691
#> 10 5000 1.3177487 0.24628963 0.9135631 1.9007572
One can also do regression modelling , using the model then Ghost-Lin suggested IPCW score equations that are implemented in the recreg function of mets.
First we generate data that from a Ghosh-Lin model with β = (−0.3, 0.3) and the baseline given by base1, this is done under the assumption that the death rate given covariates are on Cox form with baseline ddr:
n <- 100
X <- matrix(rbinom(n*2,1,0.5),n,2)
colnames(X) <- paste("X",1:2,sep="")
###
r1 <- exp( X %*% c(0.3,-0.3))
rd <- exp( X %*% c(0.3,-0.3))
rc <- exp( X %*% c(0,0))
fz <- NULL
rr <- mets:::simGLcox(n,base1,ddr,var.z=1,r1=r1,rd=rd,rc=rc,fz,cens=1/5000,type=2)
rr <- cbind(rr,X[rr$id+1,])
out <- recreg(Event(start,stop,statusD)~X1+X2+cluster(id),data=rr,cause=1,death.code=3,cens.code=0)
outs <- recreg(Event(start,stop,statusD)~X1+X2+cluster(id),data=rr,cause=1,death.code=3,cens.code=0,
cens.model=~strata(X1,X2))
summary(out)$coef
#> Estimate S.E. dU^-1/2 P-value
#> X1 0.3382084 0.4404489 0.09219668 0.4425633
#> X2 -0.5158296 0.4534145 0.08860515 0.2552643
summary(outs)$coef
#> Estimate S.E. dU^-1/2 P-value
#> X1 0.1141424 0.3903867 0.09359494 0.7699939
#> X2 -0.5951611 0.4079918 0.08906403 0.1446319
## checking baseline
par(mfrow=c(1,1))
plot(out)
plot(outs,add=TRUE,col=2)
lines(scalecumhaz(base1,1),col=3,lwd=2)
We note that for the extended censoring model we gain a little efficiency and that the estimates are close to the true values.
Also possible to do IPCW regression at fixed time-point
outipcw <- recregIPCW(Event(start,stop,statusD)~X1+X2+cluster(id),data=rr,cause=1,death.code=3,
cens.code=0,times=2000)
outipcws <- recregIPCW(Event(start,stop,statusD)~X1+X2+cluster(id),data=rr,cause=1,death.code=3,
cens.code=0,times=2000,cens.model=~strata(X1,X2))
summary(outipcw)$coef
#> Estimate Std.Err 2.5% 97.5% P-value
#> (Intercept) 1.59438391 0.2243043 1.154756 2.0340123 1.176266e-12
#> X1 0.01778425 0.3431120 -0.654703 0.6902715 9.586624e-01
#> X2 -0.58372246 0.3499427 -1.269598 0.1021527 9.530553e-02
summary(outipcws)$coef
#> Estimate Std.Err 2.5% 97.5% P-value
#> (Intercept) 1.50914396 0.2244572 1.0692159 1.9490721 1.773794e-11
#> X1 0.09860899 0.3321168 -0.5523279 0.7495459 7.665348e-01
#> X2 -0.47761299 0.3330131 -1.1303067 0.1750808 1.515104e-01
We can also do the Mao-Lin type composite outcome where we both count the cause 1 and deaths for example
out <- recreg(Event(start,stop,statusD)~X1+X2+cluster(id),data=rr,cause=c(1,3),
death.code=3,cens.code=0)
summary(out)$coef
#> Estimate S.E. dU^-1/2 P-value
#> X1 0.3043595 0.3823177 0.08592529 0.4259794
#> X2 -0.4742779 0.3962738 0.08286462 0.2313675
Also demonstrate that this can be done with competing risks death (change some of the cause 3 deaths to cause 4) and with weights w1, w2 that follow the causes, here 1 and 3.
rr$binf <- rbinom(nrow(rr),1,0.5)
rr$statusDC <- rr$statusD
rr <- dtransform(rr,statusDC=4, statusD==3 & binf==0)
rr$weight <- 1
rr <- dtransform(rr,weight=2,statusDC==3)
outC <- recreg(Event(start,stop,statusDC)~X1+X2+cluster(id),data=rr,cause=c(1,3),
death.code=c(3,4),cens.code=0)
summary(outC)$coef
#> Estimate S.E. dU^-1/2 P-value
#> X1 0.3040292 0.4085561 0.08891149 0.4567825
#> X2 -0.5025987 0.4253220 0.08588151 0.2373287
outCW <- recreg(Event(start,stop,statusDC)~X1+X2+cluster(id),data=rr,cause=c(1,3),
death.code=c(3,4),cens.code=0,wcomp=c(1,2))
summary(outCW)$coef
#> Estimate S.E. dU^-1/2 P-value
#> X1 0.2740041 0.3817591 0.08597675 0.4729172
#> X2 -0.4908230 0.4009705 0.08339607 0.2209192
bplot(out,ylab="Mean composite")
bplot(outC,col=2,add=TRUE)
bplot(outCW,col=3,add=TRUE)
Predictions and standard errors can be computed via the iid decompositions of the baseline and the regression coefficients. We illustrate this for the standard Ghosh-Lin model and it requires that the model is fitted with the option cox.prep=TRUE
out <- recreg(Event(start,stop,statusD)~X1+X2+cluster(id),data=rr,cause=1,death.code=3,cens.code=0,
cox.prep=TRUE)
summary(out)
#>
#> n events
#> 626 526
#>
#> 100 clusters
#> coeffients:
#> Estimate S.E. dU^-1/2 P-value
#> X1 0.338208 0.440449 0.092197 0.4426
#> X2 -0.515830 0.453415 0.088605 0.2553
#>
#> exp(coeffients):
#> Estimate 2.5% 97.5%
#> X1 1.40243 0.59152 3.3250
#> X2 0.59701 0.24549 1.4519
baseiid <- IIDbaseline.cifreg(out,time=3000)
GLprediid(baseiid,rr[1:5,])
#> pred se-log lower upper
#> [1,] 7.491016 0.3329986 3.900244 14.38764
#> [2,] 7.491016 0.3329986 3.900244 14.38764
#> [3,] 7.491016 0.3329986 3.900244 14.38764
#> [4,] 7.491016 0.3329986 3.900244 14.38764
#> [5,] 7.491016 0.3329986 3.900244 14.38764
The Ghosh-Lin model can be made more efficient by the regression augmentation method. First computing the augmentation and then in a second step the augmented estimator (Cortese and Scheike (2023)):
outA <- recreg(Event(start,stop,statusD)~X1+X2+cluster(id),data=rr,cause=1,death.code=3,
cens.code=0,augment.model=~Nt+X1+X2)
summary(outA)$coef
#> Estimate S.E. dU^-1/2 P-value
#> X1 0.3561797 0.4059643 0.09238904 0.3802872
#> X2 -0.6573313 0.4122291 0.08988930 0.1108067
We note that the simple augmentation improves the standard errors as expected. The data was generated assuming independence with previous number of events so it would suffice to augment only with the covariates.
Above we simulated data with a terminal event on Cox form and recurrent events satisfying the Ghosh-Lin model.
Now we fit the two-stage model (the recreg must be called with cox.prep=TRUE)
out <- recreg(Event(start,stop,statusD)~X1+X2+cluster(id),data=rr,
cause=1,death.code=3,cens.code=0,cox.prep=TRUE)
outs <- phreg(Event(start,stop,statusD==3)~X1+X2+cluster(id),data=rr)
tsout <- twostageREC(outs,out,data=rr)
summary(tsout)
#> Ghosh-Lin(recurrent)-Cox(terminal) mean model
#>
#> 100 clusters
#> coeffients:
#> Estimate Std.Err 2.5% 97.5% P-value
#> dependence1 0.70754 0.11720 0.47783 0.93725 0
#>
#> var,shared:
#> Estimate Std.Err 2.5% 97.5% P-value
#> dependence1 0.70754 0.11720 0.47783 0.93725 0
Standard errors are computed assuming that the parameters of out and outs are both known, and therefore propobly a bit to small. We could do a bootstrap to get more reliable standard errors.
The function simGLcox can simulate data where the recurrent process has mean on Ghosh-Lin form. The key is that where Z is a possible frailty. Therefore leads to a Ghosh-Lin model. We can choose the survival model to have Cox form among survivors by the option model=“twostage”, otherwise model=“frailty” uses the survival model with rate Zλd(t)rd. The Z is gamma distributed with a variance that can be specified. The simulations are based on a piecwise-linear approximation of the hazard functions for S(t|X, Z) and R(t|X, Z).
n <- 100
X <- matrix(rbinom(n*2,1,0.5),n,2)
colnames(X) <- paste("X",1:2,sep="")
###
r1 <- exp( X %*% c(0.3,-0.3))
rd <- exp( X %*% c(0.3,-0.3))
rc <- exp( X %*% c(0,0))
rr <- mets:::simGLcox(n,base1,ddr,var.z=0,r1=r1,rd=rd,rc=rc,model="twostage",cens=3/5000)
rr <- cbind(rr,X[rr$id+1,])
We can also simulate from models where the terminal event is on Cox form and the rate among survivors is on Cox form.
underlying these models we have a shared frailty model
rr <- mets:::simGLcox(100,base1,ddr,var.z=1,r1=r1,rd=rd,rc=rc,type=3,cens=3/5000)
rr <- cbind(rr,X[rr$id+1,])
margsurv <- phreg(Surv(start,stop,statusD==3)~X1+X2+cluster(id),rr)
recurrent <- phreg(Surv(start,stop,statusD==1)~X1+X2+cluster(id),rr)
estimate(margsurv)
#> Estimate Std.Err 2.5% 97.5% P-value
#> X1 0.5954 0.2680 0.07017 1.1207 0.0263
#> X2 -0.3676 0.2593 -0.87589 0.1407 0.1563
estimate(recurrent)
#> Estimate Std.Err 2.5% 97.5% P-value
#> X1 0.9617 0.2707 0.4311 1.4923 0.0003816
#> X2 -0.8636 0.2535 -1.3604 -0.3668 0.0006568
par(mfrow=c(1,2));
plot(margsurv); lines(ddr,col=3);
plot(recurrent); lines(base1,col=3)
We can simulate data with underlying dependence fromm the two-stage model (simGLcox) or using simRecurrent random effects models, for Cox-Cox or Ghosh-Lin-Cox models.
Here with marginals on - Cox- Cox form - Ghosh-Lin - Cox form
Draws covariates from data and simulates data that has the marginals given.
The mean is a useful summary measure but it is very easy and useful to look at other simple summary measures such as the probability of exceeding k events
that is thus equivalent to a certain cumulative incidence of Tk occurring before D. We denote this cumulative incidence as F̂k(t).
We note also that N1*(t)2 can be written as with f(k) = (k + 1)2 − k2, such that its mean can be written as and estimated by That is very similar to the “product-limit” estimator for E((N1*(t))2)
We use the esimator of the probabilty of exceeding “k” events based on the fact that I(N1*(t) ≥ k) is equivalent to suggesting that its mean can be computed as and estimated by
To compute these estimators we use the prob.exceed.recurrent function
rr <- simRecurrentII(200,base1,base4,death.cumhaz=ddr,cens=3/5000,dependence=4,var.z=1)
rr <- transform(rr,statusD=status)
rr <- dtransform(rr,statusD=3,death==1)
rr <- count.history(rr)
dtable(rr,~statusD)
#>
#> statusD
#> 0 1 2 3
#> 82 254 19 118
oo <- prob.exceed.recurrent(Event(entry,time,statusD)~cluster(id),rr,cause=1,death.code=3)
plot(oo,types=1:5)
We can also look at the mean and variance based on the estimators just described
We now generate recurrent events with two types of events. We start by generating data as before where all events are independent.
rr <- simRecurrentII(200,base1,cumhaz2=base4,death.cumhaz=ddr)
rr <- count.history(rr)
dtable(rr,~death+status)
#>
#> status 0 1 2
#> death
#> 0 16 562 93
#> 1 184 0 0
Based on this we can estimate also the joint distribution function, that is the probability that (N1(t) ≥ k1, N2(t) ≥ k2)
Using the normally distributed random effects we plot 4 different settings. We have variance 0.5 for all random effects and change the correlation. We let the correlation between the random effect associated with N1 and N2 be denoted ρ12 and the correlation between the random effects associated between Nj and D the terminal event be denoted as ρj3, and organize all correlation in a vector ρ = (ρ12, ρ13, ρ23).
data(base1cumhaz)
data(base4cumhaz)
data(drcumhaz)
dr <- drcumhaz
base1 <- base1cumhaz
base4 <- base4cumhaz
par(mfrow=c(1,3))
var.z <- c(0.5,0.5,0.5)
# death related to both causes in same way
cor.mat <- corM <- rbind(c(1.0, 0.0, 0.0), c(0.0, 1.0, 0.0), c(0.0, 0.0, 1.0))
rr <- simRecurrentII(200,base1,base4,death.cumhaz=dr,var.z=var.z,cor.mat=cor.mat,dependence=2)
rr <- count.history(rr,types=1:2)
### cor(attr(rr,"z"))
### coo <- covarianceRecurrent(rr,1,2,status="status",start="entry",stop="time")
### plot(coo,main ="Scenario I")
var.z <- c(0.5,0.5,0.5)
# death related to both causes in same way
cor.mat <- corM <- rbind(c(1.0, 0.0, 0.5), c(0.0, 1.0, 0.5), c(0.5, 0.5, 1.0))
rr <- simRecurrentII(200,base1,base4,death.cumhaz=dr,var.z=var.z,cor.mat=cor.mat,dependence=2)
rr <- count.history(rr,types=1:2)
### coo <- covarianceRecurrent(rr,1,2,status="status",start="entry",stop="time")
### par(mfrow=c(1,3))
### plot(coo,main ="Scenario II")
var.z <- c(0.5,0.5,0.5)
# positive dependence for N1 and N2 all related in same way
cor.mat <- corM <- rbind(c(1.0, 0.5, 0.5), c(0.5, 1.0, 0.5), c(0.5, 0.5, 1.0))
rr <- simRecurrentII(200,base1,base4,death.cumhaz=dr,var.z=var.z,cor.mat=cor.mat,dependence=2)
rr <- count.history(rr,types=1:2)
### coo <- covarianceRecurrent(rr,1,2,status="status",start="entry",stop="time")
### par(mfrow=c(1,3))
### plot(coo,main="Scenario III")
var.z <- c(0.5,0.5,0.5)
# negative dependence for N1 and N2 all related in same way
cor.mat <- corM <- rbind(c(1.0, -0.4, 0.5), c(-0.4, 1.0, 0.5), c(0.5, 0.5, 1.0))
rr <- simRecurrentII(200,base1,base4,death.cumhaz=dr,var.z=var.z,cor.mat=cor.mat,dependence=2)
rr <- count.history(rr,types=1:2)
### coo <- covarianceRecurrent(rr,1,2,status="status",start="entry",stop="time")
### par(mfrow=c(1,3))
### plot(coo,main="Scenario IV")
sessionInfo()
#> R version 4.4.2 (2024-10-31)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.2 LTS
#>
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: Etc/UTC
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] splines stats graphics grDevices utils datasets methods
#> [8] base
#>
#> other attached packages:
#> [1] ggplot2_3.5.1 cowplot_1.1.3 timereg_2.0.6 survival_3.8-3 mets_1.3.5
#> [6] rmarkdown_2.29
#>
#> loaded via a namespace (and not attached):
#> [1] sass_0.4.9 future_1.34.0 lattice_0.22-6
#> [4] listenv_0.9.1 digest_0.6.37 magrittr_2.0.3
#> [7] evaluate_1.0.3 grid_4.4.2 mvtnorm_1.3-3
#> [10] fastmap_1.2.0 jsonlite_1.9.0 Matrix_1.7-2
#> [13] scales_1.3.0 isoband_0.2.7 codetools_0.2-20
#> [16] numDeriv_2016.8-1.1 jquerylib_0.1.4 lava_1.8.1
#> [19] cli_3.6.4 rlang_1.1.5 parallelly_1.42.0
#> [22] future.apply_1.11.3 munsell_0.5.1 withr_3.0.2
#> [25] cachem_1.1.0 yaml_2.3.10 tools_4.4.2
#> [28] parallel_4.4.2 ucminf_1.2.2 colorspace_2.1-1
#> [31] globals_0.16.3 buildtools_1.0.0 vctrs_0.6.5
#> [34] R6_2.6.1 lifecycle_1.0.4 MASS_7.3-64
#> [37] pkgconfig_2.0.3 bslib_0.9.0 pillar_1.10.1
#> [40] gtable_0.3.6 glue_1.8.0 Rcpp_1.0.14
#> [43] xfun_0.51 tibble_3.2.1 sys_3.4.3
#> [46] knitr_1.49 farver_2.1.2 htmltools_0.5.8.1
#> [49] labeling_0.4.3 maketools_1.3.2 compiler_4.4.2