| Title: | Analysis of Multivariate Event Times |
|---|---|
| Description: | Implementation of various statistical models for multivariate event history data <doi:10.1007/s10985-013-9244-x>. Including multivariate cumulative incidence models <doi:10.1002/sim.6016>, and bivariate random effects probit models (Liability models) <doi:10.1016/j.csda.2015.01.014>. Modern methods for survival analysis, including regression modelling (Cox, Fine-Gray, Ghosh-Lin, Binomial regression) with fast computation of influence functions. |
| Authors: | Klaus K. Holst [aut, cre], Thomas Scheike [aut] |
| Maintainer: | Klaus K. Holst <[email protected]> |
| License: | Apache License (== 2.0) |
| Version: | 1.3.11 |
| Built: | 2026-06-02 14:56:44 UTC |
| Source: | https://github.com/kkholst/mets |
Fits a fast Lin-Ying additive hazards model with a possibly stratified baseline. Robust variance is the default variance estimate in the summary.
aalenMets(formula, data = data, no.baseline = FALSE, ...)aalenMets(formula, data = data, no.baseline = FALSE, ...)
formula |
Formula with a 'Surv' outcome (similar to |
data |
Data frame. |
no.baseline |
Logical; if |
... |
Additional arguments passed to |
Influence functions (IID) follow the numerical order of the given cluster variable.
Ordering by $id aligns the IID terms with the dataset order.
An object of class "aalenMets" (extends "phreg") containing:
coef |
Estimated coefficients. |
var |
Robust variance-covariance matrix. |
iid |
Influence functions. |
intZHZ |
Integrated ZHZ matrix. |
gamma |
Coefficient estimates. |
Thomas Scheike
data(bmt) bmt$time <- bmt$time + runif(408) * 0.001 out <- aalenMets(Surv(time, cause == 1) ~ tcell + platelet + age, data = bmt) summary(out) ## Comparison with timereg::aalen ## out2 <- timereg::aalen( ## Surv(time, cause == 1) ~ const(tcell) + const(platelet) + const(age), ## data = bmt ## ) ## summary(out2)data(bmt) bmt$time <- bmt$time + runif(408) * 0.001 out <- aalenMets(Surv(time, cause == 1) ~ tcell + platelet + age, data = bmt) summary(out) ## Comparison with timereg::aalen ## out2 <- timereg::aalen( ## Surv(time, cause == 1) ~ const(tcell) + const(platelet) + const(age), ## data = bmt ## ) ## summary(out2)
Data from speff2trial
Randomized study
Hammer et al. 1996, speff2trial package.
data(ACTG175)data(ACTG175)
Estimates the bivariate cumulative incidence function (concordance) for paired
data (e.g., twins, family members) in the presence of competing risks. The function
handles both the IPCW (Inverse Probability of Censoring Weighting) estimator and
the Aalen-Johansen estimator (via prodlim).
bicomprisk( formula, data, cause = c(1, 1), cens = 0, causes, indiv, strata = NULL, id, num, max.clust = 1000, marg = NULL, se.clusters = NULL, wname = NULL, prodlim = FALSE, messages = TRUE, model, return.data = 0, uniform = 0, conservative = 1, resample.iid = 1, ... )bicomprisk( formula, data, cause = c(1, 1), cens = 0, causes, indiv, strata = NULL, id, num, max.clust = 1000, marg = NULL, se.clusters = NULL, wname = NULL, prodlim = FALSE, messages = TRUE, model, return.data = 0, uniform = 0, conservative = 1, resample.iid = 1, ... )
formula |
Formula with an |
data |
Data frame containing the variables. |
cause |
Vector of cause codes for which to estimate the bivariate cumulative incidence
(default |
cens |
Censoring code (default 0). |
causes |
Vector of all possible causes (optional, inferred from data if missing). |
indiv |
Variable indicating individual within a pair (optional, inferred from |
strata |
Variable for stratification (optional, can be specified in formula). |
id |
Clustering variable (pair ID). Required. |
num |
Variable for numbering individuals within pairs (optional, auto-generated if missing). |
max.clust |
Maximum number of clusters to use for IID decomposition in |
marg |
Optional marginal cumulative incidence object (from |
se.clusters |
Vector of cluster indices or column name in |
wname |
Name of an additional weight variable for paired competing risks data. |
prodlim |
Logical; if TRUE, uses the |
messages |
Control amount of output (0 = silent, 1 = messages). |
model |
Type of competing risk model for |
return.data |
If 1, returns the reshaped data; if 2, returns only the data; otherwise returns the model. |
uniform |
Logical; if TRUE, computes uniform standard errors based on resampling. |
conservative |
Logical; if TRUE, uses conservative standard errors (recommended for large datasets). |
resample.iid |
Logical; if TRUE, returns IID residual processes for further computations. |
... |
Additional arguments passed to |
The concordance function is defined as the probability that both members
of a pair experience the event of interest by time :
where is the event time and is the cause of failure for individual .
The function supports:
Stratified analysis (e.g., by zygosity in twin studies).
Clustering for robust standard errors.
Both IPCW and Aalen-Johansen estimation methods.
Resampling for uniform standard errors.
IID decomposition for further inference (e.g., casewise concordance tests).
An object of class "bicomprisk" (or "bicomprisk.strata" if stratified) containing:
model |
List of fitted models for each stratum (or a single model). |
strata |
Names of strata (if applicable). |
N |
Number of strata (if applicable). |
time |
Event times. |
P1 |
Bivariate cumulative incidence estimates. |
se.P1 |
Standard errors of the estimates. |
P1.iid |
IID decomposition (if |
clusters |
Cluster assignments (if |
Thomas Scheike, Klaus K. Holst
Scheike, T. H.; Holst, K. K. & Hjelmborg, J. B. (2014). Estimating twin concordance for bivariate competing risks twin data. Statistics in Medicine, 33, 1193-1204.
library("timereg") ## Simulated data example prt <- sim_nordic_random(2000,delayed=TRUE,ptrunc=0.7, cordz=0.5,cormz=2,lam0=0.3) ## Bivariate competing risk, concordance estimates p11 <- bicomprisk(Event(time,cause)~strata(zyg)+id(id),data=prt,cause=c(1,1)) p11mz <- p11$model$"MZ" p11dz <- p11$model$"DZ" par(mfrow=c(1,2)) ## Concordance plot(p11mz,ylim=c(0,0.1)); plot(p11dz,ylim=c(0,0.1)); ## Entry time, truncation weighting ### Other weighting procedure prtl <- prt[!prt$truncated,] prt2 <- ipw2(prtl,cluster="id",same.cens=TRUE, time="time",cause="cause",entrytime="entry", pairs=TRUE,strata="zyg",obs.only=TRUE) prt22 <- fast.reshape(prt2,id="id") prt22$event <- (prt22$cause1==1)*(prt22$cause2==1)*1 prt22$timel <- pmax(prt22$time1,prt22$time2) ipwc <- timereg::comp.risk(Event(timel,event)~-1+factor(zyg1), data=prt22,cause=1,n.sim=0,model="rcif2",times=50:90, weights=prt22$weights1,cens.weights=rep(1,nrow(prt22))) p11wmz <- ipwc$cum[,2] p11wdz <- ipwc$cum[,3] lines(ipwc$cum[,1],p11wmz,col=3) lines(ipwc$cum[,1],p11wdz,col=3)library("timereg") ## Simulated data example prt <- sim_nordic_random(2000,delayed=TRUE,ptrunc=0.7, cordz=0.5,cormz=2,lam0=0.3) ## Bivariate competing risk, concordance estimates p11 <- bicomprisk(Event(time,cause)~strata(zyg)+id(id),data=prt,cause=c(1,1)) p11mz <- p11$model$"MZ" p11dz <- p11$model$"DZ" par(mfrow=c(1,2)) ## Concordance plot(p11mz,ylim=c(0,0.1)); plot(p11dz,ylim=c(0,0.1)); ## Entry time, truncation weighting ### Other weighting procedure prtl <- prt[!prt$truncated,] prt2 <- ipw2(prtl,cluster="id",same.cens=TRUE, time="time",cause="cause",entrytime="entry", pairs=TRUE,strata="zyg",obs.only=TRUE) prt22 <- fast.reshape(prt2,id="id") prt22$event <- (prt22$cause1==1)*(prt22$cause2==1)*1 prt22$timel <- pmax(prt22$time1,prt22$time2) ipwc <- timereg::comp.risk(Event(timel,event)~-1+factor(zyg1), data=prt22,cause=1,n.sim=0,model="rcif2",times=50:90, weights=prt22$weights1,cens.weights=rep(1,nrow(prt22))) p11wmz <- ipwc$cum[,2] p11wdz <- ipwc$cum[,3] lines(ipwc$cum[,1],p11wmz,col=3) lines(ipwc$cum[,1],p11wdz,col=3)
The pairwise pairwise odds ratio model provides an alternative to the alternating logistic regression (ALR).
binomial_twostage( margbin, data = parent.frame(), method = "nr", detail = 0, clusters = NULL, silent = 1, weights = NULL, theta = NULL, theta.des = NULL, var.link = 0, var.par = 1, var.func = NULL, iid = 1, notaylor = 1, model = "plackett", marginal.p = NULL, beta.iid = NULL, Dbeta.iid = NULL, strata = NULL, max.clust = NULL, se.clusters = NULL, numDeriv = 0, random.design = NULL, pairs = NULL, dim.theta = NULL, additive.gamma.sum = NULL, pair.ascertained = 0, case.control = 0, no.opt = FALSE, twostage = 1, beta = NULL, ... )binomial_twostage( margbin, data = parent.frame(), method = "nr", detail = 0, clusters = NULL, silent = 1, weights = NULL, theta = NULL, theta.des = NULL, var.link = 0, var.par = 1, var.func = NULL, iid = 1, notaylor = 1, model = "plackett", marginal.p = NULL, beta.iid = NULL, Dbeta.iid = NULL, strata = NULL, max.clust = NULL, se.clusters = NULL, numDeriv = 0, random.design = NULL, pairs = NULL, dim.theta = NULL, additive.gamma.sum = NULL, pair.ascertained = 0, case.control = 0, no.opt = FALSE, twostage = 1, beta = NULL, ... )
margbin |
Marginal binomial model |
data |
data frame |
method |
Scoring method "nr", for lava NR optimizer |
detail |
Detail |
clusters |
Cluster variable |
silent |
Debug information |
weights |
Weights for log-likelihood, can be used for each type of outcome in 2x2 tables. |
theta |
Starting values for variance components |
theta.des |
design for dependence parameters, when pairs are given the indeces of the theta-design for this pair, is given in pairs as column 5 |
var.link |
Link function for variance |
var.par |
parametrization |
var.func |
when alternative parametrizations are used this function can specify how the paramters are related to the |
iid |
Calculate i.i.d. decomposition when iid>=1, when iid=2 then avoids adding the uncertainty for marginal paramters for additive gamma model (default). |
notaylor |
Taylor expansion |
model |
model |
marginal.p |
vector of marginal probabilities |
beta.iid |
iid decomposition of marginal probability estimates for each subject, if based on GLM model this is computed. |
Dbeta.iid |
derivatives of marginal model wrt marginal parameters, if based on GLM model this is computed. |
strata |
strata for fitting: considers only pairs where both are from same strata |
max.clust |
max clusters |
se.clusters |
clusters for iid decomposition for roubst standard errors |
numDeriv |
uses Fisher scoring aprox of second derivative if 0, otherwise numerical derivatives |
random.design |
random effect design for additive gamma model, when pairs are given the indeces of the pairs random.design rows are given as columns 3:4 |
pairs |
matrix with rows of indeces (two-columns) for the pairs considered in the pairwise composite score, useful for case-control sampling when marginal is known. |
dim.theta |
dimension of theta when pairs and pairs specific design is given. That is when pairs has 6 columns. |
additive.gamma.sum |
this is specification of the lamtot in the models via a matrix that is multiplied onto the parameters theta (dimensions=(number random effects x number of theta parameters), when null then sums all parameters. Default is a matrix of 1's |
pair.ascertained |
if pairs are sampled only when there are events in the pair i.e. Y1+Y2>=1. |
case.control |
if data is case control data for pair call, and here 2nd column of pairs are probands (cases or controls) |
no.opt |
for not optimizing |
twostage |
default twostage=1, to fit MLE use twostage=0 |
beta |
is starting value for beta for MLE version |
... |
for NR of lava |
The reported standard errors are based on a cluster corrected score equations from the pairwise likelihoods assuming that the marginals are known. This gives correct standard errors in the case of the Odds-Ratio model (Plackett distribution) for dependence, but incorrect standard errors for the Clayton-Oakes types model (that is also called "gamma"-frailty). For the additive gamma version of the standard errors are adjusted for the uncertainty in the marginal models via an iid deomposition using the iid() function of lava. For the clayton oakes model that is not speicifed via the random effects these can be fixed subsequently using the iid influence functions for the marginal model, but typically this does not change much.
For the Clayton-Oakes version of the model, given the gamma distributed random effects it is assumed that the probabilities are indpendent, and that the marginal survival functions are on logistic form
therefore conditional on the random effect the probability of the event is
Can also fit a structured additive gamma random effects model, such the ACE, ADE model for survival data:
Now random.design specificies the random effects for each subject within a cluster. This is
a matrix of 1's and 0's with dimension n x d. With d random effects.
For a cluster with two subjects, we let the random.design rows be
and .
Such that the random effects for subject
1 is
, for d random effects. Each random effect
has an associated parameter . By construction
subjects 1's random effect are Gamma distributed with
mean
and variance . Note that the random effect
has mean 1 and variance .
It is here asssumed that is fixed over all clusters
as it would be for the ACE model below.
The DEFAULT parametrization uses the variances of the random effecs (var.par=1)
For alternative parametrizations (var.par=0) one can specify how the parameters relate
to with the function
Based on these parameters the relative contribution (the heritability, h) is
equivalent to the expected values of the random effects
Given the random effects the probabilities are independent and on the form
with the inverse laplace of the gamma distribution with mean 1 and variance lamtot.
The parameters
are related to the parameters of the model
by a regression construction (d x k), that links the
parameters
with the (k) underlying parameters
here using theta.des to specify these low-dimension association. Default is a diagonal matrix.
Thomas Scheike
Two-stage binomial modelling
data(twinstut) twinstut0 <- subset(twinstut, tvparnr<4000) twinstut <- twinstut0 twinstut$binstut <- (twinstut$stutter=="yes")*1 theta.des <- model.matrix( ~-1+factor(zyg),data=twinstut) margbin <- glm(binstut~factor(sex)+age,data=twinstut,family=binomial()) bin <- binomial_twostage(margbin,data=twinstut,var.link=1, clusters=twinstut$tvparnr,theta.des=theta.des,detail=0) summary(bin) twinstut$cage <- scale(twinstut$age) theta.des <- model.matrix( ~-1+factor(zyg)+cage,data=twinstut) bina <- binomial_twostage(margbin,data=twinstut,var.link=1, clusters=twinstut$tvparnr,theta.des=theta.des) summary(bina) theta.des <- model.matrix( ~-1+factor(zyg)+factor(zyg)*cage,data=twinstut) bina <- binomial_twostage(margbin,data=twinstut,var.link=1, clusters=twinstut$tvparnr,theta.des=theta.des) summary(bina) ### use of clayton oakes binomial additive gamma model ########################################################### ## Reduce Ex.Timings data <- sim_binClaytonOakes_family_ace(1000,2,1,beta=NULL,alpha=NULL) margbin <- glm(ybin~x,data=data,family=binomial()) margbin head(data) data$number <- c(1,2,3,4) data$child <- 1*(data$number==3) ### make ace random effects design out <- ace_family_design(data,member="type",id="cluster") out$pardes head(out$des.rv) bints <- binomial_twostage(margbin,data=data, clusters=data$cluster,detail=0,var.par=1, theta=c(2,1),var.link=0, random.design=out$des.rv,theta.des=out$pardes) summary(bints) data <- sim_binClaytonOakes_twin_ace(1000,2,1,beta=NULL,alpha=NULL) out <- twin_polygen_design(data,id="cluster",zygname="zygosity") out$pardes head(out$des.rv) margbin <- glm(ybin~x,data=data,family=binomial()) bintwin <- binomial_twostage(margbin,data=data, clusters=data$cluster,var.par=1, theta=c(2,1),random.design=out$des.rv,theta.des=out$pardes) summary(bintwin) concordanceTwinACE(bintwin)data(twinstut) twinstut0 <- subset(twinstut, tvparnr<4000) twinstut <- twinstut0 twinstut$binstut <- (twinstut$stutter=="yes")*1 theta.des <- model.matrix( ~-1+factor(zyg),data=twinstut) margbin <- glm(binstut~factor(sex)+age,data=twinstut,family=binomial()) bin <- binomial_twostage(margbin,data=twinstut,var.link=1, clusters=twinstut$tvparnr,theta.des=theta.des,detail=0) summary(bin) twinstut$cage <- scale(twinstut$age) theta.des <- model.matrix( ~-1+factor(zyg)+cage,data=twinstut) bina <- binomial_twostage(margbin,data=twinstut,var.link=1, clusters=twinstut$tvparnr,theta.des=theta.des) summary(bina) theta.des <- model.matrix( ~-1+factor(zyg)+factor(zyg)*cage,data=twinstut) bina <- binomial_twostage(margbin,data=twinstut,var.link=1, clusters=twinstut$tvparnr,theta.des=theta.des) summary(bina) ### use of clayton oakes binomial additive gamma model ########################################################### ## Reduce Ex.Timings data <- sim_binClaytonOakes_family_ace(1000,2,1,beta=NULL,alpha=NULL) margbin <- glm(ybin~x,data=data,family=binomial()) margbin head(data) data$number <- c(1,2,3,4) data$child <- 1*(data$number==3) ### make ace random effects design out <- ace_family_design(data,member="type",id="cluster") out$pardes head(out$des.rv) bints <- binomial_twostage(margbin,data=data, clusters=data$cluster,detail=0,var.par=1, theta=c(2,1),var.link=0, random.design=out$des.rv,theta.des=out$pardes) summary(bints) data <- sim_binClaytonOakes_twin_ace(1000,2,1,beta=NULL,alpha=NULL) out <- twin_polygen_design(data,id="cluster",zygname="zygosity") out$pardes head(out$des.rv) margbin <- glm(ybin~x,data=data,family=binomial()) bintwin <- binomial_twostage(margbin,data=data, clusters=data$cluster,var.par=1, theta=c(2,1),random.design=out$des.rv,theta.des=out$pardes) summary(bintwin) concordanceTwinACE(bintwin)
Fits a binomial regression model for a specific time point in the presence of right-censored data and competing risks. This function implements the Inverse Probability of Censoring Weighted (IPCW) estimating equation approach.
binreg( formula, data, cause = 1, time = NULL, beta = NULL, type = c("II", "I"), offset = NULL, weights = NULL, cens.weights = NULL, cens.model = ~+1, se = TRUE, kaplan.meier = TRUE, cens.code = 0, no.opt = FALSE, method = "nr", augmentation = NULL, outcome = c("cif", "rmst", "rmtl"), model = c("default", "logit", "exp", "lin"), Ydirect = NULL, ... )binreg( formula, data, cause = 1, time = NULL, beta = NULL, type = c("II", "I"), offset = NULL, weights = NULL, cens.weights = NULL, cens.model = ~+1, se = TRUE, kaplan.meier = TRUE, cens.code = 0, no.opt = FALSE, method = "nr", augmentation = NULL, outcome = c("cif", "rmst", "rmtl"), model = c("default", "logit", "exp", "lin"), Ydirect = NULL, ... )
formula |
A formula object specifying the outcome and covariates.
The outcome must be an |
data |
A data frame containing the variables in the formula. |
cause |
Numeric vector or scalar indicating the cause of interest for the competing risks. |
time |
Numeric scalar indicating the time point of interest for the cumulative incidence. |
beta |
Optional numeric vector of starting values for the coefficients. Defaults to zeros. |
type |
Character string. Either |
offset |
Optional numeric vector of offsets for the linear predictor. |
weights |
Optional numeric vector of weights for the score equations. |
cens.weights |
Optional numeric vector of pre-calculated censoring weights. If NULL, they are estimated internally. |
cens.model |
A formula specifying the censoring model. Defaults to |
se |
Logical. If TRUE, computes standard errors based on IPCW influence functions. |
kaplan.meier |
Logical. If TRUE, uses Kaplan-Meier estimator for IPCW weights; otherwise uses exponential baseline. |
cens.code |
Numeric code representing censored observations in the status variable. Defaults to 0. |
no.opt |
Logical. If TRUE, optimization is skipped and starting values are used. |
method |
Character string. Optimization method: |
augmentation |
Optional numeric vector for additional augmentation terms. |
outcome |
Character string. Outcome type: |
model |
Character string. Link function: |
Ydirect |
Optional numeric vector. If provided, this outcome is used instead of constructing one from |
... |
Additional arguments passed to lower-level optimization functions. |
The model estimates the probability:
Based on binomial regresion IPCW response estimating equation:
where
is IPCW adjustment of the response
. Two types of estimators are available:
type="I": Solves the standard IPCW estimating equation.
type="II": Adds a censoring augmentation term for efficiency gains,
solving
.
Alternatively, logitIPCW performs a standard logistic regression with
IPCW weights applied directly to the likelihood. Thus solving
a standard logistic regression with IPCW weights.
The variance estimation is based on squared influence functions, with options for naive variance (assuming known censoring) and robust variance (accounting for censoring model estimation).
Censoring model may depend on strata (cens.model=~strata(gX)).
An object of class "binreg" containing coefficients, variance-covariance matrix,
influence functions (iid), and model details.
Blanche PF, Holt A, Scheike T (2022). "On logistic regression with right censored data, with or without competing risks, and its use for estimating treatment effects." Lifetime data analysis, 29, 441–482.
Scheike TH, Zhang MJ, Gerds TA (2008). "Predicting cumulative incidence probability by direct binomial regression." Biometrika, 95(1), 205–220.
Thomas Scheike
data(bmt); bmt$time <- bmt$time+runif(408)*0.001 # logistic regresion with IPCW binomial regression out <- binreg(Event(time,cause)~tcell+platelet,bmt,time=50) summary(out) head(iid(out)) predict(out,data.frame(tcell=c(0,1),platelet=c(1,1)),se=TRUE) outs <- binreg(Event(time,cause)~tcell+platelet,bmt,time=50,cens.model=~strata(tcell,platelet)) summary(outs) ## glm with IPCW weights outl <- logitIPCW(Event(time,cause)~tcell+platelet,bmt,time=50) summary(outl) ########################################## ### risk-ratio of different causes ####### ########################################## data(bmt) bmt$id <- 1:nrow(bmt) bmt$status <- bmt$cause bmt$strata <- 1 bmtdob <- bmt bmtdob$strata <-2 bmtdob <- dtransform(bmtdob,status=1,cause==2) bmtdob <- dtransform(bmtdob,status=2,cause==1) ### bmtdob <- rbind(bmt,bmtdob) dtable(bmtdob,cause+status~strata) cif1 <- cif(Event(time,cause)~+1,bmt,cause=1) cif2 <- cif(Event(time,cause)~+1,bmt,cause=2) plot(cif1) plot(cif2,add=TRUE,col=2) cifs1 <- binreg(Event(time,cause)~tcell+platelet+age,bmt,cause=1,time=50) cifs2 <- binreg(Event(time,cause)~tcell+platelet+age,bmt,cause=2,time=50) summary(cifs1) summary(cifs2) cifdob <- binreg(Event(time,status)~-1+factor(strata)+ tcell*factor(strata)+platelet*factor(strata)+age*factor(strata) +cluster(id),bmtdob,cause=1,time=50,cens.model=~strata(strata)) summary(cifdob) head(iid(cifdob)) newdata <- data.frame(tcell=1,platelet=1,age=0,strata=1:2,id=1) riskratio <- function(p) { cifdob$coef <- p p <- predict(cifdob,newdata,se=0) return(p[1]/p[2]) } lava::estimate(cifdob,f=riskratio) predict(cifdob,newdata) (p1 <- predict(cifs1,newdata)) (p2 <- predict(cifs2,newdata)) p1[1,1]/p2[1,1]data(bmt); bmt$time <- bmt$time+runif(408)*0.001 # logistic regresion with IPCW binomial regression out <- binreg(Event(time,cause)~tcell+platelet,bmt,time=50) summary(out) head(iid(out)) predict(out,data.frame(tcell=c(0,1),platelet=c(1,1)),se=TRUE) outs <- binreg(Event(time,cause)~tcell+platelet,bmt,time=50,cens.model=~strata(tcell,platelet)) summary(outs) ## glm with IPCW weights outl <- logitIPCW(Event(time,cause)~tcell+platelet,bmt,time=50) summary(outl) ########################################## ### risk-ratio of different causes ####### ########################################## data(bmt) bmt$id <- 1:nrow(bmt) bmt$status <- bmt$cause bmt$strata <- 1 bmtdob <- bmt bmtdob$strata <-2 bmtdob <- dtransform(bmtdob,status=1,cause==2) bmtdob <- dtransform(bmtdob,status=2,cause==1) ### bmtdob <- rbind(bmt,bmtdob) dtable(bmtdob,cause+status~strata) cif1 <- cif(Event(time,cause)~+1,bmt,cause=1) cif2 <- cif(Event(time,cause)~+1,bmt,cause=2) plot(cif1) plot(cif2,add=TRUE,col=2) cifs1 <- binreg(Event(time,cause)~tcell+platelet+age,bmt,cause=1,time=50) cifs2 <- binreg(Event(time,cause)~tcell+platelet+age,bmt,cause=2,time=50) summary(cifs1) summary(cifs2) cifdob <- binreg(Event(time,status)~-1+factor(strata)+ tcell*factor(strata)+platelet*factor(strata)+age*factor(strata) +cluster(id),bmtdob,cause=1,time=50,cens.model=~strata(strata)) summary(cifdob) head(iid(cifdob)) newdata <- data.frame(tcell=1,platelet=1,age=0,strata=1:2,id=1) riskratio <- function(p) { cifdob$coef <- p p <- predict(cifdob,newdata,se=0) return(p[1]/p[2]) } lava::estimate(cifdob,f=riskratio) predict(cifdob,newdata) (p1 <- predict(cifs1,newdata)) (p2 <- predict(cifs2,newdata)) p1[1,1]/p2[1,1]
Estimates the average treatment effect (ATE) for censored
competing risks data using binomial regression with Inverse Probability of
Censoring Weighting (IPCW).
binregATE( formula, data, cause = 1, time = NULL, beta = NULL, treat.model = ~+1, cens.model = ~+1, offset = NULL, weights = NULL, cens.weights = NULL, se = TRUE, type = c("II", "I"), kaplan.meier = TRUE, cens.code = 0, no.opt = FALSE, method = "nr", augmentation = NULL, outcome = c("cif", "rmst", "rmtl"), model = c("default", "logit", "exp", "lin"), Ydirect = NULL, typeATE = "II", ... )binregATE( formula, data, cause = 1, time = NULL, beta = NULL, treat.model = ~+1, cens.model = ~+1, offset = NULL, weights = NULL, cens.weights = NULL, se = TRUE, type = c("II", "I"), kaplan.meier = TRUE, cens.code = 0, no.opt = FALSE, method = "nr", augmentation = NULL, outcome = c("cif", "rmst", "rmtl"), model = c("default", "logit", "exp", "lin"), Ydirect = NULL, typeATE = "II", ... )
formula |
A formula object specifying the outcome and covariates (see |
data |
A data frame containing the variables in the formula. |
cause |
Numeric scalar indicating the cause of interest for competing risks. |
time |
Numeric scalar indicating the time point of interest. |
beta |
Optional numeric vector of starting values for the coefficients. |
treat.model |
A formula specifying the logistic treatment model given covariates
(e.g., |
cens.model |
A formula specifying the censoring model. Only stratified Cox models
without covariates are supported (e.g., |
offset |
Optional numeric vector of offsets for the partial likelihood. |
weights |
Optional numeric vector of weights for the score equations. |
cens.weights |
Optional numeric vector of pre-calculated censoring weights.
If |
se |
Logical. If |
type |
Character string. Either |
kaplan.meier |
Logical. If |
cens.code |
Numeric code representing censored observations in the status variable. |
no.opt |
Logical. If |
method |
Character string. Optimization method: |
augmentation |
Optional numeric vector for augmenting binomial regression. |
outcome |
Character string. Outcome type: |
model |
Character string. Link function for the outcome model: |
Ydirect |
Optional numeric vector. Use this outcome Y with IPCW version instead of
constructing one from |
typeATE |
Character string. Either |
... |
Additional arguments passed to lower-level functions (e.g., |
Under standard causal assumptions, the ATE can be estimated. These assumptions include:
Consistency: The observed outcome equals the potential outcome under the observed treatment.
Ignorability: (treatment assignment is independent of potential outcomes given covariates).
Positivity: All treatment levels have non-zero probability given covariates.
The first covariate in the competing risks regression model must be the treatment variable,
which should be coded as a factor. If the factor has more than two levels, multinomial
logistic regression (mlogit) is used for propensity score modeling. In the absence
of censoring, this reduces to ordinary logistic regression.
The ATE is estimated using standard doubly robust estimating equations that are
IPCW-censoring adjusted. As an alternative to binomial regression,
logitIPCWATE provides an IPCW-weighted version of standard logistic regression.
When typeATE = "II", the estimating equation is augmented with:
when estimating the mean outcome for the treated group.
An object of class c("binreg", "ATE") containing:
coef |
Estimated coefficients from the outcome model. |
riskDR |
Double-robust marginal risk estimates for each treatment level. |
riskG |
G-formula marginal risk estimates for each treatment level. |
difriskDR |
Difference in risks (ATE) using double-robust estimator. |
difriskG |
Difference in risks (ATE) using G-formula estimator. |
riskDR.iid, riskG.iid
|
Influence functions for marginal risk estimates. |
var, var.riskDR, var.riskG
|
Variance-covariance matrices. |
se.coef, se.riskDR, se.riskG
|
Standard errors. |
Blanche PF, Holt A, Scheike T (2022). "On logistic regression with right censored data, with or without competing risks, and its use for estimating treatment effects." Lifetime Data Analysis, 29, 441–482.
Thomas Scheike
binreg, logitIPCWATE, logitATE,
binregG
[kumarsim()] [kumarsimRCT()]
data(bmt) dfactor(bmt) <- ~. brs <- binregATE(Event(time,cause)~tcell.f+platelet+age,bmt,time=50,cause=1, treat.model=tcell.f~platelet+age) summary(brs) head(brs$riskDR.iid) head(brs$riskG.iid) brsi <- binregATE(Event(time,cause)~tcell.f+tcell.f*platelet+tcell.f*age,bmt,time=50,cause=1, treat.model=tcell.f~platelet+age) summary(brsi) head(brs$riskDR.iid) head(brs$riskG.iid)data(bmt) dfactor(bmt) <- ~. brs <- binregATE(Event(time,cause)~tcell.f+platelet+age,bmt,time=50,cause=1, treat.model=tcell.f~platelet+age) summary(brs) head(brs$riskDR.iid) head(brs$riskG.iid) brsi <- binregATE(Event(time,cause)~tcell.f+tcell.f*platelet+tcell.f*age,bmt,time=50,cause=1, treat.model=tcell.f~platelet+age) summary(brsi) head(brs$riskDR.iid) head(brs$riskG.iid)
Estimates the casewise concordance based on concordance and marginal estimates
obtained from binreg objects. Uses cluster-based IID for standard errors,
which are often better than those from casewise (which can be conservative).
binregCasewise(concbreg, margbreg, zygs = c("DZ", "MZ"), newdata = NULL, ...)binregCasewise(concbreg, margbreg, zygs = c("DZ", "MZ"), newdata = NULL, ...)
concbreg |
Concordance object from |
margbreg |
Marginal estimate object from |
zygs |
Order of zygosity for estimation (e.g., |
newdata |
Data frame to give instead of |
... |
Arguments passed to |
A list containing:
coef |
Exponentiated coefficients (ratios). |
logcoef |
Log-scale coefficients and standard errors. |
Thomas Scheike
data(prt) prt <- force_same_cens(prt,cause="status") dd <- bicompriskData(Event(time, status)~strata(zyg)+id(id), data=prt, cause=c(2, 2)) newdata <- data.frame(zyg=c("DZ","MZ"),id=1) ## concordance bcif1 <- binreg(Event(time,status)~-1+factor(zyg)+cluster(id), data=dd, time=80, cause=1, cens.model=~strata(zyg)) pconc <- predict(bcif1,newdata) ## marginal estimates mbcif1 <- binreg(Event(time,status)~cluster(id), data=prt, time=80, cause=2) mc <- predict(mbcif1,newdata) cse <- binregCasewise(bcif1,mbcif1) csedata(prt) prt <- force_same_cens(prt,cause="status") dd <- bicompriskData(Event(time, status)~strata(zyg)+id(id), data=prt, cause=c(2, 2)) newdata <- data.frame(zyg=c("DZ","MZ"),id=1) ## concordance bcif1 <- binreg(Event(time,status)~-1+factor(zyg)+cluster(id), data=dd, time=80, cause=1, cens.model=~strata(zyg)) pconc <- predict(bcif1,newdata) ## marginal estimates mbcif1 <- binreg(Event(time,status)~cluster(id), data=prt, time=80, cause=2) mc <- predict(mbcif1,newdata) cse <- binregCasewise(bcif1,mbcif1) cse
Computes the G-estimator (G-formula) for standardized risk estimates based on a
fitted binreg object. The G-estimator standardizes predictions over the
covariate distribution in the data:
binregG(x, data, Avalues = NULL, varname = NULL)binregG(x, data, Avalues = NULL, varname = NULL)
x |
An object of class |
data |
A data frame containing the covariates to be used for averaging the risk estimates. This should ideally be the same data used to fit the model, or a representative sample. |
Avalues |
Numeric or factor vector specifying the values of the first covariate
(
|
varname |
Optional character string specifying the name of the variable to be
treated as the treatment/exposure variable. If |
This function assumes that the first covariate in the original model formula
represents the treatment or exposure variable (). It calculates the
marginal risk for specified values of by averaging the conditional
predictions over the observed covariate distribution .
The function returns influence functions for these risk estimates, allowing for the computation of standard errors and confidence intervals.
If the first covariate is a factor, contrasts between all levels are computed
automatically. If it is continuous, specific values must be provided via
Avalues.
An object of class "survivalG" containing:
risk |
A table of standardized risk estimates for each value of |
risk.iid |
Influence functions for the standardized risk estimates. |
difference |
Pairwise differences in risks between levels of |
ratio |
Risk ratios between levels of |
vcov |
Variance-covariance matrix of the risk estimates. |
model |
The link function used in the original model. |
Blanche PF, Holt A, Scheike T (2022). "On logistic regression with right censored data, with or without competing risks, and its use for estimating treatment effects." Lifetime Data Analysis, 29, 441–482.
Thomas Scheike
data(bmt); bmt$time <- bmt$time+runif(408)*0.001 bmt$event <- (bmt$cause!=0)*1 b1 <- binreg(Event(time,cause)~age+tcell+platelet,bmt,cause=1,time=50) sb1 <- binregG(b1,bmt,Avalues=c(0,1,2)) summary(sb1)data(bmt); bmt$time <- bmt$time+runif(408)*0.001 bmt$event <- (bmt$cause!=0)*1 b1 <- binreg(Event(time,cause)~age+tcell+platelet,bmt,cause=1,time=50) sb1 <- binregG(b1,bmt,Avalues=c(0,1,2)) summary(sb1)
Estimates the percentage of the restricted mean time lost (RMTL) that is attributable to a specific cause and models how covariates affect this percentage using IPCW regression.
binregRatio( formula, data, cause = 1, time = NULL, beta = NULL, type = c("III", "II", "I"), offset = NULL, weights = NULL, cens.weights = NULL, cens.model = ~+1, se = TRUE, relative.to.causes = NULL, kaplan.meier = TRUE, cens.code = 0, no.opt = FALSE, method = "nr", augmentation = NULL, outcome = c("rmtl", "cif"), model = c("logit", "exp", "lin"), Ydirect = NULL, ... )binregRatio( formula, data, cause = 1, time = NULL, beta = NULL, type = c("III", "II", "I"), offset = NULL, weights = NULL, cens.weights = NULL, cens.model = ~+1, se = TRUE, relative.to.causes = NULL, kaplan.meier = TRUE, cens.code = 0, no.opt = FALSE, method = "nr", augmentation = NULL, outcome = c("rmtl", "cif"), model = c("logit", "exp", "lin"), Ydirect = NULL, ... )
formula |
Formula with an outcome (see |
data |
Data frame containing the variables. |
cause |
Numeric code of the cause of interest. |
time |
Time point |
beta |
Starting values for optimization (default NULL, uses zeros). |
type |
Type of estimator: |
offset |
Offsets for the partial likelihood. |
weights |
Weights for the score equations. |
cens.weights |
External censoring weights (if provided, |
cens.model |
Formula for the censoring model (default |
se |
Logical; if TRUE, computes standard errors based on IPCW (default TRUE). |
relative.to.causes |
If not NULL, compares the RMTL of the specified |
kaplan.meier |
Logical; if TRUE, uses Kaplan-Meier for IPCW weights; if FALSE,
uses |
cens.code |
Censoring code (default 0). |
no.opt |
Logical; if TRUE, skips optimization and uses |
method |
Optimization method: |
augmentation |
Initial augmentation term (used for type "II" and "III"). |
outcome |
Outcome type: |
model |
Link function: |
Ydirect |
Matrix with two columns (numerator, denominator) to use directly as the response. |
... |
Additional arguments passed to lower-level functions. |
Let the total years lost be and the years lost due to cause 1 be
. The function models the ratio:
Estimation is based on a binomial regression IPCW response estimating equation:
where is the IPCW adjustment.
The function supports three types of estimators:
"I": Classical outcome IPCW regression (no augmentation).
"II": Adds a censoring augmentation term
to improve efficiency (requires an initial estimate of ).
"III": Adds a more complex augmentation term separating the expectations of
and for further efficiency gains.
The variance is based on the squared influence functions (IID). A "naive" variance (assuming known censoring) is also provided for comparison.
An object of class "binreg" and "ratio" containing:
coef |
Coefficient estimates. |
se.coef |
Standard errors. |
var |
Variance-covariance matrix. |
iid |
Influence function decomposition (with censoring adjustment). |
iidI |
Influence function without censoring adjustment. |
naive.var |
Variance assuming known censoring. |
time |
Time point used. |
cause |
Cause of interest. |
Causes |
Set of causes considered in the denominator. |
Yipcw |
IPCW-adjusted response matrix. |
coefI, varI
|
Results from the initial (type "I") fit. |
augmentation |
Final augmentation term used. |
Thomas Scheike
Scheike, T. & Tanaka, S. (2025). Restricted mean time lost ratio regression: Percentage of restricted mean time lost due to specific cause. WIP.
data(bmt); bmt$time <- bmt$time+runif(408)*0.001 rmtl30 <- rmstIPCW(Event(time,cause!=0)~platelet+tcell+age, bmt, time=30, cause=1, outcome="rmtl") rmtl301 <- rmstIPCW(Event(time,cause)~platelet+tcell+age, bmt, time=30, cause=1) rmtl302 <- rmstIPCW(Event(time,cause)~platelet+tcell+age, bmt, time=30, cause=2) estimate(rmtl30) estimate(rmtl301) estimate(rmtl302) ## Percentage of total RMTL due to cause 1 rmtlratioI <- rmtlRatio(Event(time,cause)~platelet+tcell+age, bmt, time=30, cause=1) summary(rmtlratioI) newdata <- data.frame(platelet=1, tcell=1, age=1) pp <- predict(rmtlratioI, newdata) pp ## Percentage of total cumulative incidence due to cause 1 cifratio <- binregRatio(Event(time,cause)~platelet+tcell+age, bmt, time=30, cause=1, model="cif") summary(cifratio) pp <- predict(cifratio, newdata) ppdata(bmt); bmt$time <- bmt$time+runif(408)*0.001 rmtl30 <- rmstIPCW(Event(time,cause!=0)~platelet+tcell+age, bmt, time=30, cause=1, outcome="rmtl") rmtl301 <- rmstIPCW(Event(time,cause)~platelet+tcell+age, bmt, time=30, cause=1) rmtl302 <- rmstIPCW(Event(time,cause)~platelet+tcell+age, bmt, time=30, cause=2) estimate(rmtl30) estimate(rmtl301) estimate(rmtl302) ## Percentage of total RMTL due to cause 1 rmtlratioI <- rmtlRatio(Event(time,cause)~platelet+tcell+age, bmt, time=30, cause=1) summary(rmtlratioI) newdata <- data.frame(platelet=1, tcell=1, age=1) pp <- predict(rmtlratioI, newdata) pp ## Percentage of total cumulative incidence due to cause 1 cifratio <- binregRatio(Event(time,cause)~platelet+tcell+age, bmt, time=30, cause=1, model="cif") summary(cifratio) pp <- predict(cifratio, newdata) pp
Estimates the average treatment effect of treatment regime
under two-stage randomization. The estimator can be augmented using information from
both randomizations and dynamic censoring augmentation to improve efficiency.
binregTSR( formula, data, cause = 1, time = NULL, cens.code = 0, response.code = NULL, augmentR0 = NULL, treat.model0 = ~+1, augmentR1 = NULL, treat.model1 = ~+1, augmentC = NULL, cens.model = ~+1, estpr = c(1, 1), response.name = NULL, offset = NULL, weights = NULL, cens.weights = NULL, beta = NULL, kaplan.meier = TRUE, no.opt = FALSE, method = "nr", augmentation = NULL, outcome = c("cif", "rmst", "rmst-cause"), model = "exp", Ydirect = NULL, return.dataw = 0, pi0 = 0.5, pi1 = 0.5, cens.time.fixed = 1, outcome.iid = 1, meanCs = 0, ... )binregTSR( formula, data, cause = 1, time = NULL, cens.code = 0, response.code = NULL, augmentR0 = NULL, treat.model0 = ~+1, augmentR1 = NULL, treat.model1 = ~+1, augmentC = NULL, cens.model = ~+1, estpr = c(1, 1), response.name = NULL, offset = NULL, weights = NULL, cens.weights = NULL, beta = NULL, kaplan.meier = TRUE, no.opt = FALSE, method = "nr", augmentation = NULL, outcome = c("cif", "rmst", "rmst-cause"), model = "exp", Ydirect = NULL, return.dataw = 0, pi0 = 0.5, pi1 = 0.5, cens.time.fixed = 1, outcome.iid = 1, meanCs = 0, ... )
formula |
Formula with outcome (see |
data |
Data frame containing all variables. |
cause |
Cause of interest for competing risks (default 1). |
time |
Time point for estimation. |
cens.code |
Censoring code (default 0). |
response.code |
Code of status indicating response at which 2nd randomization occurs. |
augmentR0 |
Covariates for augmentation model of the first randomization. |
treat.model0 |
Logistic treatment model for the first randomization. |
augmentR1 |
Covariates for augmentation model of the second randomization. |
treat.model1 |
Logistic treatment model for the second randomization. |
augmentC |
Covariates for censoring augmentation model. |
cens.model |
Stratification for censoring model based on observed covariates. |
estpr |
Logical; estimate randomization probabilities using model (default TRUE). |
response.name |
Name of response variable (reads from |
offset |
Not implemented. |
weights |
Not implemented. |
cens.weights |
Can be provided externally. |
beta |
Starting values for optimization. |
kaplan.meier |
Logical; use Kaplan-Meier for censoring weights rather than exp cumulative hazard. |
no.opt |
Not implemented. |
method |
Not implemented. |
augmentation |
Not implemented. |
outcome |
Outcome type: |
model |
Not implemented, uses linear regression for augmentation. |
Ydirect |
Use this Y instead of outcome constructed inside the program. |
return.dataw |
Logical; return weighted data for all treatment regimes. |
pi0 |
Known randomization probabilities for first randomization. |
pi1 |
Known randomization probabilities for second randomization. |
cens.time.fixed |
Logical; use time-dependent weights for censoring estimation. |
outcome.iid |
Logical; get iid contribution from outcome model. |
meanCs |
Logical; indicates censoring augmentation is centered by |
... |
Additional arguments to lower-level functions. |
The method solves the estimating equation:
where:
uses covariates from augmentR0
uses covariates from augmentR1
is the censoring augmentation
Standard errors are estimated using the influence function of all estimators, enabling tests of differences to be computed subsequently. The method handles both survival data and competing risks data, and supports multiple treatment levels.
An object of class "binregTSR" containing:
riskG |
Simple estimator results (coefficient, SE). |
riskG0 |
First randomization augmentation results. |
riskG1 |
Second randomization augmentation results. |
riskG01 |
Both randomizations augmentation results. |
riskG.iid |
Influence functions for all estimators. |
varG |
Variance-covariance matrices. |
MGc |
Censoring martingale contributions. |
CensAugment.times |
Censoring augmentation terms. |
dynCens.coef |
Dynamic censoring coefficients. |
dataW |
Weighted data (if |
Thomas Scheike
Scheike, T. H. (2024). Two-stage randomization analysis for survival data. mets package documentation.
ddf <- mets:::gsim(200,covs=1,null=0,cens=1,ce=2) bb <- binregTSR(Event(entry,time,status)~+1+cluster(id),ddf$datat,time=2,cause=c(1), cens.code=0,treat.model0=A0.f~+1,treat.model1=A1.f~A0.f, augmentR1=~X11+X12+TR,augmentR0=~X01+X02, augmentC=~A01+A02+X01+X02+A11t+A12t+X11+X12+TR, response.code=2) summary(bb)ddf <- mets:::gsim(200,covs=1,null=0,cens=1,ce=2) bb <- binregTSR(Event(entry,time,status)~+1+cluster(id),ddf$datat,time=2,cause=c(1), cens.code=0,treat.model0=A0.f~+1,treat.model1=A1.f~A0.f, augmentR1=~X11+X12+TR,augmentR0=~X01+X02, augmentC=~A01+A02+X01+X02+A11t+A12t+X11+X12+TR, response.code=2) summary(bb)
Bivariate Probit model
biprobit( x, data, id, rho = ~1, num = NULL, strata = NULL, eqmarg = TRUE, indep = FALSE, weights = NULL, weights.fun = function(x) ifelse(any(x <= 0), 0, max(x)), randomeffect = FALSE, vcov = "robust", pairs.only = FALSE, allmarg = !is.null(weights), control = list(trace = 0), messages = 1, constrain = NULL, table = pairs.only, p = NULL, ... )biprobit( x, data, id, rho = ~1, num = NULL, strata = NULL, eqmarg = TRUE, indep = FALSE, weights = NULL, weights.fun = function(x) ifelse(any(x <= 0), 0, max(x)), randomeffect = FALSE, vcov = "robust", pairs.only = FALSE, allmarg = !is.null(weights), control = list(trace = 0), messages = 1, constrain = NULL, table = pairs.only, p = NULL, ... )
x |
formula (or vector) |
data |
data.frame |
id |
The name of the column in the dataset containing the cluster id-variable. |
rho |
Formula specifying the regression model for the dependence parameter |
num |
Optional name of order variable |
strata |
Strata |
eqmarg |
If TRUE same marginals are assumed (exchangeable) |
indep |
Independence |
weights |
Weights |
weights.fun |
Function defining the bivariate weight in each cluster |
randomeffect |
If TRUE a random effect model is used (otherwise correlation parameter is estimated allowing for both negative and positive dependence) |
vcov |
Type of standard errors to be calculated |
pairs.only |
Include complete pairs only? |
allmarg |
Should all marginal terms be included |
control |
Control argument parsed on to the optimization routine. Starting values may be parsed as ' |
messages |
Control amount of messages shown |
constrain |
Vector of parameter constraints (NA where free). Use this to set an offset. |
table |
Type of estimation procedure |
p |
Parameter vector p in which to evaluate log-Likelihood and score function |
... |
Optional arguments |
data(prt) prt0 <- subset(prt,country=="Denmark") a <- biprobit(cancer~1+zyg, ~1+zyg, data=prt0, id="id") predict(a, newdata=lava::Expand(prt, zyg=c("MZ"))) ## b <- biprobit(cancer~1+zyg, ~1+zyg, data=prt0, id="id",pairs.only=TRUE) ## predict(b,newdata=lava::Expand(prt,zyg=c("MZ","DZ"))) ## Reduce Ex.Timings n <- 2e3 x <- sort(runif(n, -1, 1)) y <- rmvn(n, c(0,0), rho=cbind(tanh(x)))>0 d <- data.frame(y1=y[,1], y2=y[,2], x=x) dd <- fast.reshape(d) a <- biprobit(y~1+x,rho=~1+x,data=dd,id="id") summary(a, mean.contrast=c(1,.5), cor.contrast=c(1,.5)) with(predict(a,data.frame(x=seq(-1,1,by=.1))), plot(p00~x,type="l")) pp <- predict(a,data.frame(x=seq(-1,1,by=.1)),which=c(1)) plot(pp[,1]~pp$x, type="l", xlab="x", ylab="Concordance", lwd=2, xaxs="i") lava::confband(pp$x,pp[,2],pp[,3],polygon=TRUE,lty=0,col=lava::Col(1)) pp <- predict(a,data.frame(x=seq(-1,1,by=.1)),which=c(9)) ## rho plot(pp[,1]~pp$x, type="l", xlab="x", ylab="Correlation", lwd=2, xaxs="i") lava::confband(pp$x,pp[,2],pp[,3],polygon=TRUE,lty=0,col=lava::Col(1)) with(pp, lines(x,tanh(-x),lwd=2,lty=2)) xp <- seq(-1,1,length.out=6); delta <- mean(diff(xp)) a2 <- biprobit(y~1+x,rho=~1+I(cut(x,breaks=xp)),data=dd,id="id") pp2 <- predict(a2,data.frame(x=xp[-1]-delta/2),which=c(9)) ## rho lava::confband(pp2$x,pp2[,2],pp2[,3],center=pp2[,1]) ## Time ## Not run: a <- biprobit.time(cancer~1, rho=~1+zyg, id="id", data=prt, eqmarg=TRUE, cens.formula=Surv(time,status==0)~1, breaks=seq(75,100,by=3),fix.censweights=TRUE) a <- biprobit.time2(cancer~1+zyg, rho=~1+zyg, id="id", data=prt0, eqmarg=TRUE, cens.formula=Surv(time,status==0)~zyg, breaks=100) #a1 <- biprobit.time2(cancer~1, rho=~1, id="id", data=subset(prt0,zyg=="MZ"), eqmarg=TRUE, # cens.formula=Surv(time,status==0)~1, # breaks=100,pairs.only=TRUE) #a2 <- biprobit.time2(cancer~1, rho=~1, id="id", data=subset(prt0,zyg=="DZ"), eqmarg=TRUE, # cens.formula=Surv(time,status==0)~1, # breaks=100,pairs.only=TRUE) ## End(Not run)data(prt) prt0 <- subset(prt,country=="Denmark") a <- biprobit(cancer~1+zyg, ~1+zyg, data=prt0, id="id") predict(a, newdata=lava::Expand(prt, zyg=c("MZ"))) ## b <- biprobit(cancer~1+zyg, ~1+zyg, data=prt0, id="id",pairs.only=TRUE) ## predict(b,newdata=lava::Expand(prt,zyg=c("MZ","DZ"))) ## Reduce Ex.Timings n <- 2e3 x <- sort(runif(n, -1, 1)) y <- rmvn(n, c(0,0), rho=cbind(tanh(x)))>0 d <- data.frame(y1=y[,1], y2=y[,2], x=x) dd <- fast.reshape(d) a <- biprobit(y~1+x,rho=~1+x,data=dd,id="id") summary(a, mean.contrast=c(1,.5), cor.contrast=c(1,.5)) with(predict(a,data.frame(x=seq(-1,1,by=.1))), plot(p00~x,type="l")) pp <- predict(a,data.frame(x=seq(-1,1,by=.1)),which=c(1)) plot(pp[,1]~pp$x, type="l", xlab="x", ylab="Concordance", lwd=2, xaxs="i") lava::confband(pp$x,pp[,2],pp[,3],polygon=TRUE,lty=0,col=lava::Col(1)) pp <- predict(a,data.frame(x=seq(-1,1,by=.1)),which=c(9)) ## rho plot(pp[,1]~pp$x, type="l", xlab="x", ylab="Correlation", lwd=2, xaxs="i") lava::confband(pp$x,pp[,2],pp[,3],polygon=TRUE,lty=0,col=lava::Col(1)) with(pp, lines(x,tanh(-x),lwd=2,lty=2)) xp <- seq(-1,1,length.out=6); delta <- mean(diff(xp)) a2 <- biprobit(y~1+x,rho=~1+I(cut(x,breaks=xp)),data=dd,id="id") pp2 <- predict(a2,data.frame(x=xp[-1]-delta/2),which=c(9)) ## rho lava::confband(pp2$x,pp2[,2],pp2[,3],center=pp2[,1]) ## Time ## Not run: a <- biprobit.time(cancer~1, rho=~1+zyg, id="id", data=prt, eqmarg=TRUE, cens.formula=Surv(time,status==0)~1, breaks=seq(75,100,by=3),fix.censweights=TRUE) a <- biprobit.time2(cancer~1+zyg, rho=~1+zyg, id="id", data=prt0, eqmarg=TRUE, cens.formula=Surv(time,status==0)~zyg, breaks=100) #a1 <- biprobit.time2(cancer~1, rho=~1, id="id", data=subset(prt0,zyg=="MZ"), eqmarg=TRUE, # cens.formula=Surv(time,status==0)~1, # breaks=100,pairs.only=TRUE) #a2 <- biprobit.time2(cancer~1, rho=~1, id="id", data=subset(prt0,zyg=="DZ"), eqmarg=TRUE, # cens.formula=Surv(time,status==0)~1, # breaks=100,pairs.only=TRUE) ## End(Not run)
Sample blockwise from clustered data
blocksample(data, size, idvar = NULL, replace = TRUE, ...)blocksample(data, size, idvar = NULL, replace = TRUE, ...)
data |
Data frame |
size |
Size of samples |
idvar |
Column defining the clusters |
replace |
Logical indicating wether to sample with replacement |
... |
additional arguments to lower level functions |
Original id is stored in the attribute 'id'
data.frame
Klaus K. Holst
d <- data.frame(x=rnorm(5), z=rnorm(5), id=c(4,10,10,5,5), v=rnorm(5)) (dd <- blocksample(d,size=20,~id)) attributes(dd)$id ## Not run: blocksample(data.table::data.table(d),1e6,~id) ## End(Not run) d <- data.frame(x=c(1,rnorm(9)), z=rnorm(10), id=c(4,10,10,5,5,4,4,5,10,5), id2=c(1,1,2,1,2,1,1,1,1,2), v=rnorm(10)) dsample(d,~id, size=2) dsample(d,.~id+id2) dsample(d,x+z~id|x>0,size=5)d <- data.frame(x=rnorm(5), z=rnorm(5), id=c(4,10,10,5,5), v=rnorm(5)) (dd <- blocksample(d,size=20,~id)) attributes(dd)$id ## Not run: blocksample(data.table::data.table(d),1e6,~id) ## End(Not run) d <- data.frame(x=c(1,rnorm(9)), z=rnorm(10), id=c(4,10,10,5,5,4,4,5,10,5), id2=c(1,1,2,1,2,1,1,1,1,2), v=rnorm(10)) dsample(d,~id, size=2) dsample(d,.~id+id2) dsample(d,x+z~id|x>0,size=5)
Bone marrow transplant data with 408 rows and 5 columns.
The data has 408 rows and 5 columns.
a numeric vector code. Survival status. 1: dead from treatment related causes, 2: relapse , 0: censored.
a numeric vector. Survival time.
a numeric vector code. Plalelet 1: more than 100 x
per L, 0: less.
a numeric vector. T-cell depleted BMT 1:yes, 0:no.
a numeric vector code. Age of patient, scaled and centered ((age-35)/15).
Simulated data
NN
data(bmt) names(bmt)data(bmt) names(bmt)
Liability-threshold model for twin data
bptwin( x, data, id, zyg, DZ, group = NULL, num = NULL, weights = NULL, weights.fun = function(x) ifelse(any(x <= 0), 0, max(x)), strata = NULL, messages = 1, control = list(trace = 0), type = "ace", eqmean = TRUE, pairs.only = FALSE, samecens = TRUE, allmarg = samecens & !is.null(weights), stderr = TRUE, robustvar = TRUE, p, indiv = FALSE, constrain, varlink, ... )bptwin( x, data, id, zyg, DZ, group = NULL, num = NULL, weights = NULL, weights.fun = function(x) ifelse(any(x <= 0), 0, max(x)), strata = NULL, messages = 1, control = list(trace = 0), type = "ace", eqmean = TRUE, pairs.only = FALSE, samecens = TRUE, allmarg = samecens & !is.null(weights), stderr = TRUE, robustvar = TRUE, p, indiv = FALSE, constrain, varlink, ... )
x |
Formula specifying effects of covariates on the response. |
data |
|
id |
The name of the column in the dataset containing the twin-id variable. |
zyg |
The name of the column in the dataset containing the zygosity variable. |
DZ |
Character defining the level in the zyg variable corresponding to the dyzogitic twins. |
group |
Optional. Variable name defining group for interaction analysis (e.g., gender) |
num |
Optional twin number variable |
weights |
Weight matrix if needed by the chosen estimator (IPCW) |
weights.fun |
Function defining a single weight each individual/cluster |
strata |
Strata |
messages |
Control amount of messages shown |
control |
Control argument parsed on to the optimization routine. Starting values may be parsed as ' |
type |
Character defining the type of analysis to be performed. Should be a subset of "acde" (additive genetic factors, common environmental factors, dominant genetic factors, unique environmental factors). |
eqmean |
Equal means (with type="cor")? |
pairs.only |
Include complete pairs only? |
samecens |
Same censoring |
allmarg |
Should all marginal terms be included |
stderr |
Should standard errors be calculated? |
robustvar |
If TRUE robust (sandwich) variance estimates of the variance are used |
p |
Parameter vector p in which to evaluate log-Likelihood and score function |
indiv |
If TRUE the score and log-Likelihood contribution of each twin-pair |
constrain |
Development argument |
varlink |
Link function for variance parameters |
... |
Additional arguments to lower level functions |
Klaus K. Holst
twinlm, twinlm.time, twinlm.strata, twinsim
data(twinstut) b0 <- bptwin(stutter~sex, data=droplevels( subset(twinstut, zyg%in%c("mz","dz") & tvparnr<5e3) ), id="tvparnr",zyg="zyg",DZ="dz",type="ae") summary(b0)data(twinstut) b0 <- bptwin(stutter~sex, data=droplevels( subset(twinstut, zyg%in%c("mz","dz") & tvparnr<5e3) ), id="tvparnr",zyg="zyg",DZ="dz",type="ae") summary(b0)
Data from CALGB 8923
Data from smart design id: id of subject status : 1-death, 2-response for second randomization, 0-censoring A0 : treament at first randomization A1 : treament at second randomization At.f : treament given at record (A0 or A1) TR : time of response sex : 0-males, 1-females consent: 1 if agrees to 2nd randomization, censored if not R: 1 if response trt1: A0 trt2: A1
https://github.com/ycchao/code_Joint_model_SMART
data(calgb8923)data(calgb8923)
Estimates the casewise concordance based on concordance and marginal estimates
derived from prodlim objects. Unlike test_casewise, this function
does not perform hypothesis testing but focuses on estimation and plotting.
casewise(conc, marg, cause.marg)casewise(conc, marg, cause.marg)
conc |
Concordance object from |
marg |
Marginal cumulative incidence object from |
cause.marg |
Specifies which cause should be used for the marginal CIF based on the |
An object of class "casewise" containing:
casewise |
Matrix with time, casewise concordance, and standard errors. |
marg |
Matrix with time, marginal CIF, and standard errors. |
concordance |
Matrix with time, concordance, and standard errors. |
timer |
Time points used. |
P1, se.P1
|
Extracted concordance values and SEs. |
Thomas Scheike
## Reduce Ex.Timings library(prodlim) data(prt); prt <- force_same_cens(prt,cause="status") ### marginal cumulative incidence of prostate cancer outm <- prodlim(Hist(time,status)~+1,data=prt) times <- 60:100 cifmz <- predict(outm,cause=2,time=times,newdata=data.frame(zyg="MZ")) cifdz <- predict(outm,cause=2,time=times,newdata=data.frame(zyg="DZ")) ### concordance for MZ and DZ twins cc <- bicomprisk(Event(time,status)~strata(zyg)+id(id),data=prt,cause=c(2,2),prodlim=TRUE) cdz <- cc$model$"DZ" cmz <- cc$model$"MZ" cdz <- casewise(cdz,outm,cause.marg=2) cmz <- casewise(cmz,outm,cause.marg=2) plot(cmz,ci=NULL,ylim=c(0,0.5),xlim=c(60,100),legend=TRUE,col=c(3,2,1)) par(new=TRUE) plot(cdz,ci=NULL,ylim=c(0,0.5),xlim=c(60,100),legend=TRUE) summary(cdz) summary(cmz)## Reduce Ex.Timings library(prodlim) data(prt); prt <- force_same_cens(prt,cause="status") ### marginal cumulative incidence of prostate cancer outm <- prodlim(Hist(time,status)~+1,data=prt) times <- 60:100 cifmz <- predict(outm,cause=2,time=times,newdata=data.frame(zyg="MZ")) cifdz <- predict(outm,cause=2,time=times,newdata=data.frame(zyg="DZ")) ### concordance for MZ and DZ twins cc <- bicomprisk(Event(time,status)~strata(zyg)+id(id),data=prt,cause=c(2,2),prodlim=TRUE) cdz <- cc$model$"DZ" cmz <- cc$model$"MZ" cdz <- casewise(cdz,outm,cause.marg=2) cmz <- casewise(cmz,outm,cause.marg=2) plot(cmz,ci=NULL,ylim=c(0,0.5),xlim=c(60,100),legend=TRUE,col=c(3,2,1)) par(new=TRUE) plot(cdz,ci=NULL,ylim=c(0,0.5),xlim=c(60,100),legend=TRUE) summary(cdz) summary(cmz)
Computes casewise concordance probability and confidence interval from counts of concordant and discordant pairs using a binomial GLM.
casewise_bin(nc, nd)casewise_bin(nc, nd)
nc |
number of concordant pairs. |
nd |
number of discordant pairs. |
A list with p.casewise (estimated probability) and
ci.casewise (confidence interval).
Computes cumulative incidence functions with robust standard errors.
cif(formula, data = data, cause = 1, cens.code = 0, death.code = NULL, ...)cif(formula, data = data, cause = 1, cens.code = 0, death.code = NULL, ...)
formula |
Formula with 'Event' outcome and |
data |
Data frame. |
cause |
Cause of interest (default is |
cens.code |
Censoring code (default is |
death.code |
Alternative to |
... |
Additional arguments passed to lower-level functions. |
An object of class "cif" (extends "phreg") containing:
cumhaz |
Cumulative incidence estimates. |
se.cumhaz |
Standard errors. |
cause |
Cause of interest. |
Thomas Scheike
data(bmt) bmt$cluster <- sample(1:100, 408, replace = TRUE) out1 <- cif(Event(time, cause) ~ +1, data = bmt, cause = 1) out2 <- cif(Event(time, cause) ~ +1 + cluster(cluster), data = bmt, cause = 1) par(mfrow = c(1, 2)) plot(out1, se = TRUE) plot(out2, se = TRUE)data(bmt) bmt$cluster <- sample(1:100, 408, replace = TRUE) out1 <- cif(Event(time, cause) ~ +1, data = bmt, cause = 1) out2 <- cif(Event(time, cause) ~ +1 + cluster(cluster), data = bmt, cause = 1) par(mfrow = c(1, 2)) plot(out1, se = TRUE) plot(out2, se = TRUE)
Computes the Restricted Mean Time Lost (RMTL) for competing risks based on the integrated Aalen-Johansen estimator.
cif_yearslost(formula, data = data, cens.code = 0, times = NULL, ...)cif_yearslost(formula, data = data, cens.code = 0, times = NULL, ...)
formula |
Formula for |
data |
Data frame for calculations. |
cens.code |
Censoring code (needed to separate event codes from censorings). |
times |
Possible times for which to report restricted mean. Summary displays estimates for these times. |
... |
Additional arguments passed to |
A set of time points can be given to be returned in the summary, but the function
computes years-lost for all event times and can be plotted/viewed.
The RMTL for a specific time-point can also be computed using the rmstIPCW function.
An object of class "resmean_phreg" containing:
cumhaz |
Matrix of cumulative hazards (years lost). |
se.cumhaz |
Standard errors. |
intF1times |
Years lost at specified times. |
causes |
Vector of cause codes. |
Thomas Scheike
data(bmt) bmt$time <- bmt$time + runif(408) * 0.001 ## Years lost decomposed into causes drm1 <- cif_yearslost(Event(time, cause) ~ strata(tcell, platelet), data = bmt, times = c(40, 50)) par(mfrow = c(1, 2)) plot(drm1, cause = 1, se = 1) plot(drm1, cause = 2, se = 1) summary(drm1) estimate(drm1, cause = 1) estimate(drm1, cause = 2) ## Comparing populations drm1 <- cif_yearslost(Event(time, cause) ~ strata(tcell, platelet), data = bmt, times = 40) summary(drm1, contrast = list(1:4)) e1 <- estimate(drm1) estimate(e1, rbind(c(1, -1, 0, 0)))data(bmt) bmt$time <- bmt$time + runif(408) * 0.001 ## Years lost decomposed into causes drm1 <- cif_yearslost(Event(time, cause) ~ strata(tcell, platelet), data = bmt, times = c(40, 50)) par(mfrow = c(1, 2)) plot(drm1, cause = 1, se = 1) plot(drm1, cause = 2, se = 1) summary(drm1) estimate(drm1, cause = 1) estimate(drm1, cause = 2) ## Comparing populations drm1 <- cif_yearslost(Event(time, cause) ~ strata(tcell, platelet), data = bmt, times = 40) summary(drm1, contrast = list(1:4)) e1 <- estimate(drm1) estimate(e1, rbind(c(1, -1, 0, 0)))
Fits a regression model for the cumulative incidence function (CIF) in the presence of competing risks. Supports two link functions:
propodds=1 (default): Logistic link model (logit of CIF), providing Odds Ratio (OR) interpretations.
propodds=NULL: Fine-Gray (cloglog) regression model, providing subdistribution hazard ratio interpretations.
cifreg( formula, data, propodds = 1, cause = 1, cens.code = 0, no.codes = NULL, death.code = NULL, ... )cifreg( formula, data, propodds = 1, cause = 1, cens.code = 0, no.codes = NULL, death.code = NULL, ... )
formula |
Formula with an 'Event' outcome. |
data |
Data frame containing the variables. |
propodds |
Logical; if |
cause |
Cause of interest (default is 1). |
cens.code |
Code for censoring (default is 0). |
no.codes |
Event codes to be ignored when identifying competing causes (useful for administrative censoring). |
death.code |
Codes for death (terminal events). If |
... |
Additional arguments passed to |
For the Fine-Gray model, the score equations are:
summed over clusters and returned as iid.naive (multiplied by the inverse of the second derivative).
Here,
,
, and
.
The full influence function (IID decomposition) for the beta coefficients includes a censoring term:
which is returned as the iid component.
For the logistic link model, standard errors may be slightly underestimated because uncertainty from the
recursive baseline estimation is not fully accounted for. For smaller datasets, it is recommended to use
the prop.odds.subdist function from the timereg package (which uses more efficient weights)
or to bootstrap the standard errors.
An object of class "cifreg" (extending "phreg") containing:
coef |
Estimated coefficients. |
var |
Robust variance-covariance matrix. |
iid |
Influence functions for the coefficients. |
cumhaz |
Cumulative incidence estimates. |
propodds |
Indicator of the link function used. |
Thomas Scheike
## data with no ties data(bmt,package="mets") bmt$time <- bmt$time+runif(nrow(bmt))*0.01 bmt$id <- 1:nrow(bmt) ## logistic link OR interpretation or=cifreg(Event(time,cause)~tcell+platelet+age,data=bmt,cause=1) summary(or) par(mfrow=c(1,2)) plot(or) nd <- data.frame(tcell=c(1,0),platelet=0,age=0) por <- predict(or,nd) plot(por) ## approximate standard errors por <-mets:::predict.phreg(or,nd) plot(por,se=1) ## Fine-Gray model fg=cifregFG(Event(time,cause)~tcell+platelet+age,data=bmt,cause=1) summary(fg) ##fg=recreg(Event(time,cause)~tcell+platelet+age,data=bmt,cause=1,death.code=2) ##summary(fg) plot(fg) nd <- data.frame(tcell=c(1,0),platelet=0,age=0) pfg <- predict(fg,nd,se=1) plot(pfg,se=1) ## bt <- iidBaseline(fg,time=30) ## bt <- IIDrecreg(fg$cox.prep,fg,time=30) ## not run to avoid timing issues ## gofFG(Event(time,cause)~tcell+platelet+age,data=bmt,cause=1) sfg <- cifregFG(Event(time,cause)~strata(tcell)+platelet+age,data=bmt,cause=1) summary(sfg) plot(sfg) ### predictions with CI based on iid decomposition of baseline and beta ### these are used in the predict function above fg <- cifregFG(Event(time,cause)~tcell+platelet+age,data=bmt,cause=1) Biid <- iidBaseline(fg,time=20) pfg1 <- FGprediid(Biid,nd) pfg1## data with no ties data(bmt,package="mets") bmt$time <- bmt$time+runif(nrow(bmt))*0.01 bmt$id <- 1:nrow(bmt) ## logistic link OR interpretation or=cifreg(Event(time,cause)~tcell+platelet+age,data=bmt,cause=1) summary(or) par(mfrow=c(1,2)) plot(or) nd <- data.frame(tcell=c(1,0),platelet=0,age=0) por <- predict(or,nd) plot(por) ## approximate standard errors por <-mets:::predict.phreg(or,nd) plot(por,se=1) ## Fine-Gray model fg=cifregFG(Event(time,cause)~tcell+platelet+age,data=bmt,cause=1) summary(fg) ##fg=recreg(Event(time,cause)~tcell+platelet+age,data=bmt,cause=1,death.code=2) ##summary(fg) plot(fg) nd <- data.frame(tcell=c(1,0),platelet=0,age=0) pfg <- predict(fg,nd,se=1) plot(pfg,se=1) ## bt <- iidBaseline(fg,time=30) ## bt <- IIDrecreg(fg$cox.prep,fg,time=30) ## not run to avoid timing issues ## gofFG(Event(time,cause)~tcell+platelet+age,data=bmt,cause=1) sfg <- cifregFG(Event(time,cause)~strata(tcell)+platelet+age,data=bmt,cause=1) summary(sfg) plot(sfg) ### predictions with CI based on iid decomposition of baseline and beta ### these are used in the predict function above fg <- cifregFG(Event(time,cause)~tcell+platelet+age,data=bmt,cause=1) Biid <- iidBaseline(fg,time=20) pfg1 <- FGprediid(Biid,nd) pfg1
Convenience wrapper for cifreg that specifically fits the Fine-Gray model
by setting propodds=NULL. This provides subdistribution hazard ratio interpretations.
cifregFG(formula, data, propodds = NULL, ...)cifregFG(formula, data, propodds = NULL, ...)
formula |
Formula with an 'Event' outcome. |
data |
Data frame. |
propodds |
Set to |
... |
Additional arguments passed to |
An object of class "cifreg" (extending "phreg") with propodds=NULL.
Thomas Scheike
Clayton-Oakes frailty model
ClaytonOakes( formula, data = parent.frame(), cluster, var.formula = ~1, cuts = NULL, type = "piecewise", start, control = list(), var.invlink = exp, ... )ClaytonOakes( formula, data = parent.frame(), cluster, var.formula = ~1, cuts = NULL, type = "piecewise", start, control = list(), var.invlink = exp, ... )
formula |
formula specifying the marginal proportional (piecewise constant) hazard structure with the right-hand-side being a survival object (Surv) specifying the entry time (optional), the follow-up time, and event/censoring status at follow-up. The clustering can be specified using the special function |
data |
Data frame |
cluster |
Variable defining the clustering (if not given in the formula) |
var.formula |
Formula specifying the variance component structure (if not given via the cluster special function in the formula) using a linear model with log-link. |
cuts |
Cut points defining the piecewise constant hazard |
type |
when equal to |
start |
Optional starting values |
control |
Control parameters to the optimization routine |
var.invlink |
Inverse link function for variance structure model |
... |
Additional arguments |
Klaus K. Holst
set.seed(1) d <- subset(sim_ClaytonOakes(500,4,2,1,stoptime=2,left=2),truncated) e <- ClaytonOakes(survival::Surv(lefttime,time,status)~x+cluster(~1,cluster), cuts=c(0,0.5,1,2),data=d) e d2 <- sim_ClaytonOakes(500,4,2,1,stoptime=2,left=0) d2$z <- rep(1,nrow(d2)); d2$z[d2$cluster%in%sample(d2$cluster,100)] <- 0 ## Marginal=Cox Proportional Hazards model: ## ts <- ClaytonOakes(survival::Surv(time,status)~timereg::prop(x)+cluster(~1,cluster), ## data=d2,type="two.stage") ## Marginal=Aalens additive model: ## ts2 <- ClaytonOakes(survival::Surv(time,status)~x+cluster(~1,cluster), ## data=d2,type="two.stage") ## Marginal=Piecewise constant: e2 <- ClaytonOakes(survival::Surv(time,status)~x+cluster(~-1+factor(z),cluster), cuts=c(0,0.5,1,2),data=d2) e2 e0 <- ClaytonOakes(survival::Surv(time,status)~cluster(~-1+factor(z),cluster), cuts=c(0,0.5,1,2),data=d2) ##ts0 <- ClaytonOakes(survival::Surv(time,status)~cluster(~1,cluster), ## data=d2,type="two.stage") ##plot(ts0) ##plot(e0) e3 <- ClaytonOakes(survival::Surv(time,status)~x+cluster(~1,cluster),cuts=c(0,0.5,1,2), data=d,var.invlink=identity) e3set.seed(1) d <- subset(sim_ClaytonOakes(500,4,2,1,stoptime=2,left=2),truncated) e <- ClaytonOakes(survival::Surv(lefttime,time,status)~x+cluster(~1,cluster), cuts=c(0,0.5,1,2),data=d) e d2 <- sim_ClaytonOakes(500,4,2,1,stoptime=2,left=0) d2$z <- rep(1,nrow(d2)); d2$z[d2$cluster%in%sample(d2$cluster,100)] <- 0 ## Marginal=Cox Proportional Hazards model: ## ts <- ClaytonOakes(survival::Surv(time,status)~timereg::prop(x)+cluster(~1,cluster), ## data=d2,type="two.stage") ## Marginal=Aalens additive model: ## ts2 <- ClaytonOakes(survival::Surv(time,status)~x+cluster(~1,cluster), ## data=d2,type="two.stage") ## Marginal=Piecewise constant: e2 <- ClaytonOakes(survival::Surv(time,status)~x+cluster(~-1+factor(z),cluster), cuts=c(0,0.5,1,2),data=d2) e2 e0 <- ClaytonOakes(survival::Surv(time,status)~cluster(~-1+factor(z),cluster), cuts=c(0,0.5,1,2),data=d2) ##ts0 <- ClaytonOakes(survival::Surv(time,status)~cluster(~1,cluster), ## data=d2,type="two.stage") ##plot(ts0) ##plot(e0) e3 <- ClaytonOakes(survival::Surv(time,status)~x+cluster(~1,cluster),cuts=c(0,0.5,1,2), data=d,var.invlink=identity) e3
Finds subjects related to same cluster
cluster_index( clusters, index.type = FALSE, num = NULL, Rindex = 0, mat = NULL, return.all = FALSE, code.na = NA )cluster_index( clusters, index.type = FALSE, num = NULL, Rindex = 0, mat = NULL, return.all = FALSE, code.na = NA )
clusters |
list of indeces |
index.type |
if TRUE then already list of integers of index.type |
num |
to get numbering according to num-type in separate columns |
Rindex |
index starts with 1, in C is it is 0 |
mat |
to return matrix of indeces |
return.all |
return all arguments |
code.na |
how to code missing values |
Klaus Holst, Thomas Scheike
Cluster indeces
familycluster_index familyclusterWithProbands.index
i<-c(1,1,2,2,1,3) d<- cluster_index(i) print(d) type<-c("m","f","m","c","c","c") d<- cluster_index(i,num=type,Rindex=1) print(d)i<-c(1,1,2,2,1,3) d<- cluster_index(i) print(d) type<-c("m","f","m","c","c","c") d<- cluster_index(i,num=type,Rindex=1) print(d)
Reduces the number of unique clusters by binning into quantile groups.
coarse_clust(clusters, max.clust = 100)coarse_clust(clusters, max.clust = 100)
clusters |
vector of cluster identifiers. |
max.clust |
maximum number of coarsened clusters. |
Integer vector of coarsened cluster indices (0-based).
Concordance for Twins
concordanceCor( object, cif1, cif2 = NULL, messages = TRUE, model = NULL, coefs = NULL, ... )concordanceCor( object, cif1, cif2 = NULL, messages = TRUE, model = NULL, coefs = NULL, ... )
object |
Output from the cor_cif, rr_cif or or_cif function |
cif1 |
Marginal cumulative incidence |
cif2 |
Marginal cumulative incidence of other cause (cause2) if it is different from cause1 |
messages |
To print messages |
model |
Specfifies wich model that is considered if object not given. |
coefs |
Specfifies dependence parameters if object is not given. |
... |
Extra arguments, not used. |
The concordance is the probability that both twins have experienced the event of interest and is defined as
Similarly, the casewise concordance is
that is the probability that twin "2" has the event given that twins "1" has.
Thomas Scheike
Estimating twin concordance for bivariate competing risks twin data Thomas H. Scheike, Klaus K. Holst and Jacob B. Hjelmborg, Statistics in Medicine 2014, 1193-1204
Estimating Twin Pair Concordance for Age of Onset. Thomas H. Scheike, Jacob V B Hjelmborg, Klaus K. Holst, 2015 in Behavior genetics DOI:10.1007/s10519-015-9729-3
Fits a parametric model for the log-cross-odds-ratio for the
predictive effect of for the cumulative incidence curves for
experiencing cause i given that has experienced a cause k :
with the log cross odds ratio being
the conditional odds divided by the unconditional odds, with the odds being, respectively
and
Here is the complement event of ,
is the distribution given covariates
( are subject specific and are cluster specific covariates), and
is a function that is the simple identity
by default.
cor_cif( cif, data, cause = NULL, times = NULL, cause1 = 1, cause2 = 1, cens.code = NULL, cens.model = "KM", Nit = 40, detail = 0, clusters = NULL, theta = NULL, theta.des = NULL, step = 1, sym = 0, weights = NULL, par.func = NULL, dpar.func = NULL, dimpar = NULL, score.method = "nlminb", same.cens = FALSE, censoring.weights = NULL, silent = 1, ... )cor_cif( cif, data, cause = NULL, times = NULL, cause1 = 1, cause2 = 1, cens.code = NULL, cens.model = "KM", Nit = 40, detail = 0, clusters = NULL, theta = NULL, theta.des = NULL, step = 1, sym = 0, weights = NULL, par.func = NULL, dpar.func = NULL, dimpar = NULL, score.method = "nlminb", same.cens = FALSE, censoring.weights = NULL, silent = 1, ... )
cif |
a model object from the timereg::comp.risk function with the marginal cumulative incidence of cause1, i.e., the event of interest, and whose odds the comparision is compared to the conditional odds given cause2 |
data |
a data.frame with the variables. |
cause |
specifies the causes related to the death times, the value cens.code is the censoring value. When missing it comes from marginal cif. |
times |
time-vector that specifies the times used for the estimating euqations for the cross-odds-ratio estimation. |
cause1 |
specificies the cause considered. |
cause2 |
specificies the cause that is conditioned on. |
cens.code |
specificies the code for the censoring if NULL then uses the one from the marginal cif model. |
cens.model |
specified which model to use for the ICPW, KM is Kaplan-Meier alternatively it may be "cox" |
Nit |
number of iterations for Newton-Raphson algorithm. |
detail |
if 0 no details are printed during iterations, if 1 details are given. |
clusters |
specifies the cluster structure. |
theta |
specifies starting values for the cross-odds-ratio parameters of the model. |
theta.des |
specifies a regression design for the cross-odds-ratio parameters. |
step |
specifies the step size for the Newton-Raphson algorithm. |
sym |
specifies if symmetry is used in the model. |
weights |
weights for estimating equations. |
par.func |
parfunc |
dpar.func |
dparfunc |
dimpar |
dimpar |
score.method |
"nlminb", can also use "nr". |
same.cens |
if true then censoring within clusters are assumed to be the same variable, default is independent censoring. |
censoring.weights |
these probabilities are used for the bivariate censoring dist. |
silent |
1 to suppress output about convergence related issues. |
... |
Not used. |
The OR dependence measure is given by
This measure is numerically more stabile than the COR measure, and is symetric in i,k.
The RR dependence measure is given by
This measure is numerically more stabile than the COR measure, and is symetric in i,k.
The model is fitted under symmetry (sym=1), i.e., such that it is assumed
that and can be interchanged and leads to
the same cross-odd-ratio (i.e.
,
as would be expected for twins
or without symmetry as might be the case with mothers and daughters (sym=0).
may be specified as an R-function of the parameters,
see example below, but the default is that it is simply .
returns an object of type 'cor'. With the following arguments:
theta |
estimate of proportional odds parameters of model. |
var.theta |
variance for gamma. |
hess |
the derivative of the used score. |
score |
scores at final stage. |
score |
scores at final stage. |
theta.iid |
matrix of iid decomposition of parametric effects. |
Thomas Scheike
Cross odds ratio Modelling of dependence for Multivariate Competing Risks Data, Scheike and Sun (2012), Biostatistics.
A Semiparametric Random Effects Model for Multivariate Competing Risks Data, Scheike, Zhang, Sun, Jensen (2010), Biometrika.
## Reduce Ex.Timings library("timereg") data(multcif); multcif$cause[multcif$cause==0] <- 2 zyg <- rep(rbinom(200,1,0.5),each=2) theta.des <- model.matrix(~-1+factor(zyg)) times=seq(0.05,1,by=0.05) # to speed up computations use only these time-points add <- timereg::comp.risk(Event(time,cause)~+1+cluster(id),data=multcif,cause=1, n.sim=0,times=times,model="fg",max.clust=NULL) add2 <- timereg::comp.risk(Event(time,cause)~+1+cluster(id),data=multcif,cause=2, n.sim=0,times=times,model="fg",max.clust=NULL) out1 <- cor_cif(add,data=multcif,cause1=1,cause2=1) summary(out1) out2 <- cor_cif(add,data=multcif,cause1=1,cause2=1,theta.des=theta.des) summary(out2) ##out3 <- cor_cif(add,data=multcif,cause1=1,cause2=2,cif2=add2) ##summary(out3) ########################################################### # investigating further models using parfunc and dparfunc ########################################################### set.seed(100) prt<-sim_nordic_random(2000,cordz=2,cormz=5) prt$status <-prt$cause table(prt$status) times <- seq(40,100,by=10) cifmod <- timereg::comp.risk(Event(time,cause)~+1+cluster(id),data=prt, cause=1,n.sim=0, times=times,conservative=1,max.clust=NULL,model="fg") theta.des <- model.matrix(~-1+factor(zyg),data=prt) parfunc <- function(par,t,pardes) { par <- pardes %*% c(par[1],par[2]) + pardes %*% c( par[3]*(t-60)/12,par[4]*(t-60)/12) par } head(parfunc(c(0.1,1,0.1,1),50,theta.des)) dparfunc <- function(par,t,pardes) { dpar <- cbind(pardes, t(t(pardes) * c( (t-60)/12,(t-60)/12)) ) dpar } head(dparfunc(c(0.1,1,0.1,1),50,theta.des)) names(prt) or1 <- or_cif(cifmod,data=prt,cause1=1,cause2=1,theta.des=theta.des, same.cens=TRUE,theta=c(0.6,1.1,0.1,0.1), par.func=parfunc,dpar.func=dparfunc,dimpar=4, score.method="nr",detail=1) summary(or1) ## cor1 <- cor_cif(cifmod,data=prt,cause1=1,cause2=1,theta.des=theta.des, ## same.cens=TRUE,theta=c(0.5,1.0,0.1,0.1), ## par.func=parfunc,dpar.func=dparfunc,dimpar=4, ## control=list(trace=TRUE),detail=1) ## summary(cor1) ### piecewise contant OR model gparfunc <- function(par,t,pardes) { cuts <- c(0,80,90,120) grop <- diff(t<cuts) paru <- (pardes[,1]==1) * sum(grop*par[1:3]) + (pardes[,2]==1) * sum(grop*par[4:6]) paru } dgparfunc <- function(par,t,pardes) { cuts <- c(0,80,90,120) grop <- diff(t<cuts) par1 <- matrix(c(grop),nrow(pardes),length(grop),byrow=TRUE) parmz <- par1* (pardes[,1]==1) pardz <- (pardes[,2]==1) * par1 dpar <- cbind( parmz,pardz) dpar } head(dgparfunc(rep(0.1,6),50,theta.des)) head(gparfunc(rep(0.1,6),50,theta.des)) or1g <- or_cif(cifmod,data=prt,cause1=1,cause2=1, theta.des=theta.des, same.cens=TRUE, par.func=gparfunc,dpar.func=dgparfunc, dimpar=6,score.method="nr",detail=1) summary(or1g) names(or1g) head(or1g$theta.iid)## Reduce Ex.Timings library("timereg") data(multcif); multcif$cause[multcif$cause==0] <- 2 zyg <- rep(rbinom(200,1,0.5),each=2) theta.des <- model.matrix(~-1+factor(zyg)) times=seq(0.05,1,by=0.05) # to speed up computations use only these time-points add <- timereg::comp.risk(Event(time,cause)~+1+cluster(id),data=multcif,cause=1, n.sim=0,times=times,model="fg",max.clust=NULL) add2 <- timereg::comp.risk(Event(time,cause)~+1+cluster(id),data=multcif,cause=2, n.sim=0,times=times,model="fg",max.clust=NULL) out1 <- cor_cif(add,data=multcif,cause1=1,cause2=1) summary(out1) out2 <- cor_cif(add,data=multcif,cause1=1,cause2=1,theta.des=theta.des) summary(out2) ##out3 <- cor_cif(add,data=multcif,cause1=1,cause2=2,cif2=add2) ##summary(out3) ########################################################### # investigating further models using parfunc and dparfunc ########################################################### set.seed(100) prt<-sim_nordic_random(2000,cordz=2,cormz=5) prt$status <-prt$cause table(prt$status) times <- seq(40,100,by=10) cifmod <- timereg::comp.risk(Event(time,cause)~+1+cluster(id),data=prt, cause=1,n.sim=0, times=times,conservative=1,max.clust=NULL,model="fg") theta.des <- model.matrix(~-1+factor(zyg),data=prt) parfunc <- function(par,t,pardes) { par <- pardes %*% c(par[1],par[2]) + pardes %*% c( par[3]*(t-60)/12,par[4]*(t-60)/12) par } head(parfunc(c(0.1,1,0.1,1),50,theta.des)) dparfunc <- function(par,t,pardes) { dpar <- cbind(pardes, t(t(pardes) * c( (t-60)/12,(t-60)/12)) ) dpar } head(dparfunc(c(0.1,1,0.1,1),50,theta.des)) names(prt) or1 <- or_cif(cifmod,data=prt,cause1=1,cause2=1,theta.des=theta.des, same.cens=TRUE,theta=c(0.6,1.1,0.1,0.1), par.func=parfunc,dpar.func=dparfunc,dimpar=4, score.method="nr",detail=1) summary(or1) ## cor1 <- cor_cif(cifmod,data=prt,cause1=1,cause2=1,theta.des=theta.des, ## same.cens=TRUE,theta=c(0.5,1.0,0.1,0.1), ## par.func=parfunc,dpar.func=dparfunc,dimpar=4, ## control=list(trace=TRUE),detail=1) ## summary(cor1) ### piecewise contant OR model gparfunc <- function(par,t,pardes) { cuts <- c(0,80,90,120) grop <- diff(t<cuts) paru <- (pardes[,1]==1) * sum(grop*par[1:3]) + (pardes[,2]==1) * sum(grop*par[4:6]) paru } dgparfunc <- function(par,t,pardes) { cuts <- c(0,80,90,120) grop <- diff(t<cuts) par1 <- matrix(c(grop),nrow(pardes),length(grop),byrow=TRUE) parmz <- par1* (pardes[,1]==1) pardz <- (pardes[,2]==1) * par1 dpar <- cbind( parmz,pardz) dpar } head(dgparfunc(rep(0.1,6),50,theta.des)) head(gparfunc(rep(0.1,6),50,theta.des)) or1g <- or_cif(cifmod,data=prt,cause1=1,cause2=1, theta.des=theta.des, same.cens=TRUE, par.func=gparfunc,dpar.func=dgparfunc, dimpar=6,score.method="nr",detail=1) summary(or1g) names(or1g) head(or1g$theta.iid)
For each subject and each row of data, counts the number of prior
events of specified types in the recurrent event history. The resulting
count columns can be used as time-dependent covariates in subsequent models,
e.g. to capture event-history dependence in the recurrent event rate.
count_history( data, status = "status", id = "id", types = 1, names.count = "Count", lag = TRUE, multitype = FALSE, marks = NULL )count_history( data, status = "status", id = "id", types = 1, names.count = "Count", lag = TRUE, multitype = FALSE, marks = NULL )
data |
A data frame in counting-process format, with one row per event interval per subject. |
status |
Name of the column containing event status codes. Default is
|
id |
Name of the column containing subject identifiers. Default is
|
types |
Integer vector of status codes to count. Each value in
|
names.count |
Prefix for the names of the new count columns. The
status code is appended, e.g. |
lag |
Logical. If |
multitype |
Logical. If |
marks |
Optional numeric vector of weights applied to events when
|
When lag = TRUE (default), the count at each row reflects events that
occurred strictly before the current time point (i.e. ), making
it suitable as a left-continuous covariate in counting-process models.
The input data frame data with one or more new integer columns
appended. With multitype = FALSE, columns are named
paste0(names.count, k) for each k in types; with
multitype = TRUE, a single column named
paste0(names.count, types[1]) is added. An internal bookkeeping
column lbnr__id is also added.
Thomas Scheike
data(hfactioncpx12) hf <- hfactioncpx12 dtable(hf, ~status) ## Separate counts for event types 1 and 2 rr <- count_history(hf, types = 1:2, id = "id", status = "status") dtable(rr, ~"Count*" + status, level = 1)data(hfactioncpx12) hf <- hfactioncpx12 dtable(hf, ~status) ## Separate counts for event types 1 and 2 rr <- count_history(hf, types = 1:2, id = "id", status = "status") dtable(rr, ~"Count*" + status, level = 1)
Rates for HPN program for patients of Copenhagen Cohort
crbsi: cumulative rate of catheter related bloodstream infection in HPN patients of Copenhagen mechanical: cumulative rate of Mechanical (hole/defect) complication for catheter of HPN patients of Copenhagen trombo: cumulative rate of Occlusion/Thrombosis complication for catheter of HPN patients of Copenhagen terminal: rate of terminal event, patients leaving the HPN program
Estimated data
A wrapper function for interval_logitsurv_discrete that simplifies the
interface for discrete time-to-event data where the event time is observed exactly
or as a factor level. It converts a factor response into an interval-censored format
internally.
cumoddsreg(formula, data, ...)cumoddsreg(formula, data, ...)
formula |
Formula with a factor response on the left-hand side (representing the event time) and covariates on the right. |
data |
Data frame. |
... |
Arguments passed to |
An object of class "cumoddsreg" with the same structure as
interval_logitsurv_discrete.
Thomas Scheike
aggregating for for data frames
daggregate( data, y = NULL, x = NULL, subset, ..., fun = "summary", regex = mets.options()$regex, missing = FALSE, remove.empty = FALSE, matrix = FALSE, silent = FALSE, na.action = na.pass, convert = NULL )daggregate( data, y = NULL, x = NULL, subset, ..., fun = "summary", regex = mets.options()$regex, missing = FALSE, remove.empty = FALSE, matrix = FALSE, silent = FALSE, na.action = na.pass, convert = NULL )
data |
data.frame |
y |
name of variable, or formula, or names of variables on data frame. |
x |
name of variable, or formula, or names of variables on data frame. |
subset |
subset expression |
... |
additional arguments to lower level functions |
fun |
function defining aggregation |
regex |
interpret x,y as regular expressions |
missing |
Missing used in groups (x) |
remove.empty |
remove empty groups from output |
matrix |
if TRUE a matrix is returned instead of an array |
silent |
suppress messages |
na.action |
How model.frame deals with 'NA's |
convert |
if TRUE try to coerce result into matrix. Can also be a user-defined function |
data("sTRACE") daggregate(iris, "^.e.al", x="Species", fun=cor, regex=TRUE) daggregate(iris, Sepal.Length+Petal.Length ~Species, fun=summary) daggregate(iris, log(Sepal.Length)+I(Petal.Length>1.5) ~ Species, fun=summary) daggregate(iris, "*Length*", x="Species", fun=head) daggregate(iris, "^.e.al", x="Species", fun=tail, regex=TRUE) daggregate(sTRACE, status~ diabetes, fun=table) daggregate(sTRACE, status~ diabetes+sex, fun=table) daggregate(sTRACE, status + diabetes+sex ~ vf+I(wmi>1.4), fun=table) daggregate(iris, "^.e.al", x="Species",regex=TRUE) dlist(iris,Petal.Length+Sepal.Length ~ Species |Petal.Length>1.3 & Sepal.Length>5, n=list(1:3,-(3:1))) daggregate(iris, I(Sepal.Length>7)~Species | I(Petal.Length>1.5)) daggregate(iris, I(Sepal.Length>7)~Species | I(Petal.Length>1.5), fun=table) dsum(iris, .~Species, matrix=TRUE, missing=TRUE) par(mfrow=c(1,2)) data(iris) drename(iris) <- ~. daggregate(iris,'sepal*'~species|species!="virginica",fun=plot) daggregate(iris,'sepal*'~I(as.numeric(species))|I(as.numeric(species))!=1,fun=summary) dnumeric(iris) <- ~species daggregate(iris,'sepal*'~species.n|species.n!=1,fun=summary)data("sTRACE") daggregate(iris, "^.e.al", x="Species", fun=cor, regex=TRUE) daggregate(iris, Sepal.Length+Petal.Length ~Species, fun=summary) daggregate(iris, log(Sepal.Length)+I(Petal.Length>1.5) ~ Species, fun=summary) daggregate(iris, "*Length*", x="Species", fun=head) daggregate(iris, "^.e.al", x="Species", fun=tail, regex=TRUE) daggregate(sTRACE, status~ diabetes, fun=table) daggregate(sTRACE, status~ diabetes+sex, fun=table) daggregate(sTRACE, status + diabetes+sex ~ vf+I(wmi>1.4), fun=table) daggregate(iris, "^.e.al", x="Species",regex=TRUE) dlist(iris,Petal.Length+Sepal.Length ~ Species |Petal.Length>1.3 & Sepal.Length>5, n=list(1:3,-(3:1))) daggregate(iris, I(Sepal.Length>7)~Species | I(Petal.Length>1.5)) daggregate(iris, I(Sepal.Length>7)~Species | I(Petal.Length>1.5), fun=table) dsum(iris, .~Species, matrix=TRUE, missing=TRUE) par(mfrow=c(1,2)) data(iris) drename(iris) <- ~. daggregate(iris,'sepal*'~species|species!="virginica",fun=plot) daggregate(iris,'sepal*'~I(as.numeric(species))|I(as.numeric(species))!=1,fun=summary) dnumeric(iris) <- ~species daggregate(iris,'sepal*'~species.n|species.n!=1,fun=summary)
Derivatives of the bivariate normal cumulative distribution function
Dbvn(p,design=function(p,...) { return(list(mu=cbind(p[1],p[1]), dmu=cbind(1,1), S=matrix(c(p[2],p[3],p[3],p[4]),ncol=2), dS=rbind(c(1,0,0,0),c(0,1,1,0),c(0,0,0,1))) )}, Y=cbind(0,0))Dbvn(p,design=function(p,...) { return(list(mu=cbind(p[1],p[1]), dmu=cbind(1,1), S=matrix(c(p[2],p[3],p[3],p[4]),ncol=2), dS=rbind(c(1,0,0,0),c(0,1,1,0),c(0,0,0,1))) )}, Y=cbind(0,0))
p |
Parameter vector |
design |
Design function with defines mean, derivative of mean, variance, and derivative of variance with respect to the parameter p |
Y |
column vector where the CDF is evaluated |
Klaus K. Holst
Calculate summary statistics grouped by variable
dby( data, INPUT, ..., ID = NULL, ORDER = NULL, SUBSET = NULL, SORT = 0, COMBINE = !REDUCE, NOCHECK = FALSE, ARGS = NULL, NAMES, COLUMN = FALSE, REDUCE = FALSE, REGEX = mets.options()$regex, ALL = TRUE )dby( data, INPUT, ..., ID = NULL, ORDER = NULL, SUBSET = NULL, SORT = 0, COMBINE = !REDUCE, NOCHECK = FALSE, ARGS = NULL, NAMES, COLUMN = FALSE, REDUCE = FALSE, REGEX = mets.options()$regex, ALL = TRUE )
data |
Data.frame |
INPUT |
Input variables (character or formula) |
... |
functions |
ID |
id variable |
ORDER |
(optional) order variable |
SUBSET |
(optional) subset expression |
SORT |
sort order (id+order variable) |
COMBINE |
If TRUE result is appended to data |
NOCHECK |
No sorting or check for missing data |
ARGS |
Optional list of arguments to functions (...) |
NAMES |
Optional vector of column names |
COLUMN |
If TRUE do the calculations for each column |
REDUCE |
Reduce number of redundant rows |
REGEX |
Allow regular expressions |
ALL |
if FALSE only the subset will be returned |
Calculate summary statistics grouped by
dby2 for column-wise calculations
Klaus K. Holst and Thomas Scheike
n <- 4 k <- c(3,rbinom(n-1,3,0.5)+1) N <- sum(k) d <- data.frame(y=rnorm(N),x=rnorm(N), id=rep(seq(n),k),num=unlist(sapply(k,seq)) ) d2 <- d[sample(nrow(d)),] dby(d2, y~id, mean) dby(d2, y~id + order(num), cumsum) dby(d,y ~ id + order(num), dlag) dby(d,y ~ id + order(num), dlag, ARGS=list(k=1:2)) dby(d,y ~ id + order(num), dlag, ARGS=list(k=1:2), NAMES=c("l1","l2")) dby(d, y~id + order(num), mean=mean, csum=cumsum, n=length) dby(d2, y~id + order(num), a=cumsum, b=mean, N=length, l1=function(x) c(NA,x)[-length(x)] ) dby(d, y~id + order(num), nn=seq_along, n=length) dby(d, y~id + order(num), nn=seq_along, n=length) d <- d[,1:4] dby(d, x<0) <- list(z=mean) d <- dby(d, is.na(z), z=1) f <- function(x) apply(x,1,min) dby(d, y+x~id, min=f) dby(d,y+x~id+order(num), function(x) x) f <- function(x) { cbind(cumsum(x[,1]),cumsum(x[,2]))/sum(x)} dby(d, y+x~id, f) ## column-wise a <- d dby2(a, mean, median, REGEX=TRUE) <- '^[y|x]'~id a ## wildcards dby2(a,'y*'+'x*'~id,mean) ## subset dby(d, x<0) <- list(z=NA) d dby(d, y~id|x>-1, v=mean,z=1) dby(d, y+x~id|x>-1, mean, median, COLUMN=TRUE) dby2(d, y+x~id|x>0, mean, REDUCE=TRUE) dby(d,y~id|x<0,mean,ALL=FALSE) a <- iris a <- dby(a,y=1) dby(a,Species=="versicolor") <- list(y=2)n <- 4 k <- c(3,rbinom(n-1,3,0.5)+1) N <- sum(k) d <- data.frame(y=rnorm(N),x=rnorm(N), id=rep(seq(n),k),num=unlist(sapply(k,seq)) ) d2 <- d[sample(nrow(d)),] dby(d2, y~id, mean) dby(d2, y~id + order(num), cumsum) dby(d,y ~ id + order(num), dlag) dby(d,y ~ id + order(num), dlag, ARGS=list(k=1:2)) dby(d,y ~ id + order(num), dlag, ARGS=list(k=1:2), NAMES=c("l1","l2")) dby(d, y~id + order(num), mean=mean, csum=cumsum, n=length) dby(d2, y~id + order(num), a=cumsum, b=mean, N=length, l1=function(x) c(NA,x)[-length(x)] ) dby(d, y~id + order(num), nn=seq_along, n=length) dby(d, y~id + order(num), nn=seq_along, n=length) d <- d[,1:4] dby(d, x<0) <- list(z=mean) d <- dby(d, is.na(z), z=1) f <- function(x) apply(x,1,min) dby(d, y+x~id, min=f) dby(d,y+x~id+order(num), function(x) x) f <- function(x) { cbind(cumsum(x[,1]),cumsum(x[,2]))/sum(x)} dby(d, y+x~id, f) ## column-wise a <- d dby2(a, mean, median, REGEX=TRUE) <- '^[y|x]'~id a ## wildcards dby2(a,'y*'+'x*'~id,mean) ## subset dby(d, x<0) <- list(z=NA) d dby(d, y~id|x>-1, v=mean,z=1) dby(d, y+x~id|x>-1, mean, median, COLUMN=TRUE) dby2(d, y+x~id|x>0, mean, REDUCE=TRUE) dby(d,y~id|x<0,mean,ALL=FALSE) a <- iris a <- dby(a,y=1) dby(a,Species=="versicolor") <- list(y=2)
summary, tables, and correlations for data frames
dcor(data, y = NULL, x = NULL, use = "pairwise.complete.obs", ...)dcor(data, y = NULL, x = NULL, use = "pairwise.complete.obs", ...)
data |
if x is formula or names for data frame then data frame is needed. |
y |
name of variable, or fomula, or names of variables on data frame. |
x |
possible group variable |
use |
how to handle missing values |
... |
Optional additional arguments |
Klaus K. Holst and Thomas Scheike
data("sTRACE",package="timereg") dt<- sTRACE dt$time2 <- dt$time^2 dt$wmi2 <- dt$wmi^2 head(dt) dcor(dt) dcor(dt,~time+wmi) dcor(dt,~time+wmi,~vf+chf) dcor(dt,time+wmi~vf+chf) dcor(dt,c("time*","wmi*"),~vf+chf)data("sTRACE",package="timereg") dt<- sTRACE dt$time2 <- dt$time^2 dt$wmi2 <- dt$wmi^2 head(dt) dcor(dt) dcor(dt,~time+wmi) dcor(dt,~time+wmi,~vf+chf) dcor(dt,time+wmi~vf+chf) dcor(dt,c("time*","wmi*"),~vf+chf)
Cut variables, if breaks are given these are used, otherwise cuts into using group size given by probs, or equispace groups on range. Default is equally sized groups if possible
dcut( data, y = NULL, x = NULL, breaks = 4, probs = NULL, equi = FALSE, regex = mets.options()$regex, sep = NULL, na.rm = TRUE, labels = NULL, all = FALSE, ... )dcut( data, y = NULL, x = NULL, breaks = 4, probs = NULL, equi = FALSE, regex = mets.options()$regex, sep = NULL, na.rm = TRUE, labels = NULL, all = FALSE, ... )
data |
if x is formula or names for data frame then data frame is needed. |
y |
name of variable, or fomula, or names of variables on data frame. |
x |
name of variable, or fomula, or names of variables on data frame. |
breaks |
number of breaks, for variables or vector of break points, |
probs |
groups defined from quantiles |
equi |
for equi-spaced breaks |
regex |
for regular expressions. |
sep |
seperator for naming of cut names. |
na.rm |
to remove NA for grouping variables. |
labels |
to use for cut groups |
all |
to do all variables, even when breaks are not unique |
... |
Optional additional arguments |
Klaus K. Holst and Thomas Scheike
data("sTRACE",package="timereg") sTRACE$age2 <- sTRACE$age^2 sTRACE$age3 <- sTRACE$age^3 mm <- dcut(sTRACE,~age+wmi) head(mm) mm <- dcut(sTRACE,catage4+wmi4~age+wmi) head(mm) mm <- dcut(sTRACE,~age+wmi,breaks=c(2,4)) head(mm) mm <- dcut(sTRACE,c("age","wmi")) head(mm) mm <- dcut(sTRACE,~.) head(mm) mm <- dcut(sTRACE,c("age","wmi"),breaks=c(2,4)) head(mm) gx <- dcut(sTRACE$age) head(gx) ## Removes all cuts variables with these names wildcards mm1 <- drm(mm,c("*.2","*.4")) head(mm1) ## wildcards, for age, age2, age4 and wmi head(dcut(mm,c("a*","?m*"))) ## with direct asignment drm(mm) <- c("*.2","*.4") head(mm) dcut(mm) <- c("age","*m*") dcut(mm) <- ageg1+wmig1~age+wmi head(mm) ############################ ## renaming ############################ head(mm) drename(mm, ~Age+Wmi) <- c("wmi","age") head(mm) mm1 <- mm ## all names to lower drename(mm1) <- ~. head(mm1) ## A* to lower mm2 <- drename(mm,c("A*","W*")) head(mm2) drename(mm) <- "A*" head(mm) dd <- data.frame(A_1=1:2,B_1=1:2) funn <- function(x) gsub("_",".",x) drename(dd) <- ~. drename(dd,fun=funn) <- ~. names(dd)data("sTRACE",package="timereg") sTRACE$age2 <- sTRACE$age^2 sTRACE$age3 <- sTRACE$age^3 mm <- dcut(sTRACE,~age+wmi) head(mm) mm <- dcut(sTRACE,catage4+wmi4~age+wmi) head(mm) mm <- dcut(sTRACE,~age+wmi,breaks=c(2,4)) head(mm) mm <- dcut(sTRACE,c("age","wmi")) head(mm) mm <- dcut(sTRACE,~.) head(mm) mm <- dcut(sTRACE,c("age","wmi"),breaks=c(2,4)) head(mm) gx <- dcut(sTRACE$age) head(gx) ## Removes all cuts variables with these names wildcards mm1 <- drm(mm,c("*.2","*.4")) head(mm1) ## wildcards, for age, age2, age4 and wmi head(dcut(mm,c("a*","?m*"))) ## with direct asignment drm(mm) <- c("*.2","*.4") head(mm) dcut(mm) <- c("age","*m*") dcut(mm) <- ageg1+wmig1~age+wmi head(mm) ############################ ## renaming ############################ head(mm) drename(mm, ~Age+Wmi) <- c("wmi","age") head(mm) mm1 <- mm ## all names to lower drename(mm1) <- ~. head(mm1) ## A* to lower mm2 <- drename(mm,c("A*","W*")) head(mm2) drename(mm) <- "A*" head(mm) dd <- data.frame(A_1=1:2,B_1=1:2) funn <- function(x) gsub("_",".",x) drename(dd) <- ~. drename(dd,fun=funn) <- ~. names(dd)
Data on dermal ridge counts in left and right hand in (nuclear) families
Data on 50 families with ridge counts in left and right hand for moter, father and each child. Family id in 'family' and gender and child number in 'sex' and 'child'.
Sarah B. Holt (1952). Genetics of dermal ridges: bilateral asymmetry in finger ridge-counts. Annals of Eugenics 17 (1), pp.211–231. DOI: 10.1111/j.1469-1809.1952.tb02513.x
data(dermalridges) fast.reshape(dermalridges,id="family",varying=c("child.left","child.right","sex"))data(dermalridges) fast.reshape(dermalridges,id="family",varying=c("child.left","child.right","sex"))
Data on dermal ridge counts in left and right hand in (nuclear) families
Data on dermal ridge counts (left and right hand) in 18 monozygotic twin pairs.
Sarah B. Holt (1952). Genetics of dermal ridges: bilateral asymmetry in finger ridge-counts. Annals of Eugenics 17 (1), pp.211–231. DOI: 10.1111/j.1469-1809.1952.tb02513.x
data(dermalridgesMZ) fast.reshape(dermalridgesMZ,id="id",varying=c("left","right"))data(dermalridgesMZ) fast.reshape(dermalridgesMZ,id="id",varying=c("left","right"))
The data was colleceted to test a laser treatment for delaying blindness in patients with dibetic retinopathy. The subset of 197 patiens given in Huster et al. (1989) is used.
This data frame contains the following columns:
a numeric vector. Patient code.
a numeric vector. Age of patient at diagnosis.
a numeric vector. Survival time: time to blindness or censoring.
a numeric vector code. Survival status. 1: blindness, 0: censored.
a numeric vector code. Random eye selected for treatment. 1: left eye 2: right eye.
a numeric vector. 1: treatment 0: untreated.
a numeric vector code. 1: younger than 20, 2: older than 20.
Huster W.J. and Brookmeyer, R. and Self. S. (1989) MOdelling paired survival data with covariates, Biometrics 45, 145-56.
data(diabetes) names(diabetes)data(diabetes) names(diabetes)
Split a data set and run function
divide_conquer(func = NULL, data, size, splits, id = NULL, ...)divide_conquer(func = NULL, data, size, splits, id = NULL, ...)
func |
called function |
data |
data-frame |
size |
size of splits |
splits |
number of splits (ignored if size is given) |
id |
optional cluster variable |
... |
Additional arguments to lower level functions |
Thomas Scheike, Klaus K. Holst
## avoid dependency on timereg ## library(timereg) ## data(TRACE) ## res <- divide_conquer(prop.odds,TRACE, ## formula=Event(time,status==9)~chf+vf+age,n.sim=0,size=200)## avoid dependency on timereg ## library(timereg) ## data(TRACE) ## res <- divide_conquer(prop.odds,TRACE, ## formula=Event(time,status==9)~chf+vf+age,n.sim=0,size=200)
Lag operator
dlag(data, x, k = 1, combine = TRUE, simplify = TRUE, names, ...)dlag(data, x, k = 1, combine = TRUE, simplify = TRUE, names, ...)
data |
data.frame or vector |
x |
optional column names or formula |
k |
lag (vector of integers) |
combine |
combine results with original data.frame |
simplify |
Return vector if possible |
names |
optional new column names |
... |
additional arguments to lower level functions |
d <- data.frame(y=1:10,x=c(10:1)) dlag(d,k=1:2) dlag(d,~x,k=0:1) dlag(d$x,k=1) dlag(d$x,k=-1:2, names=letters[1:4])d <- data.frame(y=1:10,x=c(10:1)) dlag(d,k=1:2) dlag(d,~x,k=0:1) dlag(d$x,k=1) dlag(d$x,k=-1:2, names=letters[1:4])
listing for data frames
dprint(data, y = NULL, n = 0, ..., x = NULL)dprint(data, y = NULL, n = 0, ..., x = NULL)
data |
if x is formula or names for data frame then data frame is needed. |
y |
name of variable, or fomula, or names of variables on data frame. |
n |
Index of observations to print (default c(1:nfirst, n-nlast:nlast) |
... |
Optional additional arguments (nfirst,nlast, and print options) |
x |
possible group variable |
Klaus K. Holst and Thomas Scheike
m <- lava::lvm(letters) d <- lava::sim(m, 20) dlist(d,~a+b+c) dlist(d,~a+b+c|a<0 & b>0) ## listing all : dlist(d,~a+b+c|a<0 & b>0,n=0) dlist(d,a+b+c~I(d>0)|a<0 & b>0) dlist(d,.~I(d>0)|a<0 & b>0) dlist(d,~a+b+c|a<0 & b>0, nlast=0) dlist(d,~a+b+c|a<0 & b>0, nfirst=3, nlast=3) dlist(d,~a+b+c|a<0 & b>0, 1:5) dlist(d,~a+b+c|a<0 & b>0, -(5:1)) dlist(d,~a+b+c|a<0 & b>0, list(1:5,50:55,-(5:1))) dprint(d,a+b+c ~ I(d>0) |a<0 & b>0, list(1:5,50:55,-(5:1)))m <- lava::lvm(letters) d <- lava::sim(m, 20) dlist(d,~a+b+c) dlist(d,~a+b+c|a<0 & b>0) ## listing all : dlist(d,~a+b+c|a<0 & b>0,n=0) dlist(d,a+b+c~I(d>0)|a<0 & b>0) dlist(d,.~I(d>0)|a<0 & b>0) dlist(d,~a+b+c|a<0 & b>0, nlast=0) dlist(d,~a+b+c|a<0 & b>0, nfirst=3, nlast=3) dlist(d,~a+b+c|a<0 & b>0, 1:5) dlist(d,~a+b+c|a<0 & b>0, -(5:1)) dlist(d,~a+b+c|a<0 & b>0, list(1:5,50:55,-(5:1))) dprint(d,a+b+c ~ I(d>0) |a<0 & b>0, list(1:5,50:55,-(5:1)))
Regression for data frames with dutility call
dreg( data, y, x = NULL, z = NULL, x.oneatatime = TRUE, x.base.names = NULL, z.arg = c("clever", "base", "group", "condition"), fun. = lm, summary. = summary, regex = FALSE, convert = NULL, doSummary = TRUE, special = NULL, equal = TRUE, test = 1, ... )dreg( data, y, x = NULL, z = NULL, x.oneatatime = TRUE, x.base.names = NULL, z.arg = c("clever", "base", "group", "condition"), fun. = lm, summary. = summary, regex = FALSE, convert = NULL, doSummary = TRUE, special = NULL, equal = TRUE, test = 1, ... )
data |
data frame |
y |
name of variable, or fomula, or names of variables on data frame. |
x |
name of variable, or fomula, or names of variables on data frame. |
z |
name of variable, or fomula, or names of variables on data frame. |
x.oneatatime |
x's one at a time |
x.base.names |
base covarirates |
z.arg |
what is Z, c("clever","base","group","condition"), clever decides based on type of Z, base means that Z is used as fixed baseline covaraites for all X, group means the analyses is done based on groups of Z, and condition means that Z specifies a condition on the data |
fun. |
function lm is default |
summary. |
summary to use |
regex |
regex |
convert |
convert |
doSummary |
doSummary or not |
special |
special's |
equal |
to do pairwise stuff |
test |
development argument |
... |
Additional arguments for fun |
Klaus K. Holst, Thomas Scheike
##' data(iris) dat <- iris drename(dat) <- ~. names(dat) set.seed(1) dat$time <- runif(nrow(dat)) dat$time1 <- runif(nrow(dat)) dat$status <- rbinom(nrow(dat),1,0.5) dat$S1 <- with(dat, Surv(time,status)) dat$S2 <- with(dat, Surv(time1,status)) dat$id <- 1:nrow(dat) mm <- dreg(dat, "*.length"~"*.width"|I(species=="setosa" & status==1)) mm <- dreg(dat, "*.length"~"*.width"|species+status) mm <- dreg(dat, "*.length"~"*.width"|species) mm <- dreg(dat, "*.length"~"*.width"|species+status,z.arg="group") ## Reduce Ex.Timings y <- "S*"~"*.width" xs <- dreg(dat, y, fun.=phreg) ## xs <- dreg(dat, y, fun.=survdiff) y <- "S*"~"*.width" xs <- dreg(dat, y, x.oneatatime=FALSE, fun.=phreg) ## under condition y <- S1~"*.width"|I(species=="setosa" & sepal.width>3) xs <- dreg(dat, y, z.arg="condition", fun.=phreg) xs <- dreg(dat, y, fun.=phreg) ## under condition y <- S1~"*.width"|species=="setosa" xs <- dreg(dat, y, z.arg="condition", fun.=phreg) xs <- dreg(dat, y, fun.=phreg) ## with baseline after | y <- S1~"*.width"|sepal.length xs <- dreg(dat, y, fun.=phreg) ## by group by species, not working y <- S1~"*.width"|species ss <- split(dat, paste(dat$species, dat$status)) xs <- dreg(dat, y, fun.=phreg) ## species as base, species is factor so assumes that this is grouping y <- S1~"*.width"|species xs <- dreg(dat, y, z.arg="base", fun.=phreg) ## background var after | and then one of x's at at time y <- S1~"*.width"|status+"sepal*" xs <- dreg(dat, y, fun.=phreg) ## background var after | and then one of x's at at time ##y <- S1~"*.width"|status+"sepal*" ##xs <- dreg(dat, y, x.oneatatime=FALSE, fun.=phreg) ##xs <- dreg(dat, y, fun.=phreg) ## background var after | and then one of x's at at time ##y <- S1~"*.width"+factor(species) ##xs <- dreg(dat, y, fun.=phreg) ##xs <- dreg(dat, y, fun.=phreg, x.oneatatime=FALSE) y <- S1~"*.width"|factor(species) xs <- dreg(dat, y, z.arg="base", fun.=phreg) y <- S1~"*.width"|cluster(id)+factor(species) xs <- dreg(dat, y, z.arg="base", fun.=phreg) xs <- dreg(dat, y, z.arg="base", fun.=survival::coxph) ## under condition with groups y <- S1~"*.width"|I(sepal.length>4) xs <- dreg(subset(dat, species=="setosa"), y,z.arg="group",fun.=phreg) ## under condition with groups y <- S1~"*.width"+I(log(sepal.length))|I(sepal.length>4) xs <- dreg(subset(dat, species=="setosa"), y,z.arg="group",fun.=phreg) y <- S1~"*.width"+I(dcut(sepal.length))|I(sepal.length>4) xs <- dreg(subset(dat,species=="setosa"), y,z.arg="group",fun.=phreg) ff <- function(formula,data,...) { ss <- survfit(formula,data,...) kmplot(ss,...) return(ss) } if (interactive()) { dcut(dat) <- ~"*.width" y <- S1~"*.4"|I(sepal.length>4) par(mfrow=c(1, 2)) xs <- dreg(dat, y, fun.=ff) }##' data(iris) dat <- iris drename(dat) <- ~. names(dat) set.seed(1) dat$time <- runif(nrow(dat)) dat$time1 <- runif(nrow(dat)) dat$status <- rbinom(nrow(dat),1,0.5) dat$S1 <- with(dat, Surv(time,status)) dat$S2 <- with(dat, Surv(time1,status)) dat$id <- 1:nrow(dat) mm <- dreg(dat, "*.length"~"*.width"|I(species=="setosa" & status==1)) mm <- dreg(dat, "*.length"~"*.width"|species+status) mm <- dreg(dat, "*.length"~"*.width"|species) mm <- dreg(dat, "*.length"~"*.width"|species+status,z.arg="group") ## Reduce Ex.Timings y <- "S*"~"*.width" xs <- dreg(dat, y, fun.=phreg) ## xs <- dreg(dat, y, fun.=survdiff) y <- "S*"~"*.width" xs <- dreg(dat, y, x.oneatatime=FALSE, fun.=phreg) ## under condition y <- S1~"*.width"|I(species=="setosa" & sepal.width>3) xs <- dreg(dat, y, z.arg="condition", fun.=phreg) xs <- dreg(dat, y, fun.=phreg) ## under condition y <- S1~"*.width"|species=="setosa" xs <- dreg(dat, y, z.arg="condition", fun.=phreg) xs <- dreg(dat, y, fun.=phreg) ## with baseline after | y <- S1~"*.width"|sepal.length xs <- dreg(dat, y, fun.=phreg) ## by group by species, not working y <- S1~"*.width"|species ss <- split(dat, paste(dat$species, dat$status)) xs <- dreg(dat, y, fun.=phreg) ## species as base, species is factor so assumes that this is grouping y <- S1~"*.width"|species xs <- dreg(dat, y, z.arg="base", fun.=phreg) ## background var after | and then one of x's at at time y <- S1~"*.width"|status+"sepal*" xs <- dreg(dat, y, fun.=phreg) ## background var after | and then one of x's at at time ##y <- S1~"*.width"|status+"sepal*" ##xs <- dreg(dat, y, x.oneatatime=FALSE, fun.=phreg) ##xs <- dreg(dat, y, fun.=phreg) ## background var after | and then one of x's at at time ##y <- S1~"*.width"+factor(species) ##xs <- dreg(dat, y, fun.=phreg) ##xs <- dreg(dat, y, fun.=phreg, x.oneatatime=FALSE) y <- S1~"*.width"|factor(species) xs <- dreg(dat, y, z.arg="base", fun.=phreg) y <- S1~"*.width"|cluster(id)+factor(species) xs <- dreg(dat, y, z.arg="base", fun.=phreg) xs <- dreg(dat, y, z.arg="base", fun.=survival::coxph) ## under condition with groups y <- S1~"*.width"|I(sepal.length>4) xs <- dreg(subset(dat, species=="setosa"), y,z.arg="group",fun.=phreg) ## under condition with groups y <- S1~"*.width"+I(log(sepal.length))|I(sepal.length>4) xs <- dreg(subset(dat, species=="setosa"), y,z.arg="group",fun.=phreg) y <- S1~"*.width"+I(dcut(sepal.length))|I(sepal.length>4) xs <- dreg(subset(dat,species=="setosa"), y,z.arg="group",fun.=phreg) ff <- function(formula,data,...) { ss <- survfit(formula,data,...) kmplot(ss,...) return(ss) } if (interactive()) { dcut(dat) <- ~"*.width" y <- S1~"*.4"|I(sepal.length>4) par(mfrow=c(1, 2)) xs <- dreg(dat, y, fun.=ff) }
levels shows levels for variables in data frame, relevel relevels a factor in data.frame
drelevel( data, y = NULL, x = NULL, ref = NULL, newlevels = NULL, regex = mets.options()$regex, sep = NULL, overwrite = FALSE, ... )drelevel( data, y = NULL, x = NULL, ref = NULL, newlevels = NULL, regex = mets.options()$regex, sep = NULL, overwrite = FALSE, ... )
data |
if x is formula or names for data frame then data frame is needed. |
y |
name of variable, or fomula, or names of variables on data frame. |
x |
name of variable, or fomula, or names of variables on data frame. |
ref |
new reference variable |
newlevels |
to combine levels of factor in data frame |
regex |
for regular expressions. |
sep |
seperator for naming of cut names. |
overwrite |
to overwrite variable |
... |
Optional additional arguments |
Klaus K. Holst and Thomas Scheike
data(mena) dstr(mena) dfactor(mena) <- ~twinnum dnumeric(mena) <- ~twinnum.f dstr(mena) mena2 <- drelevel(mena,"cohort",ref="(1980,1982]") mena2 <- drelevel(mena,~cohort,ref="(1980,1982]") mena2 <- drelevel(mena,cohortII~cohort,ref="(1980,1982]") dlevels(mena) dlevels(mena2) drelevel(mena,ref="(1975,1977]") <- ~cohort drelevel(mena,ref="(1980,1982]") <- ~cohort dlevels(mena,"coh*") dtable(mena,"coh*",level=1) ### level 1 of zyg as baseline for new variable drelevel(mena,ref=1) <- ~zyg drelevel(mena,ref=c("DZ","[1973,1975]")) <- ~ zyg+cohort drelevel(mena,ref=c("DZ","[1973,1975]")) <- zygdz+cohort.early~ zyg+cohort ### level 2 of zyg and cohort as baseline for new variables drelevel(mena,ref=2) <- ~ zyg+cohort dlevels(mena) ##################### combining factor levels with newlevels argument dcut(mena,labels=c("I","II","III","IV")) <- cat4~agemena dlevels(drelevel(mena,~cat4,newlevels=1:3)) dlevels(drelevel(mena,ncat4~cat4,newlevels=3:2)) drelevel(mena,newlevels=3:2) <- ncat4~cat4 dlevels(mena) dlevels(drelevel(mena,nca4~cat4,newlevels=list(c(1,4),2:3))) drelevel(mena,newlevels=list(c(1,4),2:3)) <- nca4..2 ~ cat4 dlevels(mena) drelevel(mena,newlevels=list("I-III"=c("I","II","III"),"IV"="IV")) <- nca4..3 ~ cat4 dlevels(mena) drelevel(mena,newlevels=list("I-III"=c("I","II","III"))) <- nca4..4 ~ cat4 dlevels(mena) drelevel(mena,newlevels=list(group1=c("I","II","III"))) <- nca4..5 ~ cat4 dlevels(mena) drelevel(mena,newlevels=list(g1=c("I","II","III"),g2="IV")) <- nca4..6 ~ cat4 dlevels(mena)data(mena) dstr(mena) dfactor(mena) <- ~twinnum dnumeric(mena) <- ~twinnum.f dstr(mena) mena2 <- drelevel(mena,"cohort",ref="(1980,1982]") mena2 <- drelevel(mena,~cohort,ref="(1980,1982]") mena2 <- drelevel(mena,cohortII~cohort,ref="(1980,1982]") dlevels(mena) dlevels(mena2) drelevel(mena,ref="(1975,1977]") <- ~cohort drelevel(mena,ref="(1980,1982]") <- ~cohort dlevels(mena,"coh*") dtable(mena,"coh*",level=1) ### level 1 of zyg as baseline for new variable drelevel(mena,ref=1) <- ~zyg drelevel(mena,ref=c("DZ","[1973,1975]")) <- ~ zyg+cohort drelevel(mena,ref=c("DZ","[1973,1975]")) <- zygdz+cohort.early~ zyg+cohort ### level 2 of zyg and cohort as baseline for new variables drelevel(mena,ref=2) <- ~ zyg+cohort dlevels(mena) ##################### combining factor levels with newlevels argument dcut(mena,labels=c("I","II","III","IV")) <- cat4~agemena dlevels(drelevel(mena,~cat4,newlevels=1:3)) dlevels(drelevel(mena,ncat4~cat4,newlevels=3:2)) drelevel(mena,newlevels=3:2) <- ncat4~cat4 dlevels(mena) dlevels(drelevel(mena,nca4~cat4,newlevels=list(c(1,4),2:3))) drelevel(mena,newlevels=list(c(1,4),2:3)) <- nca4..2 ~ cat4 dlevels(mena) drelevel(mena,newlevels=list("I-III"=c("I","II","III"),"IV"="IV")) <- nca4..3 ~ cat4 dlevels(mena) drelevel(mena,newlevels=list("I-III"=c("I","II","III"))) <- nca4..4 ~ cat4 dlevels(mena) drelevel(mena,newlevels=list(group1=c("I","II","III"))) <- nca4..5 ~ cat4 dlevels(mena) drelevel(mena,newlevels=list(g1=c("I","II","III"),g2="IV")) <- nca4..6 ~ cat4 dlevels(mena)
Removes terms such as strata() or cluster() from a
formula/terms object.
drop.specials(x, components, ...)drop.specials(x, components, ...)
x |
a formula object. |
components |
character vector of special term names to remove. |
... |
additional arguments passed to |
The updated formula with an attribute "variables" containing
the extracted variable names from removed specials.
Sort data according to columns in data frame
dsort(data, x, ..., decreasing = FALSE, return.order = FALSE)dsort(data, x, ..., decreasing = FALSE, return.order = FALSE)
data |
Data frame |
x |
variable to order by |
... |
additional variables to order by |
decreasing |
sort order (vector of length x) |
return.order |
return order |
data.frame
data(data="hubble",package="lava") dsort(hubble, "sigma") dsort(hubble, hubble$sigma,"v") dsort(hubble,~sigma+v) dsort(hubble,~sigma-v) ## with direct asignment dsort(hubble) <- ~sigma-vdata(data="hubble",package="lava") dsort(hubble, "sigma") dsort(hubble, hubble$sigma,"v") dsort(hubble,~sigma+v) dsort(hubble,~sigma-v) ## with direct asignment dsort(hubble) <- ~sigma-v
Constructs simple linear spline on a data frame using the formula syntax of dutils that is adds (x-cuti)* (x>cuti) to the data-set for each knot of the spline. The full spline is thus given by x and spline variables added to the data-set.
dspline( data, y = NULL, x = NULL, breaks = 4, probs = NULL, equi = FALSE, regex = mets.options()$regex, sep = NULL, na.rm = TRUE, labels = NULL, all = FALSE, ... )dspline( data, y = NULL, x = NULL, breaks = 4, probs = NULL, equi = FALSE, regex = mets.options()$regex, sep = NULL, na.rm = TRUE, labels = NULL, all = FALSE, ... )
data |
if x is formula or names for data frame then data frame is needed. |
y |
name of variable, or fomula, or names of variables on data frame. |
x |
name of variable, or fomula, or names of variables on data frame. |
breaks |
number of breaks, for variables or vector of break points, |
probs |
groups defined from quantiles |
equi |
for equi-spaced breaks |
regex |
for regular expressions. |
sep |
seperator for naming of cut names. |
na.rm |
to remove NA for grouping variables. |
labels |
to use for cut groups |
all |
to do all variables, even when breaks are not unique |
... |
Optional additional arguments |
Thomas Scheike
data(TRACE) TRACE <- dspline(TRACE,~wmi,breaks=c(1,1.3,1.7)) cca <- survival::coxph(Surv(time,status==9)~age+vf+chf+wmi,data=TRACE) cca2 <- survival::coxph(Surv(time,status==9)~age+wmi+vf+chf+ wmi.spline1+wmi.spline2+wmi.spline3,data=TRACE) anova(cca,cca2) nd <- data.frame(age=50,vf=0,chf=0,wmi=seq(0.4,3,by=0.01)) nd <- dspline(nd,~wmi,breaks=c(1,1.3,1.7)) pl <- predict(cca2,newdata=nd) plot(nd$wmi,pl,type="l")data(TRACE) TRACE <- dspline(TRACE,~wmi,breaks=c(1,1.3,1.7)) cca <- survival::coxph(Surv(time,status==9)~age+vf+chf+wmi,data=TRACE) cca2 <- survival::coxph(Surv(time,status==9)~age+wmi+vf+chf+ wmi.spline1+wmi.spline2+wmi.spline3,data=TRACE) anova(cca,cca2) nd <- data.frame(age=50,vf=0,chf=0,wmi=seq(0.4,3,by=0.01)) nd <- dspline(nd,~wmi,breaks=c(1,1.3,1.7)) pl <- predict(cca2,newdata=nd) plot(nd$wmi,pl,type="l")
tables for data frames
dtable( data, y = NULL, x = NULL, ..., level = -1, response = NULL, flat = TRUE, total = FALSE, prop = FALSE, summary = NULL )dtable( data, y = NULL, x = NULL, ..., level = -1, response = NULL, flat = TRUE, total = FALSE, prop = FALSE, summary = NULL )
data |
if x is formula or names for data frame then data frame is needed. |
y |
name of variable, or fomula, or names of variables on data frame. |
x |
name of variable, or fomula, or names of variables on data frame. |
... |
Optional additional arguments |
level |
1 for all marginal tables, 2 for all 2 by 2 tables, and null for the full table, possible versus group variable |
response |
For level=2, only produce tables with columns given by 'response' (index) |
flat |
produce flat tables |
total |
add total counts/proportions |
prop |
Proportions instead of counts (vector of margins) |
summary |
summary function |
Klaus K. Holst and Thomas Scheike
data("sTRACE",package="timereg") dtable(sTRACE,~status) dtable(sTRACE,~status+vf) dtable(sTRACE,~status+vf,level=1) dtable(sTRACE,~status+vf,~chf+diabetes) dtable(sTRACE,c("*f*","status"),~diabetes) dtable(sTRACE,c("*f*","status"),~diabetes, level=2) dtable(sTRACE,c("*f*","status"),level=1) dtable(sTRACE,~"*f*"+status,level=1) dtable(sTRACE,~"*f*"+status+I(wmi>1.4)|age>60,level=2) dtable(sTRACE,"*f*"+status~I(wmi>0.5)|age>60,level=1) dtable(sTRACE,status~dcut(age)) dtable(sTRACE,~status+vf+sex|age>60) dtable(sTRACE,status+vf+sex~+1|age>60, level=2) dtable(sTRACE,.~status+vf+sex|age>60,level=1) dtable(sTRACE,status+vf+sex~diabetes|age>60) dtable(sTRACE,status+vf+sex~diabetes|age>60, flat=FALSE) dtable(sTRACE,status+vf+sex~diabetes|age>60, level=1) dtable(sTRACE,status+vf+sex~diabetes|age>60, level=2) dtable(sTRACE,status+vf+sex~diabetes|age>60, level=2, prop=1, total=TRUE) dtable(sTRACE,status+vf+sex~diabetes|age>60, level=2, prop=2, total=TRUE) dtable(sTRACE,status+vf+sex~diabetes|age>60, level=2, prop=1:2, summary=summary)data("sTRACE",package="timereg") dtable(sTRACE,~status) dtable(sTRACE,~status+vf) dtable(sTRACE,~status+vf,level=1) dtable(sTRACE,~status+vf,~chf+diabetes) dtable(sTRACE,c("*f*","status"),~diabetes) dtable(sTRACE,c("*f*","status"),~diabetes, level=2) dtable(sTRACE,c("*f*","status"),level=1) dtable(sTRACE,~"*f*"+status,level=1) dtable(sTRACE,~"*f*"+status+I(wmi>1.4)|age>60,level=2) dtable(sTRACE,"*f*"+status~I(wmi>0.5)|age>60,level=1) dtable(sTRACE,status~dcut(age)) dtable(sTRACE,~status+vf+sex|age>60) dtable(sTRACE,status+vf+sex~+1|age>60, level=2) dtable(sTRACE,.~status+vf+sex|age>60,level=1) dtable(sTRACE,status+vf+sex~diabetes|age>60) dtable(sTRACE,status+vf+sex~diabetes|age>60, flat=FALSE) dtable(sTRACE,status+vf+sex~diabetes|age>60, level=1) dtable(sTRACE,status+vf+sex~diabetes|age>60, level=2) dtable(sTRACE,status+vf+sex~diabetes|age>60, level=2, prop=1, total=TRUE) dtable(sTRACE,status+vf+sex~diabetes|age>60, level=2, prop=2, total=TRUE) dtable(sTRACE,status+vf+sex~diabetes|age>60, level=2, prop=1:2, summary=summary)
Defines new variables under condition for data frame
dtransform(data, ...)dtransform(data, ...)
data |
is data frame |
... |
new variable definitions including possible if condition |
data(mena) xx <- dtransform(mena,ll=log(agemena)+twinnum) xx <- dtransform(mena,ll=log(agemena)+twinnum,agemena<15) xx <- dtransform(xx ,ll=100+agemena,ll2=1000,agemena>15) dsummary(xx,ll+ll2~I(agemena>15))data(mena) xx <- dtransform(mena,ll=log(agemena)+twinnum) xx <- dtransform(mena,ll=log(agemena)+twinnum,agemena<15) xx <- dtransform(xx ,ll=100+agemena,ll2=1000,agemena>15) dsummary(xx,ll+ll2~I(agemena>15))
Constructur for Event History objects
Event(time, time2 = TRUE, cause = NULL, cens.code = 0, ...)Event(time, time2 = TRUE, cause = NULL, cens.code = 0, ...)
time |
Time |
time2 |
Time 2 |
cause |
Cause |
cens.code |
Censoring code (default 0) |
... |
Additional arguments |
... content for details
Object of class Event (a matrix)
Klaus K. Holst and Thomas Scheike
t1 <- 1:10 t2 <- t1+runif(10) ca <- rbinom(10,2,0.4) (x <- Event(t1,t2,ca))t1 <- 1:10 t2 <- t1+runif(10) ca <- rbinom(10,2,0.4) (x <- Event(t1,t2,ca))
Constructs start stop formulation of event time data after a variable in the data.set. Similar to SurvSplit of the survival package but can also split after random time given in data frame.
event_split( data, time = "time", status = "status", cuts = "cuts", name.id = "id", name.start = "start", cens.code = 0, order.id = TRUE, time.group = FALSE )event_split( data, time = "time", status = "status", cuts = "cuts", name.id = "id", name.start = "start", cens.code = 0, order.id = TRUE, time.group = FALSE )
data |
data to be split |
time |
time variable. |
status |
status variable. |
cuts |
cuts variable or numeric cut (only one value) |
name.id |
name of id variable. |
name.start |
name of start variable in data, start can also be numeric "0" |
cens.code |
code for the censoring. |
order.id |
order data after id and start. |
time.group |
make variable "before"."cut" that keeps track of wether start,stop is before (1) or after cut (0). |
Thomas Scheike
set.seed(1) d <- data.frame(event=round(5*runif(5),2),start=1:5,time=2*1:5, status=rbinom(5,1,0.5),x=1:5) d d0 <- event_split(d,cuts="event",name.start=0) d0 dd <- event_split(d,cuts="event") dd ddd <- event_split(dd,cuts=3.5) ddd event_split(ddd,cuts=5.5) ### successive cutting for many values dd <- d for (cuts in seq(2,3,by=0.3)) dd <- event_split(dd,cuts=cuts) dd ########################################################################### ### same but for situation with multiple events along the time-axis ########################################################################### d <- data.frame(event1=1:5+runif(5)*0.5,start=1:5,time=2*1:5, status=rbinom(5,1,0.5),x=1:5,start0=0) d$event2 <- d$event1+0.2 d$event2[4:5] <- NA d d0 <- event_split(d,cuts="event1",name.start="start",time="time",status="status") d0 ### d00 <- event_split(d0,cuts="event2",name.start="start",time="time",status="status") d00set.seed(1) d <- data.frame(event=round(5*runif(5),2),start=1:5,time=2*1:5, status=rbinom(5,1,0.5),x=1:5) d d0 <- event_split(d,cuts="event",name.start=0) d0 dd <- event_split(d,cuts="event") dd ddd <- event_split(dd,cuts=3.5) ddd event_split(ddd,cuts=5.5) ### successive cutting for many values dd <- d for (cuts in seq(2,3,by=0.3)) dd <- event_split(dd,cuts=cuts) dd ########################################################################### ### same but for situation with multiple events along the time-axis ########################################################################### d <- data.frame(event1=1:5+runif(5)*0.5,start=1:5,time=2*1:5, status=rbinom(5,1,0.5),x=1:5,start0=0) d$event2 <- d$event1+0.2 d$event2[4:5] <- NA d d0 <- event_split(d,cuts="event1",name.start="start",time="time",status="status") d0 ### d00 <- event_split(d0,cuts="event2",name.start="start",time="time",status="status") d00
Cuts time for two time-scales, as event.split
event_split2( data, time = "time", status = "status", entry = "start", cuts = "cuts", name.id = "id", gaptime = NULL, gaptime.entry = NULL, cuttime = c("time", "gaptime"), cens.code = 0, order.id = TRUE )event_split2( data, time = "time", status = "status", entry = "start", cuts = "cuts", name.id = "id", gaptime = NULL, gaptime.entry = NULL, cuttime = c("time", "gaptime"), cens.code = 0, order.id = TRUE )
data |
data to be split |
time |
time variable. |
status |
status variable. |
entry |
name of entry variable. |
cuts |
cuts variable or numeric cut (only one value) |
name.id |
name of id variable. |
gaptime |
gaptime variable. |
gaptime.entry |
name of entry variable for gaptime. |
cuttime |
to cut after time or gaptime |
cens.code |
code for the censoring. |
order.id |
order data after id and start. |
Thomas Scheike
rr <- data.frame(time=c(500,1000),start=c(0,500),status=c(1,1),id=c(1,1)) rr$gaptime <- rr$time-rr$start rr$gapstart <- 0 rr1 <- event_split2(rr,cuts=600,cuttime="time", gaptime="gaptime",gaptime.entry="gapstart") rr2 <- event_split2(rr1,cuts=100,cuttime="gaptime",gaptime="gaptime",gaptime.entry="gapstart") dlist(rr1,start-time+status+gapstart+gaptime~id) dlist(rr2,start-time+status+gapstart+gaptime~id)rr <- data.frame(time=c(500,1000),start=c(0,500),status=c(1,1),id=c(1,1)) rr$gaptime <- rr$time-rr$start rr$gapstart <- 0 rr1 <- event_split2(rr,cuts=600,cuttime="time", gaptime="gaptime",gaptime.entry="gapstart") rr2 <- event_split2(rr1,cuts=100,cuttime="gaptime",gaptime="gaptime",gaptime.entry="gapstart") dlist(rr1,start-time+status+gapstart+gaptime~id) dlist(rr2,start-time+status+gapstart+gaptime~id)
Summary for survival analyses via the 'lifetable' function
eventpois( object, ..., timevar, time, int.len, confint = FALSE, level = 0.95, individual = FALSE, length.out = 25 )eventpois( object, ..., timevar, time, int.len, confint = FALSE, level = 0.95, individual = FALSE, length.out = 25 )
object |
glm object (poisson regression) |
... |
Contrast arguments |
timevar |
Name of time variable |
time |
Time points (optional) |
int.len |
Time interval length (optional) |
confint |
If TRUE confidence limits are supplied |
level |
Level of confidence limits |
individual |
Individual predictions |
length.out |
Length of time vector |
Summary for survival analyses via the 'lifetable' function
Klaus K. Holst
Extends a collection of cumulative hazard functions so that they all cover the same time range. Cumulative hazards that end before the maximum observed time are extrapolated linearly using either a user-supplied rate or the average hazard rate estimated from the existing cumulative hazard.
extendCums(cumA, cumB, extend = NULL)extendCums(cumA, cumB, extend = NULL)
cumA |
a cumulative hazard matrix (two columns: time, cumulative hazard) or a list of such matrices. |
cumB |
an optional second cumulative hazard matrix. If provided it is
combined with |
extend |
numeric vector of hazard rates used for extrapolation, one per
cumulative hazard. If |
A list of cumulative hazard matrices extended to the common maximum time.
rchaz, rcrisk, mets-simulation
Finds all pairs within a cluster (family)
familycluster_index(clusters, index.type = FALSE, num = NULL, Rindex = 1)familycluster_index(clusters, index.type = FALSE, num = NULL, Rindex = 1)
clusters |
list of indeces |
index.type |
argument of cluster index |
num |
num |
Rindex |
index starts with 1 in R, and 0 in C |
Klaus Holst, Thomas Scheike
Cluster indeces
cluster_index familyclusterWithProbands_index
i<-c(1,1,2,2,1,3) d<- familycluster_index(i) print(d)i<-c(1,1,2,2,1,3) d<- familycluster_index(i) print(d)
second column of pairs are the probands and the first column the related subjects
familyclusterWithProbands_index( clusters, probands, index.type = FALSE, num = NULL, Rindex = 1 )familyclusterWithProbands_index( clusters, probands, index.type = FALSE, num = NULL, Rindex = 1 )
clusters |
list of indeces giving the clusters (families) |
probands |
list of 0,1 where 1 specifices which of the subjects that are probands |
index.type |
argument passed to other functions |
num |
argument passed to other functions |
Rindex |
index starts with 1, in C is it is 0 |
Klaus Holst, Thomas Scheike
Cluster indeces
familycluster_index cluster_index
i<-c(1,1,2,2,1,3) p<-c(1,0,0,1,0,1) d<- familyclusterWithProbands_index(i,p) print(d)i<-c(1,1,2,2,1,3) p<-c(1,0,0,1,0,1) d<- familyclusterWithProbands_index(i,p) print(d)
Fast approximation
fast.approx( time, new.time, equal = FALSE, type = c("nearest", "right", "left"), sorted = FALSE, ... )fast.approx( time, new.time, equal = FALSE, type = c("nearest", "right", "left"), sorted = FALSE, ... )
time |
Original ordered time points |
new.time |
New time points |
equal |
If TRUE a list is returned with additional element |
type |
Type of matching, nearest index, nearest greater than or equal (right), number of elements smaller than y otherwise the closest value above new.time is returned. |
sorted |
Set to true if new.time is already sorted |
... |
Optional additional arguments |
Klaus K. Holst
id <- c(1,1,2,2,7,7,10,10) fast.approx(unique(id),id) t <- 0:6 n <- c(-1,0,0.1,0.9,1,1.1,1.2,6,6.5) fast.approx(t,n,type="left")id <- c(1,1,2,2,7,7,10,10) fast.approx(unique(id),id) t <- 0:6 n <- c(-1,0,0.1,0.9,1,1.1,1.2,6,6.5) fast.approx(t,n,type="left")
Converts a cluster variable to consecutive numeric indices using compiled code.
fast.cluster(x, ...)fast.cluster(x, ...)
x |
integer vector of cluster identifiers. |
... |
additional arguments (not used). |
Integer vector of 0-based cluster indices.
Fast pattern
fast.pattern(x, y, categories = 2, ...)fast.pattern(x, y, categories = 2, ...)
x |
Matrix (binary) of patterns. Optionally if |
y |
Optional matrix argument with same dimensions as |
categories |
Default 2 (binary) |
... |
Optional additional arguments |
Klaus K. Holst
X <- matrix(rbinom(100,1,0.5),ncol=4) fast.pattern(X) X <- matrix(rbinom(100,3,0.5),ncol=4) fast.pattern(X,categories=4)X <- matrix(rbinom(100,1,0.5),ncol=4) fast.pattern(X) X <- matrix(rbinom(100,3,0.5),ncol=4) fast.pattern(X,categories=4)
Fast reshape/tranpose of data
fast.reshape( data, varying, id, num, sep = "", keep, idname = "id", numname = "num", factor = FALSE, idcombine = TRUE, labelnum = FALSE, labels, regex = mets.options()$regex, dropid = FALSE, ... )fast.reshape( data, varying, id, num, sep = "", keep, idname = "id", numname = "num", factor = FALSE, idcombine = TRUE, labelnum = FALSE, labels, regex = mets.options()$regex, dropid = FALSE, ... )
data |
data.frame or matrix |
varying |
Vector of prefix-names of the time varying variables. Optional for Long->Wide reshaping. |
id |
id-variable. If omitted then reshape Wide->Long. |
num |
Optional number/time variable |
sep |
String seperating prefix-name with number/time |
keep |
Vector of column names to keep |
idname |
Name of id-variable (Wide->Long) |
numname |
Name of number-variable (Wide->Long) |
factor |
If true all factors are kept (otherwise treated as character) |
idcombine |
If TRUE and |
labelnum |
If TRUE varying variables in wide format (going from long->wide) are labeled 1,2,3,... otherwise use 'num' variable. In long-format (going from wide->long) varying variables matching 'varying' prefix are only selected if their postfix is a number. |
labels |
Optional labels for the number variable |
regex |
Use regular expressions |
dropid |
Drop id in long format (default FALSE) |
... |
Optional additional arguments |
Thomas Scheike, Klaus K. Holst
m <- lava::lvm(c(y1,y2,y3,y4)~x) d <- lava::sim(m,5) d fast.reshape(d,"y") fast.reshape(fast.reshape(d,"y"),id="id") ##### From wide-format (dd <- fast.reshape(d,"y")) ## Same with explicit setting new id and number variable/column names ## and seperator "" (default) and dropping x fast.reshape(d,"y",idname="a",timevar="b",sep="",keep=c()) ## Same with 'reshape' list-syntax fast.reshape(d,list(c("y1","y2","y3","y4")),labelnum=TRUE) ##### From long-format fast.reshape(dd,id="id") ## Restrict set up within-cluster varying variables fast.reshape(dd,"y",id="id") fast.reshape(dd,"y",id="id",keep="x",sep=".") ##### x <- data.frame(id=c(5,5,6,6,7),y=1:5,x=1:5,tv=c(1,2,2,1,2)) x (xw <- fast.reshape(x,id="id")) (xl <- fast.reshape(xw,c("y","x"),idname="id2",keep=c())) (xl <- fast.reshape(xw,c("y","x","tv"))) (xw2 <- fast.reshape(xl,id="id",num="num")) fast.reshape(xw2,c("y","x"),idname="id") ### more generally: ### varying=list(c("ym","yf","yb1","yb2"), c("zm","zf","zb1","zb2")) ### varying=list(c("ym","yf","yb1","yb2"))) ##### Family cluster example d <- mets:::sim_BinFam(3) d fast.reshape(d,var="y") fast.reshape(d,varying=list(c("ym","yf","yb1","yb2"))) d <- lava::sim(lava::lvm(~y1+y2+ya),10) d (dd <- fast.reshape(d,"y")) fast.reshape(d,"y",labelnum=TRUE) fast.reshape(dd,id="id",num="num") fast.reshape(dd,id="id",num="num",labelnum=TRUE) fast.reshape(d,c(a="y"),labelnum=TRUE) ## New column name ##### Unbalanced data m <- lava::lvm(c(y1,y2,y3,y4)~ x+z1+z3+z5) d <- lava::sim(m,3) d fast.reshape(d,c("y","z")) ##### not-varying syntax: fast.reshape(d,-c("x")) ##### Automatically define varying variables from trailing digits fast.reshape(d) ##### Prostate cancer example data(prt) head(prtw <- fast.reshape(prt,"cancer",id="id")) ftable(cancer1~cancer2,data=prtw) rm(prtw)m <- lava::lvm(c(y1,y2,y3,y4)~x) d <- lava::sim(m,5) d fast.reshape(d,"y") fast.reshape(fast.reshape(d,"y"),id="id") ##### From wide-format (dd <- fast.reshape(d,"y")) ## Same with explicit setting new id and number variable/column names ## and seperator "" (default) and dropping x fast.reshape(d,"y",idname="a",timevar="b",sep="",keep=c()) ## Same with 'reshape' list-syntax fast.reshape(d,list(c("y1","y2","y3","y4")),labelnum=TRUE) ##### From long-format fast.reshape(dd,id="id") ## Restrict set up within-cluster varying variables fast.reshape(dd,"y",id="id") fast.reshape(dd,"y",id="id",keep="x",sep=".") ##### x <- data.frame(id=c(5,5,6,6,7),y=1:5,x=1:5,tv=c(1,2,2,1,2)) x (xw <- fast.reshape(x,id="id")) (xl <- fast.reshape(xw,c("y","x"),idname="id2",keep=c())) (xl <- fast.reshape(xw,c("y","x","tv"))) (xw2 <- fast.reshape(xl,id="id",num="num")) fast.reshape(xw2,c("y","x"),idname="id") ### more generally: ### varying=list(c("ym","yf","yb1","yb2"), c("zm","zf","zb1","zb2")) ### varying=list(c("ym","yf","yb1","yb2"))) ##### Family cluster example d <- mets:::sim_BinFam(3) d fast.reshape(d,var="y") fast.reshape(d,varying=list(c("ym","yf","yb1","yb2"))) d <- lava::sim(lava::lvm(~y1+y2+ya),10) d (dd <- fast.reshape(d,"y")) fast.reshape(d,"y",labelnum=TRUE) fast.reshape(dd,id="id",num="num") fast.reshape(dd,id="id",num="num",labelnum=TRUE) fast.reshape(d,c(a="y"),labelnum=TRUE) ## New column name ##### Unbalanced data m <- lava::lvm(c(y1,y2,y3,y4)~ x+z1+z3+z5) d <- lava::sim(m,3) d fast.reshape(d,c("y","z")) ##### not-varying syntax: fast.reshape(d,-c("x")) ##### Automatically define varying variables from trailing digits fast.reshape(d) ##### Prostate cancer example data(prt) head(prtw <- fast.reshape(prt,"cancer",id="id")) ftable(cancer1~cancer2,data=prtw) rm(prtw)
Reshapes clustered long-format data to wide format efficiently using compiled code.
faster.reshape(data, clusters, index.type = FALSE, num = NULL, Rindex = 1)faster.reshape(data, clusters, index.type = FALSE, num = NULL, Rindex = 1)
data |
a matrix or data.frame to reshape. |
clusters |
vector of cluster identifiers, or column name in data. |
index.type |
logical; if TRUE, clusters are already 0-based indices. |
num |
optional within-cluster numbering variable. |
Rindex |
if 1, use R (1-based) indexing. |
A wide-format matrix.
Splits n observations into random folds for cross-validation.
folds(n, folds = 10)folds(n, folds = 10)
n |
number of observations. |
folds |
number of folds (default 10). |
A list of integer vectors, each containing indices for one fold.
Enforces the same censoring time within clusters (pairs) by censoring both subjects at the minimum censoring time when censoring occurs first.
force.same.cens( data, id = "id", time = "time", cause = "cause", entrytime = NULL, cens.code = 0 ) force_same_cens( data, id = "id", time = "time", cause = "cause", entrytime = NULL, cens.code = 0 )force.same.cens( data, id = "id", time = "time", cause = "cause", entrytime = NULL, cens.code = 0 ) force_same_cens( data, id = "id", time = "time", cause = "cause", entrytime = NULL, cens.code = 0 )
data |
a data.frame in long format. |
id |
name of the cluster/pair identifier column. |
time |
name of the time variable. |
cause |
name of the cause/status variable. |
entrytime |
optional name of left-truncation time variable. |
cens.code |
value indicating censoring in the cause variable. |
A data.frame with enforced same censoring.
Fits GLM model with treatment weights
, computes standard errors via influence functions that are returned as the IID argument. Propensity scores are fitted using either logistic regression (glm) or the multinomial model (mlogit) when more than two categories for treatment. The treatment needs to be a factor and is identified on the rhs of the "treat.model".
glm_IPTW( formula, data, treat.model = NULL, family = binomial(), id = NULL, weights = NULL, estpr = 1, pi0 = 0.5, ... )glm_IPTW( formula, data, treat.model = NULL, family = binomial(), id = NULL, weights = NULL, estpr = 1, pi0 = 0.5, ... )
formula |
for glm |
data |
data frame for risk averaging |
treat.model |
propensity score model (binary or multinomial) |
family |
of glm (logistic regression) |
id |
cluster id for standard errors |
weights |
may be given, and then uses weights*w(A) as the weights |
estpr |
to estimate propensity scores and get infuence function contribution to uncertainty |
pi0 |
fixed simple weights |
... |
arguments for glm call |
Also works with cluster argument.
Thomas Scheike
Performs cumulative score process residual tests for the proportional hazards (PH) assumption in Cox regression. The test statistics are based on the cumulative score process:
where are the martingale residuals.
## S3 method for class 'phreg' gof(object, n.sim = 1000, silent = 1, robust = NULL, ...)## S3 method for class 'phreg' gof(object, n.sim = 1000, silent = 1, robust = NULL, ...)
object |
A fitted |
n.sim |
Number of simulations for the resampling procedure (default 1000). |
silent |
Logical; if TRUE, suppresses timing estimates for long jobs. |
robust |
Logical; if TRUE, uses robust martingale-based simulations. If NULL, defaults to TRUE if a cluster term is detected in the model call. |
... |
Additional arguments passed to lower-level functions. |
P-values are computed using the Lin, Wei, and Ying (1993) resampling method, which simulates the asymptotic distribution of the supremum of the score process under the null hypothesis of proportional hazards.
The function supports two types of simulation:
Standard: Uses (counting process increments) for simulation.
Robust: Uses (martingale residuals) adjusted for clustering
if a cluster() term is present in the model.
An object of class "gof.phreg" containing:
jumptimes |
Event times used in the process. |
supUsim |
Matrix of simulated supremum values for each covariate. |
res |
Matrix with observed supremum ( |
supU |
Observed supremum values. |
pvals |
Vector of p-values for each covariate. |
score |
Cumulative score process values over time. |
simUt |
Simulated score processes. |
type |
Type of test performed ("prop"). |
robust |
Logical flag indicating if robust simulation was used. |
Thomas Scheike and Klaus K. Holst
Lin, D. Y., Wei, L. J., & Ying, Z. (1993). Checking the Cox model with cumulative sums of martingale-based residuals. Biometrika, 80(3), 557-572.
data(sTRACE) m1 <- phreg(Surv(time,status==9)~vf+chf+diabetes, data=sTRACE) gg <- gof(m1) gg par(mfrow=c(1,3)) plot(gg) m1 <- phreg(Surv(time,status==9)~strata(vf)+chf+diabetes, data=sTRACE) gg <- gof(m1) ## Robust simulations with cluster sTRACE$id <- 1:500 m1 <- phreg(Surv(time,status==9)~vf+chf+diabetes+cluster(id), data=sTRACE) gg <- gof(m1) ggdata(sTRACE) m1 <- phreg(Surv(time,status==9)~vf+chf+diabetes, data=sTRACE) gg <- gof(m1) gg par(mfrow=c(1,3)) plot(gg) m1 <- phreg(Surv(time,status==9)~strata(vf)+chf+diabetes, data=sTRACE) gg <- gof(m1) ## Robust simulations with cluster sTRACE$id <- 1:500 m1 <- phreg(Surv(time,status==9)~vf+chf+diabetes+cluster(id), data=sTRACE) gg <- gof(m1) gg
Tests the functional form of covariates in a Cox PH model by computing cumulative residuals against a user-specified model matrix. This helps detect non-linear effects or time-varying coefficients (interaction with time).
gofM_phreg( formula, data, offset = NULL, weights = NULL, modelmatrix = NULL, n.sim = 1000, silent = 1, ... )gofM_phreg( formula, data, offset = NULL, weights = NULL, modelmatrix = NULL, n.sim = 1000, silent = 1, ... )
formula |
Formula for the Cox regression model. |
data |
Data frame. |
offset |
Offset vector. |
weights |
Weights vector. |
modelmatrix |
Matrix for cumulating residuals. Typically constructed using
|
n.sim |
Number of simulations (default 1000). |
silent |
Logical; suppresses timing estimates if TRUE. |
... |
Additional arguments passed to |
The test statistic is:
where is the model matrix (e.g., a set of basis functions for a continuous covariate)
and are the martingale residuals.
P-values are based on the Lin, Wei, and Ying (1993) resampling method. The plot shows whether the residuals are consistent with the model across the range of the covariate.
An object of class "gof.phreg" containing:
jumptimes |
Event times. |
supUsim |
Simulated supremum values. |
res |
Matrix with observed supremum and p-values for each column of |
score |
Cumulative score process. |
simUt |
Simulated processes. |
Utlast, pval.last
|
Supremum and p-value for the final time point (covariate direction). |
type |
Type of test ("modelmatrix"). |
Thomas Scheike and Klaus K. Holst
Lin, D. Y., Wei, L. J., & Ying, Z. (1993). Checking the Cox model with cumulative sums of martingale-based residuals. Biometrika, 80(3), 557-572.
gof.phreg, gofZ_phreg, cumContr
data(TRACE) set.seed(1) TRACEsam <- blocksample(TRACE, idvar="id", replace=FALSE, 100) dcut(TRACEsam) <- ~. mm <- model.matrix(~-1+factor(wmicat.4), data=TRACEsam) m1 <- gofM_phreg(Surv(time,status==9)~vf+chf+wmi, data=TRACEsam, modelmatrix=mm) summary(m1) if (interactive()) { par(mfrow=c(2,2)) plot(m1) } ## Cumulative sums in covariates via design matrix mm <- mets:::cumContr(TRACEsam$wmi, breaks=10, equi=TRUE) m1 <- gofM_phreg(Surv(time,status==9)~strata(vf)+chf+wmi, data=TRACEsam, modelmatrix=mm, silent=0) summary(m1)data(TRACE) set.seed(1) TRACEsam <- blocksample(TRACE, idvar="id", replace=FALSE, 100) dcut(TRACEsam) <- ~. mm <- model.matrix(~-1+factor(wmicat.4), data=TRACEsam) m1 <- gofM_phreg(Surv(time,status==9)~vf+chf+wmi, data=TRACEsam, modelmatrix=mm) summary(m1) if (interactive()) { par(mfrow=c(2,2)) plot(m1) } ## Cumulative sums in covariates via design matrix mm <- mets:::cumContr(TRACEsam$wmi, breaks=10, equi=TRUE) m1 <- gofM_phreg(Surv(time,status==9)~strata(vf)+chf+wmi, data=TRACEsam, modelmatrix=mm, silent=0) summary(m1)
Tests the functional form of continuous covariates in a Cox PH model to check for
linearity. It computes cumulative residuals evaluated at a grid of covariate values .
gofZ_phreg( formula, data, vars = NULL, offset = NULL, weights = NULL, breaks = 50, equi = FALSE, n.sim = 1000, silent = 1, ... )gofZ_phreg( formula, data, vars = NULL, offset = NULL, weights = NULL, breaks = 50, equi = FALSE, n.sim = 1000, silent = 1, ... )
formula |
Formula for the Cox regression. |
data |
Data frame. |
vars |
Vector of variable names to test. If NULL, automatically detects continuous covariates with more than 2 levels. |
offset |
Offset vector. |
weights |
Weights vector. |
breaks |
Number of break points for the grid (default 50). |
equi |
Logical; if TRUE, uses equidistant breaks; if FALSE, uses quantiles. |
n.sim |
Number of simulations (default 1000). |
silent |
Logical; suppresses timing estimates. |
... |
Additional arguments passed to |
The test statistic is:
where is a design matrix based on indicator functions
for a grid of points .
The p-value is valid but depends on the chosen grid. As the number of break points increases, this test converges to the original Lin, Wei, and Ying test for linearity.
An object of class "gof.phreg" with type "Zmodelmatrix" containing:
res |
Matrix of p-values for each tested variable. |
Zres |
List of |
type |
Type of test ("Zmodelmatrix"). |
Thomas Scheike and Klaus K. Holst
data(TRACE) set.seed(1) TRACEsam <- blocksample(TRACE, idvar="id", replace=FALSE, 100) ## Test linearity of continuous covariates ## Reduce Ex.Timings m1 <- gofZ_phreg(Surv(time,status==9)~strata(vf)+chf+wmi+age, data=TRACEsam) summary(m1) plot(m1, type="z")data(TRACE) set.seed(1) TRACEsam <- blocksample(TRACE, idvar="id", replace=FALSE, 100) ## Test linearity of continuous covariates ## Reduce Ex.Timings m1 <- gofZ_phreg(Surv(time,status==9)~strata(vf)+chf+wmi+age, data=TRACEsam) summary(m1) plot(m1, type="z")
Fits a random effects model describing the dependence in the cumulative incidence curves for subjects within a cluster. Given the gamma distributed random effects it is assumed that the cumulative incidence curves are indpendent, and that the marginal cumulative incidence curves are on additive form
Grandom_cif( cif, data, cause = NULL, cif2 = NULL, times = NULL, cause1 = 1, cause2 = 1, cens.code = NULL, cens.model = "KM", Nit = 40, detail = 0, clusters = NULL, theta = NULL, theta.des = NULL, weights = NULL, step = 1, sym = 0, same.cens = FALSE, censoring.weights = NULL, silent = 1, var.link = 0, score.method = "nr", entry = NULL, estimator = 1, trunkp = 1, admin.cens = NULL, random.design = NULL, ... )Grandom_cif( cif, data, cause = NULL, cif2 = NULL, times = NULL, cause1 = 1, cause2 = 1, cens.code = NULL, cens.model = "KM", Nit = 40, detail = 0, clusters = NULL, theta = NULL, theta.des = NULL, weights = NULL, step = 1, sym = 0, same.cens = FALSE, censoring.weights = NULL, silent = 1, var.link = 0, score.method = "nr", entry = NULL, estimator = 1, trunkp = 1, admin.cens = NULL, random.design = NULL, ... )
cif |
a model object from the timereg::comp.risk function with the marginal cumulative incidence of cause2, i.e., the event that is conditioned on, and whose odds the comparision is made with respect to |
data |
a data.frame with the variables. |
cause |
specifies the causes related to the death times, the value cens.code is the censoring value. |
cif2 |
specificies model for cause2 if different from cause1. |
times |
time points |
cause1 |
cause of first coordinate. |
cause2 |
cause of second coordinate. |
cens.code |
specificies the code for the censoring if NULL then uses the one from the marginal cif model. |
cens.model |
specified which model to use for the ICPW, KM is Kaplan-Meier alternatively it may be "cox" |
Nit |
number of iterations for Newton-Raphson algorithm. |
detail |
if 0 no details are printed during iterations, if 1 details are given. |
clusters |
specifies the cluster structure. |
theta |
specifies starting values for the cross-odds-ratio parameters of the model. |
theta.des |
specifies a regression design for the cross-odds-ratio parameters. |
weights |
weights for score equations. |
step |
specifies the step size for the Newton-Raphson algorith.m |
sym |
1 for symmetri and 0 otherwise |
same.cens |
if true then censoring within clusters are assumed to be the same variable, default is independent censoring. |
censoring.weights |
Censoring probabilities |
silent |
debug information |
var.link |
if var.link=1 then var is on log-scale. |
score.method |
default uses "nlminb" optimzer, alternatively, use the "nr" algorithm. |
entry |
entry-age in case of delayed entry. Then two causes must be given. |
estimator |
estimator |
trunkp |
gives probability of survival for delayed entry, and related to entry-ages given above. |
admin.cens |
Administrative censoring |
random.design |
specifies a regression design of 0/1's for the random effects. |
... |
extra arguments. |
We allow a regression structure for the indenpendent gamma distributed random effects and their variances that may depend on cluster covariates.
random.design specificies the random effects for each subject within a cluster. This is
a matrix of 1's and 0's with dimension n x d. With d random effects.
For a cluster with two subjects, we let the random.design rows be
and .
Such that the random effects for subject
1 is
, for d random effects. Each random effect
has an associated parameter . By construction
subjects 1's random effect are Gamma distributed with
mean
and variance . Note that the random effect
has mean 1 and variance .
The parameters
are related to the parameters of the model
by a regression construction (d x k), that links the
parameters
with the (k) underlying parameters
returns an object of type 'random.cif'. With the following arguments:
theta |
estimate of parameters of model. |
var.theta |
variance for gamma. |
hess |
the derivative of the used score. |
score |
scores at final stage. |
theta.iid |
matrix of iid decomposition of parametric effects. |
Thomas Scheike
A Semiparametric Random Effects Model for Multivariate Competing Risks Data, Scheike, Zhang, Sun, Jensen (2010), Biometrika.
Cross odds ratio Modelling of dependence for Multivariate Competing Risks Data, Scheike and Sun (2013), Biostatitistics.
Scheike, Holst, Hjelmborg (2014), LIDA, Estimating heritability for cause specific hazards based on twin data
## Reduce Ex.Timings d <- sim_nordic_random(1000,delayed=TRUE, cordz=1.0,cormz=2,lam0=0.3,country=TRUE) times <- seq(50,90,by=10) addm <- timereg::comp.risk(Event(time,cause)~-1+factor(country)+cluster(id),data=d, times=times,cause=1,max.clust=NULL) ### making group indidcator mm <- model.matrix(~-1+factor(zyg),d) out1m<-random_cif(addm,data=d,cause1=1,cause2=1,theta=1, theta.des=mm,same.cens=TRUE) summary(out1m) ## this model can also be formulated as a random effects model ## but with different parameters out2m<-Grandom_cif(addm,data=d,cause1=1,cause2=1, theta=c(0.5,1),step=1.0, random.design=mm,same.cens=TRUE) summary(out2m) 1/out2m$theta out1m$theta #################################################################### ################### ACE modelling of twin data ##################### #################################################################### ### assume that zygbin gives the zygosity of mono and dizygotic twins ### 0 for mono and 1 for dizygotic twins. We now formulate and AC model zygbin <- d$zyg=="DZ" n <- nrow(d) ### random effects for each cluster des.rv <- cbind(mm,(zygbin==1)*rep(c(1,0)),(zygbin==1)*rep(c(0,1)),1) ### design making parameters half the variance for dizygotic components pardes <- rbind(c(1,0), c(0.5,0),c(0.5,0), c(0.5,0), c(0,1)) outacem <-Grandom_cif(addm,data=d,cause1=1,cause2=1, same.cens=TRUE,theta=c(0.35,0.15), step=1.0,theta.des=pardes,random.design=des.rv) summary(outacem)## Reduce Ex.Timings d <- sim_nordic_random(1000,delayed=TRUE, cordz=1.0,cormz=2,lam0=0.3,country=TRUE) times <- seq(50,90,by=10) addm <- timereg::comp.risk(Event(time,cause)~-1+factor(country)+cluster(id),data=d, times=times,cause=1,max.clust=NULL) ### making group indidcator mm <- model.matrix(~-1+factor(zyg),d) out1m<-random_cif(addm,data=d,cause1=1,cause2=1,theta=1, theta.des=mm,same.cens=TRUE) summary(out1m) ## this model can also be formulated as a random effects model ## but with different parameters out2m<-Grandom_cif(addm,data=d,cause1=1,cause2=1, theta=c(0.5,1),step=1.0, random.design=mm,same.cens=TRUE) summary(out2m) 1/out2m$theta out1m$theta #################################################################### ################### ACE modelling of twin data ##################### #################################################################### ### assume that zygbin gives the zygosity of mono and dizygotic twins ### 0 for mono and 1 for dizygotic twins. We now formulate and AC model zygbin <- d$zyg=="DZ" n <- nrow(d) ### random effects for each cluster des.rv <- cbind(mm,(zygbin==1)*rep(c(1,0)),(zygbin==1)*rep(c(0,1)),1) ### design making parameters half the variance for dizygotic components pardes <- rbind(c(1,0), c(0.5,0),c(0.5,0), c(0.5,0), c(0,1)) outacem <-Grandom_cif(addm,data=d,cause1=1,cause2=1, same.cens=TRUE,theta=c(0.35,0.15), step=1.0,theta.des=pardes,random.design=des.rv) summary(outacem)
Creates a contingency table by group from paired/clustered data, optionally combining lower and upper triangles.
grouptable( data, id, group, var, lower = TRUE, labels, order, group.labels, group.order, combine = " & ", ... )grouptable( data, id, group, var, lower = TRUE, labels, order, group.labels, group.order, combine = " & ", ... )
data |
a data.frame. |
id |
name of the cluster/pair identifier column. |
group |
name of the grouping variable (e.g., zygosity). |
var |
name of the outcome variable to tabulate. |
lower |
logical; if TRUE, fold upper triangle into lower. |
labels |
optional labels for levels of |
order |
optional ordering of factor levels. |
group.labels |
optional labels for the groups. |
group.order |
optional ordering of groups. |
combine |
separator for combining two groups (default |
... |
additional arguments. |
A table or list of tables.
haplo fun data
hapfreqs : haplo frequencies haploX: covariates and response for haplo survival discrete survival ghaplos: haplo-types for subjects of haploX data
Estimated data
Performs cycle-specific logistic regression to estimate haplotype effects on discrete
time-to-event data, accounting for phase ambiguity. Given observed genotypes
and unobserved haplotypes , the method integrates (mixes out) over the possible
haplotype configurations using the conditional probabilities .
haplo_surv_discrete( X = NULL, y = "y", time.name = "time", Haplos = NULL, id = "id", desnames = NULL, designfunc = NULL, beta = NULL, no.opt = FALSE, method = "NR", stderr = TRUE, designMatrix = NULL, response = NULL, idhap = NULL, design.only = FALSE, covnames = NULL, fam = binomial, weights = NULL, offsets = NULL, idhapweights = NULL, ... )haplo_surv_discrete( X = NULL, y = "y", time.name = "time", Haplos = NULL, id = "id", desnames = NULL, designfunc = NULL, beta = NULL, no.opt = FALSE, method = "NR", stderr = TRUE, designMatrix = NULL, response = NULL, idhap = NULL, design.only = FALSE, covnames = NULL, fam = binomial, weights = NULL, offsets = NULL, idhapweights = NULL, ... )
X |
Design matrix data frame (must be sorted by |
y |
Name of the response variable (binary, 0/1) in |
time.name |
Name of the time variable in |
Haplos |
Data frame containing |
id |
Name of the ID variable in |
desnames |
Names of the covariate columns in |
designfunc |
Function that computes the design vector given haplotypes |
beta |
Starting values for the optimization (vector of length |
no.opt |
Logical; if TRUE, skips optimization and returns estimates based on the
provided |
method |
Optimization method: |
stderr |
Logical; if FALSE, returns only the coefficient estimates. |
designMatrix |
Alternative to |
response |
Alternative to |
idhap |
Name of the ID-haplotype variable to specify different haplotypes for different IDs. |
design.only |
Logical; if TRUE, returns only the design matrices constructed for the analysis. |
covnames |
Names of covariates to extract from the object for regression output. |
fam |
Family of the model (default |
weights |
Weights following ID for the GLM component. |
offsets |
Offsets following ID for the GLM component. |
idhapweights |
Weights following ID-haplotype for the GLM component (Work in Progress). |
... |
Additional arguments passed to the optimizer ( |
The survival function is computed by averaging over the possible haplotypes:
The discrete hazard function is modeled using logistic regression:
where are time-specific intercepts (baseline hazards), is the
regression design constructed from covariates and haplotypes , and
are the regression coefficients.
The likelihood is maximized numerically. Standard errors are computed assuming that
is known (i.e., ignoring the uncertainty in haplotype estimation).
The design matrix over possible haplotypes is constructed by merging the covariate
data with the haplotype probabilities and applying a user-defined
designfunc.
An object of class "haplosurvd" containing:
coef |
Estimated coefficients (baseline time effects and haplotype/covariate effects). |
se |
Standard errors of the coefficients. |
var |
Variance-covariance matrix. |
se.robust |
Robust standard errors (if available). |
iid |
Influence function (IID) decomposition. |
ploglik |
Log-likelihood at convergence. |
gradient, hessian
|
Optimization results. |
Xhap, X, Haplos
|
Data and design matrices used. |
nid, nidhap
|
Number of IDs and ID-haplotype combinations. |
Thomas Scheike
Scheike, T. H. (2024). Discrete time survival analysis with haplotype effects. mets package documentation.
## Some haplotypes of interest types <- c("DCGCGCTCACG","DTCCGCTGACG","ITCAGTTGACG","ITCCGCTGAGG") ## Some haplotype frequencies for simulations data(haplo) hapfreqs <- haplo$hapfreqs www <- which(hapfreqs$haplotype %in% types) hapfreqs$freq[www] baseline <- hapfreqs$haplotype[9] baseline ## Design function: indicator for presence of any 'types' haplotype designftypes <- function(x, sm=0) { hap1 <- x[1] hap2 <- x[2] if (sm == 0) y <- 1 * ((hap1 == types) | (hap2 == types)) if (sm == 1) y <- 1 * (hap1 == types) + 1 * (hap2 == types) return(y) } tcoef <- c(-1.93110204, -0.47531630, -0.04118204, -1.57872602, -0.22176426, -0.13836416, 0.88830288, 0.60756224, 0.39802821, 0.32706859) ghaplos <- haplo$ghaplos haploX <- haplo$haploX haploX$time <- haploX$times Xdes <- model.matrix(~ factor(time), haploX) colnames(Xdes) <- paste("X", 1:ncol(Xdes), sep="") X <- dkeep(haploX, ~ id + y + time) X <- cbind(X, Xdes) Haplos <- dkeep(ghaplos, ~ id + "haplo*" + p) desnames <- paste("X", 1:6, sep="") # Six X's related to 6 cycles out <- haplo_surv_discrete(X=X, y="y", time.name="time", Haplos=Haplos, desnames=desnames, designfunc=designftypes) names(out$coef) <- c(desnames, types) out$coef summary(out)## Some haplotypes of interest types <- c("DCGCGCTCACG","DTCCGCTGACG","ITCAGTTGACG","ITCCGCTGAGG") ## Some haplotype frequencies for simulations data(haplo) hapfreqs <- haplo$hapfreqs www <- which(hapfreqs$haplotype %in% types) hapfreqs$freq[www] baseline <- hapfreqs$haplotype[9] baseline ## Design function: indicator for presence of any 'types' haplotype designftypes <- function(x, sm=0) { hap1 <- x[1] hap2 <- x[2] if (sm == 0) y <- 1 * ((hap1 == types) | (hap2 == types)) if (sm == 1) y <- 1 * (hap1 == types) + 1 * (hap2 == types) return(y) } tcoef <- c(-1.93110204, -0.47531630, -0.04118204, -1.57872602, -0.22176426, -0.13836416, 0.88830288, 0.60756224, 0.39802821, 0.32706859) ghaplos <- haplo$ghaplos haploX <- haplo$haploX haploX$time <- haploX$times Xdes <- model.matrix(~ factor(time), haploX) colnames(Xdes) <- paste("X", 1:ncol(Xdes), sep="") X <- dkeep(haploX, ~ id + y + time) X <- cbind(X, Xdes) Haplos <- dkeep(ghaplos, ~ id + "haplo*" + p) desnames <- paste("X", 1:6, sep="") # Six X's related to 6 cycles out <- haplo_surv_discrete(X=X, y="y", time.name="time", Haplos=Haplos, desnames=desnames, designfunc=designftypes) names(out$coef) <- c(desnames, types) out$coef summary(out)
Data from HF-action trial slightly modified from WA package, consisting of 741 nonischemic patients with baseline cardiopulmonary test duration less than or equal to 12 minutes.
Randomized study status : 1-event, 2-death, 0-censoring treatment : 1/0
WA package, Connor et al. 2009
data(hfactioncpx12)data(hfactioncpx12)
Extracts and scales influence function (IC) components from a fitted binomial regression ATE object or raw IC matrix. Depending on the object class and requested 'type', returns one or more components combined by column-binding.
## S3 method for class 'binreg' IC(x, type = c("coef", "DR", "G"), ...)## S3 method for class 'binreg' IC(x, type = c("coef", "DR", "G"), ...)
x |
An object of class |
type |
Character vector specifying which influence curve components to return. Possible values are:
If |
... |
Currently unused. |
For objects of class ATE, the function extracts and rescales
the stored influence curve components:
x$iid for coefficient ICs
x$riskDR.iid for doubly robust ATE ICs
x$riskG.iid for G-computation ICs
Each component is multiplied by its sample size (NROW(.)).
For non-ATE objects, the function returns
NROW(x$iid) * x$iid.
A numeric matrix obtained by column-binding the selected
influence curve components. Each column corresponds to one element
of type, with column names preserved.
## Not run: # default ATE behavior (DR + G) IC.binreg(fit) # only DR component IC.binreg(fit, "DR") # multiple components IC.binreg(fit, c("coef", "DR")) ## End(Not run)## Not run: # default ATE behavior (DR + G) IC.binreg(fit) # only DR component IC.binreg(fit, "DR") # multiple components IC.binreg(fit, c("coef", "DR")) ## End(Not run)
Computes the influence functions (IID decomposition) for the regression coefficients and/or the baseline cumulative hazard at a specific time point.
## S3 method for class 'phreg' IC(x, type = "robust", all = FALSE, time = NULL, baseline = NULL, ...)## S3 method for class 'phreg' IC(x, type = "robust", all = FALSE, time = NULL, baseline = NULL, ...)
x |
Object of class |
type |
Type of influence function: |
all |
Logical; if |
time |
Time point for baseline influence function (required if baseline is requested). |
baseline |
Arguments for baseline estimation. |
... |
Additional arguments. |
A matrix of influence functions. If all=TRUE, columns correspond to
regression coefficients and baseline cumulative hazard. Attributes include coef
and time.
Thomas Scheike
Computes the influence functions for the baseline cumulative hazard (and optionally
regression coefficients) for phreg, recreg, or cifregFG objects.
iidBaseline( object, time = NULL, ft = NULL, fixbeta = NULL, beta.iid = NULL, tminus = FALSE, ... )iidBaseline( object, time = NULL, ft = NULL, fixbeta = NULL, beta.iid = NULL, tminus = FALSE, ... )
object |
Object of class |
time |
Time point for baseline IID (required). |
ft |
Function to compute IID of baseline integrated against |
fixbeta |
Logical; if |
beta.iid |
Optional matrix of beta influence functions to use. |
tminus |
Logical; if |
... |
Additional arguments passed to lower-level functions. |
The decomposition is based on the formula:
where denotes the stratum and the cluster.
An object of class "iidBaseline" containing:
time |
Time point. |
base.iid |
Influence functions for the baseline. |
beta.iid |
Influence functions for the regression coefficients. |
cumhaz.time |
Cumulative hazard at the specified time. |
strata |
Strata indices. |
Thomas Scheike
Computes the inverse Laplace transform for gamma frailty simulation.
ilap(theta, t)ilap(theta, t)
theta |
frailty parameter. |
t |
value at which to evaluate. |
Numeric value of the inverse Laplace transform.
Fits a cumulative odds model for discrete time-to-event data, handling interval
censoring where the event time is known only to lie within an interval .
The model assumes:
where is the baseline cumulative odds function and are the
regression coefficients. This is equivalent to:
interval_logitsurv_discrete( formula, data, beta = NULL, no.opt = FALSE, method = "NR", stderr = TRUE, weights = NULL, offsets = NULL, exp.link = 1, increment = 1, ... )interval_logitsurv_discrete( formula, data, beta = NULL, no.opt = FALSE, method = "NR", stderr = TRUE, weights = NULL, offsets = NULL, exp.link = 1, increment = 1, ... )
formula |
Formula with an |
data |
Data frame containing the variables in the formula. |
beta |
Starting values for the optimization (vector of length |
no.opt |
Logical; if TRUE, skips optimization and returns estimates based on
the provided |
method |
Optimization method: |
stderr |
Logical; if FALSE, returns only the coefficient estimates. |
weights |
Observation weights (follows ID). |
offsets |
Offsets (follows ID). |
exp.link |
Logical; if TRUE, parameterizes increments as |
increment |
Logical; if TRUE, uses increments |
... |
Additional arguments passed to the optimizer ( |
The baseline is parameterized as the cumulative sum of exponentials
(), ensuring positivity. The regression coefficients
describe the log-odds of the event occurring by time .
The likelihood is maximized over the observed intervals:
where and are the left and right endpoints of the interval
for subject . Right-censored intervals have .
An object of class "cumoddsreg" containing:
coef |
Estimated coefficients (baseline time effects and covariate effects). |
se.coef |
Standard errors of the coefficients. |
var |
Variance-covariance matrix. |
iid |
Influence function (IID) decomposition for robust variance estimation. |
ntimes |
Number of distinct time intervals. |
utimes |
Unique time points. |
ploglik |
Log-likelihood at convergence. |
gradient, hessian
|
Optimization results. |
call |
Original function call. |
Thomas Scheike
Scheike, T. H. (2024). Discrete time survival analysis with interval censoring. mets package documentation.
cumoddsreg, predictlogitSurvd, simlogitSurvd
data(ttpd) dtable(ttpd,~entry+time2) out <- interval_logitsurv_discrete(Interval(entry,time2)~X1+X2+X3+X4,ttpd) summary(out) head(iid(out)) pred <- predictlogitSurvd(out,se=FALSE) plotSurvd(pred) ttpd <- dfactor(ttpd,fentry~entry) out <- cumoddsreg(fentry~X1+X2+X3+X4,ttpd) summary(out)data(ttpd) dtable(ttpd,~entry+time2) out <- interval_logitsurv_discrete(Interval(entry,time2)~X1+X2+X3+X4,ttpd) summary(out) head(iid(out)) pred <- predictlogitSurvd(out,se=FALSE) plotSurvd(pred) ttpd <- dfactor(ttpd,fentry~entry) out <- cumoddsreg(fentry~X1+X2+X3+X4,ttpd) summary(out)
Internal function. Calculates Inverse Probability of Censoring Weights (IPCW) and adds them to a data.frame
ipw( formula, data, cluster, same.cens = FALSE, obs.only = FALSE, weight.name = "w", trunc.prob = FALSE, weight.name2 = "wt", indi.weight = "pr", cens.model = "aalen", pairs = FALSE, theta.formula = ~1, ... )ipw( formula, data, cluster, same.cens = FALSE, obs.only = FALSE, weight.name = "w", trunc.prob = FALSE, weight.name2 = "wt", indi.weight = "pr", cens.model = "aalen", pairs = FALSE, theta.formula = ~1, ... )
formula |
Formula specifying the censoring model |
data |
data frame |
cluster |
clustering variable |
same.cens |
For clustered data, should same censoring be assumed (bivariate probability calculated as mininum of the marginal probabilities) |
obs.only |
Return data with uncensored observations only |
weight.name |
Name of weight variable in the new data.frame |
trunc.prob |
If TRUE truncation probabilities are also calculated and stored in 'weight.name2' (based on Clayton-Oakes gamma frailty model) |
weight.name2 |
Name of truncation probabilities |
indi.weight |
Name of individual censoring weight in the new data.frame |
cens.model |
Censoring model (default Aalens additive model) |
pairs |
For paired data (e.g. twins) only the complete pairs are returned (With pairs=TRUE) |
theta.formula |
Model for the dependence parameter in the Clayton-Oakes model (truncation only) |
... |
Additional arguments to censoring model |
Klaus K. Holst
## Not run: data("prt",package="mets") prtw <- ipw(Surv(time,status==0)~country, data=prt[sample(nrow(prt),5000),], cluster="id",weight.name="w") plot(0,type="n",xlim=range(prtw$time),ylim=c(0,1),xlab="Age",ylab="Probability") count <- 0 for (l in unique(prtw$country)) { count <- count+1 prtw <- prtw[order(prtw$time),] with(subset(prtw,country==l), lines(time,w,col=count,lwd=2)) } legend("topright",legend=unique(prtw$country),col=1:4,pch=-1,lty=1) ## End(Not run)## Not run: data("prt",package="mets") prtw <- ipw(Surv(time,status==0)~country, data=prt[sample(nrow(prt),5000),], cluster="id",weight.name="w") plot(0,type="n",xlim=range(prtw$time),ylim=c(0,1),xlab="Age",ylab="Probability") count <- 0 for (l in unique(prtw$country)) { count <- count+1 prtw <- prtw[order(prtw$time),] with(subset(prtw,country==l), lines(time,w,col=count,lwd=2)) } legend("topright",legend=unique(prtw$country),col=1:4,pch=-1,lty=1) ## End(Not run)
Internal function. Calculates Inverse Probability of Censoring and Truncation Weights and adds them to a data.frame
ipw2( data, times = NULL, entrytime = NULL, time = "time", cause = "cause", same.cens = FALSE, cluster = NULL, pairs = FALSE, strata = NULL, obs.only = TRUE, cens.formula = NULL, cens.code = 0, pair.cweight = "pcw", pair.tweight = "ptw", pair.weight = "weights", cname = "cweights", tname = "tweights", weight.name = "indi.weights", prec.factor = 100, ... )ipw2( data, times = NULL, entrytime = NULL, time = "time", cause = "cause", same.cens = FALSE, cluster = NULL, pairs = FALSE, strata = NULL, obs.only = TRUE, cens.formula = NULL, cens.code = 0, pair.cweight = "pcw", pair.tweight = "ptw", pair.weight = "weights", cname = "cweights", tname = "tweights", weight.name = "indi.weights", prec.factor = 100, ... )
data |
data frame |
times |
possible time argument for speciying a maximum value of time tau=max(times), to specify when things are considered censored or not. |
entrytime |
nam of entry-time for truncation. |
time |
name of time variable on data frame. |
cause |
name of cause indicator on data frame. |
same.cens |
For clustered data, should same censoring be assumed and same truncation (bivariate probability calculated as mininum of the marginal probabilities) |
cluster |
name of clustering variable |
pairs |
For paired data (e.g. twins) only the complete pairs are returned (With pairs=TRUE) |
strata |
name of strata variable to get weights stratified. |
obs.only |
Return data with uncensored observations only |
cens.formula |
model for Cox models for truncation and right censoring times. |
cens.code |
censoring.code |
pair.cweight |
Name of weight variable in the new data.frame for right censorig of pairs |
pair.tweight |
Name of weight variable in the new data.frame for left truncation of pairs |
pair.weight |
Name of weight variable in the new data.frame for right censoring and left truncation of pairs |
cname |
Name of weight variable in the new data.frame for right censoring of individuals |
tname |
Name of weight variable in the new data.frame for left truncation of individuals |
weight.name |
Name of weight variable in the new data.frame for right censoring and left truncation of individuals |
prec.factor |
To let tied censoring and truncation times come after the death times. |
... |
Additional arguments to censoring model |
Thomas Scheike
library("timereg") set.seed(1) d <- sim_nordic_random(5000,delayed=TRUE,ptrunc=0.7, cordz=0.5,cormz=2,lam0=0.3,country=FALSE) d$strata <- as.numeric(d$country)+(d$zyg=="MZ")*4 times <- seq(60,100,by=10) ## c1 <- timereg::comp.risk(Event(time,cause)~1+cluster(id),data=d,cause=1, ## model="fg",times=times,max.clust=NULL,n.sim=0) ## mm=model.matrix(~-1+zyg,data=d) ## out1<-random_cif(c1,data=d,cause1=1,cause2=1,same.cens=TRUE,theta.des=mm) ## summary(out1) ## pc1 <- predict(c1,X=1,se=0) ## plot(pc1) ## ## dl <- d[!d$truncated,] ## dl <- ipw2(dl,cluster="id",same.cens=TRUE,time="time",entrytime="entry",cause="cause", ## strata="strata",prec.factor=100) ## cl <- timereg::comp.risk(Event(time,cause)~+1+ ## cluster(id), ## data=dl,cause=1,model="fg", ## weights=dl$indi.weights,cens.weights=rep(1,nrow(dl)), ## times=times,max.clust=NULL,n.sim=0) ## pcl <- predict(cl,X=1,se=0) ## lines(pcl$time,pcl$P1,col=2) ## mm=model.matrix(~-1+factor(zyg),data=dl) ## out2<-random_cif(cl,data=dl,cause1=1,cause2=1,theta.des=mm, ## weights=dl$weights,censoring.weights=rep(1,nrow(dl))) ## summary(out2)library("timereg") set.seed(1) d <- sim_nordic_random(5000,delayed=TRUE,ptrunc=0.7, cordz=0.5,cormz=2,lam0=0.3,country=FALSE) d$strata <- as.numeric(d$country)+(d$zyg=="MZ")*4 times <- seq(60,100,by=10) ## c1 <- timereg::comp.risk(Event(time,cause)~1+cluster(id),data=d,cause=1, ## model="fg",times=times,max.clust=NULL,n.sim=0) ## mm=model.matrix(~-1+zyg,data=d) ## out1<-random_cif(c1,data=d,cause1=1,cause2=1,same.cens=TRUE,theta.des=mm) ## summary(out1) ## pc1 <- predict(c1,X=1,se=0) ## plot(pc1) ## ## dl <- d[!d$truncated,] ## dl <- ipw2(dl,cluster="id",same.cens=TRUE,time="time",entrytime="entry",cause="cause", ## strata="strata",prec.factor=100) ## cl <- timereg::comp.risk(Event(time,cause)~+1+ ## cluster(id), ## data=dl,cause=1,model="fg", ## weights=dl$indi.weights,cens.weights=rep(1,nrow(dl)), ## times=times,max.clust=NULL,n.sim=0) ## pcl <- predict(cl,X=1,se=0) ## lines(pcl$time,pcl$P1,col=2) ## mm=model.matrix(~-1+factor(zyg),data=dl) ## out2<-random_cif(cl,data=dl,cause1=1,cause2=1,theta.des=mm, ## weights=dl$weights,censoring.weights=rep(1,nrow(dl))) ## summary(out2)
Extracts unique event times from survival data, optionally restricting to concordant pairs or specific causes.
jumptimes( time, status = TRUE, id, cause, sample, sample.all = TRUE, strata = NULL, num = NULL, ... )jumptimes( time, status = TRUE, id, cause, sample, sample.all = TRUE, strata = NULL, num = NULL, ... )
time |
vector of event/censoring times. |
status |
vector of status indicators (default TRUE = event). |
id |
optional cluster identifier for concordant pair times. |
cause |
optional cause value to select. |
sample |
optional maximum number of time points to return. |
sample.all |
logical; if TRUE and sampling, include remaining times. |
strata |
not used. |
num |
optional within-cluster numbering variable. |
... |
additional arguments. |
Sorted vector of event times.
Computes Kaplan-Meier estimates with robust standard errors.
Robust variance is the default and is obtained from the predict call.
km(formula, data = data, km = TRUE, ...)km(formula, data = data, km = TRUE, ...)
formula |
Formula with 'Surv' or 'Event' outcome. |
data |
Data frame. |
km |
Logical; if |
... |
Additional arguments passed to |
An object of class "km" (extends "predictphreg") containing:
surr |
Survival probabilities. |
se.surv |
Standard errors. |
lower, upper
|
Confidence intervals. |
Thomas Scheike
data(sTRACE) sTRACE$cluster <- sample(1:100, 500, replace = TRUE) out1 <- km(Surv(time, status == 9) ~ strata(vf, chf), data = sTRACE) out2 <- km(Surv(time, status == 9) ~ strata(vf, chf) + cluster(cluster), data = sTRACE) summary(out1, times = 1:3) summary(out2, times = 1:3) par(mfrow = c(1, 2)) plot(out1, se = TRUE) plot(out2, se = TRUE)data(sTRACE) sTRACE$cluster <- sample(1:100, 500, replace = TRUE) out1 <- km(Surv(time, status == 9) ~ strata(vf, chf), data = sTRACE) out2 <- km(Surv(time, status == 9) ~ strata(vf, chf) + cluster(cluster), data = sTRACE) summary(out1, times = 1:3) summary(out2, times = 1:3) par(mfrow = c(1, 2)) plot(out1, se = TRUE) plot(out2, se = TRUE)
Life-course plot for event life data with recurrent events
lifecourse( formula, data, id = "id", group = NULL, type = "l", lty = 1, col = 1:10, alpha = 0.3, lwd = 1, recurrent.col = NULL, recurrent.lty = NULL, legend = NULL, pchlegend = NULL, by = NULL, status.legend = NULL, place.sl = "bottomright", xlab = "Time", ylab = "", add = FALSE, ... )lifecourse( formula, data, id = "id", group = NULL, type = "l", lty = 1, col = 1:10, alpha = 0.3, lwd = 1, recurrent.col = NULL, recurrent.lty = NULL, legend = NULL, pchlegend = NULL, by = NULL, status.legend = NULL, place.sl = "bottomright", xlab = "Time", ylab = "", add = FALSE, ... )
formula |
Formula (Event(start,slut,status) ~ ...) |
data |
data.frame |
id |
Id variable |
group |
group variable |
type |
Type (line 'l', stair 's', ...) |
lty |
Line type |
col |
Colour |
alpha |
transparency (0-1) |
lwd |
Line width |
recurrent.col |
col of recurrence type |
recurrent.lty |
lty's of of recurrence type |
legend |
position of optional id legend |
pchlegend |
point type legends |
by |
make separate plot for each level in 'by' (formula, name of column, or vector) |
status.legend |
Status legend |
place.sl |
Placement of status legend |
xlab |
Label of X-axis |
ylab |
Label of Y-axis |
add |
Add to existing device |
... |
Additional arguments to lower level arguments |
Thomas Scheike, Klaus K. Holst
data = data.frame(id=c(1,1,1,2,2),start=c(0,1,2,3,4),slut=c(1,2,4,4,7), type=c(1,2,3,2,3),status=c(0,1,2,1,2),group=c(1,1,1,2,2)) ll = lifecourse(Event(start,slut,status)~id,data,id="id") ll = lifecourse(Event(start,slut,status)~id,data,id="id",recurrent.col="type") ll = lifecourse(Event(start,slut,status)~id,data,id="id",group=~group,col=1:2) op <- par(mfrow=c(1,2)) ll = lifecourse(Event(start,slut,status)~id,data,id="id",by=~group) par(op) legends=c("censored","pregnant","married") ll = lifecourse(Event(start,slut,status)~id,data,id="id",group=~group,col=1:2,status.legend=legends)data = data.frame(id=c(1,1,1,2,2),start=c(0,1,2,3,4),slut=c(1,2,4,4,7), type=c(1,2,3,2,3),status=c(0,1,2,1,2),group=c(1,1,1,2,2)) ll = lifecourse(Event(start,slut,status)~id,data,id="id") ll = lifecourse(Event(start,slut,status)~id,data,id="id",recurrent.col="type") ll = lifecourse(Event(start,slut,status)~id,data,id="id",group=~group,col=1:2) op <- par(mfrow=c(1,2)) ll = lifecourse(Event(start,slut,status)~id,data,id="id",by=~group) par(op) legends=c("censored","pregnant","married") ll = lifecourse(Event(start,slut,status)~id,data,id="id",group=~group,col=1:2,status.legend=legends)
Create simple life table
## S3 method for class 'matrix' lifetable(x, strata = list(), breaks = c(), weights=NULL, confint = FALSE, ...) ## S3 method for class 'formula' lifetable(x, data=parent.frame(), breaks = c(), weights=NULL, confint = FALSE, ...)## S3 method for class 'matrix' lifetable(x, strata = list(), breaks = c(), weights=NULL, confint = FALSE, ...) ## S3 method for class 'formula' lifetable(x, data=parent.frame(), breaks = c(), weights=NULL, confint = FALSE, ...)
x |
time formula (Surv) or matrix/data.frame with columns time,status or entry,exit,status |
strata |
strata |
breaks |
time intervals |
weights |
weights variable |
confint |
if TRUE 95% confidence limits are calculated |
... |
additional arguments to lower level functions |
data |
data.frame |
Klaus K. Holst
library(timereg) data(TRACE) d <- with(TRACE,lifetable(Surv(time,status==9)~sex+vf,breaks=c(0,0.2,0.5,8.5))) lava::estimate(glm(events ~ offset(log(atrisk))+factor(int.end)*vf + sex*vf, data=d,poisson))library(timereg) data(TRACE) d <- with(TRACE,lifetable(Surv(time,status==9)~sex+vf,breaks=c(0,0.2,0.5,8.5))) lava::estimate(glm(events ~ offset(log(atrisk))+factor(int.end)*vf + sex*vf, data=d,poisson))
Simple linear spline
LinSpline(x, knots, num = TRUE, name = "Spline")LinSpline(x, knots, num = TRUE, name = "Spline")
x |
variable to make into spline |
knots |
cut points |
num |
to give names x1 x2 and so forth |
name |
name of spline expansion name.1 name.2 and so forth |
Thomas Scheike
Fits a semiparametric proportional odds model where:
Thus, covariate effects represent the odds ratio (OR) of survival.
logitSurv(formula, data, offset = NULL, weights = NULL, ...)logitSurv(formula, data, offset = NULL, weights = NULL, ...)
formula |
Formula with 'Surv' outcome (similar to |
data |
Data frame. |
offset |
Offsets for |
weights |
Weights for score equations. |
... |
Additional arguments passed to lower-level functions. |
This is equivalent to using a hazards model:
where is gamma distributed with mean and variance 1.
An object of class "phreg" with propodds=1.
Thomas Scheike
Eriksson, Frank, Li, Jianing, Scheike, Thomas, and Zhang, Mei-Jie (2015). "The proportional odds cumulative incidence model for competing risks." Biometrics, 71(3), 687–695.
data(TRACE) dcut(TRACE) <- ~. out1 <- logitSurv(Surv(time, status == 9) ~ vf + chf + strata(wmicat.4), data = TRACE) summary(out1) gof(out1) plot(out1)data(TRACE) dcut(TRACE) <- ~. out1 <- logitSurv(Surv(time, status == 9) ~ vf + chf + strata(wmicat.4), data = TRACE) summary(out1) gof(out1) plot(out1)
Mediation analysis in survival context with robust standard errors taking the weights into account via influence function computations. Mediator and exposure must be factors. This is based on numerical derivative wrt parameters for weighting. See vignette for more examples.
mediatorSurv( survmodel, weightmodel, data = data, wdata = wdata, id = "id", silent = TRUE, ... )mediatorSurv( survmodel, weightmodel, data = data, wdata = wdata, id = "id", silent = TRUE, ... )
survmodel |
with mediation model (binreg, aalenMets, phreg) |
weightmodel |
mediation model |
data |
for computations |
wdata |
weighted data expansion for computations |
id |
name of id variable, important for SE computations |
silent |
to be silent |
... |
Additional arguments to survival model |
Thomas Scheike
## Reduce Ex.Timings n <- 400 dat <- kumarsimRCT(n,rho1=0.5,rho2=0.5,rct=2,censpar=c(0,0,0,0), beta = c(-0.67, 0.59, 0.55, 0.25, 0.98, 0.18, 0.45, 0.31), treatmodel = c(-0.18, 0.56, 0.56, 0.54),restrict=1) dfactor(dat) <- dnr.f~dnr dfactor(dat) <- gp.f~gp drename(dat) <- ttt24~"ttt24*" dat$id <- 1:n dat$ftime <- 1 weightmodel <- fit <- glm(gp.f~dnr.f+preauto+ttt24,data=dat,family=binomial) wdata <- medweight(fit,data=dat) ### fitting models with and without mediator aaMss2 <- binreg(Event(time,status)~gp+dnr+preauto+ttt24+cluster(id),data=dat,time=50,cause=2) aaMss22 <- binreg(Event(time,status)~dnr+preauto+ttt24+cluster(id),data=dat,time=50,cause=2) ### estimating direct and indirect effects (under strong strong assumptions) aaMss <- binreg(Event(time,status)~dnr.f0+dnr.f1+preauto+ttt24+cluster(id), data=wdata,time=50,weights=wdata$weights,cause=2) ## to compute standard errors , requires numDeriv ll <- mediatorSurv(aaMss,fit,data=dat,wdata=wdata) summary(ll) ## not run bootstrap (to save time) ## bll <- BootmediatorSurv(aaMss,fit,data=dat,k.boot=500)## Reduce Ex.Timings n <- 400 dat <- kumarsimRCT(n,rho1=0.5,rho2=0.5,rct=2,censpar=c(0,0,0,0), beta = c(-0.67, 0.59, 0.55, 0.25, 0.98, 0.18, 0.45, 0.31), treatmodel = c(-0.18, 0.56, 0.56, 0.54),restrict=1) dfactor(dat) <- dnr.f~dnr dfactor(dat) <- gp.f~gp drename(dat) <- ttt24~"ttt24*" dat$id <- 1:n dat$ftime <- 1 weightmodel <- fit <- glm(gp.f~dnr.f+preauto+ttt24,data=dat,family=binomial) wdata <- medweight(fit,data=dat) ### fitting models with and without mediator aaMss2 <- binreg(Event(time,status)~gp+dnr+preauto+ttt24+cluster(id),data=dat,time=50,cause=2) aaMss22 <- binreg(Event(time,status)~dnr+preauto+ttt24+cluster(id),data=dat,time=50,cause=2) ### estimating direct and indirect effects (under strong strong assumptions) aaMss <- binreg(Event(time,status)~dnr.f0+dnr.f1+preauto+ttt24+cluster(id), data=wdata,time=50,weights=wdata$weights,cause=2) ## to compute standard errors , requires numDeriv ll <- mediatorSurv(aaMss,fit,data=dat,wdata=wdata) summary(ll) ## not run bootstrap (to save time) ## bll <- BootmediatorSurv(aaMss,fit,data=dat,k.boot=500)
Computes mediation weights for either binary or multinomial mediators. The important part is that the influence functions can be obtained to compute standard errors.
medweight( fit, data = data, var = NULL, name.weight = "weights", id.name = "id", ... )medweight( fit, data = data, var = NULL, name.weight = "weights", id.name = "id", ... )
fit |
either glm-binomial or mlogit (mets package) |
data |
data frame with data |
var |
is NULL reads mediator and exposure from formulae in the fit. |
name.weight |
name of weights |
id.name |
name of id variable, important for SE computations |
... |
Additional arguments to |
Thomas Scheike
The melanoma data frame has 205 rows and 7 columns. It contains data relating to survival of patients after operation for malignant melanoma collected at Odense University Hospital by K.T. Drzewiecki.
This data frame contains the following columns:
a numeric vector. Patient code.
a numeric vector code. Survival status. 1: dead from melanoma, 2: alive, 3: dead from other cause.
a numeric vector. Survival time.
a numeric vector code. Ulceration, 1: present, 0: absent.
a numeric vector. Tumour thickness (1/100 mm).
a numeric vector code. 0: female, 1: male.
Andersen, P.K., Borgan O, Gill R.D., Keiding N. (1993), Statistical Models Based on Counting Processes, Springer-Verlag.
Drzewiecki, K.T., Ladefoged, C., and Christensen, H.E. (1980), Biopsy and prognosis for cutaneous malignant melanoma in clinical stage I. Scand. J. Plast. Reconstru. Surg. 14, 141-144.
data(melanoma) names(melanoma)data(melanoma) names(melanoma)
mets
Extract and set global parameters of mets.
mets.options(...)mets.options(...)
... |
Arguments |
regex: If TRUE character vectors will be interpreted as regular expressions (dby, dcut, ...)
silent: Set to FALSE to disable various output messages
list of parameters
## Not run: mets.options(regex=TRUE) ## End(Not run)## Not run: mets.options(regex=TRUE) ## End(Not run)
Fits a multinomial regression model for a categorical outcome with levels:
for , where (baseline category).
mlogit(formula, data, offset = NULL, weights = NULL, fix.X = FALSE, ...)mlogit(formula, data, offset = NULL, weights = NULL, fix.X = FALSE, ...)
formula |
Formula with the outcome (similar to |
data |
Data frame containing the variables. |
offset |
Offsets for the partial likelihood. |
weights |
Weights for the score equations. |
fix.X |
Logical; if |
... |
Additional arguments passed to lower-level functions. |
This ensures that . The model is fitted using the phreg function
by expanding the data into a long format with strata for each category.
The coefficients represent the log-Relative-Risk (log-RR) relative to the baseline group
(the first level of the factor, which can be reset using relevel).
Standard errors are computed based on the sandwich form:
where is the score vector and is the derivative matrix.
Influence functions (possibly robust) can be obtained via the iid() function.
The response variable should be a factor.
Can also fit a cumulative odds model as a special case of interval_logitsurv_discrete.
An object of class "mlogit" (extending "phreg") containing:
coef |
Matrix of estimated coefficients (rows correspond to categories, columns to covariates). |
var |
Robust variance-covariance matrix. |
iid |
Influence functions for the coefficients. |
nlev |
Number of levels in the outcome. |
px |
Number of covariates. |
Thomas Scheike
data(bmt) bmt$id <- sample(200,408,replace=TRUE) dfactor(bmt) <- cause1f~cause drelevel(bmt,ref=3) <- cause3f~cause dlevels(bmt) mreg <- mlogit(cause1f~+1+cluster(id),bmt) summary(mreg) head(iid(mreg)) dim(iid(mreg)) mreg <- mlogit(cause1f~tcell+platelet,bmt) summary(mreg) head(iid(mreg)) dim(iid(mreg)) mreg3 <- mlogit(cause3f~tcell+platelet,bmt) summary(mreg3) ## inverse information standard errors lava::estimate(coef=mreg3$coef,vcov=mreg3$II) ## predictions based on seen response or not ## all probabilities head(predict(mreg,response=FALSE)) head(predict(mreg)) ## using newdata newdata <- data.frame(tcell=c(1,1,1),platelet=c(0,1,1),cause1f=c("2","2","0")) ## only probability of seen response predict(mreg,newdata) ## without response predict(mreg,newdata,response=FALSE) ## given indexx of P(Y=j) predict(mreg,newdata,Y=c(1,2,3)) ## reponse not given newdata <- data.frame(tcell=c(1,1,1),platelet=c(0,1,1)) predict(mreg,newdata)data(bmt) bmt$id <- sample(200,408,replace=TRUE) dfactor(bmt) <- cause1f~cause drelevel(bmt,ref=3) <- cause3f~cause dlevels(bmt) mreg <- mlogit(cause1f~+1+cluster(id),bmt) summary(mreg) head(iid(mreg)) dim(iid(mreg)) mreg <- mlogit(cause1f~tcell+platelet,bmt) summary(mreg) head(iid(mreg)) dim(iid(mreg)) mreg3 <- mlogit(cause3f~tcell+platelet,bmt) summary(mreg3) ## inverse information standard errors lava::estimate(coef=mreg3$coef,vcov=mreg3$II) ## predictions based on seen response or not ## all probabilities head(predict(mreg,response=FALSE)) head(predict(mreg)) ## using newdata newdata <- data.frame(tcell=c(1,1,1),platelet=c(0,1,1),cause1f=c("2","2","0")) ## only probability of seen response predict(mreg,newdata) ## without response predict(mreg,newdata,response=FALSE) ## given indexx of P(Y=j) predict(mreg,newdata,Y=c(1,2,3)) ## reponse not given newdata <- data.frame(tcell=c(1,1,1),platelet=c(0,1,1)) predict(mreg,newdata)
Multivariate Cumulative Incidence Function example data set
Simulated data
Computes concordance probability (joint probability of both subjects experiencing the event) given dependence parameters and random-effect variance structures from a twostage model.
Computes concordance probabilities for twin ACE/ADE models from a binomial twostage object.
Functions for constructing random-effects design matrices for twin and family models. These designs specify the genetic (A), dominance (D), common environment (C), and unique environment (E) variance components.
p11_binomial_twostage_RV( theta, rv1, rv2, p1, p2, pardes, ags = NULL, link = 0, i = 1, j = 1 ) concordanceTwostage( theta, p, rv1, rv2, theta.des, additive.gamma.sum = NULL, link = 0, var.par = 0, ... ) concordanceTwinACE( object, rv1 = NULL, rv2 = NULL, xmarg = NULL, type = "ace", ... ) kendall_ClaytonOakes_twin_ace(parg, parc, K = 10000, test = 0) kendall.ClaytonOakes.twin.ace(x, y, ...) kendall_normal_twin_ace(parg, parc, K = 10000) ascertained_pairs(pairs, data, cr.models, bin = FALSE) twin.polygen.design(x, ...) ace_family_design( data, id = "id", member = "type", mother = "mother", father = "father", child = "child", child1 = "child", type = "ace", ... ) make_pairwise_design(pairs, kinship, type = "ace")p11_binomial_twostage_RV( theta, rv1, rv2, p1, p2, pardes, ags = NULL, link = 0, i = 1, j = 1 ) concordanceTwostage( theta, p, rv1, rv2, theta.des, additive.gamma.sum = NULL, link = 0, var.par = 0, ... ) concordanceTwinACE( object, rv1 = NULL, rv2 = NULL, xmarg = NULL, type = "ace", ... ) kendall_ClaytonOakes_twin_ace(parg, parc, K = 10000, test = 0) kendall.ClaytonOakes.twin.ace(x, y, ...) kendall_normal_twin_ace(parg, parc, K = 10000) ascertained_pairs(pairs, data, cr.models, bin = FALSE) twin.polygen.design(x, ...) ace_family_design( data, id = "id", member = "type", mother = "mother", father = "father", child = "child", child1 = "child", type = "ace", ... ) make_pairwise_design(pairs, kinship, type = "ace")
theta |
dependence parameter vector. |
rv1 |
random-effects design for subject 1. |
rv2 |
random-effects design for subject 2. |
p1 |
marginal probability for subject 1. |
p2 |
marginal probability for subject 2. |
pardes |
parameter design matrix. |
ags |
additive gamma sum matrix (optional). |
link |
link function indicator (0 = identity, 1 = log). |
i |
index for subject 1. |
j |
index for subject 2. |
p |
matrix of marginal probabilities (n x 2). |
theta.des |
parameter design matrix linking theta to lambda parameters. |
additive.gamma.sum |
optional matrix for additive gamma sums. |
var.par |
if 1, parameters are rescaled by sum squared. |
... |
additional arguments. |
object |
a binomial twostage model object. |
xmarg |
optional covariate values for marginal probabilities. |
type |
model type: |
parg |
genetic variance parameter (gamma shape for genetic component). |
parc |
common environment variance parameter (gamma shape for environment). |
K |
number of simulated twin pairs (multiplied by 2 internally). |
test |
if 1, prints diagnostic correlations. |
x |
passed as |
y |
passed as |
pairs |
matrix of pair indices (n x 2). |
data |
a data.frame with twin/family data. |
cr.models |
formula specifying time and status variables. |
bin |
logical; if TRUE uses binary (prevalence) ordering rather than time ordering. |
id |
character name of the cluster (pair) identifier column. |
member |
character name of the family member type column. |
mother |
value identifying mothers in the member column. |
father |
value identifying fathers in the member column. |
child |
value identifying children in the member column. |
child1 |
column name distinguishing first child from second. |
kinship |
vector of kinship coefficients for each pair. |
twin_polygen_design creates a polygenic random-effects design for
twin pairs, distinguishing MZ and DZ twins.
twin.polygen.design is an alias for twin_polygen_design.
ace_family_design creates designs for nuclear families (mother,
father, children).
make_pairwise_design creates pairwise random-effects designs for
arbitrary kinship structures.
concordanceTwostage computes concordance probabilities from a
twostage model.
concordanceTwinACE computes concordance from a twin ACE model.
kendall_ClaytonOakes_twin_ace and kendall_normal_twin_ace
compute Kendall's tau for Clayton-Oakes and normal-frailty twin ACE models
respectively.
ascertained_pairs identifies ascertained (affected) pairs in
clustered survival data.
p11_binomial_twostage_RV computes the joint probability P(T1<=t, T2<=t)
for the additive gamma binary random effects model.
A list of concordance tables, one per pair, each containing
pmat (2x2 probability matrix), casewise (casewise concordance),
and marg (marginal probabilities).
A list of concordance tables per zygosity group.
A list with components:
pardes |
parameter design matrix linking random effects to variance parameters. |
des.rv |
random-effects design matrix for subjects. |
Klaus K. Holst, Thomas Scheike
Fits a Cox proportional hazards model using a fast C++ backend. Robust variance (sandwich estimator) is the default variance estimate in the summary.
phreg(formula, data, offset = NULL, weights = NULL, ...)phreg(formula, data, offset = NULL, weights = NULL, ...)
formula |
Formula with a 'Surv' or 'Event' outcome (similar to |
data |
Data frame containing the variables. |
offset |
Offsets for the Cox model linear predictor. |
weights |
Weights for the Cox score equations. |
... |
Additional arguments passed to lower-level functions (e.g., optimization controls). |
The influence functions (IID decomposition) follow the numerical order of the given cluster variable. Ordering the results by '$id' will align the IID terms with the original dataset order.
An object of class "phreg" containing:
coef |
Vector of estimated coefficients. |
var |
Robust variance-covariance matrix. |
beta.iid |
Influence functions for the regression coefficients. |
cumhaz |
Matrix of cumulative hazard estimates (time, cumhaz). |
se.cumhaz |
Matrix of standard errors for the cumulative hazard. |
cox.prep |
List containing preprocessed data for the Cox model. |
opt |
Optimization results (if optimization was performed). |
call |
The matched call. |
Klaus K. Holst, Thomas Scheike
plot.phreg, predict.phreg, robust_phreg
data(TRACE) dcut(TRACE) <- ~. # Fit model with clustering out1 <- phreg(Surv(time, status == 9) ~ wmi + age + strata(vf, chf) + cluster(id), data = TRACE) summary(out1) # Plotting baselines par(mfrow = c(1, 2)) plot(out1) # Computing robust variance for baseline rob1 <- robust_phreg(out1) plot(rob1, se = TRUE, robust = TRUE) # IID decomposition for regression parameters head(iid(out1)) # IID decomposition for baseline at a specific time-point Aiiid <- iid(out1, time = 30) head(Aiiid) # Combined IID decomposition (beta and baseline) dd <- iidBaseline(out1, time = 30) head(dd$beta.iid) head(dd$base.iid) # Stratified model outs <- phreg(Surv(time, status == 9) ~ strata(vf, wmicat.4) + cluster(id), data = TRACE) summary(outs) par(mfrow = c(1, 2)) plot(outs)data(TRACE) dcut(TRACE) <- ~. # Fit model with clustering out1 <- phreg(Surv(time, status == 9) ~ wmi + age + strata(vf, chf) + cluster(id), data = TRACE) summary(out1) # Plotting baselines par(mfrow = c(1, 2)) plot(out1) # Computing robust variance for baseline rob1 <- robust_phreg(out1) plot(rob1, se = TRUE, robust = TRUE) # IID decomposition for regression parameters head(iid(out1)) # IID decomposition for baseline at a specific time-point Aiiid <- iid(out1, time = 30) head(Aiiid) # Combined IID decomposition (beta and baseline) dd <- iidBaseline(out1, time = 30) head(dd$beta.iid) head(dd$base.iid) # Stratified model outs <- phreg(Surv(time, status == 9) ~ strata(vf, wmicat.4) + cluster(id), data = TRACE) summary(outs) par(mfrow = c(1, 2)) plot(outs)
Fits a Cox model with treatment weights
, where
.
phreg_IPTW( formula, data, treat.model = NULL, treat.var = NULL, weights = NULL, estpr = 1, pi0 = 0.5, se.cluster = NULL, ... )phreg_IPTW( formula, data, treat.model = NULL, treat.var = NULL, weights = NULL, estpr = 1, pi0 = 0.5, se.cluster = NULL, ... )
formula |
Formula for |
data |
Data frame for risk averaging. |
treat.model |
Propensity score model (binary or multinomial). |
treat.var |
A 1/0 variable indicating when treatment is given (for time-dependent weights). |
weights |
Optional weights to multiply with the IPTW weights. |
estpr |
(=1, default) to estimate propensity scores and include their uncertainty in the influence function. |
pi0 |
Fixed simple weights (if |
se.cluster |
To compute GEE-type standard errors when additional cluster structure is present. |
... |
Arguments for |
Standard errors are computed via influence functions that are returned as the IID argument.
Propensity scores are fitted using either logistic regression (glm) or the multinomial
model (mlogit) when there are more than two treatment categories.
The treatment variable must be a factor and is identified on the RHS of the treat.model.
Recurrent events can be considered with a start-stop structure, requiring cluster(id).
Robust standard errors are computed in all cases.
Time-dependent propensity score weights can be computed when treat.var is used. This weight
be 1 at the time of first (A_0) and 2nd treatment (A_1), then uses weights
where
is
time of 2nd randomization. The weights are constructed using a glm or mlogit model based on the data where
treat.var=1. The propensity score can be constructed for any number of treatments in a similar manner.
An object of class "phreg" with additional IPTW components:
IID |
Influence functions including propensity score uncertainty. |
iptw |
IPTW weights used. |
naive.var |
Naive variance ignoring propensity score uncertainty. |
Thomas Scheike
data <- mets:::simLT(0.7, 100, beta = 0.3, betac = 0, ce = 1, betao = 0.3) dfactor(data) <- Z.f ~ Z out <- phreg_IPTW(Surv(time, status) ~ Z.f, data = data, treat.model = Z.f ~ X) summary(out)data <- mets:::simLT(0.7, 100, beta = 0.3, betac = 0, ce = 1, betao = 0.3) dfactor(data) <- Z.f ~ Z out <- phreg_IPTW(Surv(time, status) ~ Z.f, data = data, treat.model = Z.f ~ X) summary(out)
Efficient implementation of the Lu-Tsiatis improvement using baseline covariates,
extended to competing risks and recurrent events. The results are almost equivalent
to the speffSurv function of the speff2trial package in the survival case.
A dynamic censoring augmentation regression is also computed to gain additional
efficiency from the censoring augmentation. The function handles two-stage
randomizations and recurrent events (start,stop) with cluster structure.
phreg_rct( formula, data, cause = 1, cens.code = 0, typesR = c("R0", "R1", "R01"), typesC = c("C", "dynC"), weights = NULL, augmentR0 = NULL, augmentR1 = NULL, augmentC = NULL, treat.model = ~+1, RCT = TRUE, treat.var = NULL, km = TRUE, level = 0.95, cens.model = NULL, estpr = 1, pi0 = 0.5, base.augment = FALSE, return.augmentR0 = FALSE, mlogit = FALSE, ... )phreg_rct( formula, data, cause = 1, cens.code = 0, typesR = c("R0", "R1", "R01"), typesC = c("C", "dynC"), weights = NULL, augmentR0 = NULL, augmentR1 = NULL, augmentC = NULL, treat.model = ~+1, RCT = TRUE, treat.var = NULL, km = TRUE, level = 0.95, cens.model = NULL, estpr = 1, pi0 = 0.5, base.augment = FALSE, return.augmentR0 = FALSE, mlogit = FALSE, ... )
formula |
Formula with |
data |
Data frame containing all variables referenced in the formula. |
cause |
Numeric code for the event of interest in competing risks or recurrent events. |
cens.code |
Numeric code for censoring in competing risks or recurrent events. |
typesR |
Character vector specifying augmentation types for randomization (options: "R0" for baseline, "R1" for post-baseline, "R01" for both). |
typesC |
Character vector specifying augmentation types for censoring (options: "C" for static, "dynC" for dynamic). |
weights |
Weights to be used for |
augmentR0 |
Formula for the first randomization augmentation (e.g., |
augmentR1 |
Formula for the second randomization augmentation (e.g., |
augmentC |
Formula for the censoring augmentation (e.g., |
treat.model |
Propensity score model formula (default is |
RCT |
Logical; if FALSE, uses propensity score adjustment for marginal model. |
treat.var |
Variable indicating treatment times in two-stage randomization. |
km |
Logical; use Kaplan-Meier for censoring weights (stratified on treatment). |
level |
Confidence level for intervals (default 0.95). |
cens.model |
Censoring model formula (default is |
estpr |
Numeric code (1/0); estimate propensity scores or not (default TRUE). |
pi0 |
Fixed propensity scores for randomizations (if not estimating). |
base.augment |
Logical; covariate augment baselines (only for R0 augmentation). |
return.augmentR0 |
Logical; return augmentation data. |
mlogit |
Logical; use multinomial logistic regression for propensity scores. |
... |
Additional arguments passed to |
The method improves the efficiency of the log-rank test by utilizing auxiliary baseline covariates to reduce variance, particularly useful in randomized clinical trials (RCTs) where covariate adjustment can increase power.
An object of class "phreg_rct" containing:
coefs |
Coefficient estimates for the treatment effect. |
iid |
Influence function (IID) decomposition for variance estimation. |
AugR0, AugR1, AugCdyn, AugClt
|
Augmentation terms for different strategies. |
cumhaz |
Cumulative hazards. |
var |
Variance-covariance matrix. |
se |
Standard errors. |
call |
Original function call. |
formula |
Formula used. |
data |
The data used (if requested). |
The object includes results for different augmentation combinations (R0, R1, R01, C, dynC).
Thomas Scheike
Lu, T. and Tsiatis, A. A. (2008), Improving the efficiency of the log-rank test using auxiliary covariates, Biometrika, 95, 679–694.
Scheike, T. H., Nerstroem, C. and Martinussen, T. (2026), Randomized clinical trials and the proportional hazards model for recurrent events, TEST.
## Lu, Tsiatis simulation data <- mets:::simLT(0.7,100) dfactor(data) <- Z.f~Z out <- phreg_rct(Surv(time,status)~Z.f,data=data,augmentR0=~X,augmentC=~factor(Z):X) summary(out)## Lu, Tsiatis simulation data <- mets:::simLT(0.7,100) dfactor(data) <- Z.f~Z out <- phreg_rct(Surv(time,status)~Z.f,data=data,augmentR0=~X,augmentC=~factor(Z):X) summary(out)
Fits a Cox-Weibull with cumulative hazard given by
where is the shape parameter, and
the rate parameter. We here allow a regression model for
both parameters
as defined by 'formula' and 'shape.formula' respectively.
phreg_weibull( formula, shape.formula = ~1, data, save.data = TRUE, control = list() )phreg_weibull( formula, shape.formula = ~1, data, save.data = TRUE, control = list() )
formula |
Formula for proportional hazards. The right-handside must be an [Event] or [Surv] object (with right-censoring and possibly delayed entry). |
shape.formula |
Formula for shape parameter |
data |
data.frame |
save.data |
if TRUE the data.frame is stored in the model object (for predictions and simulations) |
control |
control arguments to optimization routine [stats::nlmbin] |
The parametrization
'phreg.par' object
Klaus Kähler Holst, Thomas Scheike
[mets::phreg()]
data(sTRACE, package="mets") sTRACE$entry <- 0 fit1 <- phreg_weibull(Event(entry, time, status == 9) ~ age, shape.formula = ~age, data = sTRACE) tt <- seq(0,10, length.out=100) pr1 <- predict(fit1, newdata = sTRACE[1, ], times = tt) fit2 <- phreg(Event(time, status == 9) ~ age, data = sTRACE) pr2 <- predict(fit2, newdata = sTRACE[1, ], se = FALSE) if (interactive()) { plot(pr2$times, pr2$surv, type="s") lines(tt, pr1[,1,1], col="red", lwd=2) }data(sTRACE, package="mets") sTRACE$entry <- 0 fit1 <- phreg_weibull(Event(entry, time, status == 9) ~ age, shape.formula = ~age, data = sTRACE) tt <- seq(0,10, length.out=100) pr1 <- predict(fit1, newdata = sTRACE[1, ], times = tt) fit2 <- phreg(Event(time, status == 9) ~ age, data = sTRACE) pr2 <- predict(fit2, newdata = sTRACE[1, ], se = FALSE) if (interactive()) { plot(pr2$times, pr2$surv, type="s") lines(tt, pr1[,1,1], col="red", lwd=2) }
.. content for description (no empty lines) ..
plack_cif(cif1, cif2, object)plack_cif(cif1, cif2, object)
cif1 |
Cumulative incidence of first argument. |
cif2 |
Cumulative incidence of second argument. |
object |
or.cif object with dependence parameters. |
Thomas Scheike
Scatterplot with contours of the (kernel) estimated density
plot_twin( data, marginal.args = list(), kernsmooth.args = list(), xlab, ylab, col = "black", col2 = "lightblue", alpha = 0.3, grid = TRUE, side.plot = TRUE, ... )plot_twin( data, marginal.args = list(), kernsmooth.args = list(), xlab, ylab, col = "black", col2 = "lightblue", alpha = 0.3, grid = TRUE, side.plot = TRUE, ... )
data |
bivariate data to plot (data.frame or matrix with 2 columns) |
marginal.args |
argumemts to marginal estimator ('density' continuous data, 'barplot' for categorical ) |
kernsmooth.args |
arguments to 2d-kernel smoother |
xlab |
x-axis label |
ylab |
y-axis label |
col |
color of points |
col2 |
color of contour / density plot |
alpha |
transparency level of points |
grid |
should grid be added to the plot |
side.plot |
If TRUE subplots of the marginal distributions are added to the plot |
... |
arguments to lower level plot functions |
Klaus Kähler Holst
data("twinbmi", package="mets") twinwide <- fast.reshape(twinbmi, id="tvparnr",varying=c("bmi")) datamz <- log(subset(twinwide, zyg=="MZ")[,c("bmi1","bmi2")]) # continuous variables plot_twin(datamz) # categorical variables datamz2 <- datamz datamz2[, 1] <- cut(datamz[, 1], 4) datamz2[, 2] <- cut(datamz[, 2], 4) plot_twin(datamz2, color = TRUE) # survival variables cens1 <- rbinom(nrow(datamz), 1, 0.5) cens2 <- rbinom(nrow(datamz), 1, 0.5) datamz2[, 1] <- Event(datamz[, 1], cens1) datamz2[, 2] <- suppressWarnings(Event(datamz[, 2], cens2)) plot_twin(datamz2) rm(datamz, datamz2, cens1, cens2)data("twinbmi", package="mets") twinwide <- fast.reshape(twinbmi, id="tvparnr",varying=c("bmi")) datamz <- log(subset(twinwide, zyg=="MZ")[,c("bmi1","bmi2")]) # continuous variables plot_twin(datamz) # categorical variables datamz2 <- datamz datamz2[, 1] <- cut(datamz[, 1], 4) datamz2[, 2] <- cut(datamz[, 2], 4) plot_twin(datamz2, color = TRUE) # survival variables cens1 <- rbinom(nrow(datamz), 1, 0.5) cens2 <- rbinom(nrow(datamz), 1, 0.5) datamz2[, 1] <- Event(datamz[, 1], cens1) datamz2[, 2] <- suppressWarnings(Event(datamz[, 2], cens2)) plot_twin(datamz2) rm(datamz, datamz2, cens1, cens2)
Plotting the baselines of stratified Cox
## S3 method for class 'phreg' plot(x, ...)## S3 method for class 'phreg' plot(x, ...)
x |
phreg object |
... |
Additional arguments to baseplot funtion |
Klaus K. Holst, Thomas Scheike
data(TRACE) dcut(TRACE) <- ~. out1 <- phreg(Surv(time,status==9)~vf+chf+strata(wmicat.4),data=TRACE) par(mfrow=c(2,2)) plot(out1) plot(out1,stratas=c(0,3)) plot(out1,stratas=c(0,3),col=2:3,lty=1:2,se=TRUE) plot(out1,stratas=c(0),col=2,lty=2,se=TRUE,polygon=FALSE) plot(out1,stratas=c(0),col=matrix(c(2,1,3),1,3),lty=matrix(c(1,2,3),1,3),se=TRUE,polygon=FALSE)data(TRACE) dcut(TRACE) <- ~. out1 <- phreg(Surv(time,status==9)~vf+chf+strata(wmicat.4),data=TRACE) par(mfrow=c(2,2)) plot(out1) plot(out1,stratas=c(0,3)) plot(out1,stratas=c(0,3),col=2:3,lty=1:2,se=TRUE) plot(out1,stratas=c(0),col=2,lty=2,se=TRUE,polygon=FALSE) plot(out1,stratas=c(0),col=matrix(c(2,1,3),1,3),lty=matrix(c(1,2,3),1,3),se=TRUE,polygon=FALSE)
Multivariate normal distribution function
pmvn(lower, upper, mu, sigma, cor = FALSE)pmvn(lower, upper, mu, sigma, cor = FALSE)
lower |
lower limits |
upper |
upper limits |
mu |
mean vector |
sigma |
variance matrix or vector of correlation coefficients |
cor |
if TRUE sigma is treated as standardized (correlation matrix) |
lower <- rbind(c(0,-Inf),c(-Inf,0)) upper <- rbind(c(Inf,0),c(0,Inf)) mu <- rbind(c(1,1),c(-1,1)) sigma <- diag(2)+1 pmvn(lower=lower,upper=upper,mu=mu,sigma=sigma)lower <- rbind(c(0,-Inf),c(-Inf,0)) upper <- rbind(c(Inf,0),c(0,Inf)) mu <- rbind(c(1,1),c(-1,1)) sigma <- diag(2)+1 pmvn(lower=lower,upper=upper,mu=mu,sigma=sigma)
Computes predicted probabilities for the categorical outcome based on a mlogit object.
Can return probabilities for all categories or only for the observed response.
## S3 method for class 'mlogit' predict( object, newdata, se = TRUE, response = TRUE, Y = NULL, level = 0.95, ... )## S3 method for class 'mlogit' predict( object, newdata, se = TRUE, response = TRUE, Y = NULL, level = 0.95, ... )
object |
Object of class |
newdata |
Data frame for prediction. If missing, predictions are made for the original data. |
se |
Logical; if |
response |
Logical; if |
Y |
Vector of outcome levels to predict probabilities for (optional). |
level |
Confidence level for intervals (default 0.95). |
... |
Additional arguments. |
A matrix or data frame containing:
pred |
Predicted probabilities. |
se |
Standard errors (if |
lower, upper
|
Confidence intervals (if |
If response=FALSE, columns correspond to the levels of the outcome factor.
Thomas Scheike
Computes predictions for survival probability, cumulative hazard, or risk at specified time points for new data or existing data. Includes standard errors and confidence intervals.
## S3 method for class 'phreg' predict( object, newdata, times = NULL, individual.time = FALSE, tminus = FALSE, se = TRUE, robust = FALSE, conf.type = "log", conf.int = 0.95, km = FALSE, ... )## S3 method for class 'phreg' predict( object, newdata, times = NULL, individual.time = FALSE, tminus = FALSE, se = TRUE, robust = FALSE, conf.type = "log", conf.int = 0.95, km = FALSE, ... )
object |
Object of class |
newdata |
Data frame for prediction. If |
times |
Time points for prediction. Defaults to all unique event times in the model. |
individual.time |
Logical; if |
tminus |
Logical; if |
se |
Logical; if |
robust |
Logical; if |
conf.type |
Transformation for survival estimates: |
conf.int |
Confidence level (default 0.95). |
km |
Logical; if |
... |
Additional arguments for plotting functions. |
An object of class "predictphreg" containing:
surr |
Matrix of survival probabilities. |
cumhaz |
Matrix of cumulative hazards. |
cif |
Matrix of cumulative incidence functions (if applicable). |
times |
Vector of time points. |
surv.upper, surv.lower
|
Confidence bounds for survival. |
RR |
Relative risks. |
Thomas Scheike
prints Concordance test
## S3 method for class 'casewise' print(x, digits = 3, ...)## S3 method for class 'casewise' print(x, digits = 3, ...)
x |
output from casewise.test |
digits |
number of digits |
... |
Additional arguments to lower level functions |
Thomas Scheike
Estimates as a function of time , for a range of
thresholds , in the presence of a terminal event (death). The estimator
is based on the cumulative incidence of "reaching events", treating
death as a competing risk. Confidence intervals are computed on the log or
plain scale.
prob_exceed_recurrent( formula, data, cause = 1, death.code = 2, cens.code = 0, exceed = NULL, marks = NULL, all.cifs = FALSE, return.data = FALSE, conf.type = c("log", "plain"), level = 0.95, ... )prob_exceed_recurrent( formula, data, cause = 1, death.code = 2, cens.code = 0, exceed = NULL, marks = NULL, all.cifs = FALSE, return.data = FALSE, conf.type = c("log", "plain"), level = 0.95, ... )
formula |
A formula with an |
data |
A data frame containing all variables in |
cause |
Integer code for the recurrent event of interest. Default is
|
death.code |
Integer code for the terminal event. Default is |
cens.code |
Integer code for censoring. Default is |
exceed |
Integer vector of thresholds |
marks |
Optional numeric vector of event weights. If non- |
all.cifs |
Logical. If |
return.data |
Logical. If |
conf.type |
Type of confidence interval transformation: |
level |
Confidence level. Default is |
... |
Further arguments passed to |
For each threshold in exceed, the function identifies the
first time each subject reaches events, then fits a competing risks
model (cif) with "reaching events" as the event of
interest and death as the competing event. Strata are supported. When
marks is NULL, each event contributes equally; otherwise events
are weighted by their mark values before cumulative counts are formed.
An object of class "exceed" with the following components:
time |
Vector of evaluation time points. |
prob |
Array of dimension |
se.prob |
Standard errors of |
lower, upper
|
Pointwise confidence interval bounds. |
meanN |
Estimated mean number of events |
meanN2, varN
|
Second moment and variance of |
exceed |
Thresholds evaluated (excluding zero). |
cif.exceed |
List of fitted |
dataList |
List of datasets for each threshold (if
|
nstrata, strata.levels, strata.name
|
Stratification information. |
Use plot() and summary() methods for visualisation and
tabulation.
Thomas Scheike
Scheike, T. H., Eriksson, L., and Tribler, P. (2019). The mean, variance and correlation for bivariate recurrent events with a terminal event. Journal of the Royal Statistical Society, Series C, 68(5).
data(hfactioncpx12) dtable(hfactioncpx12, ~status) oo <- prob_exceed_recurrent(Event(entry, time, status) ~ cluster(id), hfactioncpx12, cause = 1, death.code = 2) plot(oo) summary(oo, times = c(1, 2, 5))data(hfactioncpx12) dtable(hfactioncpx12, ~status) oo <- prob_exceed_recurrent(Event(entry, time, status) ~ cluster(id), hfactioncpx12, cause = 1, death.code = 2) plot(oo) summary(oo, times = c(1, 2, 5))
Fits a random effects model describing the dependence in the cumulative incidence curves for subjects within a cluster. Given the gamma distributed random effects it is assumed that the cumulative incidence curves are indpendent, and that the marginal cumulative incidence curves are on the form
We allow a regression structure for the random effects variances that may depend on cluster covariates.
random_cif( cif, data, cause = NULL, cif2 = NULL, cause1 = 1, cause2 = 1, cens.code = NULL, cens.model = "KM", Nit = 40, detail = 0, clusters = NULL, theta = NULL, theta.des = NULL, sym = 1, step = 1, same.cens = FALSE, var.link = 0, score.method = "nr", entry = NULL, trunkp = 1, ... )random_cif( cif, data, cause = NULL, cif2 = NULL, cause1 = 1, cause2 = 1, cens.code = NULL, cens.model = "KM", Nit = 40, detail = 0, clusters = NULL, theta = NULL, theta.des = NULL, sym = 1, step = 1, same.cens = FALSE, var.link = 0, score.method = "nr", entry = NULL, trunkp = 1, ... )
cif |
a model object from the comp.risk function with the marginal cumulative incidence of cause2, i.e., the event that is conditioned on, and whose odds the comparision is made with respect to |
data |
a data.frame with the variables. |
cause |
specifies the causes related to the death times, the value cens.code is the censoring value. |
cif2 |
specificies model for cause2 if different from cause1. |
cause1 |
cause of first coordinate. |
cause2 |
cause of second coordinate. |
cens.code |
specificies the code for the censoring if NULL then uses the one from the marginal cif model. |
cens.model |
specified which model to use for the ICPW, KM is Kaplan-Meier alternatively it may be "cox" |
Nit |
number of iterations for Newton-Raphson algorithm. |
detail |
if 0 no details are printed during iterations, if 1 details are given. |
clusters |
specifies the cluster structure. |
theta |
specifies starting values for the cross-odds-ratio parameters of the model. |
theta.des |
specifies a regression design for the cross-odds-ratio parameters. |
sym |
1 for symmetry 0 otherwise |
step |
specifies the step size for the Newton-Raphson algorith.m |
same.cens |
if true then censoring within clusters are assumed to be the same variable, default is independent censoring. |
var.link |
if var.link=1 then var is on log-scale. |
score.method |
default uses "nlminb" optimzer, alternatively, use the "nr" algorithm. |
entry |
entry-age in case of delayed entry. Then two causes must be given. |
trunkp |
gives probability of survival for delayed entry, and related to entry-ages given above. |
... |
extra arguments. |
returns an object of type 'cor'. With the following arguments:
theta |
estimate of proportional odds parameters of model. |
var.theta |
variance for gamma. |
hess |
the derivative of the used score. |
score |
scores at final stage. |
score |
scores at final stage. |
theta.iid |
matrix of iid decomposition of parametric effects. |
Thomas Scheike
A Semiparametric Random Effects Model for Multivariate Competing Risks Data, Scheike, Zhang, Sun, Jensen (2010), Biometrika.
Cross odds ratio Modelling of dependence for Multivariate Competing Risks Data, Scheike and Sun (2012), work in progress.
## Reduce Ex.Timings d <- sim_nordic_random(1000,delayed=TRUE,cordz=0.5,cormz=2,lam0=0.3,country=TRUE) times <- seq(50,90,by=10) add1 <- timereg::comp.risk(Event(time,cause)~-1+factor(country)+cluster(id),data=d, times=times,cause=1,max.clust=NULL) ### making group indidcator mm <- model.matrix(~-1+factor(zyg),d) out1<-random_cif(add1,data=d,cause1=1,cause2=1,theta=1,same.cens=TRUE) summary(out1) out2<-random_cif(add1,data=d,cause1=1,cause2=1,theta=1, theta.des=mm,same.cens=TRUE) summary(out2) ######################################### ##### 2 different causes ######################################### add2 <- timereg::comp.risk(Event(time,cause)~-1+factor(country)+cluster(id),data=d, times=times,cause=2,max.clust=NULL) out3 <- random_cif(add1,data=d,cause1=1,cause2=2,cif2=add2,sym=1,same.cens=TRUE) summary(out3) ## negative dependence out4 <- random_cif(add1,data=d,cause1=1,cause2=2,cif2=add2,theta.des=mm,sym=1,same.cens=TRUE) summary(out4) ## negative dependence## Reduce Ex.Timings d <- sim_nordic_random(1000,delayed=TRUE,cordz=0.5,cormz=2,lam0=0.3,country=TRUE) times <- seq(50,90,by=10) add1 <- timereg::comp.risk(Event(time,cause)~-1+factor(country)+cluster(id),data=d, times=times,cause=1,max.clust=NULL) ### making group indidcator mm <- model.matrix(~-1+factor(zyg),d) out1<-random_cif(add1,data=d,cause1=1,cause2=1,theta=1,same.cens=TRUE) summary(out1) out2<-random_cif(add1,data=d,cause1=1,cause2=1,theta=1, theta.des=mm,same.cens=TRUE) summary(out2) ######################################### ##### 2 different causes ######################################### add2 <- timereg::comp.risk(Event(time,cause)~-1+factor(country)+cluster(id),data=d, times=times,cause=2,max.clust=NULL) out3 <- random_cif(add1,data=d,cause1=1,cause2=2,cif2=add2,sym=1,same.cens=TRUE) summary(out3) ## negative dependence out4 <- random_cif(add1,data=d,cause1=1,cause2=2,cif2=add2,theta.des=mm,sym=1,same.cens=TRUE) summary(out4) ## negative dependence
Computes the ratio of two Average Treatment Effects (ATEs), typically comparing the ATE for a specific cause (e.g., RMTL due to cause 1) to the ATE for the total RMTL.
ratioATE(rmtl, rmtl1, h = NULL, null = 1)ratioATE(rmtl, rmtl1, h = NULL, null = 1)
rmtl |
Object containing the ATE for the total RMTL (from |
rmtl1 |
Object containing the ATE for the specific cause RMTL (from |
h |
Transformation function (e.g., |
null |
Value under the null hypothesis for the ratio (default 1). |
The function transforms the estimates (e.g., using log) to compute the ratio and its standard error using the delta method and the joint influence functions.
A list containing:
ratioG |
Ratio based on the simple IPCW estimator. |
ratioDR |
Ratio based on the double robust estimator. |
Thomas Scheike
data(bmt); bmt$event <- bmt$cause!=0; dfactor(bmt) <- tcell~tcell out <- resmeanATE(Event(time,event)~tcell+platelet, data=bmt, time=40, outcome="rmtl") out1 <- resmeanATE(Event(time,cause)~tcell+platelet, data=bmt, cause=1, time=40, outcome="rmtl") ratioATE(out, out1, h=log)data(bmt); bmt$event <- bmt$cause!=0; dfactor(bmt) <- tcell~tcell out <- resmeanATE(Event(time,event)~tcell+platelet, data=bmt, time=40, outcome="rmtl") out1 <- resmeanATE(Event(time,cause)~tcell+platelet, data=bmt, cause=1, time=40, outcome="rmtl") ratioATE(out, out1, h=log)
Simulates data from a piecewise constant baseline hazard that can also be of Cox type. Censors data at the highest value of the break points.
rchaz(cumhazard, rr, n = NULL, entry = NULL, cause = 1, extend = FALSE)rchaz(cumhazard, rr, n = NULL, entry = NULL, cause = 1, extend = FALSE)
cumhazard |
Cumulative hazard matrix (columns: time, cumulative hazard), or piece-constant rates for periods defined by the first column of input. |
rr |
Relative risk vector for simulations, or alternatively when |
n |
Number of simulations if |
entry |
Delayed entry time for simulations (optional). |
cause |
Name/code of the event cause (default 1). |
extend |
To extend piecewise constant with constant rate beyond the last break point.
Default is |
For a piecewise linear cumulative hazard, the inverse is easy to compute. With delayed
entry , we compute:
where are the relative risks and is exponential with mean 1.
This quantity has survival function:
A data frame containing:
entry |
Entry times. |
time |
Event/censoring times. |
status |
Event status (1=event, 0=censored). |
rr |
Relative risks used. |
Attributes include:
cumhaz |
The cumulative hazard used. |
extend.rate |
The extension rate if used. |
Thomas Scheike
chaz <- c(0,1,1.5,2,2.1) breaks <- c(0,10, 20, 30, 40) cumhaz <- cbind(breaks,chaz) n <- 10 X <- rbinom(n,1,0.5) beta <- 0.2 rrcox <- exp(X * beta) pctime <- rchaz(cumhaz,n=10) pctimecox <- rchaz(cumhaz,rrcox,entry=runif(n))chaz <- c(0,1,1.5,2,2.1) breaks <- c(0,10, 20, 30, 40) cumhaz <- cbind(breaks,chaz) n <- 10 X <- rbinom(n,1,0.5) beta <- 0.2 rrcox <- exp(X * beta) pctime <- rchaz(cumhaz,n=10) pctimecox <- rchaz(cumhaz,rrcox,entry=runif(n))
Simulates data from multiple piecewise constant baseline hazards for competing risks. Takes the minimum of all cause-specific event times and assigns the corresponding cause.
rchazl(cumhaz, rr, causes = NULL, ...)rchazl(cumhaz, rr, causes = NULL, ...)
cumhaz |
List of cumulative hazard matrices, one for each cause. |
rr |
Matrix of relative risks (rows = subjects, columns = causes). |
causes |
Vector of cause codes to assign (default NULL, uses 1,2,...). |
... |
Additional arguments passed to |
A data frame with event times and status indicating the cause of the first event.
Thomas Scheike
Simulates data from piecwise constant baseline hazard that can also be of Cox type. Censor data at highest value of the break points for either of the cumulatives, see also sim_phregs
rcrisk( cumA, cumB, rr1 = NULL, rr2 = NULL, n = NULL, cens = NULL, rrc = NULL, extend = TRUE, causes = NULL, ... )rcrisk( cumA, cumB, rr1 = NULL, rr2 = NULL, n = NULL, cens = NULL, rrc = NULL, extend = TRUE, causes = NULL, ... )
cumA |
cumulative hazard of cause 1, or list of multiple cumulative hazards |
cumB |
cumulative hazard of cause 2 or NULL when cumA is a list |
rr1 |
number of simulations or vector of relative risk for simuations, or matrix with columns equal to number of hazards in list |
rr2 |
number of simulations or vector of relative risk for simuations. |
n |
number of simulation if rr not given, must be given when rr is not given |
cens |
to censor further , rate or cumumlative hazard |
rrc |
retlativ risk for censoring. |
extend |
to extend the cumulative hazards to largest end-point |
causes |
to assign status values for each of the causes, vector of integers |
... |
arguments for rchaz |
Thomas Scheike
data(bmt); n <- 100 cox1 <- phreg(Surv(time,cause==1)~tcell+platelet,data=bmt) cox2 <- phreg(Surv(time,cause==2)~tcell+platelet,data=bmt) X1 <- bmt[,c("tcell","platelet")] xid <- sample(1:nrow(X1),n,replace=TRUE) Z1 <- X1[xid,] Z2 <- X1[xid,] rr1 <- exp(as.matrix(Z1) %*% cox1$coef) rr2 <- exp(as.matrix(Z2) %*% cox2$coef) d <- rcrisk(cox1$cum,cox2$cum,rr1,rr2,cens=2/70) dd <- cbind(d,Z1) d <- rcrisk(cox1$cum,cox2$cum,rr1,rr2,cens=cbind(c(1,30,68),c(.01,1,3))) dd <- cbind(d,Z1) par(mfrow=c(1,3)) scox0 <- phreg(Surv(time,status==0)~tcell+platelet,data=dd) plot(scox0); lines(cbind(c(1,30,68),c(.01,1,3)),col=2) ## scox1 <- phreg(Surv(time,status==1)~tcell+platelet,data=dd) scox2 <- phreg(Surv(time,status==2)~tcell+platelet,data=dd) plot(cox1); plot(scox1,add=TRUE,col=2) plot(cox2); plot(scox2,add=TRUE,col=2) cbind(cox1$coef,scox1$coef,cox2$coef,scox2$coef) # 3 causes and censoring d3 <- rcrisk(list(cox1$cum,cox2$cum,cox1$cum),NULL,n=100,cens=cbind(c(1,30,68),c(.01,1,3))) dtable(d3,~status)data(bmt); n <- 100 cox1 <- phreg(Surv(time,cause==1)~tcell+platelet,data=bmt) cox2 <- phreg(Surv(time,cause==2)~tcell+platelet,data=bmt) X1 <- bmt[,c("tcell","platelet")] xid <- sample(1:nrow(X1),n,replace=TRUE) Z1 <- X1[xid,] Z2 <- X1[xid,] rr1 <- exp(as.matrix(Z1) %*% cox1$coef) rr2 <- exp(as.matrix(Z2) %*% cox2$coef) d <- rcrisk(cox1$cum,cox2$cum,rr1,rr2,cens=2/70) dd <- cbind(d,Z1) d <- rcrisk(cox1$cum,cox2$cum,rr1,rr2,cens=cbind(c(1,30,68),c(.01,1,3))) dd <- cbind(d,Z1) par(mfrow=c(1,3)) scox0 <- phreg(Surv(time,status==0)~tcell+platelet,data=dd) plot(scox0); lines(cbind(c(1,30,68),c(.01,1,3)),col=2) ## scox1 <- phreg(Surv(time,status==1)~tcell+platelet,data=dd) scox2 <- phreg(Surv(time,status==2)~tcell+platelet,data=dd) plot(cox1); plot(scox1,add=TRUE,col=2) plot(cox2); plot(scox2,add=TRUE,col=2) cbind(cox1$coef,scox1$coef,cox2$coef,scox2$coef) # 3 causes and censoring d3 <- rcrisk(list(cox1$cum,cox2$cum,cox1$cum),NULL,n=100,cens=cbind(c(1,30,68),c(.01,1,3))) dtable(d3,~status)
Fits a Ghosh-Lin IPCW (Inverse Probability of Censoring Weighted) Cox-type model for recurrent events in the presence of a terminal event (e.g., death).
recreg( formula, data, cause = 1, death.code = 2, cens.code = 0, cens.model = ~1, weights = NULL, offset = NULL, Gc = NULL, wcomp = NULL, marks = NULL, augmentation.type = c("lindyn.augment", "lin.augment"), ... )recreg( formula, data, cause = 1, death.code = 2, cens.code = 0, cens.model = ~1, weights = NULL, offset = NULL, Gc = NULL, wcomp = NULL, marks = NULL, augmentation.type = c("lindyn.augment", "lin.augment"), ... )
formula |
Formula with an 'Event' outcome. |
data |
Data frame containing the variables. |
cause |
Cause of interest (default is 1). |
death.code |
Codes for the terminal event/death (default is 2). |
cens.code |
Code for censoring (default is 0). |
cens.model |
Formula for a stratified Cox model without covariates used to estimate censoring probabilities. |
weights |
Weights for the score equations. |
offset |
Offsets for the model. |
Gc |
Censoring weights for the time argument. If |
wcomp |
Weights for composite outcomes (e.g., when |
marks |
A mark value vector from the data frame, specifying the mark value at all events. |
augmentation.type |
Type of augmentation when an augmentation model is given (options: |
... |
Additional arguments passed to lower-level functions. |
For the Cox-type model, the expectation is modeled as:
by solving Cox-type IPCW weighted score equations:
where
,
, and
.
The IID decomposition of the beta coefficients takes the form:
and is returned as the iid component.
Events, deaths, and censorings are specified via a start-stop structure and the Event call.
The function identifies these via a status vector and cause codes, censoring codes (cens.code),
and death codes (death.code). See examples and vignettes for details.
An object of class "recreg" (extending "phreg") containing:
coef |
Estimated coefficients. |
var |
Robust variance-covariance matrix. |
iid |
Influence functions for the coefficients. |
cumhaz |
Cumulative hazard estimates. |
se.cumhaz |
Standard errors for cumulative hazard. |
Thomas Scheike
## data with no ties data(hfactioncpx12) hf <- hfactioncpx12 hf$x <- as.numeric(hf$treatment) dd <- data.frame(treatment=levels(hf$treatment),id=1) gl <- recreg(Event(entry,time,status)~treatment+cluster(id),data=hf,cause=1,death.code=2) summary(gl) head(iid(gl)) pgl <- predict(gl,dd,se=1); plot(pgl,se=1) ## censoring stratified after treatment gls <- recreg(Event(entry,time,status)~treatment+cluster(id),data=hf, cause=1,death.code=2,cens.model=~strata(treatment)) summary(gls) glss <- recreg(Event(entry,time,status)~strata(treatment)+cluster(id),data=hf, cause=1,death.code=2,cens.model=~strata(treatment)) summary(glss) plot(glss) ## IPCW at 2 years ll2 <- recregIPCW(Event(entry,time,status)~treatment+cluster(id),data=hf, cause=1,death.code=2,time=2,cens.model=~strata(treatment)) summary(ll2) ll2i <- recregIPCW(Event(entry,time,status)~-1+treatment+cluster(id),data=hf, cause=1,death.code=2,time=2,cens.model=~strata(treatment)) summary(ll2i)## data with no ties data(hfactioncpx12) hf <- hfactioncpx12 hf$x <- as.numeric(hf$treatment) dd <- data.frame(treatment=levels(hf$treatment),id=1) gl <- recreg(Event(entry,time,status)~treatment+cluster(id),data=hf,cause=1,death.code=2) summary(gl) head(iid(gl)) pgl <- predict(gl,dd,se=1); plot(pgl,se=1) ## censoring stratified after treatment gls <- recreg(Event(entry,time,status)~treatment+cluster(id),data=hf, cause=1,death.code=2,cens.model=~strata(treatment)) summary(gls) glss <- recreg(Event(entry,time,status)~strata(treatment)+cluster(id),data=hf, cause=1,death.code=2,cens.model=~strata(treatment)) summary(glss) plot(glss) ## IPCW at 2 years ll2 <- recregIPCW(Event(entry,time,status)~treatment+cluster(id),data=hf, cause=1,death.code=2,time=2,cens.model=~strata(treatment)) summary(ll2) ll2i <- recregIPCW(Event(entry,time,status)~-1+treatment+cluster(id),data=hf, cause=1,death.code=2,time=2,cens.model=~strata(treatment)) summary(ll2i)
Computes the Inverse Probability of Censoring Weighted (IPCW) estimator for the mean number of recurrent events. Supports various estimators including the Ghosh-Lin and Lawless-Cook estimators.
recregIPCW( formula, data = data, cause = 1, cens.code = 0, death.code = 2, cens.model = ~1, km = TRUE, times = NULL, beta = NULL, offset = NULL, type = c("II", "I"), marks = NULL, weights = NULL, model = c("exp", "lin"), no.opt = FALSE, augmentation = NULL, method = "nr", se = TRUE, ... )recregIPCW( formula, data = data, cause = 1, cens.code = 0, death.code = 2, cens.model = ~1, km = TRUE, times = NULL, beta = NULL, offset = NULL, type = c("II", "I"), marks = NULL, weights = NULL, model = c("exp", "lin"), no.opt = FALSE, augmentation = NULL, method = "nr", se = TRUE, ... )
formula |
Formula with an 'Event' outcome. |
data |
Data frame. |
cause |
Cause of interest (default is 1). |
cens.code |
Censoring code (default is 0). |
death.code |
Death code (default is 2). |
cens.model |
Formula for the censoring model (default is |
km |
Logical; if |
times |
Time points for estimation (required). |
beta |
Initial values for coefficients (optional). |
offset |
Offsets. |
type |
Type of estimator: |
marks |
Mark values. |
weights |
Weights. |
model |
Model type for the mean: |
no.opt |
Logical; if |
augmentation |
Augmentation terms. |
method |
Optimization method (default is "nr"). |
se |
Logical; if |
... |
Additional arguments. |
An object of class "binreg" (extending "resmean") containing:
coef |
Estimated coefficients. |
var |
Variance-covariance matrix. |
iid |
Influence functions. |
times |
Time points. |
Y |
Observed counts. |
Thomas Scheike
Estimates the marginal mean number of recurrent events over time in the presence of a competing terminal event (e.g. death), using the nonparametric estimator of Ghosh and Lin (2000). Two proportional hazards models are fitted internally—one for the recurrent event rate and one for the terminal event—and combined to form the estimator
where is the marginal survival probability at the baseline covariate
level and is the baseline recurrent event rate among survivors.
Robust (sandwich) standard errors are computed via the influence-function
approach of Ghosh and Lin (2000).
recurrent_marginal(formula, data, cause = 1, ..., death.code = 2, test = FALSE) recurrentMarginal(formula, data, ...)recurrent_marginal(formula, data, cause = 1, ..., death.code = 2, test = FALSE) recurrentMarginal(formula, data, ...)
formula |
A formula with an |
data |
A data frame containing all variables in |
cause |
Integer code(s) for the recurrent event of interest. Default is
|
... |
Further arguments passed to |
death.code |
Integer code(s) for the terminal event. Default is |
test |
Logical. If |
Jump times must be unique within each stratum. If ties are present, use
tie_breaker to resolve them before calling this function.
An object of class "recurrent" with the following components:
mu |
Estimated marginal mean |
se.mu |
Robust standard error of |
times |
Jump times at which estimates are computed. |
St |
Marginal survival estimate |
cumhaz |
Two-column matrix of |
se.cumhaz |
Two-column matrix of |
The object carries three attributes: "logrank" (the test result when
test = TRUE, otherwise NULL), "cause", and
"death.code".
Thomas Scheike
Cook, R. J. and Lawless, J. F. (1997). Marginal analysis of recurrent events and a terminating event. Statistics in Medicine, 16, 911–924.
Ghosh, D. and Lin, D. Y. (2000). Nonparametric analysis of recurrent events and death. Biometrics, 56, 554–562.
test_logrankRecurrent, tie_breaker,
prob_exceed_recurrent
data(hfactioncpx12) hf <- hfactioncpx12 hf$x <- as.numeric(hf$treatment) ## Fit nonparametric baseline models for recurrent events and death xr <- phreg(Surv(entry, time, status == 1) ~ cluster(id), data = hf) dr <- phreg(Surv(entry, time, status == 2) ~ cluster(id), data = hf) par(mfrow = c(1, 3)) plot(dr, se = TRUE); title(main = "Death") plot(xr, se = TRUE); title(main = "Recurrent events") ## Compare naive and robust standard errors for the recurrent event rate rxr <- robust_phreg(xr, fixbeta = 1) plot(rxr, se = TRUE, robust = TRUE, add = TRUE, col = 4) ## Marginal mean via formula interface outN <- recurrent_marginal(Event(entry, time, status) ~ cluster(id), data = hf, cause = 1, death.code = 2) plot(outN, se = TRUE, col = 2, add = TRUE) summary(outN, times = 1:5) ## Stratified analysis with logrank test out <- recurrent_marginal(Event(entry, time, status) ~ strata(treatment) + cluster(id), data = hf, cause = 1, death.code = 2, test = TRUE) plot(out, se = TRUE, ylab = "Marginal mean", col = 1:2) attr(out, "logrank") summary(out, times = 1:5) ## Influence-function (iid) decomposition at a fixed time point head(iid(outN, time = 3))data(hfactioncpx12) hf <- hfactioncpx12 hf$x <- as.numeric(hf$treatment) ## Fit nonparametric baseline models for recurrent events and death xr <- phreg(Surv(entry, time, status == 1) ~ cluster(id), data = hf) dr <- phreg(Surv(entry, time, status == 2) ~ cluster(id), data = hf) par(mfrow = c(1, 3)) plot(dr, se = TRUE); title(main = "Death") plot(xr, se = TRUE); title(main = "Recurrent events") ## Compare naive and robust standard errors for the recurrent event rate rxr <- robust_phreg(xr, fixbeta = 1) plot(rxr, se = TRUE, robust = TRUE, add = TRUE, col = 4) ## Marginal mean via formula interface outN <- recurrent_marginal(Event(entry, time, status) ~ cluster(id), data = hf, cause = 1, death.code = 2) plot(outN, se = TRUE, col = 2, add = TRUE) summary(outN, times = 1:5) ## Stratified analysis with logrank test out <- recurrent_marginal(Event(entry, time, status) ~ strata(treatment) + cluster(id), data = hf, cause = 1, death.code = 2, test = TRUE) plot(out, se = TRUE, ylab = "Marginal mean", col = 1:2) attr(out, "logrank") summary(out, times = 1:5) ## Influence-function (iid) decomposition at a fixed time point head(iid(outN, time = 3))
Computes the Restricted Mean Survival Time (RMST) for stratified Kaplan-Meier or stratified Cox models with martingale standard errors.
resmean_phreg(x, times = NULL, covs = NULL, ...)resmean_phreg(x, times = NULL, covs = NULL, ...)
x |
Object of class |
times |
Possible times for which to report restricted mean. If |
covs |
Possible covariates for Cox model adjustment. |
... |
Additional arguments passed to lower-level functions. |
The standard error is computed using linear interpolation between standard errors at jump-times. This allows plotting the restricted mean as a function of time.
Years lost can be computed based on this and decomposed into years lost for different causes
using the cif_yearslost function.
An object of class "resmean_phreg" containing:
rmst |
Matrix of restricted mean survival times. |
se.rmst |
Standard errors for RMST. |
intkmtimes |
Restricted mean at specified times. |
years.lost |
Years lost (if applicable). |
Thomas Scheike
data(bmt) bmt$time <- bmt$time + runif(408) * 0.001 out1 <- phreg(Surv(time, cause != 0) ~ strata(tcell, platelet), data = bmt) rm1 <- resmean_phreg(out1, times = 10 * (1:6)) summary(rm1) e1 <- estimate(rm1) par(mfrow = c(1, 2)) plot(rm1, se = 1) plot(rm1, years.lost = TRUE, se = 1) ## Comparing populations rm1 <- resmean_phreg(out1, times = 40) e1 <- estimate(rm1) estimate(e1, rbind(c(1, -1, 0, 0)))data(bmt) bmt$time <- bmt$time + runif(408) * 0.001 out1 <- phreg(Surv(time, cause != 0) ~ strata(tcell, platelet), data = bmt) rm1 <- resmean_phreg(out1, times = 10 * (1:6)) summary(rm1) e1 <- estimate(rm1) par(mfrow = c(1, 2)) plot(rm1, se = 1) plot(rm1, years.lost = TRUE, se = 1) ## Comparing populations rm1 <- resmean_phreg(out1, times = 40) e1 <- estimate(rm1) estimate(e1, rbind(c(1, -1, 0, 0)))
Estimates the Average Treatment Effect (ATE) for Restricted Mean Survival Time (RMST) or Restricted Mean Time Lost (RMTL) in censored competing risks data using IPCW.
resmeanATE(formula, data, model = "exp", outcome = c("rmst", "rmtl"), ...)resmeanATE(formula, data, model = "exp", outcome = c("rmst", "rmtl"), ...)
formula |
Formula with an |
data |
Data frame. |
model |
Link function: |
outcome |
Outcome type: |
... |
Additional arguments passed to |
Under standard causal assumptions (Consistency, Ignorability, Positivity), the ATE is
estimated as , where is the potential outcome under treatment .
The method uses double robust estimating equations that are IPCW-adjusted for censoring.
The first covariate in the formula must be the treatment effect (a factor). If the factor has more than two levels, multinomial logistic regression (mlogit) is used for propensity score modeling.
An object of class "binregATE" containing:
riskG |
Simple IPCW estimator results. |
riskDR |
Double Robust estimator results. |
riskG.iid, riskDR.iid
|
Influence functions. |
coef |
Treatment effect estimates. |
se |
Standard errors. |
Thomas Scheike
Scheike, T. and Holst, K. K. (2024). Restricted mean time lost for survival and competing risks data using mets in R. WIP.
Scheike, T. and Tanaka, S. (2025). Restricted mean time lost ratio regression: Percentage of restricted mean time lost due to specific cause. WIP.
data(bmt); bmt$event <- bmt$cause!=0; dfactor(bmt) <- tcell~tcell out <- resmeanATE(Event(time,event)~tcell+platelet, data=bmt, time=40, treat.model=tcell~platelet, outcome="rmtl") summary(out) out1 <- resmeanATE(Event(time,cause)~tcell+platelet, data=bmt, cause=1, time=40, treat.model=tcell~platelet, outcome="rmtl") summary(out1)data(bmt); bmt$event <- bmt$cause!=0; dfactor(bmt) <- tcell~tcell out <- resmeanATE(Event(time,event)~tcell+platelet, data=bmt, time=40, treat.model=tcell~platelet, outcome="rmtl") summary(out) out1 <- resmeanATE(Event(time,cause)~tcell+platelet, data=bmt, cause=1, time=40, treat.model=tcell~platelet, outcome="rmtl") summary(out1)
Provides a fast implementation of Inverse Probability of Censoring Weighting (IPCW) regression for a single time point. It fits the model:
or, in the case of competing risks data:
which represents the "Years Lost Due to Cause" (RMTL).
resmeanIPCW(formula, data, outcome = c("rmst", "rmtl"), ...)resmeanIPCW(formula, data, outcome = c("rmst", "rmtl"), ...)
formula |
Formula with an |
data |
Data frame containing the variables. |
outcome |
Outcome type: |
... |
Additional arguments passed to |
The method solves the binomial regression IPCW response estimating equation:
where is the indicator of being uncensored
at the time of interest.
When the status variable is binary, the outcome is assumed to be (RMST).
If the status has more than two levels (competing risks), the outcome is
(RMTL for a specific cause).
The function supports:
IPCW Adjustment: Weights by the inverse of the censoring survival probability.
Augmentation: Can include an augmentation term (type="II" or "III") to improve efficiency and robustness (Double Robust estimation).
Variance Estimation: Based on the influence function, including adjustments for the estimation of the censoring model.
An object of class "binreg" containing:
coef |
Coefficient estimates. |
se |
Standard errors. |
var |
Variance-covariance matrix. |
iid |
Influence function decomposition. |
naive.var |
Variance under known censoring model (if applicable). |
time |
Time point used. |
outcome |
Type of outcome analyzed. |
Thomas Scheike
Scheike, T. and Holst, K. K. (2024). Restricted mean time lost for survival and competing risks data using mets in R. WIP.
data(bmt); bmt$time <- bmt$time+runif(nrow(bmt))*0.001 # E( min(T;t) | X ) = exp( a+b X) with IPCW estimation out <- resmeanIPCW(Event(time,cause!=0)~tcell+platelet+age, bmt, time=50, cens.model=~strata(platelet), model="exp") summary(out) ## Reduce Ex.Timings ## Weighted GLM version (RMST) out2 <- logitIPCW(Event(time,cause!=0)~tcell+platelet+age, bmt, time=50, cens.model=~strata(platelet), model="exp", outcome="rmst") summary(out2) ### Time-lost (RMTL) outtl <- resmeanIPCW(Event(time,cause!=0)~tcell+platelet+age, bmt, time=50, cens.model=~strata(platelet), model="exp", outcome="rmtl") summary(outtl) ### Same as Kaplan-Meier for full censoring model bmt$int <- with(bmt, strata(tcell, platelet)) out <- resmeanIPCW(Event(time,cause!=0)~-1+int, bmt, time=30, cens.model=~strata(platelet, tcell), model="lin") estimate(out) out1 <- phreg(Surv(time,cause!=0)~strata(tcell,platelet), data=bmt) rm1 <- resmean_phreg(out1, times=30) summary(rm1) ### Years lost regression outl <- resmeanIPCW(Event(time,cause!=0)~-1+int, bmt, time=30, outcome="years-lost", cens.model=~strata(platelet, tcell), model="lin") estimate(outl) ## Competing risks years-lost for cause 1 out <- resmeanIPCW(Event(time,cause)~-1+int, bmt, time=30, cause=1, cens.model=~strata(platelet, tcell), model="lin") estimate(out) ## Same as integrated cumulative incidence rmc1 <- cif_yearslost(Event(time,cause)~strata(tcell,platelet), data=bmt, times=30) summary(rmc1)data(bmt); bmt$time <- bmt$time+runif(nrow(bmt))*0.001 # E( min(T;t) | X ) = exp( a+b X) with IPCW estimation out <- resmeanIPCW(Event(time,cause!=0)~tcell+platelet+age, bmt, time=50, cens.model=~strata(platelet), model="exp") summary(out) ## Reduce Ex.Timings ## Weighted GLM version (RMST) out2 <- logitIPCW(Event(time,cause!=0)~tcell+platelet+age, bmt, time=50, cens.model=~strata(platelet), model="exp", outcome="rmst") summary(out2) ### Time-lost (RMTL) outtl <- resmeanIPCW(Event(time,cause!=0)~tcell+platelet+age, bmt, time=50, cens.model=~strata(platelet), model="exp", outcome="rmtl") summary(outtl) ### Same as Kaplan-Meier for full censoring model bmt$int <- with(bmt, strata(tcell, platelet)) out <- resmeanIPCW(Event(time,cause!=0)~-1+int, bmt, time=30, cens.model=~strata(platelet, tcell), model="lin") estimate(out) out1 <- phreg(Surv(time,cause!=0)~strata(tcell,platelet), data=bmt) rm1 <- resmean_phreg(out1, times=30) summary(rm1) ### Years lost regression outl <- resmeanIPCW(Event(time,cause!=0)~-1+int, bmt, time=30, outcome="years-lost", cens.model=~strata(platelet, tcell), model="lin") estimate(outl) ## Competing risks years-lost for cause 1 out <- resmeanIPCW(Event(time,cause)~-1+int, bmt, time=30, cause=1, cens.model=~strata(platelet, tcell), model="lin") estimate(out) ## Same as integrated cumulative incidence rmc1 <- cif_yearslost(Event(time,cause)~strata(tcell,platelet), data=bmt, times=30) summary(rmc1)
Computes robust (sandwich) standard errors for the cumulative baseline
hazard from a phreg object.
Summarizes cumulative baseline hazard estimates from a phreg object, optionally with robust standard errors.
Computes confidence intervals using log or plain transformations, with optional restrictions to positive values or probability scale.
robust.basehaz.phreg(x, type = "robust", fixbeta = NULL, ...) summarybase.phreg(object, robust = FALSE, ...) conftype( x, std.err, conf.type = c("log", "plain"), restrict = c("positive", "prob", "none"), conf.int = 0.95 )robust.basehaz.phreg(x, type = "robust", fixbeta = NULL, ...) summarybase.phreg(object, robust = FALSE, ...) conftype( x, std.err, conf.type = c("log", "plain"), restrict = c("positive", "prob", "none"), conf.int = 0.95 )
x |
point estimate(s). |
type |
type of standard error (default |
fixbeta |
if non-NULL, fixes beta at given value. |
... |
additional arguments. |
object |
a |
robust |
logical; if TRUE, uses robust standard errors. |
std.err |
standard error(s). |
conf.type |
type of transformation: |
restrict |
restriction: |
conf.int |
confidence level (default 0.95). |
A list with cumhaz, se.cumhaz, and strata.
An object of class "summary.recurrent".
A list with upper, lower, conf.type, and conf.int.
Functions for computing and visualizing non-parametric cumulative incidence estimates, as well as dependence measures (odds ratio, relative risk) for bivariate competing risks data.
rr_cif( cif, data, cause = NULL, cif2 = NULL, times = NULL, cause1 = 1, cause2 = 1, cens.code = NULL, cens.model = "KM", Nit = 40, detail = 0, clusters = NULL, theta = NULL, theta.des = NULL, step = 1, sym = 0, weights = NULL, same.cens = FALSE, censoring.weights = NULL, silent = 1, par.func = NULL, dpar.func = NULL, dimpar = NULL, score.method = "nlminb", entry = NULL, estimator = 1, trunkp = 1, admin.cens = NULL, ... ) or_cif( cif, data, cause = NULL, cif2 = NULL, times = NULL, cause1 = 1, cause2 = 1, cens.code = NULL, cens.model = "KM", Nit = 40, detail = 0, clusters = NULL, theta = NULL, theta.des = NULL, step = 1, sym = 0, weights = NULL, same.cens = FALSE, censoring.weights = NULL, silent = 1, par.func = NULL, dpar.func = NULL, dimpar = NULL, score.method = "nlminb", entry = NULL, estimator = 1, trunkp = 1, admin.cens = NULL, ... ) random.cif(cif, ...) Grandom.cif(cif, ...) predictPairPlack(cif1, cif2, status1, status2, theta) npc(T, cause, same.cens = TRUE, sep = FALSE) nonparcuminc(t, status, cens = 0) plotcr( x, col, lty, legend = TRUE, which = 1:2, cause = 1:2, ask = prod(par("mfcol")) < length(which) && dev.interactive(), ... )rr_cif( cif, data, cause = NULL, cif2 = NULL, times = NULL, cause1 = 1, cause2 = 1, cens.code = NULL, cens.model = "KM", Nit = 40, detail = 0, clusters = NULL, theta = NULL, theta.des = NULL, step = 1, sym = 0, weights = NULL, same.cens = FALSE, censoring.weights = NULL, silent = 1, par.func = NULL, dpar.func = NULL, dimpar = NULL, score.method = "nlminb", entry = NULL, estimator = 1, trunkp = 1, admin.cens = NULL, ... ) or_cif( cif, data, cause = NULL, cif2 = NULL, times = NULL, cause1 = 1, cause2 = 1, cens.code = NULL, cens.model = "KM", Nit = 40, detail = 0, clusters = NULL, theta = NULL, theta.des = NULL, step = 1, sym = 0, weights = NULL, same.cens = FALSE, censoring.weights = NULL, silent = 1, par.func = NULL, dpar.func = NULL, dimpar = NULL, score.method = "nlminb", entry = NULL, estimator = 1, trunkp = 1, admin.cens = NULL, ... ) random.cif(cif, ...) Grandom.cif(cif, ...) predictPairPlack(cif1, cif2, status1, status2, theta) npc(T, cause, same.cens = TRUE, sep = FALSE) nonparcuminc(t, status, cens = 0) plotcr( x, col, lty, legend = TRUE, which = 1:2, cause = 1:2, ask = prod(par("mfcol")) < length(which) && dev.interactive(), ... )
cif |
a cumulative incidence model object (from timereg). |
data |
a data.frame with the variables. |
cause |
causes to plot. |
cif2 |
optional second CIF model if different from first. |
times |
time points for evaluation. |
cause1 |
cause for first coordinate. |
cause2 |
cause for second coordinate. |
cens.code |
censoring code value. |
cens.model |
censoring model type (default |
Nit |
maximum number of iterations. |
detail |
level of output detail. |
clusters |
cluster variable name or vector. |
theta |
dependence parameter(s). |
theta.des |
design matrix for theta. |
step |
step size for optimization. |
sym |
if 1, symmetric dependence structure. |
weights |
optional weights. |
same.cens |
logical; if TRUE, uses joint censoring weights. |
censoring.weights |
optional pre-computed censoring weights. |
silent |
verbosity level. |
par.func |
optional parameter function. |
dpar.func |
optional derivative of parameter function. |
dimpar |
dimension of parameter vector. |
score.method |
optimization method (default |
entry |
optional entry time variable. |
estimator |
estimator type. |
trunkp |
truncation probability. |
admin.cens |
administrative censoring time. |
... |
additional arguments. |
cif1 |
CIF values for subject 1 (for |
status1 |
status for subject 1. |
status2 |
status for subject 2. |
T |
matrix with columns: time1, time2, status1, status2 (for |
sep |
logical; if TRUE, uses separate censoring models for each subject. |
t |
vector of event/censoring times (for |
status |
vector of status codes (for |
cens |
censoring code (default 0). |
x |
data matrix or competing risks object. |
col |
colors for curves. |
lty |
line types for curves. |
legend |
logical; if TRUE, add legend. |
which |
which plots to show. |
ask |
logical; if TRUE, prompt before new page. |
npc computes bivariate non-parametric cumulative incidence using
inverse-probability-of-censoring weights.
nonparcuminc computes univariate non-parametric cumulative incidence
for multiple causes.
plotcr plots cumulative incidence curves for competing risks using
the prodlim package.
or_cif fits an odds-ratio model for bivariate cumulative incidence.
rr_cif fits a relative-risk model for bivariate cumulative incidence.
random.cif and Grandom.cif are aliases for random_cif
and Grandom_cif (random effects CIF models).
predictPairPlack predicts pairwise joint probabilities under a
Plackett (odds-ratio) dependence model.
matplot.mets.twostage produces matrix-plots of concordance over time
from a twostage object.
For npc: matrix with columns (time, cumulative incidence).
For nonparcuminc: matrix with time and cause-specific cumulative incidences.
Klaus K. Holst, Thomas Scheike
Simulate observations from the model with cumulative hazard given by
where is the
rate parameter and is the shape parameter.
rweibullcox(n, rate, shape)rweibullcox(n, rate, shape)
n |
(integer) number of observations |
rate |
(numeric) rate parameter (can be a vector of size n) |
shape |
(numeric) shape parameter (can be a vector of size n) |
[stats::rweibull()] uses a different parametrization with cumulative hazard given by
i.e., the shape is the same but the scale paramter
is related to rate paramter by
[stats::rweibull()]
Simulates data that looks like fit from a fitted cumulative incidence model (Fine-Gray or logistic).
sim_cif( cif, n, data = NULL, Z = NULL, rr = NULL, strata = NULL, drawZ = TRUE, cens = NULL, rrc = NULL, entry = NULL, Sentry = NULL, cumstart = c(0, 0), U = NULL, pU = NULL, type = NULL, extend = NULL, ... )sim_cif( cif, n, data = NULL, Z = NULL, rr = NULL, strata = NULL, drawZ = TRUE, cens = NULL, rrc = NULL, entry = NULL, Sentry = NULL, cumstart = c(0, 0), U = NULL, pU = NULL, type = NULL, extend = NULL, ... )
cif |
Output from |
n |
Number of simulations. |
data |
Data frame to extract covariates. |
Z |
Design matrix instead of data. |
rr |
Relative risks. |
strata |
Strata vector. |
drawZ |
Logical; randomly sample from Z. |
cens |
Censoring specification. |
rrc |
Relative risks for censoring. |
entry |
Delayed entry time. |
Sentry |
Survival related to delayed entry. |
cumstart |
Start cumulatives at time 0. |
U |
Uniforms for drawing timing. |
pU |
Uniforms for drawing event type. |
type |
Model type: |
extend |
Extend piecewise constant with constant rate. |
... |
Arguments for |
Data frame with simulated event times, status, and covariates.
Thomas Scheike
[simul_cifs()]
data(bmt) nsim <- 100 ## logit cumulative incidence regression model cif <- cifreg(Event(time,cause)~platelet+age,data=bmt,cause=1) estimate(cif) plot(cif,col=1) simbmt <- sim_cif(cif,nsim,data=bmt) dtable(simbmt,~cause) scif <- cifreg(Event(time,cause)~platelet+age,data=simbmt,cause=1) estimate(scif) plot(scif,add=TRUE,col=2) ## Fine-Gray cloglog cumulative incidence regression model cif <- cifregFG(Event(time,cause)~strata(tcell)+age,data=bmt,cause=1) estimate(cif) plot(cif,col=1) simbmt <- sim_cif(cif,nsim,data=bmt) scif <- cifregFG(Event(time,cause)~strata(tcell)+age,data=simbmt,cause=1) estimate(scif) plot(scif,add=TRUE,col=2) ################################################################ # simulating several causes with specific cumulatives ################################################################ cif1 <- cifreg(Event(time,cause)~strata(tcell)+age,data=bmt,cause=1) cif2 <- cifreg(Event(time,cause)~strata(platelet)+tcell+age,data=bmt,cause=2) cifss <- list(cif1,cif2) simbmt <- sim_cifs(list(cif1,cif2),nsim,data=bmt,extend=0.005) scif1 <- cifreg(Event(time,cause)~strata(tcell)+age,data=simbmt,cause=1) scif2 <- cifreg(Event(time,cause)~strata(platelet)+tcell+age,data=simbmt,cause=2) cbind(cif1$coef,scif1$coef) ## can be off due to restriction F1+F2<= 1 cbind(cif2$coef,scif2$coef) par(mfrow=c(1,2)) ## Cause 1 follows the model plot(cif1); plot(scif1,add=TRUE,col=1:2,lwd=2) # Cause 2:second cause is modified with restriction to satisfy F1+F2<= 1, so scaled down plot(cif2); plot(scif2,add=TRUE,col=1:2,lwd=2)data(bmt) nsim <- 100 ## logit cumulative incidence regression model cif <- cifreg(Event(time,cause)~platelet+age,data=bmt,cause=1) estimate(cif) plot(cif,col=1) simbmt <- sim_cif(cif,nsim,data=bmt) dtable(simbmt,~cause) scif <- cifreg(Event(time,cause)~platelet+age,data=simbmt,cause=1) estimate(scif) plot(scif,add=TRUE,col=2) ## Fine-Gray cloglog cumulative incidence regression model cif <- cifregFG(Event(time,cause)~strata(tcell)+age,data=bmt,cause=1) estimate(cif) plot(cif,col=1) simbmt <- sim_cif(cif,nsim,data=bmt) scif <- cifregFG(Event(time,cause)~strata(tcell)+age,data=simbmt,cause=1) estimate(scif) plot(scif,add=TRUE,col=2) ################################################################ # simulating several causes with specific cumulatives ################################################################ cif1 <- cifreg(Event(time,cause)~strata(tcell)+age,data=bmt,cause=1) cif2 <- cifreg(Event(time,cause)~strata(platelet)+tcell+age,data=bmt,cause=2) cifss <- list(cif1,cif2) simbmt <- sim_cifs(list(cif1,cif2),nsim,data=bmt,extend=0.005) scif1 <- cifreg(Event(time,cause)~strata(tcell)+age,data=simbmt,cause=1) scif2 <- cifreg(Event(time,cause)~strata(platelet)+tcell+age,data=simbmt,cause=2) cbind(cif1$coef,scif1$coef) ## can be off due to restriction F1+F2<= 1 cbind(cif2$coef,scif2$coef) par(mfrow=c(1,2)) ## Cause 1 follows the model plot(cif1); plot(scif1,add=TRUE,col=1:2,lwd=2) # Cause 2:second cause is modified with restriction to satisfy F1+F2<= 1, so scaled down plot(cif2); plot(scif2,add=TRUE,col=1:2,lwd=2)
Simulate observations from the Clayton-Oakes copula model with piecewise constant marginals.
sim_ClaytonOakes( K, n, eta, beta, stoptime, lam = 1, left = 0, pairleft = 0, trunc.prob = 0.5, same = 0 )sim_ClaytonOakes( K, n, eta, beta, stoptime, lam = 1, left = 0, pairleft = 0, trunc.prob = 0.5, same = 0 )
K |
Number of clusters |
n |
Number of observations in each cluster |
eta |
variance |
beta |
Effect (log hazard ratio) of covariate |
stoptime |
Stopping time |
lam |
constant hazard |
left |
Left truncation |
pairleft |
pairwise (1) left truncation or individual (0) |
trunc.prob |
Truncation probability |
same |
if 1 then left-truncation is same also for univariate truncation |
Thomas Scheike and Klaus K. Holst
Simulate observations from the Clayton-Oakes copula model with Weibull type baseline and Cox marginals.
sim_ClaytonOakesWei( K, n, eta, beta, stoptime, weiscale = 1, weishape = 2, left = 0, pairleft = 0 )sim_ClaytonOakesWei( K, n, eta, beta, stoptime, weiscale = 1, weishape = 2, left = 0, pairleft = 0 )
K |
Number of clusters |
n |
Number of observations in each cluster |
eta |
1/variance |
beta |
Effect (log hazard ratio) of covariate |
stoptime |
Stopping time |
weiscale |
weibull scale parameter |
weishape |
weibull shape parameter |
left |
Left truncation |
pairleft |
pairwise (1) left truncation or individual (0) |
Klaus K. Holst
Simulates data based on Cox/Cox or Cox/Ghosh-Lin structures for recurrent events with a terminal event.
sim_GLcox( n, base1, drcumhaz, var.z = 0, r1 = NULL, rd = NULL, rc = NULL, fz = NULL, fdz = NULL, model = c("twostage", "frailty", "shared"), type = NULL, share = 1, cens = NULL, nmin = 100, nmax = 1000 )sim_GLcox( n, base1, drcumhaz, var.z = 0, r1 = NULL, rd = NULL, rc = NULL, fz = NULL, fdz = NULL, model = c("twostage", "frailty", "shared"), type = NULL, share = 1, cens = NULL, nmin = 100, nmax = 1000 )
n |
Number of IDs. |
base1 |
Baseline cumulative hazard for recurrent events. |
drcumhaz |
Baseline cumulative hazard for the terminal event. |
var.z |
Variance of the gamma frailty. |
r1 |
Relative risk term for the recurrent event baseline. |
rd |
Relative risk term for the terminal event. |
rc |
Relative risk term for censoring. |
fz |
Function for transformation of the frailty term. |
fdz |
Function for transformation of the frailty term for death. |
model |
Model type: |
type |
Type of simulation (default depends on |
share |
Proportion of shared random effects (for partially shared models). |
cens |
Censoring rate for exponential censoring. |
nmin |
Minimum number of points in the time grid (default 100). |
nmax |
Maximum number of points in the time grid (default 1000). |
type=3: Generates data from a Cox/Cox two-stage model.
type=2: Generates data from a Ghosh-Lin/Cox model.
If var.z=0, data is generated without dependence or frailty.
If model="twostage", the default is to generate data from a Ghosh-Lin/Cox model.
If type=3, data is generated with marginal Cox models (Cox/Cox).
Simulation is based on a linear approximation of the hazard for two-stage models on a time grid. The grid must be sufficiently fine.
A data frame with simulated recurrent events and terminal events, including frailty terms.
Thomas Scheike
Scheike (2025), Two-stage recurrent events random effects models, LIDA, to appear.
Simulates data from a full illness-death model with reversible transitions and multiple causes of death. Supports various dependence structures via shared frailties.
sim_multistate( n, cumhaz, cumhaz2, death.cumhaz, death.cumhaz2, rr12 = NULL, rr21 = NULL, rd13 = NULL, rd23 = NULL, rrc = NULL, gap.time = FALSE, max.recurrent = 100, dependence = 0, var.z = 0.5, cor.mat = NULL, cens = NULL, extend = TRUE, ... )sim_multistate( n, cumhaz, cumhaz2, death.cumhaz, death.cumhaz2, rr12 = NULL, rr21 = NULL, rd13 = NULL, rd23 = NULL, rrc = NULL, gap.time = FALSE, max.recurrent = 100, dependence = 0, var.z = 0.5, cor.mat = NULL, cens = NULL, extend = TRUE, ... )
n |
Number of IDs. |
cumhaz |
Cumulative hazard from state 1 to 2. |
cumhaz2 |
Cumulative hazard from state 2 to 1. |
death.cumhaz |
Cumulative hazard of death from state 1. |
death.cumhaz2 |
Cumulative hazard of death from state 2. |
rr12 |
Relative risk for 1->2. |
rr21 |
Relative risk for 2->1. |
rd13 |
Relative risk for death 1->3. |
rd23 |
Relative risk for death 2->3. |
rrc |
Relative risk for censoring. |
gap.time |
Gap time indicator. If true simulates gap-times with specified cumulative hazard. |
max.recurrent |
Maximum recurrent events. |
dependence |
Dependence structure (0-3). |
var.z |
Variance of random effects. |
cor.mat |
Correlation matrix. |
cens |
Censoring rate. |
extend |
Extend hazards. |
... |
Additional arguments. |
Data frame with multi-state event history.
Thomas Scheike
######################################## ## getting some rates to mimick ######################################## data(CPH_HPN_CRBSI) dr <- CPH_HPN_CRBSI$terminal base1 <- CPH_HPN_CRBSI$crbsi base4 <- CPH_HPN_CRBSI$mechanical dr2 <- scalecumhaz(dr,1.5) cens <- rbind(c(0,0),c(2000,0.5),c(5110,3)) iddata <- sim_multistate(100,base1,base1,dr,dr2,cens=cens) dlist(iddata,.~id|id<3,n=0) ### estimating rates from simulated data c0 <- phreg(Surv(start,stop,status==0)~+1,iddata) c3 <- phreg(Surv(start,stop,status==3)~+strata(from),iddata) c1 <- phreg(Surv(start,stop,status==1)~+1,subset(iddata,from==2)) c2 <- phreg(Surv(start,stop,status==2)~+1,subset(iddata,from==1)) ### par(mfrow=c(2,3)) plot(c0) lines(cens,col=2) plot(c3,main="rates 1-> 3 , 2->3") lines(dr,col=1,lwd=2) lines(dr2,col=2,lwd=2) ### plot(c1,main="rate 1->2") lines(base1,lwd=2) ### plot(c2,main="rate 2->1") lines(base1,lwd=2)######################################## ## getting some rates to mimick ######################################## data(CPH_HPN_CRBSI) dr <- CPH_HPN_CRBSI$terminal base1 <- CPH_HPN_CRBSI$crbsi base4 <- CPH_HPN_CRBSI$mechanical dr2 <- scalecumhaz(dr,1.5) cens <- rbind(c(0,0),c(2000,0.5),c(5110,3)) iddata <- sim_multistate(100,base1,base1,dr,dr2,cens=cens) dlist(iddata,.~id|id<3,n=0) ### estimating rates from simulated data c0 <- phreg(Surv(start,stop,status==0)~+1,iddata) c3 <- phreg(Surv(start,stop,status==3)~+strata(from),iddata) c1 <- phreg(Surv(start,stop,status==1)~+1,subset(iddata,from==2)) c2 <- phreg(Surv(start,stop,status==2)~+1,subset(iddata,from==1)) ### par(mfrow=c(2,3)) plot(c0) lines(cens,col=2) plot(c3,main="rates 1-> 3 , 2->3") lines(dr,col=1,lwd=2) lines(dr2,col=2,lwd=2) ### plot(c1,main="rate 1->2") lines(base1,lwd=2) ### plot(c2,main="rate 2->1") lines(base1,lwd=2)
Simulates data from an illness-death model with two causes of death from the illness state. Covariate effects can be introduced via relative risk terms.
sim_multistateII( cumhaz, death.cumhaz, death.cumhaz2, n = NULL, rr = NULL, rd = NULL, rd2 = NULL, gamma23 = 0, gamma24 = 0, early2 = 10000, gap.time = FALSE, max.recurrent = 100, cens = NULL, rrc = NULL, extend = TRUE, ... )sim_multistateII( cumhaz, death.cumhaz, death.cumhaz2, n = NULL, rr = NULL, rd = NULL, rd2 = NULL, gamma23 = 0, gamma24 = 0, early2 = 10000, gap.time = FALSE, max.recurrent = 100, cens = NULL, rrc = NULL, extend = TRUE, ... )
cumhaz |
Cumulative hazard from state 1 to 2. |
death.cumhaz |
Cumulative hazard of death from state 1. |
death.cumhaz2 |
Cumulative hazard of death from state 2. |
n |
Number of simulations. |
rr |
Relative risks. |
rd |
Relative risks for death from state 1. |
rd2 |
Relative risks for death from state 2. |
gamma23 |
Early effect parameters for death causes. |
gamma24 |
Early effect parameters for death causes. |
early2 |
Time threshold for early effect. |
gap.time |
Gap time indicator. |
max.recurrent |
Maximum recurrent events. |
cens |
Censoring specification. |
rrc |
Censoring relative risks. |
extend |
Extend hazards. |
... |
Additional arguments. |
Data frame with multi-state event history.
Thomas Scheike
Simulates data that looks like fit from a Cox model. Automatically censors data for the highest value of the event times by using cumulative hazard.
sim_phreg( cox, n, data = NULL, Z = NULL, rr = NULL, strata = NULL, entry = NULL, extend = TRUE, cens = NULL, rrc = NULL, ... )sim_phreg( cox, n, data = NULL, Z = NULL, rr = NULL, strata = NULL, entry = NULL, extend = TRUE, cens = NULL, rrc = NULL, ... )
cox |
Output from |
n |
Number of simulations. |
data |
Data frame to extract covariates for simulations (draws from observed covariates). |
Z |
Design matrix instead of data. |
rr |
Vector of relative risks for Cox model. |
strata |
Vector of strata. |
entry |
Delayed entry variable for simulation. |
extend |
Extend possible stratified baselines to largest endpoint. |
cens |
Censoring specification (matrix = cumulative hazard, scalar = rate). |
rrc |
Relative risks for Cox-type censoring. |
... |
Arguments for |
Data frame with simulated event times, status, and covariates.
Thomas Scheike
data(sTRACE) nsim <- 100 coxs <- phreg(Surv(time,status==9)~strata(chf)+vf+wmi,data=sTRACE) set.seed(100) sim3 <- sim_phreg(coxs,nsim,data=sTRACE) head(sim3) cc <- phreg(Surv(time,status)~strata(chf)+vf+wmi,data=sim3) cbind(coxs$coef,cc$coef) plot(coxs,col=1); plot(cc,add=TRUE,col=2) Z <- sim3[,c("vf","chf","wmi")] strata <- sim3[,c("chf")] rr <- exp(as.matrix(Z[,-2]) %*% coef(coxs)) sim4 <- sim_phreg(coxs,nsim,data=NULL,rr=rr,strata=strata) sim4 <- cbind(sim4,Z) cc <- phreg(Surv(time,status)~strata(chf)+vf+wmi,data=sim4) cbind(coxs$coef,cc$coef) plot(coxs,col=1); plot(cc,add=TRUE,col=2)data(sTRACE) nsim <- 100 coxs <- phreg(Surv(time,status==9)~strata(chf)+vf+wmi,data=sTRACE) set.seed(100) sim3 <- sim_phreg(coxs,nsim,data=sTRACE) head(sim3) cc <- phreg(Surv(time,status)~strata(chf)+vf+wmi,data=sim3) cbind(coxs$coef,cc$coef) plot(coxs,col=1); plot(cc,add=TRUE,col=2) Z <- sim3[,c("vf","chf","wmi")] strata <- sim3[,c("chf")] rr <- exp(as.matrix(Z[,-2]) %*% coef(coxs)) sim4 <- sim_phreg(coxs,nsim,data=NULL,rr=rr,strata=strata) sim4 <- cbind(sim4,Z) cc <- phreg(Surv(time,status)~strata(chf)+vf+wmi,data=sim4) cbind(coxs$coef,cc$coef) plot(coxs,col=1); plot(cc,add=TRUE,col=2)
Simulates data that looks like fit from cause-specific Cox models. Censors data automatically. When censoring is given in the list of causes, this provides censoring that looks like the data.
sim_phregs( coxs, n, data = NULL, rr = NULL, strata = NULL, entry = NULL, extend = TRUE, cens = NULL, rrc = NULL, ... )sim_phregs( coxs, n, data = NULL, rr = NULL, strata = NULL, entry = NULL, extend = TRUE, cens = NULL, rrc = NULL, ... )
coxs |
List of Cox models. |
n |
Number of simulations. |
data |
Data frame to extract covariates. |
rr |
Relative risks. |
strata |
Strata vector. |
entry |
Delayed entry. |
extend |
Extend baselines to largest endpoint. |
cens |
Censoring specification. |
rrc |
Relative risks for censoring. |
... |
Arguments for |
Data frame with simulated event times, status, and covariates.
Thomas Scheike
data(bmt) nsim <- 100; cox1 <- phreg(Surv(time,cause==1)~strata(tcell)+platelet+age,data=bmt) cox2 <- phreg(Surv(time,cause==2)~tcell+strata(platelet),data=bmt) coxs <- list(cox1,cox2) ## just calls sim_phregs ! dd <- sim_phregs(coxs,nsim,data=bmt,extend=c(0.001)) scox1 <- phreg(Surv(time,cause==1)~strata(tcell)+platelet+age,data=dd) scox2 <- phreg(Surv(time,cause==2)~tcell+strata(platelet),data=dd) cbind(cox1$coef,scox1$coef) cbind(cox2$coef,scox2$coef) par(mfrow=c(1,2)) plot(cox1); plot(scox1,add=TRUE); plot(cox2); plot(scox2,add=TRUE);data(bmt) nsim <- 100; cox1 <- phreg(Surv(time,cause==1)~strata(tcell)+platelet+age,data=bmt) cox2 <- phreg(Surv(time,cause==2)~tcell+strata(platelet),data=bmt) coxs <- list(cox1,cox2) ## just calls sim_phregs ! dd <- sim_phregs(coxs,nsim,data=bmt,extend=c(0.001)) scox1 <- phreg(Surv(time,cause==1)~strata(tcell)+platelet+age,data=dd) scox2 <- phreg(Surv(time,cause==2)~tcell+strata(platelet),data=dd) cbind(cox1$coef,scox1$coef) cbind(cox2$coef,scox2$coef) par(mfrow=c(1,2)) plot(cox1); plot(scox1,add=TRUE); plot(cox2); plot(scox2,add=TRUE);
A convenience wrapper around sim_recurrentII for the common
case of a single recurrent event type. Frailty and censoring options are
passed through to sim_recurrent_list.
sim_recurrent( n, cumhaz, death.cumhaz = NULL, r1 = NULL, rd = NULL, rc = NULL, ... )sim_recurrent( n, cumhaz, death.cumhaz = NULL, r1 = NULL, rd = NULL, rc = NULL, ... )
n |
Number of subjects to simulate. |
cumhaz |
Two-column matrix |
death.cumhaz |
Two-column matrix |
r1 |
Optional numeric vector of length |
rd |
Optional numeric vector of length |
rc |
Optional numeric vector of length |
... |
Further arguments passed to |
Thomas Scheike
data(CPH_HPN_CRBSI) dr <- CPH_HPN_CRBSI$terminal base1 <- CPH_HPN_CRBSI$crbsi base4 <- CPH_HPN_CRBSI$mechanical ## Single recurrent event type, with and without terminal event rr <- sim_recurrent(5, base1) dlist(rr, . ~ id, n = 0) rr <- sim_recurrent(5, base1, death.cumhaz = dr) dlist(rr, . ~ id, n = 0) ## Verify that estimated rates recover the true baselines (increase n for precision) rr <- sim_recurrent(100, base1, death.cumhaz = dr) par(mfrow = c(1, 3)) mets:::showfitsim(causes = 1, rr, dr, base1, base1) ## Shared frailty across all processes rr <- sim_recurrent(100, base1, death.cumhaz = dr, dependence = 1, var.z = 0.4) dtable(rr, ~death + status) ## Two event types; second type uses the mechanical complication rate set.seed(100) rr <- sim_recurrentII(100, base1, base4, death.cumhaz = dr) dtable(rr, ~death + status) par(mfrow = c(2, 2)) mets:::showfitsim(causes = 2, rr, dr, base1, base4) ## Three event types and two causes of death via sim_recurrent_list set.seed(100) cumhaz <- list(base1, base1, base4) drl <- list(dr, base4) rr <- sim_recurrent_list(100, cumhaz, death.cumhaz = drl, dependence = 0) dtable(rr, ~death + status) mets:::showfitsimList(rr, cumhaz, drl)data(CPH_HPN_CRBSI) dr <- CPH_HPN_CRBSI$terminal base1 <- CPH_HPN_CRBSI$crbsi base4 <- CPH_HPN_CRBSI$mechanical ## Single recurrent event type, with and without terminal event rr <- sim_recurrent(5, base1) dlist(rr, . ~ id, n = 0) rr <- sim_recurrent(5, base1, death.cumhaz = dr) dlist(rr, . ~ id, n = 0) ## Verify that estimated rates recover the true baselines (increase n for precision) rr <- sim_recurrent(100, base1, death.cumhaz = dr) par(mfrow = c(1, 3)) mets:::showfitsim(causes = 1, rr, dr, base1, base1) ## Shared frailty across all processes rr <- sim_recurrent(100, base1, death.cumhaz = dr, dependence = 1, var.z = 0.4) dtable(rr, ~death + status) ## Two event types; second type uses the mechanical complication rate set.seed(100) rr <- sim_recurrentII(100, base1, base4, death.cumhaz = dr) dtable(rr, ~death + status) par(mfrow = c(2, 2)) mets:::showfitsim(causes = 2, rr, dr, base1, base4) ## Three event types and two causes of death via sim_recurrent_list set.seed(100) cumhaz <- list(base1, base1, base4) drl <- list(dr, base4) rr <- sim_recurrent_list(100, cumhaz, death.cumhaz = drl, dependence = 0) dtable(rr, ~death + status) mets:::showfitsimList(rr, cumhaz, drl)
Simulates recurrent event data from a fitted two-stage model, where the
recurrent event process and the terminal event are each described by a
separate fitted model. The recurrent event model may be either a Cox
proportional hazards model (phreg) or a Ghosh-Lin marginal rate model
(recreg); the terminal event model must be a Cox model (phreg).
sim_recurrent_ts( cox1, coxd = NULL, n = 1, data = NULL, type = c("default", "cox-cox", "gl-cox"), id = "id", varz = 1, share = 1, cens = 0.001, scale1 = 1, scaled = 1, dependence = NULL, r1 = NULL, rd = NULL, rc = NULL, strata1 = NULL, stratad = NULL, death.code = 3, ... )sim_recurrent_ts( cox1, coxd = NULL, n = 1, data = NULL, type = c("default", "cox-cox", "gl-cox"), id = "id", varz = 1, share = 1, cens = 0.001, scale1 = 1, scaled = 1, dependence = NULL, r1 = NULL, rd = NULL, rc = NULL, strata1 = NULL, stratad = NULL, death.code = 3, ... )
cox1 |
A fitted |
coxd |
A fitted |
n |
Number of subjects to simulate. Default is |
data |
The data frame on which |
type |
Simulation type: |
id |
Name of the subject identifier variable in |
varz |
Variance of the frailty distribution in the two-stage model.
Default is |
share |
Proportion of the shared frailty assigned to the recurrent event
process in the partial-sharing model. Default is |
cens |
Rate of exponential censoring. Default is |
scale1 |
Scalar multiplier for the baseline cumulative hazard of the
recurrent event process. Default is |
scaled |
Scalar multiplier for the baseline cumulative hazard of the
terminal event. Default is |
dependence |
If non- |
r1 |
Optional numeric vector of length |
rd |
Optional numeric vector of length |
rc |
Optional numeric vector of length |
strata1 |
Optional integer vector of length |
stratad |
Optional integer vector of length |
death.code |
Integer status code used for the terminal event in the
output |
... |
Further arguments passed to |
Covariates are drawn by bootstrap from data (if supplied), and
subject-specific relative risks and strata are derived from the fitted
model objects. Stratified baselines are fully supported. When
dependence is NULL (default), the simulation uses the
two-stage structure from sim_GLcox; setting dependence
to an integer falls back to sim_recurrent_list with the
corresponding frailty model.
A data frame in counting-process format with one row per event
interval per subject. Column names match those in the original model
formula (entry, exit, and status variables). Additional columns include
id and the covariates drawn from data (if supplied). The
terminal event is coded as death.code in the status variable;
recurrent events are coded as 1.
Thomas Scheike
Scheike, T. H. (2026). Two-stage recurrent events random effects models. Lifetime Data Analysis.
recurrent_marginal, sim_recurrent_list,
sim_GLcox
data(hfactioncpx12) hf <- hfactioncpx12 hf$x <- as.numeric(hf$treatment) n <- 100 ## Cox-Cox two-stage model xr <- phreg(Surv(entry, time, status == 1) ~ x + cluster(id), data = hf) dr <- phreg(Surv(entry, time, status == 2) ~ x + cluster(id), data = hf) simcoxcox <- sim_recurrent_ts(xr, dr, n = n, data = hf, death.code = 2) ## Ghosh-Lin/Cox two-stage model recGL <- recreg(Event(entry, time, status) ~ x + cluster(id), hf, death.code = 2) simglcox <- sim_recurrent_ts(recGL, dr, n = n, data = hf, death.code = 2)data(hfactioncpx12) hf <- hfactioncpx12 hf$x <- as.numeric(hf$treatment) n <- 100 ## Cox-Cox two-stage model xr <- phreg(Surv(entry, time, status == 1) ~ x + cluster(id), data = hf) dr <- phreg(Surv(entry, time, status == 2) ~ x + cluster(id), data = hf) simcoxcox <- sim_recurrent_ts(xr, dr, n = n, data = hf, death.code = 2) ## Ghosh-Lin/Cox two-stage model recGL <- recreg(Event(entry, time, status) ~ x + cluster(id), hf, death.code = 2) simglcox <- sim_recurrent_ts(recGL, dr, n = n, data = hf, death.code = 2)
Simulates recurrent event data with up to two distinct event types and an optional terminal event (death), based on user-supplied cumulative hazard functions. Dependence between processes can be introduced via shared or correlated gamma-distributed frailties.
sim_recurrentII( n, cumhaz, cumhaz2, death.cumhaz = NULL, r1 = NULL, r2 = NULL, rd = NULL, rc = NULL, dependence = 0, var.z = 1, cor.mat = NULL, cens = NULL, gap.time = FALSE, max.recurrent = 100, ... )sim_recurrentII( n, cumhaz, cumhaz2, death.cumhaz = NULL, r1 = NULL, r2 = NULL, rd = NULL, rc = NULL, dependence = 0, var.z = 1, cor.mat = NULL, cens = NULL, gap.time = FALSE, max.recurrent = 100, ... )
n |
Number of subjects to simulate. |
cumhaz |
Two-column matrix |
cumhaz2 |
Two-column matrix |
death.cumhaz |
Two-column matrix |
r1 |
Optional numeric vector of length |
r2 |
Optional numeric vector of length |
rd |
Optional numeric vector of length |
rc |
Optional numeric vector of length |
dependence |
Integer specifying the frailty structure. One of |
var.z |
Variance of the gamma-distributed frailty. Default is |
cor.mat |
Correlation matrix for the random effects. Used when
|
cens |
Rate of exponential censoring. If |
gap.time |
Logical. If |
max.recurrent |
Maximum number of recurrent events allowed per subject.
Default is |
... |
Further arguments passed to |
The simulation proceeds by sequentially drawing the next event time from the specified cumulative hazards, taking the minimum of the two recurrent event times, and stopping each subject at death or administrative censoring.
Dependence between processes is controlled by dependence:
0Independence: all subjects have frailty fixed at 1.
1Shared frailty: all processes share a single
gamma-distributed random effect with mean 1 and variance var.z.
4Recurrent-event frailty only: the two recurrent event processes share a gamma frailty but the terminal event is independent.
For more complex correlation structures across two event types and death, use
sim_recurrentTS.
A data frame in counting-process format (one row per event interval per subject) with columns:
id |
Subject identifier. |
start, entry
|
Interval start time. |
stop, time
|
Interval end time (event or censoring time). |
status |
Event type at |
death |
Indicator for a terminal event ( |
Attributes "cumhaz", "death.cumhaz", "rr", and
"rd" store the inputs used for simulation.
Thomas Scheike
sim_recurrent, sim_recurrent_list,
sim_recurrentTS
data(CPH_HPN_CRBSI) dr <- CPH_HPN_CRBSI$terminal base1 <- CPH_HPN_CRBSI$crbsi base4 <- CPH_HPN_CRBSI$mechanical ## Single recurrent event type, with and without terminal event rr <- sim_recurrent(5, base1) dlist(rr, . ~ id, n = 0) rr <- sim_recurrent(5, base1, death.cumhaz = dr) dlist(rr, . ~ id, n = 0) ## Verify that estimated rates recover the true baselines (increase n for precision) rr <- sim_recurrent(100, base1, death.cumhaz = dr) par(mfrow = c(1, 3)) mets:::showfitsim(causes = 1, rr, dr, base1, base1) ## Shared frailty across all processes rr <- sim_recurrent(100, base1, death.cumhaz = dr, dependence = 1, var.z = 0.4) dtable(rr, ~death + status) ## Two event types; second type uses the mechanical complication rate set.seed(100) rr <- sim_recurrentII(100, base1, base4, death.cumhaz = dr) dtable(rr, ~death + status) par(mfrow = c(2, 2)) mets:::showfitsim(causes = 2, rr, dr, base1, base4) ## Three event types and two causes of death via sim_recurrent_list set.seed(100) cumhaz <- list(base1, base1, base4) drl <- list(dr, base4) rr <- sim_recurrent_list(100, cumhaz, death.cumhaz = drl, dependence = 0) dtable(rr, ~death + status) mets:::showfitsimList(rr, cumhaz, drl)data(CPH_HPN_CRBSI) dr <- CPH_HPN_CRBSI$terminal base1 <- CPH_HPN_CRBSI$crbsi base4 <- CPH_HPN_CRBSI$mechanical ## Single recurrent event type, with and without terminal event rr <- sim_recurrent(5, base1) dlist(rr, . ~ id, n = 0) rr <- sim_recurrent(5, base1, death.cumhaz = dr) dlist(rr, . ~ id, n = 0) ## Verify that estimated rates recover the true baselines (increase n for precision) rr <- sim_recurrent(100, base1, death.cumhaz = dr) par(mfrow = c(1, 3)) mets:::showfitsim(causes = 1, rr, dr, base1, base1) ## Shared frailty across all processes rr <- sim_recurrent(100, base1, death.cumhaz = dr, dependence = 1, var.z = 0.4) dtable(rr, ~death + status) ## Two event types; second type uses the mechanical complication rate set.seed(100) rr <- sim_recurrentII(100, base1, base4, death.cumhaz = dr) dtable(rr, ~death + status) par(mfrow = c(2, 2)) mets:::showfitsim(causes = 2, rr, dr, base1, base4) ## Three event types and two causes of death via sim_recurrent_list set.seed(100) cumhaz <- list(base1, base1, base4) drl <- list(dr, base4) rr <- sim_recurrent_list(100, cumhaz, death.cumhaz = drl, dependence = 0) dtable(rr, ~death + status) mets:::showfitsimList(rr, cumhaz, drl)
Simulates recurrent event data with two event types and a terminal event,
using a parametric two-stage frailty model. The construction ensures that the
marginal rates are approximately correct: conditional on survival,
cumhazj, and the hazard of death
equals death.cumhaz.
sim_recurrentTS( n, cumhaz, cumhaz2, death.cumhaz = NULL, nu = rep(1, 3), share1 = 0.3, vargamD = 2, vargam12 = 0.5, gap.time = FALSE, max.recurrent = 100, cens = NULL, ... )sim_recurrentTS( n, cumhaz, cumhaz2, death.cumhaz = NULL, nu = rep(1, 3), share1 = 0.3, vargamD = 2, vargam12 = 0.5, gap.time = FALSE, max.recurrent = 100, cens = NULL, ... )
n |
Number of subjects to simulate. |
cumhaz |
Two-column matrix |
cumhaz2 |
Two-column matrix |
death.cumhaz |
Two-column matrix |
nu |
Numeric vector of length 3: the powers |
share1 |
Proportion of the total death frailty variance assigned to the
first component |
vargamD |
Total variance of the death frailty |
vargam12 |
Variance of the shared recurrent-event frailty |
gap.time |
Logical. If |
max.recurrent |
Maximum number of recurrent events per subject. Default
is |
cens |
Rate of exponential censoring. If |
... |
Further arguments passed to lower-level simulation functions. |
The frailty structure uses three gamma random variables ,
, to induce dependence:
The parameters share1 and vargamD control how the death frailty
splits between the two components, and vargam12 controls the shared
recurrent-event frailty. Setting with share1 = 0.5
gives a symmetric structure; varying allows asymmetric dependence.
A data frame in counting-process format (one row per event interval
per subject) with columns id, start, stop,
entry, time, status, and death. Attributes
store the (possibly adjusted) cumulative hazards used in simulation and
the frailty parameters.
Thomas Scheike
sim_recurrentII, sim_recurrent_ts
data(CPH_HPN_CRBSI) dr <- CPH_HPN_CRBSI$terminal base1 <- CPH_HPN_CRBSI$crbsi base4 <- CPH_HPN_CRBSI$mechanical rr <- sim_recurrentTS(1000, base1, base4, death.cumhaz = dr) dtable(rr, ~death + status) mets:::showfitsim(causes = 2, rr, dr, base1, base4)data(CPH_HPN_CRBSI) dr <- CPH_HPN_CRBSI$terminal base1 <- CPH_HPN_CRBSI$crbsi base4 <- CPH_HPN_CRBSI$mechanical rr <- sim_recurrentTS(1000, base1, base4, death.cumhaz = dr) dtable(rr, ~death + status) mets:::showfitsim(causes = 2, rr, dr, base1, base4)
Low-level helper functions for computing sums, cumulative sums, and reverse cumulative sums within strata groups, as well as matrix double-indexing.
sumstrata(x, strata, nstrata) cumsumstrata(x, strata, nstrata) revcumsumstrata(x, strata, nstrata) revcumsum(x) matdoubleindex(x, rows, cols, xvec = NULL) mdi(x, ...)sumstrata(x, strata, nstrata) cumsumstrata(x, strata, nstrata) revcumsumstrata(x, strata, nstrata) revcumsum(x) matdoubleindex(x, rows, cols, xvec = NULL) mdi(x, ...)
x |
numeric vector (or matrix for |
strata |
integer vector of strata indices (0-based). |
nstrata |
number of distinct strata. |
rows |
row indices for |
cols |
column indices for |
xvec |
optional values to assign at indexed positions. |
... |
additional arguments (for |
Numeric vector of the same length as x (or modified matrix).
Klaus K. Holst, Thomas Scheike
Computes concordance and casewise concordance for dependence models for competing risks models of the type cor_cif, rr_cif or or_cif for the given cumulative incidences and the different dependence measures in the object.
## S3 method for class 'cor' summary(object, marg.cif = NULL, marg.cif2 = NULL, digits = 3, ...)## S3 method for class 'cor' summary(object, marg.cif = NULL, marg.cif2 = NULL, digits = 3, ...)
object |
object from cor_cif rr_cif or or_cif for dependence between competing risks data for two causes. |
marg.cif |
a number that gives the cumulative incidence in one time point for which concordance and casewise concordance are computed. |
marg.cif2 |
the cumulative incidence for cause 2 for concordance and casewise concordance are computed. Default is that it is the same as marg.cif. |
digits |
digits in output. |
... |
Additional arguments. |
prints summary for dependence model.
casewise |
gives casewise concordance that is, probability of cause 2 (related to cif2) given that cause 1 (related to cif1) has occured. |
concordance |
gives concordance that is, probability of cause 2 (related to cif2) and cause 1 (related to cif1). |
cif1 |
cumulative incidence for cause1. |
cif2 |
cumulative incidence for cause1. |
Thomas Scheike
Cross odds ratio Modelling of dependence for Multivariate Competing Risks Data, Scheike and Sun (2012), Biostatistics.
A Semiparametric Random Effects Model for Multivariate Competing Risks Data, Scheike, Zhang, Sun, Jensen (2010), Biometrika.
## library("timereg") ## data("multcif",package="mets") # simulated data ## multcif$cause[multcif$cause==0] <- 2 ## ## times=seq(0.1,3,by=0.1) # to speed up computations use only these time-points ## add <- timereg::comp.risk(Event(time,cause)~+1+cluster(id), ## data=multcif,n.sim=0,times=times,cause=1) ### ## out1<- cor_cif(add,data=multcif,cause1=1,cause2=1,theta=log(2+1)) ## summary(out1) ## ## pad <- predict(add,X=1,se=0,uniform=0) ## summary(out1,marg.cif=pad)## library("timereg") ## data("multcif",package="mets") # simulated data ## multcif$cause[multcif$cause==0] <- 2 ## ## times=seq(0.1,3,by=0.1) # to speed up computations use only these time-points ## add <- timereg::comp.risk(Event(time,cause)~+1+cluster(id), ## data=multcif,n.sim=0,times=times,cause=1) ### ## out1<- cor_cif(add,data=multcif,cause1=1,cause2=1,theta=log(2+1)) ## summary(out1) ## ## pad <- predict(add,X=1,se=0,uniform=0) ## summary(out1,marg.cif=pad)
Reporting OR from glm with binomial link and glm predictions
summaryGLM(object, id = NULL, fun = NULL, ...)summaryGLM(object, id = NULL, fun = NULL, ...)
object |
glm output |
id |
possible id for cluster corrected standard errors |
fun |
possible function for non-standard predictions based on object |
... |
arguments of estimate of lava for example level=0.95 |
Thomas Scheike
data(sTRACE) sTRACE$id <- sample(1:100,nrow(sTRACE),replace=TRUE) model <- glm(I(status==9)~sex+factor(diabetes)+age,data=sTRACE,family=binomial) summaryGLM(model) summaryGLM(model,id=sTRACE$id) nd <- data.frame(sex=c(0,1),age=67,diabetes=1) predictGLM(model,nd)data(sTRACE) sTRACE$id <- sample(1:100,nrow(sTRACE),replace=TRUE) model <- glm(I(status==9)~sex+factor(diabetes)+age,data=sTRACE,family=binomial) summaryGLM(model) summaryGLM(model,id=sTRACE$id) nd <- data.frame(sex=c(0,1),age=67,diabetes=1) predictGLM(model,nd)
Extracts estimates at specified time points with confidence intervals.
summaryTimeobject( mutimes, mu, se.mu = NULL, times = NULL, type = "log", level = 0.95, ... )summaryTimeobject( mutimes, mu, se.mu = NULL, times = NULL, type = "log", level = 0.95, ... )
mutimes |
vector of estimation time points. |
mu |
vector or matrix of estimates. |
se.mu |
vector or matrix of standard errors (optional). |
times |
time points at which to evaluate (default: |
type |
confidence interval type: |
level |
confidence level (default 0.95). |
... |
additional arguments. |
A data.frame with columns: times, mean, se-mean, CI-2.5%, CI-97.5%.
Restricts bivariate survival data to a rectangular time region defined by left-truncation and right-censoring boundaries, for use with piecewise twostage models.
surv_boxarea( left.trunc, right.cens, data, timevar = "time", status = "status", id = "id", covars = NULL, covars.pairs = NULL, num = NULL, silent = 1, boxtimevar = "boxtime" )surv_boxarea( left.trunc, right.cens, data, timevar = "time", status = "status", id = "id", covars = NULL, covars.pairs = NULL, num = NULL, silent = 1, boxtimevar = "boxtime" )
left.trunc |
vector of length 2 giving left truncation times. |
right.cens |
vector of length 2 giving right censoring times. |
data |
a data.frame with the survival data. |
timevar |
name of the time variable. |
status |
name of the status variable. |
id |
name of the cluster identifier. |
covars |
optional covariate names. |
covars.pairs |
optional pair-level covariate names. |
num |
within-cluster numbering variable name. |
silent |
verbosity level (1=silent). |
boxtimevar |
name for the created box-time variable. |
A data.frame in long format restricted to the specified region.
Fits Clayton-Oakes or bivariate Plackett models for bivariate survival data using marginals that are on Cox form. The dependence can be modelled via:
Regression design on dependence parameter.
Random effects, additive gamma model.
survival_twostage( margsurv, data = NULL, method = "nr", detail = 0, clusters = NULL, silent = 1, weights = NULL, theta = NULL, theta.des = NULL, var.link = 1, baseline.iid = 1, model = "clayton.oakes", marginal.trunc = NULL, marginal.survival = NULL, strata = NULL, se.clusters = NULL, numDeriv = 1, random.design = NULL, pairs = NULL, dim.theta = NULL, numDeriv.method = "simple", additive.gamma.sum = NULL, var.par = 1, no.opt = FALSE, ... )survival_twostage( margsurv, data = NULL, method = "nr", detail = 0, clusters = NULL, silent = 1, weights = NULL, theta = NULL, theta.des = NULL, var.link = 1, baseline.iid = 1, model = "clayton.oakes", marginal.trunc = NULL, marginal.survival = NULL, strata = NULL, se.clusters = NULL, numDeriv = 1, random.design = NULL, pairs = NULL, dim.theta = NULL, numDeriv.method = "simple", additive.gamma.sum = NULL, var.par = 1, no.opt = FALSE, ... )
margsurv |
Marginal model object. |
data |
Data frame (must be given). |
method |
Scoring method: |
detail |
Detail level for output. |
clusters |
Cluster variable. |
silent |
Debug information level. |
weights |
Weights for score equations. |
theta |
Starting values for variance components. |
theta.des |
Design for dependence parameters; when pairs are given, the indices of the theta-design for this pair are given in pairs as column 5. |
var.link |
Link function for variance: 1 for exponential link. |
baseline.iid |
To adjust for baseline estimation, using |
model |
Model type: |
marginal.trunc |
Marginal left truncation probabilities. |
marginal.survival |
Optional vector of marginal survival probabilities. |
strata |
Strata for fitting (see examples). |
se.clusters |
Clusters for SE calculation with IID. |
numDeriv |
To get numDeriv version of second derivative; otherwise uses sum of squared scores for each pair. |
random.design |
Random effect design for additive gamma model; when pairs are given, the indices of the pairs' random.design rows are given as columns 3:4. |
pairs |
Matrix with rows of indices (two-columns) for the pairs considered in the pairwise composite score; useful for case-control sampling when marginal is known. |
dim.theta |
Dimension of the theta parameter for pairs situation. |
numDeriv.method |
Method for numerical derivative (e.g., |
additive.gamma.sum |
For |
var.par |
Is 1 for the default parametrization with the variances of the random effects;
|
no.opt |
For not optimizing. |
... |
Additional arguments to maximizer NR of lava. |
If clusters contain more than two subjects, a composite likelihood based on
pairwise bivariate models is used (for full MLE see twostageMLE).
The two-stage model is constructed such that, given gamma distributed random effects, the survival functions are assumed independent, and the marginal survival functions are on Cox form:
One possibility is to model the variance within clusters via a regression design, specifying a regression structure for the independent gamma distributed random effect for each cluster, such that the variance is given by:
where is specified by theta.des, and a possible link function
var.link=1 will use the exponential link , and
var.link=0 the identity link .
The reported standard errors are based on the estimated information from the
likelihood assuming that the marginals are known (unlike twostageMLE
and for the additive gamma model below).
Can also fit a structured additive gamma random effects model, such as the
ACE, ADE model for survival data. In this case, the random.design
specifies the random effects for each subject within a cluster. This is a
matrix of 1's and 0's with dimension (for random effects).
For a cluster with two subjects, the random.design rows are
and . Such that the random effect for subject 1 is
, for random effects. Each random effect has an
associated parameter . By construction,
subject 1's random effect is Gamma distributed with mean
and variance . Note that the random effect has mean
1 and variance . It is assumed that is fixed within clusters as it would be for the ACE model.
Based on these parameters, the relative contribution (the heritability, ) is
equivalent to the expected values of the random effects: .
The DEFAULT parametrization (var.par=1) uses the variances of the random effects:
For alternative parametrizations, specify how the parameters relate to
with the argument var.par=0.
For both types of models, the basic model assumptions are that given the random effects of the clusters, the survival distributions within a cluster are independent and on the form:
with the inverse Laplace of
the gamma distribution with mean 1 and variance .
The parameters are related to the parameters of the model
by a regression construction pard (), that links the
parameters with the underlying parameters:
here using theta.des to specify these low-dimension associations. Default is a diagonal matrix.
This can be used to make structural assumptions about the variances of the random-effects
as is needed for the ACE model for example.
The case.control option can be used with the pair specification of the pairwise parts
of the estimating equations. Here it is assumed that the second subject of each pair is the proband.
An object of class "mets.twostage" containing:
theta |
Estimated dependence parameters. |
coef |
Coefficients. |
score |
Score vector. |
hess |
Hessian matrix. |
hessi |
Inverse Hessian matrix. |
var.theta |
Variance of theta parameters. |
robvar.theta |
Robust variance of theta parameters. |
loglike |
Log-likelihood value. |
theta.iid |
Influence functions for theta. |
model |
Model type used. |
Thomas Scheike
Twostage estimation of additive gamma frailty models for survival data. Scheike (2019), work in progress.
Shih and Louis (1995) Inference on the association parameter in copula models for bivariate survival data, Biometrics.
Glidden (2000), A Two-Stage estimator of the dependence parameter for the Clayton Oakes model, LIDA.
Measuring early or late dependence for bivariate twin data. Scheike, Holst, Hjelmborg (2015), LIDA.
Estimating heritability for cause specific mortality based on twins studies. Scheike, Holst, Hjelmborg (2014), LIDA.
Additive Gamma frailty models for competing risks data. Eriksson and Scheike (2015), Biometrics.
data(diabetes) # Marginal Cox model with treat as covariate margph <- phreg(Surv(time,status)~treat+cluster(id),data=diabetes) ### Clayton-Oakes, MLE fitco1<-twostageMLE(margph,data=diabetes,theta=1.0) summary(fitco1) ## Reduce Ex.Timings ### Plackett model mph <- phreg(Surv(time,status)~treat+cluster(id),data=diabetes) fitp <- survival_twostage(mph,data=diabetes,theta=3.0,Nit=40, clusters=diabetes$id,var.link=1,model="plackett") summary(fitp) ### Clayton-Oakes fitco2 <- survival_twostage(mph,data=diabetes,theta=0.0,detail=0, clusters=diabetes$id,var.link=1,model="clayton.oakes") summary(fitco2) fitco3 <- survival_twostage(margph,data=diabetes,theta=1.0,detail=0, clusters=diabetes$id,var.link=0,model="clayton.oakes") summary(fitco3) ### without covariates but with stratafied marg <- phreg(Surv(time,status)~+strata(treat)+cluster(id),data=diabetes) fitpa <- survival_twostage(marg,data=diabetes,theta=1.0, clusters=diabetes$id,model="clayton.oakes") summary(fitpa) ### Piecewise constant cross hazards ratio modelling ######################################################## d <- subset(sim_ClaytonOakes(1000,2,0.5,0,stoptime=2,left=0),!truncated) udp <- piecewise_twostage(c(0,0.5,2),data=d,method="optimize", id="cluster",timevar="time", status="status",model="clayton.oakes",silent=0) summary(udp) ## Reduce Ex.Timings ### Same model using the strata option, a bit slower ######################################################## ## makes the survival pieces for different areas in the plane ##ud1=surv_boxarea(c(0,0),c(0.5,0.5),data=d,id="cluster",timevar="time",status="status") ##ud2=surv_boxarea(c(0,0.5),c(0.5,2),data=d,id="cluster",timevar="time",status="status") ##ud3=surv_boxarea(c(0.5,0),c(2,0.5),data=d,id="cluster",timevar="time",status="status") ##ud4=surv_boxarea(c(0.5,0.5),c(2,2),data=d,id="cluster",timevar="time",status="status") ## everything done in one call ud <- piecewise_data(c(0,0.5,2),data=d,timevar="time",status="status",id="cluster") ud$strata <- factor(ud$strata); ud$intstrata <- factor(ud$intstrata) ## makes strata specific id variable to identify pairs within strata ## se's computed based on the id variable across strata "cluster" ud$idstrata <- ud$id+(as.numeric(ud$strata)-1)*2000 marg2 <- timereg::aalen(Surv(boxtime,status)~-1+factor(num):factor(intstrata), data=ud,n.sim=0,robust=0) tdes <- model.matrix(~-1+factor(strata),data=ud) fitp2 <- survival_twostage(marg2,data=ud,se.clusters=ud$cluster,clusters=ud$idstrata, model="clayton.oakes",theta.des=tdes,step=0.5) summary(fitp2) ### now fitting the model with symmetry, i.e. strata 2 and 3 same effect ud$stratas <- ud$strata; ud$stratas[ud$strata=="0.5-2,0-0.5"] <- "0-0.5,0.5-2" tdes2 <- model.matrix(~-1+factor(stratas),data=ud) fitp3 <- survival_twostage(marg2,data=ud,clusters=ud$idstrata,se.cluster=ud$cluster, model="clayton.oakes",theta.des=tdes2,step=0.5) summary(fitp3) ### same model using strata option, a bit slower fitp4 <- survival_twostage(marg2,data=ud,clusters=ud$cluster,se.cluster=ud$cluster, model="clayton.oakes",theta.des=tdes2,step=0.5,strata=ud$strata) summary(fitp4) ## Reduce Ex.Timings ### structured random effects model additive gamma ACE ### simulate structured two-stage additive gamma ACE model data <- sim_ClaytonOakes_twin_ace(2000,2,1,0,3) out <- twin_polygen_design(data,id="cluster") pardes <- out$pardes pardes des.rv <- out$des.rv head(des.rv) aa <- phreg(Surv(time,status)~x+cluster(cluster),data=data,robust=0) ts <- survival_twostage(aa,data=data,clusters=data$cluster,detail=0, theta=c(2,1),var.link=0,step=0.5, random.design=des.rv,theta.des=pardes) summary(ts)data(diabetes) # Marginal Cox model with treat as covariate margph <- phreg(Surv(time,status)~treat+cluster(id),data=diabetes) ### Clayton-Oakes, MLE fitco1<-twostageMLE(margph,data=diabetes,theta=1.0) summary(fitco1) ## Reduce Ex.Timings ### Plackett model mph <- phreg(Surv(time,status)~treat+cluster(id),data=diabetes) fitp <- survival_twostage(mph,data=diabetes,theta=3.0,Nit=40, clusters=diabetes$id,var.link=1,model="plackett") summary(fitp) ### Clayton-Oakes fitco2 <- survival_twostage(mph,data=diabetes,theta=0.0,detail=0, clusters=diabetes$id,var.link=1,model="clayton.oakes") summary(fitco2) fitco3 <- survival_twostage(margph,data=diabetes,theta=1.0,detail=0, clusters=diabetes$id,var.link=0,model="clayton.oakes") summary(fitco3) ### without covariates but with stratafied marg <- phreg(Surv(time,status)~+strata(treat)+cluster(id),data=diabetes) fitpa <- survival_twostage(marg,data=diabetes,theta=1.0, clusters=diabetes$id,model="clayton.oakes") summary(fitpa) ### Piecewise constant cross hazards ratio modelling ######################################################## d <- subset(sim_ClaytonOakes(1000,2,0.5,0,stoptime=2,left=0),!truncated) udp <- piecewise_twostage(c(0,0.5,2),data=d,method="optimize", id="cluster",timevar="time", status="status",model="clayton.oakes",silent=0) summary(udp) ## Reduce Ex.Timings ### Same model using the strata option, a bit slower ######################################################## ## makes the survival pieces for different areas in the plane ##ud1=surv_boxarea(c(0,0),c(0.5,0.5),data=d,id="cluster",timevar="time",status="status") ##ud2=surv_boxarea(c(0,0.5),c(0.5,2),data=d,id="cluster",timevar="time",status="status") ##ud3=surv_boxarea(c(0.5,0),c(2,0.5),data=d,id="cluster",timevar="time",status="status") ##ud4=surv_boxarea(c(0.5,0.5),c(2,2),data=d,id="cluster",timevar="time",status="status") ## everything done in one call ud <- piecewise_data(c(0,0.5,2),data=d,timevar="time",status="status",id="cluster") ud$strata <- factor(ud$strata); ud$intstrata <- factor(ud$intstrata) ## makes strata specific id variable to identify pairs within strata ## se's computed based on the id variable across strata "cluster" ud$idstrata <- ud$id+(as.numeric(ud$strata)-1)*2000 marg2 <- timereg::aalen(Surv(boxtime,status)~-1+factor(num):factor(intstrata), data=ud,n.sim=0,robust=0) tdes <- model.matrix(~-1+factor(strata),data=ud) fitp2 <- survival_twostage(marg2,data=ud,se.clusters=ud$cluster,clusters=ud$idstrata, model="clayton.oakes",theta.des=tdes,step=0.5) summary(fitp2) ### now fitting the model with symmetry, i.e. strata 2 and 3 same effect ud$stratas <- ud$strata; ud$stratas[ud$strata=="0.5-2,0-0.5"] <- "0-0.5,0.5-2" tdes2 <- model.matrix(~-1+factor(stratas),data=ud) fitp3 <- survival_twostage(marg2,data=ud,clusters=ud$idstrata,se.cluster=ud$cluster, model="clayton.oakes",theta.des=tdes2,step=0.5) summary(fitp3) ### same model using strata option, a bit slower fitp4 <- survival_twostage(marg2,data=ud,clusters=ud$cluster,se.cluster=ud$cluster, model="clayton.oakes",theta.des=tdes2,step=0.5,strata=ud$strata) summary(fitp4) ## Reduce Ex.Timings ### structured random effects model additive gamma ACE ### simulate structured two-stage additive gamma ACE model data <- sim_ClaytonOakes_twin_ace(2000,2,1,0,3) out <- twin_polygen_design(data,id="cluster") pardes <- out$pardes pardes des.rv <- out$des.rv head(des.rv) aa <- phreg(Surv(time,status)~x+cluster(cluster),data=data,robust=0) ts <- survival_twostage(aa,data=data,clusters=data$cluster,detail=0, theta=c(2,1),var.link=0,step=0.5, random.design=des.rv,theta.des=pardes) summary(ts)
Helper functions for the twostage survival dependence models.
survival.twostage(x, ...) matplot.mets.twostage(object, ...) alpha2spear(theta, link = 1) alpha2kendall(theta, link = 0) piecewise_twostage( cut1, cut2, data = parent.frame(), timevar = "time", status = "status", id = "id", covars = NULL, covars.pairs = NULL, num = NULL, method = "optimize", Nit = 100, detail = 0, silent = 1, weights = NULL, control = list(), theta = NULL, theta.des = NULL, var.link = 1, step = 0.5, model = "plackett", data.return = 0 ) piecewise_data( cut1, cut2, data = parent.frame(), timevar = "time", status = "status", id = "id", covars = NULL, covars.pairs = NULL, num = NULL, silent = 1 )survival.twostage(x, ...) matplot.mets.twostage(object, ...) alpha2spear(theta, link = 1) alpha2kendall(theta, link = 0) piecewise_twostage( cut1, cut2, data = parent.frame(), timevar = "time", status = "status", id = "id", covars = NULL, covars.pairs = NULL, num = NULL, method = "optimize", Nit = 100, detail = 0, silent = 1, weights = NULL, control = list(), theta = NULL, theta.des = NULL, var.link = 1, step = 0.5, model = "plackett", data.return = 0 ) piecewise_data( cut1, cut2, data = parent.frame(), timevar = "time", status = "status", id = "id", covars = NULL, covars.pairs = NULL, num = NULL, silent = 1 )
x |
a marginal model object (for |
... |
additional arguments. |
object |
a twostage model object (for matplot method). |
theta |
initial dependence parameter values. |
link |
if 1, parameters are on log scale (for alpha2kendall/alpha2spear). |
cut1 |
vector of cut points for the first time axis. |
cut2 |
vector of cut points for the second time axis. |
data |
a data.frame with the survival data. |
timevar |
character name of the time variable. |
status |
character name of the status variable. |
id |
name of the cluster identifier column. |
covars |
optional character vector of covariate names. |
covars.pairs |
optional covariates at the pair level. |
num |
character name of the within-cluster number variable. |
method |
optimization method. |
Nit |
maximum number of iterations. |
detail |
level of detail in output. |
silent |
level of verbosity (1=silent). |
weights |
optional weights. |
control |
optimization control list. |
theta.des |
theta design matrix. |
var.link |
if 1, log-link for variance parameters. |
step |
step size for optimization. |
model |
dependence model: |
data.return |
if 1, return data with model fits. |
survival.twostage is an alias for survival_twostage.
alpha2kendall converts the Clayton-Oakes alpha parameter to Kendall's tau.
alpha2spear converts the Clayton-Oakes alpha parameter to Spearman's rho.
piecewise_twostage fits twostage models on piecewise time intervals.
piecewise_data prepares data for piecewise twostage analysis.
matplot.mets.twostage produces matplot of twostage baseline estimates.
Klaus K. Holst, Thomas Scheike
Computes the G-estimator (G-formula) for standardized survival or cumulative incidence estimates:
survivalG( x, data, time = NULL, Avalues = NULL, varname = NULL, same.data = TRUE, First = FALSE )survivalG( x, data, time = NULL, Avalues = NULL, varname = NULL, same.data = TRUE, First = FALSE )
x |
Object of class |
data |
Data frame for risk averaging. Must be part of the data used for fitting unless |
time |
Time point for estimation. |
Avalues |
Values to compare for the first covariate |
varname |
Name of the variable to be treated as the treatment/exposure variable (default is the first variable). |
same.data |
Logical; assumes the same data is used for fitting and averaging. |
First |
Logical; if |
Based on a phreg or cifreg object. Provides influence functions for these risk estimates,
allowing for standard error computation.
If the first covariate is a factor, contrasts between all levels are computed automatically.
If it is continuous, specific values must be provided via Avalues.
An object of class "survivalG" containing:
risk |
Standardized risk estimates. |
risk.iid |
Influence functions for the risk estimates. |
difference |
Pairwise differences in risks. |
ratio |
Risk ratios. |
survival.ratio |
Survival ratios (for |
survival.difference |
Survival differences (for |
Thomas Scheike
data(bmt) bmt$time <- bmt$time + runif(408) * 0.001 bmt$event <- (bmt$cause != 0) * 1 bmt$id <- 1:408 dfactor(bmt) <- tcell.f ~ tcell # Fine-Gray model fg1 <- cifreg(Event(time, cause) ~ tcell.f + platelet + age, bmt, cause = 1, cox.prep = TRUE, propodds = NULL) summary(survivalG(fg1, bmt, 50)) ## Reduce Ex.Timings # Cox model ss <- phreg(Surv(time, event) ~ tcell.f + platelet + age, bmt) summary(survivalG(ss, bmt, 50)) # Stratified Cox model ss <- phreg(Surv(time, event) ~ strata(tcell.f) + platelet + age, bmt) summary(survivalG(ss, bmt, 50)) # Time-varying G-estimates sst <- survivalGtime(ss, bmt, n = 50) plot(sst) # Among treated (specify id to link influence functions) ss <- phreg(Surv(time, event) ~ tcell.f + platelet + age + cluster(id), bmt) summary(survivalG(ss, subset(bmt, tcell == 1), 50))data(bmt) bmt$time <- bmt$time + runif(408) * 0.001 bmt$event <- (bmt$cause != 0) * 1 bmt$id <- 1:408 dfactor(bmt) <- tcell.f ~ tcell # Fine-Gray model fg1 <- cifreg(Event(time, cause) ~ tcell.f + platelet + age, bmt, cause = 1, cox.prep = TRUE, propodds = NULL) summary(survivalG(fg1, bmt, 50)) ## Reduce Ex.Timings # Cox model ss <- phreg(Surv(time, event) ~ tcell.f + platelet + age, bmt) summary(survivalG(ss, bmt, 50)) # Stratified Cox model ss <- phreg(Surv(time, event) ~ strata(tcell.f) + platelet + age, bmt) summary(survivalG(ss, bmt, 50)) # Time-varying G-estimates sst <- survivalGtime(ss, bmt, n = 50) plot(sst) # Among treated (specify id to link influence functions) ss <- phreg(Surv(time, event) ~ tcell.f + platelet + age + cluster(id), bmt) summary(survivalG(ss, subset(bmt, tcell == 1), 50))
Estimates the casewise concordance based on concordance and marginal estimates, and performs tests for the independence assumption. This is particularly useful in twin studies to assess genetic vs. environmental contributions to disease risk.
test_casewise(conc, marg, test = "no-test", p = 0.01)test_casewise(conc, marg, test = "no-test", p = 0.01)
conc |
An object containing concordance estimates (e.g., from |
marg |
An object containing marginal cumulative incidence estimates (e.g., from |
test |
Type of test: |
p |
Threshold for checking that marginal probability is greater than this value at some point (default 0.01). Used to avoid division by near-zero probabilities. |
The casewise concordance is defined as:
where the numerator is the joint probability of both twins having the event by time ,
and the denominator is the marginal probability.
The function supports two types of tests:
"case": Tests on the casewise concordance scale (difference between observed and expected under independence).
"conc": Tests on the concordance scale (difference between observed concordance and squared marginal).
Standard errors are computed using cluster-based conservative estimates or influence functions (IID) if available from the input objects.
An object of class "casewise" containing:
casewise |
Matrix with time, casewise concordance, and standard errors. |
marg |
Matrix with time, marginal CIF, and standard errors. |
conc |
Matrix with time, concordance, and standard errors. |
casewise.iid |
Influence function decomposition for casewise concordance. |
test |
Test results (if |
mintime, maxtime
|
Time range used for the analysis. |
same.cluster |
Logical indicating if clusters were assumed identical. |
testtype |
Type of test performed. |
Thomas Scheike
Scheike, T. H. (2024). Casewise concordance estimation and testing. mets package documentation.
bicomprisk, casewise, test_conc
## Reduce Ex.Timings library("timereg") data("prt",package="mets"); prt <- force_same_cens(prt,cause="status") prt <- prt[which(prt$id %in% sample(unique(prt$id),7500)),] ### marginal cumulative incidence of prostate cancer times <- seq(60,100,by=2) outm <- timereg::comp.risk(Event(time,status)~+1,data=prt,cause=2,times=times) cifmz <- predict(outm,X=1,uniform=0,resample.iid=1) cifdz <- predict(outm,X=1,uniform=0,resample.iid=1) ### concordance for MZ and DZ twins cc <- bicomprisk(Event(time,status)~strata(zyg)+id(id), data=prt,cause=c(2,2)) cdz <- cc$model$"DZ" cmz <- cc$model$"MZ" ### To compute casewise cluster argument must be passed on, ### here with a max of 100 to limit comp-time outm <- timereg::comp.risk(Event(time,status)~+1,data=prt, cause=2,times=times,max.clust=100) cifmz <- predict(outm,X=1,uniform=0,resample.iid=1) cc <- bicomprisk(Event(time,status)~strata(zyg)+id(id),data=prt, cause=c(2,2),se.clusters=outm$clusters) cdz <- cc$model$"DZ" cmz <- cc$model$"MZ" cdz <- test_casewise(cdz,cifmz,test="case") ## test based on casewise cmz <- test_casewise(cmz,cifmz,test="conc") ## based on concordance plot(cmz,ylim=c(0,0.7),xlim=c(60,100)) par(new=TRUE) plot(cdz,ylim=c(0,0.7),xlim=c(60,100))## Reduce Ex.Timings library("timereg") data("prt",package="mets"); prt <- force_same_cens(prt,cause="status") prt <- prt[which(prt$id %in% sample(unique(prt$id),7500)),] ### marginal cumulative incidence of prostate cancer times <- seq(60,100,by=2) outm <- timereg::comp.risk(Event(time,status)~+1,data=prt,cause=2,times=times) cifmz <- predict(outm,X=1,uniform=0,resample.iid=1) cifdz <- predict(outm,X=1,uniform=0,resample.iid=1) ### concordance for MZ and DZ twins cc <- bicomprisk(Event(time,status)~strata(zyg)+id(id), data=prt,cause=c(2,2)) cdz <- cc$model$"DZ" cmz <- cc$model$"MZ" ### To compute casewise cluster argument must be passed on, ### here with a max of 100 to limit comp-time outm <- timereg::comp.risk(Event(time,status)~+1,data=prt, cause=2,times=times,max.clust=100) cifmz <- predict(outm,X=1,uniform=0,resample.iid=1) cc <- bicomprisk(Event(time,status)~strata(zyg)+id(id),data=prt, cause=c(2,2),se.clusters=outm$clusters) cdz <- cc$model$"DZ" cmz <- cc$model$"MZ" cdz <- test_casewise(cdz,cifmz,test="case") ## test based on casewise cmz <- test_casewise(cmz,cifmz,test="conc") ## based on concordance plot(cmz,ylim=c(0,0.7),xlim=c(60,100)) par(new=TRUE) plot(cdz,ylim=c(0,0.7),xlim=c(60,100))
Performs a Pepe-Mori type test to compare two concordance estimates (e.g., MZ vs DZ twins). The test evaluates whether the concordance functions differ significantly over time.
test_conc(conc1, conc2, same.cluster = FALSE)test_conc(conc1, conc2, same.cluster = FALSE)
conc1 |
Concordance estimate of group 1. |
conc2 |
Concordance estimate of group 2. |
same.cluster |
Logical; if FALSE, groups are assumed independent. If TRUE, estimates are based on the same data (e.g., paired twins). |
An object of class "testconc" containing:
test |
Matrix with cumulative difference, standard error, z-value, and p-value. |
mintime, maxtime
|
Time range used. |
same.cluster |
Logical flag. |
Thomas Scheike
Tests whether the marginal mean number of recurrent events differs across groups (strata), extending the classical logrank test to the setting of recurrent events with a competing terminal event. The test statistic is
where is a weight function and is the estimated
marginal mean for group . Variance is estimated robustly via the
influence functions of Ghosh and Lin (2000).
test_logrankRecurrent( recurrent, death, weight = c("I", "II", "III"), km = TRUE, start = 0, stop = NULL, at.risk = 5, cluster.id = NULL, ... )test_logrankRecurrent( recurrent, death, weight = c("I", "II", "III"), km = TRUE, start = 0, stop = NULL, at.risk = 5, cluster.id = NULL, ... )
recurrent |
Either a |
death |
A |
weight |
Character string specifying the weight scheme: |
km |
Logical. If |
start |
Left truncation time for the integration. Default is |
stop |
Right truncation time for the integration. Defaults to the last observed jump time. |
at.risk |
Minimum combined risk-set size below which the weight is set
to zero. Default is |
cluster.id |
Optional vector of cluster identifiers for aggregating influence functions across clusters before forming the test statistic. |
... |
Currently unused. |
Three weight schemes are available:
"I"(Default) ,
where . Analogous to the standard
logrank weight.
"II", the raw risk-set size. Equivalent
to using observed counts without survival adjustment.
"III"A modified weight incorporating the cumulative incidence, analogous to Gray's test for competing risks.
An object of class "estimate" (from the lava package)
with the following components:
coef |
The weighted difference in marginal means between groups. |
se |
Robust standard error of the test statistic. |
lower, upper
|
95% confidence interval bounds. |
p.value |
Two-sided p-value for the null hypothesis of no difference. |
The object also carries an iid attribute containing the
subject-level influence function decomposition of the test statistic,
which can be used for further inference or combination with other estimators.
Thomas Scheike
Ghosh, D. and Lin, D. Y. (2000). Nonparametric analysis of recurrent events and death. Biometrics, 56, 554–562.
recurrent_marginal, logrankRecurrentBase
data(hfactioncpx12) hf <- hfactioncpx12 ## Test using two separate phreg models xr <- phreg(Surv(entry, time, status == 1) ~ strata(treatment) + cluster(id), data = hf) dr <- phreg(Surv(entry, time, status == 2) ~ strata(treatment) + cluster(id), data = hf) out <- test_logrankRecurrent(xr, dr, stop = 5) summary(out) ## Equivalently, using a recurrent_marginal object directly outN <- recurrent_marginal(Event(entry, time, status) ~ strata(treatment) + cluster(id), data = hf, cause = 1, death.code = 2) test_logrankRecurrent(outN)data(hfactioncpx12) hf <- hfactioncpx12 ## Test using two separate phreg models xr <- phreg(Surv(entry, time, status == 1) ~ strata(treatment) + cluster(id), data = hf) dr <- phreg(Surv(entry, time, status == 2) ~ strata(treatment) + cluster(id), data = hf) out <- test_logrankRecurrent(xr, dr, stop = 5) summary(out) ## Equivalently, using a recurrent_marginal object directly outN <- recurrent_marginal(Event(entry, time, status) ~ strata(treatment) + cluster(id), data = hf, cause = 1, death.code = 2) test_logrankRecurrent(outN)
Performs score-test type tests for proportionality of marginal means in competing risks data and recurrent events, as presented in Ghosh and Lin (2000). The test is based on an IPCW (Inverse Probability of Censoring Weighting) formulation.
test_marginalMean( formula, data, cause = 1, cens.code = 0, ..., death.code = 2, death.code.prop = NULL, time = NULL, beta = NULL )test_marginalMean( formula, data, cause = 1, cens.code = 0, ..., death.code = 2, death.code.prop = NULL, time = NULL, beta = NULL )
formula |
Formula with an |
data |
Data frame containing all variables referenced in the formula. |
cause |
Cause of interest (default 1). |
cens.code |
Censoring code (default 0). |
... |
Additional arguments passed to lower-level functions. |
death.code |
Code for death (terminating event, default 2). |
death.code.prop |
Code for other causes of death for Fine-Gray regression model. |
time |
Upper limit for Pepe-Mori and AUC integrals. If NULL, defaults to the maximum event time for the cause of interest. |
beta |
Starting values for the score test (default NULL, uses zeros). |
The function computes several tests:
Pepe-Mori Test: Tests for equality of marginal mean functions between groups.
Ratio of AUC: Compares the area under the curve of marginal means.
Difference of AUC: Tests for difference in areas under the curve.
Score Test: Tests for proportionality (equivalent to Gray's test for CIF).
Proportionality Test: Tests the proportional hazards assumption.
The Pepe-Mori test uses weights based on the number at risk in each group to construct a weighted integral of the difference in marginal means.
An object of class "marginalTest" containing:
pepe.mori |
Pepe-Mori test results with compare p-value. |
RatioAUC |
Ratio of AUC test results with compare p-value. |
difAUC |
Difference of AUC test results with compare p-value. |
prop.test |
Proportionality test results. |
score.test |
Score test results (equivalent to Gray's test). |
score.iid |
Influence function for the score test. |
time |
Upper time limit used. |
RAUCl, RAUCe
|
Raw and transformed AUC estimates. |
Thomas Scheike
Ghosh, D. and Lin, D. Y. (2000). Nonparametric Analysis of Recurrent Events and Death. Biometrics, 56, 554–562.
test_logrankRecurrent, test_conc
data(bmt,package="mets") bmt$time <- bmt$time+runif(nrow(bmt))*0.01 bmt$id <- 1:nrow(bmt) dcut(bmt) <- age.f~age fg=cifregFG(Event(time,cause)~tcell,data=bmt,cause=1) ## computing tests for difference for CIF pmt <- test_marginalMean(Event(time,cause)~strata(tcell)+cluster(id),data=bmt,cause=1, death.code=1:2,death.code.prop=2,cens.code=0,time=40) summary(pmt) pmt$pepe.mori pmt$RatioAUC pmt$prop.test ## score test equialent to Gray's test but variance estimated differently pmt$score.test ### age-groups pmt <- test_marginalMean(Event(time,cause)~strata(age.f)+cluster(id),data=bmt,cause=1, death.code=1:2,death.code.prop=2,cens.code=0) summary(pmt) ## having a look at the cumulative incidences cifs <- cif(Event(time,cause)~strata(age.f)+cluster(id),data=bmt,cause=1) plot(cifs) ## recurrent events data(hfactioncpx12) hf <- hfactioncpx12 pmt <- test_marginalMean(Event(entry,time,status)~strata(treatment)+cluster(id),data=hf, cause=1,death.code=2,cens.code=0) summary(pmt)data(bmt,package="mets") bmt$time <- bmt$time+runif(nrow(bmt))*0.01 bmt$id <- 1:nrow(bmt) dcut(bmt) <- age.f~age fg=cifregFG(Event(time,cause)~tcell,data=bmt,cause=1) ## computing tests for difference for CIF pmt <- test_marginalMean(Event(time,cause)~strata(tcell)+cluster(id),data=bmt,cause=1, death.code=1:2,death.code.prop=2,cens.code=0,time=40) summary(pmt) pmt$pepe.mori pmt$RatioAUC pmt$prop.test ## score test equialent to Gray's test but variance estimated differently pmt$score.test ### age-groups pmt <- test_marginalMean(Event(time,cause)~strata(age.f)+cluster(id),data=bmt,cause=1, death.code=1:2,death.code.prop=2,cens.code=0) summary(pmt) ## having a look at the cumulative incidences cifs <- cif(Event(time,cause)~strata(age.f)+cluster(id),data=bmt,cause=1) plot(cifs) ## recurrent events data(hfactioncpx12) hf <- hfactioncpx12 pmt <- test_marginalMean(Event(entry,time,status)~strata(treatment)+cluster(id),data=hf, cause=1,death.code=2,cens.code=0) summary(pmt)
Calculate tetrachoric correlation of probabilities from odds-ratio
tetrachoric(P, OR, approx = 0, ...)tetrachoric(P, OR, approx = 0, ...)
P |
Joint probabilities or marginals (if OR is given) |
OR |
Odds-ratio |
approx |
If TRUE an approximation of the tetrachoric correlation is used |
... |
Additional arguments |
tetrachoric(0.3,1.25) # Marginal p1=p2=0.3, OR=2 P <- matrix(c(0.1,0.2,0.2,0.5),2) prod(diag(P))/prod(lava::revdiag(P)) ##mets:::assoc(P) tetrachoric(P) or2prob(2,0.1) or2prob(2,c(0.1,0.2))tetrachoric(0.3,1.25) # Marginal p1=p2=0.3, OR=2 P <- matrix(c(0.1,0.2,0.2,0.5),2) prod(diag(P))/prod(lava::revdiag(P)) ##mets:::assoc(P) tetrachoric(P) or2prob(2,0.1) or2prob(2,c(0.1,0.2))
Resolves tied event times in a counting-process dataset by adding a small
random perturbation to duplicated exit times of event rows. This is a
preprocessing step required by recurrent_marginal and related
functions, which assume unique jump times within each stratum.
tie_breaker( data, stop = "time", start = "entry", status = "status", id = NULL, cause = NULL, cens.code = 0, exit.unique = TRUE, ddt = NULL, seed = NULL )tie_breaker( data, stop = "time", start = "entry", status = "status", id = NULL, cause = NULL, cens.code = 0, exit.unique = TRUE, ddt = NULL, seed = NULL )
data |
A data frame in counting-process format, sorted by subject and time. |
stop |
Name of the column containing interval exit (stop) times.
Default is |
start |
Name of the column containing interval entry (start) times,
used to update the following row when |
status |
Name of the column containing event status codes. Default is
|
id |
Name of the column containing subject identifiers. If supplied,
the start time of the next interval for the same subject is adjusted to
match the perturbed stop time, preserving interval continuity. Default is
|
cause |
Integer vector of status codes that identify events (non-censored
rows). If |
cens.code |
Integer code(s) for censoring. Rows with this status are
never perturbed. Default is |
exit.unique |
Logical. If |
ddt |
Maximum perturbation size. Tied event times are shifted by a
uniform draw on |
seed |
Optional integer passed to |
A tie is defined as an event exit time (rows where status is a cause
code) that coincides with another exit time in the dataset. When
exit.unique = TRUE (default), a tie is flagged whenever an event time
also appears among any other exit times (censored or event). When
exit.unique = FALSE, only exact ties between two event rows are
resolved.
Tied event times are perturbed by adding where
and is ddt (defaulting
to half the smallest observed positive gap between consecutive exit times).
When subject IDs are provided via id, the corresponding interval start
time of the immediately following row for the same subject is updated to
maintain a valid counting-process structure, and a logical tiebreaker
column is added to flag affected rows.
The input data frame data with tied event exit times
perturbed. If id is supplied, an additional logical column
tiebreaker marks rows whose start time was adjusted as a consequence
of a perturbation in the preceding row.
Thomas Scheike
recurrent_marginal, test_logrankRecurrent
data(hfactioncpx12) hf <- hfactioncpx12 ## Check for ties in event exit times ev <- hf[hf$status == 1, ] any(duplicated(ev$time)) ## Resolve ties before fitting the marginal mean model hf_clean <- tie_breaker(hf, stop = "time", start = "entry", status = "status", id = "id", cause = 1, cens.code = 0) out <- recurrent_marginal(Event(entry, time, status) ~ cluster(id), data = hf_clean, cause = 1, death.code = 2) summary(out, times = 1:5)data(hfactioncpx12) hf <- hfactioncpx12 ## Check for ties in event exit times ev <- hf[hf$status == 1, ] any(duplicated(ev$time)) ## Resolve ties before fitting the marginal mean model hf_clean <- tie_breaker(hf, stop = "time", start = "entry", status = "status", id = "id", cause = 1, cens.code = 0) out <- recurrent_marginal(Event(entry, time, status) ~ cluster(id), data = hf_clean, cause = 1, death.code = 2) summary(out, times = 1:5)
The TRACE data frame contains 1877 patients and is a subset of a data set consisting of approximately 6000 patients. It contains data relating survival of patients after myocardial infarction to various risk factors.
This data frame contains the following columns:
a numeric vector. Patient code.
a numeric vector code. Survival status. 9: dead from myocardial infarction, 0: alive, 7: dead from other causes.
a numeric vector. Survival time in years.
a numeric vector code. Clinical heart pump failure, 1: present, 0: absent.
a numeric vector code. Diabetes, 1: present, 0: absent.
a numeric vector code. Ventricular fibrillation, 1: present, 0: absent.
a numeric vector. Measure of heart pumping effect based on ultrasound measurements where 2 is normal and 0 is worst.
a numeric vector code. 1: female, 0: male.
a numeric vector code. Age of patient.
sTRACE is a subsample consisting of 300 patients.
tTRACE is a subsample consisting of 1000 patients.
The TRACE study group.
Jensen, G.V., Torp-Pedersen, C., Hildebrandt, P., Kober, L., F. E. Nielsen, Melchior, T., Joen, T. and P. K. Andersen (1997), Does in-hospital ventricular fibrillation affect prognosis after myocardial infarction?, European Heart Journal 18, 919–924.
data(TRACE) names(TRACE)data(TRACE) names(TRACE)
ttpd discrete survival data on interval form
Simulated data
BMI data set
Self-reported BMI-values on 11,411 subjects
tvparnr: twin id bmi: BMI (m/kg^2) age: Age gender: (male/female) zyg: zygosity, MZ:=mz, DZ(same sex):=dz, DZ(opposite sex):=os
Fits a classical twin model for quantitative traits.
twinlm( formula, data, id, zyg, DZ, group = NULL, group.equal = FALSE, strata = NULL, weights = NULL, type = c("ace"), twinnum = "twinnum", binary = FALSE, ordinal = 0, keep = weights, estimator = NULL, constrain = TRUE, control = list(), messages = 1, ... )twinlm( formula, data, id, zyg, DZ, group = NULL, group.equal = FALSE, strata = NULL, weights = NULL, type = c("ace"), twinnum = "twinnum", binary = FALSE, ordinal = 0, keep = weights, estimator = NULL, constrain = TRUE, control = list(), messages = 1, ... )
formula |
Formula specifying effects of covariates on the response |
data |
|
id |
The name of the column in the dataset containing the twin-id variable. |
zyg |
The name of the column in the dataset containing the zygosity variable |
DZ |
Character defining the level in the zyg variable corresponding to the dyzogitic twins. If this argument is missing, the reference level (i.e. the first level) will be interpreted as the dyzogitic twins |
group |
Optional. Variable name defining group for interaction analysis (e.g., gender) |
group.equal |
If TRUE marginals of groups are asummed to be the same |
strata |
Strata variable name |
weights |
Weights matrix if needed by the chosen estimator. For use with Inverse Probability Weights |
type |
Character defining the type of analysis to be performed. Can be a subset of "aced" (additive genetic factors, common environmental factors, unique environmental factors, dominant genetic factors). Other choices are:
The default value is an additive polygenic model |
twinnum |
The name of the column in the dataset numbering the
twins (1,2). If it does not exist in |
binary |
If |
ordinal |
If non-zero (number of bins) a liability model is fitted. |
keep |
Vector of variables from |
estimator |
Choice of estimator/model |
constrain |
Development argument |
control |
Control argument parsed on to the optimization routine |
messages |
Control amount of messages shown |
... |
Additional arguments parsed on to lower-level functions |
Returns an object of class twinlm.
Klaus K. Holst
bptwin, twinlm.time, twinlm.strata, twinsim
## Simulate data set.seed(1) d <- twinsim(1000,b1=c(1,-1),b2=c(),acde=c(1,1,0,1)) ## E(y|z1,z2) = z1 - z2. var(A) = var(C) = var(E) = 1 ## E.g to fit the data to an ACE-model without any confounders we simply write ace <- twinlm(y ~ 1, data=d, DZ="DZ", zyg="zyg", id="id") ace ## An AE-model could be fitted as ae <- twinlm(y ~ 1, data=d, DZ="DZ", zyg="zyg", id="id", type="ae") ## LRT: lava::compare(ae,ace) ## AIC AIC(ae)-AIC(ace) ## To adjust for the covariates we simply alter the formula statement ace2 <- twinlm(y ~ x1+x2, data=d, DZ="DZ", zyg="zyg", id="id", type="ace") ## Summary/GOF summary(ace2) ## Reduce Ex.Timings ## An interaction could be analyzed as: ace3 <- twinlm(y ~ x1+x2 + x1:I(x2<0), data=d, DZ="DZ", zyg="zyg", id="id", type="ace") ace3 ## Categorical variables are also supported d2 <- transform(d,x2cat=cut(x2,3,labels=c("Low","Med","High"))) ace4 <- twinlm(y ~ x1+x2cat, data=d2, DZ="DZ", zyg="zyg", id="id", type="ace")## Simulate data set.seed(1) d <- twinsim(1000,b1=c(1,-1),b2=c(),acde=c(1,1,0,1)) ## E(y|z1,z2) = z1 - z2. var(A) = var(C) = var(E) = 1 ## E.g to fit the data to an ACE-model without any confounders we simply write ace <- twinlm(y ~ 1, data=d, DZ="DZ", zyg="zyg", id="id") ace ## An AE-model could be fitted as ae <- twinlm(y ~ 1, data=d, DZ="DZ", zyg="zyg", id="id", type="ae") ## LRT: lava::compare(ae,ace) ## AIC AIC(ae)-AIC(ace) ## To adjust for the covariates we simply alter the formula statement ace2 <- twinlm(y ~ x1+x2, data=d, DZ="DZ", zyg="zyg", id="id", type="ace") ## Summary/GOF summary(ace2) ## Reduce Ex.Timings ## An interaction could be analyzed as: ace3 <- twinlm(y ~ x1+x2 + x1:I(x2<0), data=d, DZ="DZ", zyg="zyg", id="id", type="ace") ace3 ## Categorical variables are also supported d2 <- transform(d,x2cat=cut(x2,3,labels=c("Low","Med","High"))) ace4 <- twinlm(y ~ x1+x2cat, data=d2, DZ="DZ", zyg="zyg", id="id", type="ace")
Simulate twin data from a linear normal ACE/ADE/AE model.
twinsim( nMZ = 100, nDZ = nMZ, b1 = c(), b2 = c(), mu = 0, acde = c(1, 1, 0, 1), randomslope = NULL, threshold = 0, cens = FALSE, wide = FALSE, ... )twinsim( nMZ = 100, nDZ = nMZ, b1 = c(), b2 = c(), mu = 0, acde = c(1, 1, 0, 1), randomslope = NULL, threshold = 0, cens = FALSE, wide = FALSE, ... )
nMZ |
Number of monozygotic twin pairs |
nDZ |
Number of dizygotic twin pairs |
b1 |
Effect of covariates (labelled x1,x2,...) of type 1. One distinct covariate value for each twin/individual. |
b2 |
Effect of covariates (labelled g1,g2,...) of type 2. One covariate value for each twin pair. |
mu |
Intercept parameter. |
acde |
Variance of random effects (in the order A,C,D,E) |
randomslope |
Logical indicating wether to include random slopes of the variance components w.r.t. x1,x2,... |
threshold |
Treshold used to define binary outcome y0 |
cens |
Logical variable indicating whether to censor outcome |
wide |
Logical indicating if wide data format should be returned |
... |
Additional arguments parsed on to lower-level functions |
Klaus K. Holst
Based on nation-wide questionnaire answers from 33,317 Danish twins
tvparnr: twin-pair id zyg: zygosity, MZ:=mz, DZ(same sex):=dz, DZ(opposite sex):=os stutter: stutter status (yes/no) age: age nr: number within twin-pair
Fits Clayton-Oakes clustered survival data using marginals that are on Cox form in the likelihood for the dependence parameter as in Glidden (2000).
twostageMLE( margsurv, data = parent.frame(), theta = NULL, theta.des = NULL, var.link = 0, method = "NR", no.opt = FALSE, weights = NULL, se.cluster = NULL, ... )twostageMLE( margsurv, data = parent.frame(), theta = NULL, theta.des = NULL, var.link = 0, method = "NR", no.opt = FALSE, weights = NULL, se.cluster = NULL, ... )
margsurv |
Marginal model from |
data |
Data frame. |
theta |
Starting values for variance components. |
theta.des |
Design for dependence parameters; when pairs are given, this could be a
(pairs) |
var.link |
Link function for variance; if 1 then uses exponential link. |
method |
Type of optimizer; default is Newton-Raphson |
no.opt |
To not optimize, for example to get score and IID for specific theta. |
weights |
Cluster-specific weights, but given with length equivalent to data-set; weights for score equations. |
se.cluster |
Specifies how the influence functions are summed before squared when computing the variance.
Note that the id from the marginal model is used to construct MLE, and then these scores can be summed with the |
... |
Arguments to be passed to optimizer. |
The dependence can be modelled via a regression structure for the independent gamma distributed random effects and their variances that may depend on cluster covariates. So:
where is specified by theta.des. The link function can be the exponential
when var.link=1.
An object of class "mets.twostage" containing:
theta |
Estimated dependence parameters. |
coef |
Coefficients. |
var.theta |
Variance of theta parameters. |
robvar.theta |
Robust variance of theta parameters. |
theta.iid |
Influence functions for theta. |
theta.iid.naive |
Naive influence functions for theta. |
loglike |
Log-likelihood value. |
Thomas Scheike
Measuring early or late dependence for bivariate twin data. Scheike, Holst, Hjelmborg (2015), LIDA.
Twostage modelling of additive gamma frailty models for survival data. Scheike and Holst, working paper.
Shih and Louis (1995) Inference on the association parameter in copula models for bivariate survival data, Biometrics.
Glidden (2000), A Two-Stage estimator of the dependence parameter for the Clayton Oakes model, LIDA.
data(diabetes) dd <- phreg(Surv(time, status == 1) ~ treat + cluster(id), diabetes) oo <- twostageMLE(dd, data = diabetes) summary(oo) theta.des <- model.matrix(~ -1 + factor(adult), diabetes) oo <- twostageMLE(dd, data = diabetes, theta.des = theta.des) summary(oo)data(diabetes) dd <- phreg(Surv(time, status == 1) ~ treat + cluster(id), diabetes) oo <- twostageMLE(dd, data = diabetes) summary(oo) theta.des <- model.matrix(~ -1 + factor(adult), diabetes) oo <- twostageMLE(dd, data = diabetes, theta.des = theta.des) summary(oo)
Fits a two-stage random effects model for recurrent events with a terminal event. Marginal models (Cox or Ghosh-Lin) are fitted first and passed to this function.
twostageREC( margsurv, recurrent, data = parent.frame(), theta = NULL, model = c("full", "shared", "non-shared"), ghosh.lin = NULL, theta.des = NULL, var.link = 0, method = "NR", no.opt = FALSE, weights = NULL, se.cluster = NULL, fnu = NULL, nufix = 0, nu = NULL, numderiv = 1, derivmethod = c("simple", "Richardson"), ... )twostageREC( margsurv, recurrent, data = parent.frame(), theta = NULL, model = c("full", "shared", "non-shared"), ghosh.lin = NULL, theta.des = NULL, var.link = 0, method = "NR", no.opt = FALSE, weights = NULL, se.cluster = NULL, fnu = NULL, nufix = 0, nu = NULL, numderiv = 1, derivmethod = c("simple", "Richardson"), ... )
margsurv |
Marginal model for the terminal event (object of class |
recurrent |
Marginal model for recurrent events (object of class |
data |
Data frame used for fitting. |
theta |
Starting value for total variance of gamma frailty. |
model |
Model type: |
ghosh.lin |
Logical; if |
theta.des |
Regression design for variance parameters. |
var.link |
Link function for variance (1 for exponential). |
method |
Optimization method (default "NR"). |
no.opt |
Logical; if |
weights |
Weights. |
se.cluster |
Clusters for SE calculation (GEE style). |
fnu |
Function to transform |
nufix |
Logical; if |
nu |
Starting value for the amount shared. |
numderiv |
Logical; if |
derivmethod |
Method for numerical derivative. |
... |
Arguments for the optimizer. |
Supports:
Cox/Cox marginals.
Cox/Ghosh-Lin marginals.
Fully shared, partly shared, or non-shared random effects.
An object of class "twostageREC" containing:
coef |
Estimated coefficients. |
var |
Variance-covariance matrix. |
theta |
Dependence parameters. |
model |
Model type. |
Thomas Scheike
Scheike (2026), Two-stage recurrent events random effects models, LIDA, to appear.
Computes the "While-Alive" estimands for recurrent events in the presence of a terminal event (death). These estimands address the challenge of defining meaningful treatment effects when death prevents further observation of recurrent events.
WA_recurrent( formula, data, time = NULL, cens.code = 0, cause = 1, death.code = 2, trans = NULL, cens.formula = NULL, augmentR = NULL, augmentC = NULL, type = NULL, marks = NULL, ... )WA_recurrent( formula, data, time = NULL, cens.code = 0, cause = 1, death.code = 2, trans = NULL, cens.formula = NULL, augmentR = NULL, augmentC = NULL, type = NULL, marks = NULL, ... )
formula |
Formula with an |
data |
Data frame containing all variables referenced in the formula. |
time |
Time point |
cens.code |
Numeric code for censoring (default 0). |
cause |
Numeric code for the recurrent event of interest (default 1). |
death.code |
Numeric code for the terminal event/death (default 2). |
trans |
Power transformation for the mean of events per time-unit (default NULL, i.e., linear). |
cens.formula |
Formula for the censoring model. Default is |
augmentR |
Formula for covariate augmentation in the randomization model
(e.g., |
augmentC |
Formula for covariate augmentation in the censoring model. Enables double robustness. |
type |
Type of augmentation for the binomial regression call. Default is "I"
if |
marks |
Optional marks for composite outcome situations (e.g., distinguishing event types in a composite endpoint). |
... |
Additional arguments passed to |
The function estimates two primary quantities:
Ratio of Means:
The expected number of events up to time (censored by death )
divided by the expected time alive up to .
Mean of Events per Time Unit:
The expected rate of events per unit of time alive.
Estimation is based on Inverse Probability of Censoring Weighting (IPCW) to handle administrative censoring and death. The method can be augmented with covariates (double robust estimation) to improve efficiency and robustness.
An object of class "WA" containing:
RAW |
List of raw estimates: RMST, mean number of events, ratio of means, and their log-transformed versions with standard errors. |
ET |
List of estimated treatment effects: risk difference for the mean rate
( |
time |
The time point used for estimation. |
cause, death.code, cens.code
|
Codes used. |
augmentR, augmentC
|
Formulas used for augmentation. |
The object includes influence functions (IID) for all estimators, allowing for further variance calculations or combination with other estimators.
Thomas Scheike
Ragni, A., Martinussen, T., & Scheike, T. H. (2023). Nonparametric estimation of the Patient Weighted While-Alive Estimand. arXiv preprint.
Mao, L. (2023). Nonparametric inference of general while-alive estimands for recurrent events. Biometrics, 79(3), 1749–1760.
Schmidli, H., Roger, J. H., & Akacha, M. (2023). Estimands for recurrent event endpoints in the presence of a terminal event. Statistics in Biopharmaceutical Research, 15(2), 238–248.
data(hfactioncpx12) dtable(hfactioncpx12,~status) dd <- WA_recurrent(Event(entry,time,status)~treatment+cluster(id),data=hfactioncpx12, time=2,death.code=2) summary(dd)data(hfactioncpx12) dtable(hfactioncpx12,~status) dd <- WA_recurrent(Event(entry,time,status)~treatment+cluster(id),data=hfactioncpx12, time=2,death.code=2) summary(dd)
Performs regression analysis for the "While-Alive" mean of events per time unit,
defined as . This function models how covariates
affect the rate of recurrent events per unit of time alive.
WA_reg( formula, data, time = NULL, cens.code = 0, cause = 1, death.code = 2, marks = NULL, ..., trans = 1 )WA_reg( formula, data, time = NULL, cens.code = 0, cause = 1, death.code = 2, marks = NULL, ..., trans = 1 )
formula |
Formula with regression design. The first covariate on the RHS must
be the treatment factor. Can include other covariates and |
data |
Data frame. |
time |
Time point |
cens.code |
Censoring code. |
cause |
Event cause code. |
death.code |
Death code. |
marks |
Marks for composite outcomes. |
... |
Additional arguments passed to |
trans |
Power transformation for the outcome (default 1). |
The estimation is based on IPCW (Inverse Probability of Censoring Weighting) and
calls binreg after constructing the outcome variable. It supports double
robust estimation if covariate augmentation is specified.
An object of class "binreg" containing coefficient estimates,
standard errors, confidence intervals, and influence functions for the
regression of the event rate per time alive.
Thomas Scheike
Ragni, A., Martinussen, T., & Scheike, T. H. (2023). Nonparametric estimation of the Patient Weighted While-Alive Estimand. arXiv preprint.
data(hfactioncpx12) hfactioncpx12$age <- rnorm(741)[hfactioncpx12$id] dtable(hfactioncpx12,~status) ## exp-link regression dd <- WA_reg(Event(entry,time,status)~treatment+age+cluster(id),data=hfactioncpx12, time=2,death.code=2) summary(dd)data(hfactioncpx12) hfactioncpx12$age <- rnorm(741)[hfactioncpx12$id] dtable(hfactioncpx12,~status) ## exp-link regression dd <- WA_reg(Event(entry,time,status)~treatment+age+cluster(id),data=hfactioncpx12, time=2,death.code=2) summary(dd)