Abstract :
[en] Interval-censored time-to-event data arise frequently in clinical trials and longitudinal studies, where the event of interest is only known to have occurred between the two consecutive visits. Interval-censoring is a natural generalization of right censored time-to-event data. For right-censored data, extensive number of statistical techniques are available to tackle most research questions under a variety of assumptions. However, for interval-censored data, less well developed procedures are available. A sparse offer in statistical softwares to handle this type of censoring has driven many researchers to use imputation techniques, especially right-point or mid-point imputation. However such imputation strategies can lead to misleading inferences. Our thesis proposes and studies the properties of innovative methods to analyze such data.
In the first part of the text, we have extended a Bayesian density estimation procedure for grouped data to estimate hazard ratios and survival functions from interval-censored data. If one further assumes proportionality of the hazards, the proposed strategy also provides estimates of global covariate effects. Clearly, the proposed method provides very good estimates for the regression coefficients and successfully approximates the baseline survival function when the mean interval width is smaller than some threshold defined from the data standard deviation.
In a Cox proportional hazards model, the observations are assumed to be independent. However, this may not be true in certain situations where the observed units are clustered or subject to multiple measurements. A number of approaches generalizing Cox's PH model to handle correlated interval-censored data have been proposed in the literature. The shared frailty model is a popular tool to analyze correlated right-censored time-to-event data. Frailty models have also been adapted to handle interval-censored data. In the case of interval-censored time-to-event data, the inclusion of frailties results in complicated intractable likelihoods. In the second part of this thesis, we propose flexible frailty models for analyzing such data by assuming a smooth flexible form for the conditional time-to-event distribution and a parametric or a flexible form for the frailty distribution.
It has been indicated in the literature in different contexts that the misspecification of the random effect distribution can influence the estimation of quantities of primary interest, like the fixed effects. To circumvent such misspecification, we have suggested modeling the distribution of the frailty in a flexible way using P-splines or a gamma shape mixture (GSM) distribution. The biggest advantage of using a flexible specification for the density of frailty arises when its shape is of specific interest. If it is considered as a nuisance, assuming a simpler lognormal or gamma frailty would be an adequate solution to draw conclusions related to other model parameters, such as regression coefficients and variance of frailties. Indeed, it was shown in the simulation study that the regression parameter estimates in a shared frailty PH model are robust to the misspecification of the frailty density. Moreover the use of a flexible form for the frailty does not cause any loss of precision in the estimation of regression parameters when compared to the simpler parametric frailty model. Both models provide the possibility to visualize the baseline density and survival functions. Given sufficiently large sample sizes, the flexible approach produces smooth and accurate posterior estimates for the baseline survival function and for the frailty density, and can correctly detect and identify unusual frailty density forms.