Grouped data; Bivariate density estimation; Bayesian P-splines
Abstract :
[en] Penalized B-splines combined with the composite link model are used to estimate a bivariate density from a histogram with wide bins. The goals are multiple: they include the visualization of the dependence between the two variates, but also the estimation of derived quantities like Kendall’s tau, conditional moments and quantiles. Two strategies are proposed: the first one is semiparametric with flexible margins modeled using B-splines and a parametric copula for the dependence structure; the second one is nonparametric and is based on Kronecker products of the marginal B-spline bases. Frequentist and Bayesian estimations are described. A large simulation study quantifies the performances of the two methods under different dependence structures and for varying strengths of dependence, sample sizes and amounts of grouping. It suggests that Schwarz’s BIC is a good tool for classifying the competing models. The density estimates are used to evaluate conditional quantiles in two applications in social and in medical sciences.
Disciplines :
Mathematics
Author, co-author :
Lambert, Philippe ; Université de Liège - ULiège > Institut des sciences humaines et sociales > Méthodes quantitatives en sciences sociales
Language :
English
Title :
Smooth semiparametric and nonparametric Bayesian estimation of bivariate densities from bivariate histogram data
Publication date :
2011
Journal title :
Computational Statistics and Data Analysis
ISSN :
0167-9473
eISSN :
1872-7352
Publisher :
Elsevier Science, Amsterdam, Netherlands
Volume :
55
Pages :
429-445
Peer reviewed :
Peer Reviewed verified by ORBi
Name of the research project :
CREATION D’OUTILS STATISTIQUES POUR L’ANALYSE DE DONNEES D’ENQUETES CENSUREES PAR INTERVALLE
Funders :
FSR research grant No. FSRC-08/42 from the University of Liège ; IAP research network No. P6/03 of the Belgian government (Belgian Science Policy)
scite shows how a scientific paper has been cited by providing the context of the citation, a classification describing whether it supports, mentions, or contrasts the cited claim, and a label indicating in which section the citation was made.
Bibliography
Akaike, H., 1974. A new look at the statistical model identification. IEEE Transactions on Automatic Control 19, 716-723.
Betensky, R., Finkelstein, D., 1999. A non-parametric maximum likelihood estimator for bivariate interval censored data. Statistics in Medicine 18 (22), 3089-3100. (Pubitemid 29526946)
Braun, J., Duchesne, T., Stafford, J.E., 2005. Local likelihood density estimation for interval censored data. The Canadian Journal of Statistics 33, 39-60.
Eilers, P.H.C., 1992. Nonparametric density estimation with grouped observations. Statistica Neerlandica 45, 255-269.
Eilers, P.H.C., 2007. Ill-posed problems with counts, the composite link model and penalized likelihood. Statistical Modelling 7, 239-254.
Eilers, P.H.C., Marx, B.D., 1996. Flexible smoothing with B-splines and penalties (with discussion). Statistical Science 11, 89-121.
Goggins, W.B., Finkelstein, D.M., 2000. A proportional hazards model for multivariate interval-censored failure time data. Biometrics 56, 940-943.
Gomez, G., Calle, M.L., Oller, R., 2004. Frequentist and Bayesian approaches for interval-censored data. Statistical Papers 45, 139-173.
Haario, H., Saksman, E., Tamminen, J., 2001. An adaptive metropolis algorithm. Bernoulli 7, 223-242.
Hanson, T.E., 2006. Inference for mixtures of finite Polya tree models. Journal of the American Statistical Association 101, 1548-1565.
Härkänen, T., Virtanen, J.I., Arjas, E., 2000. Caries on permanent teeth: a non-parametric Bayesian analysis. Scandinavian Journal of Statistics 27, 577-588.
Jullion, J., Lambert, P., 2007. Robust specification of the roughness penalty prior distribution in spatially adaptive Bayesian P-splines models. Computational Statistics and Data Analysis 51, 2542-2558.
Komárek, A., Lesaffre, E., 2006. Bayesian semi-parametric accelerated failure time model for paired doubly interval-censored data. Statistical Modelling 6 (1), 3-22.
Komárek, A., Lesaffre, E., Härkänen, T., Declerck, D., Virtanen, J.I., 2005. A Bayesian analysis of multivariate doubly-interval- censored dental data. Biostatistics 6, 145-155.
Koo, J.-Y., Kooperberg, C., 2000. Logspline density estimation for binned data. Statistics and Probability Letters 46, 133-147.
Kooperberg, C., Stone, C.J., 1992. Logspline density estimation for censored data. Journal of Computational and Graphical Statistics 1, 301-328.
Lambert, P., Eilers, P.H., 2009. Bayesian density estimation from grouped continuous data. Computational Statistics and Data Analysis 53, 1388-1399.
Lang, S., Brezger, A., 2004. Bayesian P-splines. Journal of Computational and Graphical Statistics 13, 183-212.
Lavine, M., 1992. Some aspects of Polya tree distributions for statistical modelling. The Annals of Statistics 20, 1222-1235.
Law, C.G., Brookmeyer, R., 1992. Effects of mid-point imputation on the analysis of doubly censored data. Statistics in Medicine 11, 1569-1578.
Minnotte, M., 1998. Achieving higher-order convergence rates for density estimation with binned data. Journal of the American Statistical Association 93, 663-672.
Peto, R., 1973. Experimental survival curves for interval-censored data. Journal of the Royal Statistical Society. Series C (Applied Statistics) 22, 86-91.
Schwarz, G., 1978. Estimating the dimension of a model. The Annals of Statistics 6, 461-464.
Smith, M., Kohn, R., 1997. A Bayesian approach to nonparametric bivariate regression. Journal of the American Statistical Association 92, 1522-1535.
Sun, J., 2006. The Statistical Analysis of Interval-censored Failure Time Data. Springer.
Thompson, R., Baker, R.J., 1981. Composite link functions in generalized linear models. Applied Statistics 30, 125-131.
Turnbull, B., 1976. The empirical distribution function with arbitrarily grouped, censored and truncated data. Journal of the Royal Statistical Society. Series B (Methodological) 38, 290-295.
Versonnen, A., 2009. Population et ménages, mariages et divorces. Technical Report. Direction Générale Statistique et Information Economique, 44 rue de Louvain, 1000 Bruxelles (Belgique).
Wong, M.C.M., Lam, K.F., Lo, E.C.M., 2005. Bayesian analysis of clustered interval-censored data. Journal of Dental Research 84 (9), 817-821.
Yang, M., Hanson, T., Christensen, R., 2008. Nonparametric Bayesian estimation of a bivariate density with interval censored data. Computational Statistics and Data Analysis 52, 5202-5214.
Zhang, M., Davidian, M., 2008. Smooth semiparametric regression analysis for arbitrarily censored time-to-event data. Biometrics 64 (2), 567-576.
Similar publications
Sorry the service is unavailable at the moment. Please try again later.
This website uses cookies to improve user experience. Read more
Save & Close
Accept all
Decline all
Show detailsHide details
Cookie declaration
About cookies
Strictly necessary
Performance
Strictly necessary cookies allow core website functionality such as user login and account management. The website cannot be used properly without strictly necessary cookies.
This cookie is used by Cookie-Script.com service to remember visitor cookie consent preferences. It is necessary for Cookie-Script.com cookie banner to work properly.
Performance cookies are used to see how visitors use the website, eg. analytics cookies. Those cookies cannot be used to directly identify a certain visitor.
Used to store the attribution information, the referrer initially used to visit the website
Cookies are small text files that are placed on your computer by websites that you visit. Websites use cookies to help users navigate efficiently and perform certain functions. Cookies that are required for the website to operate properly are allowed to be set without your permission. All other cookies need to be approved before they can be set in the browser.
You can change your consent to cookie usage at any time on our Privacy Policy page.