Iterative Proportional Fitting (IPF); Hidden Markov Model (HMM); Hierarchical model (HM); Multi-source information fusion
Abstract :
[en] In urban and transportation research, important information is often scattered over a wide variety of independent datasets which vary in terms of described variables and sampling rates. As activity-travel behavior of people depends particularly on socio-demographics and transport/urban-related variables, there is an increasing need for advanced methods to merge information provided by multiple urban/transport household surveys. In this paper, we propose a hierarchical algorithm based on a Hidden Markov Model (HMM) and an Iterative Proportional Fitting (IPF) procedure to obtain quasi-perfect marginal distributions and accurate multi-variate joint distributions. The model allows for the combination of an unlimited number of datasets. The model is validated on the basis of a synthetic dataset with 1,000,000 observations and 8 categorical variables. The results reveal that the hierarchical model is particularly robust as the deviation between the simulated and observed multivariate joint distributions is extremely small and constant, regardless of the sampling rates and the composition of the datasets in terms of variables included in those datasets. Besides, the presented methodological framework allows for an intelligent merging of multiple data sources. Furthermore, heterogeneity is smoothly incorporated into micro-samples with small sampling rates subjected to potential sampling bias. These aspects are handled simultaneously to build a generalized probabilistic structure from which new observations can be inferred. A major impact in term of expert systems is that the outputs of the hierarchical model (HM) model serve as a basis for a qualitative and quantitative analyses of integrated datasets.
Research Center/Unit :
Lepur : Centre de Recherche sur la Ville, le Territoire et le Milieu rural - ULiège LEMA : Local Environment Management & Analysis UEE - Urban and Environmental Engineering - ULiège
Disciplines :
Engineering, computing & technology: Multidisciplinary, general & others
Author, co-author :
Saadi, Ismaïl ; Université de Liège - ULiège > Département ArGEnCo > Transports et mobilité
Farooq, Bilal; Ryerson University
Mustafa, Ahmed Mohamed El Saeid ; Université de Liège - ULiège > Département ArGEnCo > LEMA (Local environment management and analysis)
Teller, Jacques ; Université de Liège - ULiège > Département ArGEnCo > Urbanisme et aménagement du territoire
Cools, Mario ; Université de Liège - ULiège > Département ArGEnCo > Transports et mobilité
Language :
English
Title :
An efficient hierarchical model for multi-source information fusion
Publication date :
15 November 2018
Journal title :
Expert Systems with Applications
ISSN :
0957-4174
eISSN :
1873-6793
Publisher :
Elsevier, United Kingdom
Volume :
110
Pages :
352-362
Peer reviewed :
Peer Reviewed verified by ORBi
Name of the research project :
FloodLand
Funders :
ARC grant for Concerted Research Actions, financed by the Wallonia-Brussels Federation
Arentze, T., Timmermans, H., Hofman, F., Creating synthetic household populations: Problems and approach. Transportation Research Record: Journal of the Transportation Research Board 2014 (2007), 85–91, 10.3141/2014-11.
Axhausen, K.W., Gärling, T., Activity-based approaches to travel analysis: conceptual frameworks, models, and research problems. Transport Reviews 12:4 (1992), 323–341, 10.1080/01441649208716826.
Barthelemy, J., Toint, P.L., Synthetic population generation without a sample. Transportation Science 47:2 (2013), 266–279, 10.1287/trsc.1120.0408.
Batty, M., Cities and complexity: Understanding cities with cellular automata, agent-based models, and fractals. 2007, The MIT press.
Beckman, R.J., Baggerly, K.A., McKay, M.D., Creating synthetic baseline populations. Transportation Research Part A: Policy and Practice 30:6 (1996), 415–429, 10.1016/0965-8564(96)00004-3.
El Faouzi, N.-E., Leung, H., Kurian, A., Data fusion in intelligent transportation systems: progress and challenges–a survey. Information Fusion 12:1 (2011), 4–10, 10.1016/j.inffus.2010.06.001.
Farooq, B., Bierlaire, M., Hurtubia, R., Flötteröd, G., Simulation based population synthesis. Transportation Research Part B: Methodological 58 (2013), 243–263, 10.1016/j.trb.2013.09.012.
Horni, A., Nagel, K., & Axhausen, K. W. (2016). The multi-agent transport simulation MATSim. doi: 10.5334/baw.
Liu, F., Janssens, D., Cui, J., Wets, G., Cools, M., Characterizing activity sequences using profile hidden markov models. Expert Systems with Applications 42:13 (2015), 5705–5722, 10.1016/j.eswa.2015.02.057.
Liu, F., Janssens, D., Wets, G., Cools, M., Annotating mobile phone location data with activity purposes using machine learning algorithms. Expert Systems with Applications 40:8 (2013), 3299–3311, 10.1016/j.eswa.2012.12.100.
Mosteller, F., Association and estimation in contingency tables. Journal of the American Statistical Association 63:321 (1968), 1–28, 10.1080/01621459.1968.11009219.
Saadi, I., Mustafa, A., Teller, J., Cools, M., Forecasting travel behavior using markov chains-based approaches. Transportation Research Part C: Emerging Technologies 69 (2016), 402–417, 10.1016/j.trc.2016.06.020.
Saadi, I., Mustafa, A., Teller, J., Cools, M., Investigating the impact of river floods on travel demand based on an agent-based modeling approach: The case of liège, belgium. Transport Policy, 2017, 10.1016/j.tranpol.2017.09.009.
Saadi, I., Mustafa, A., Teller, J., Farooq, B., Cools, M., Hidden markov model-based population synthesis. Transportation Research Part B: Methodological 90 (2016), 1–21, 10.1016/j.trb.2016.04.007.
Voas, D., Williamson, P., An evaluation of the combinatorial optimisation approach to the creation of synthetic microdata. Population, Space and Place 6:5 (2000), 349–366, 10.1002/1099-1220(200009/10)6:5<349::AID-IJPG196>3.0.CO;2-5.
Wu, S., Applying statistical principles to data fusion in information retrieval. Expert Systems with Applications 36:2 (2009), 2997–3006, 10.1016/j.eswa.2008.01.019.
Ye, P., Hu, X., Yuan, Y., Wang, F.-Y., et al. Population synthesis based on joint distribution inference without disaggregate samples. Journal of Artificial Societies and Social Simulation 20:4 (2017), 1–16.
Zhu, Y., Ferreira, J., Synthetic population generation at disaggregated spatial scales for land use and transportation microsimulation. Transportation Research Record: Journal of the Transportation Research Board 2429 (2014), 168–177, 10.3141/2429-18.