Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 32727
Combining the Deep Neural Network with the K-Means for Traffic Accident Prediction

Authors: Celso L. Fernando, Toshio Yoshii, Takahiro Tsubota


Understanding the causes of a road accident and predicting their occurrence is key to prevent deaths and serious injuries from road accident events. Traditional statistical methods such as the Poisson and the Logistics regressions have been used to find the association of the traffic environmental factors with the accident occurred; recently, an artificial neural network, ANN, a computational technique that learns from historical data to make a more accurate prediction, has emerged. Although the ability to make accurate predictions, the ANN has difficulty dealing with highly unbalanced attribute patterns distribution in the training dataset; in such circumstances, the ANN treats the minority group as noise. However, in the real world data, the minority group is often the group of interest; e.g., in the road traffic accident data, the events of the accident are the group of interest. This study proposes a combination of the k-means with the ANN to improve the predictive ability of the neural network model by alleviating the effect of the unbalanced distribution of the attribute patterns in the training dataset. The results show that the proposed method improves the ability of the neural network to make a prediction on a highly unbalanced distributed attribute patterns dataset; however, on an even distributed attribute patterns dataset, the proposed method performs almost like a standard neural network. 

Keywords: Accident risks estimation, artificial neural network, deep learning, K-mean, road safety.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 818


[1] W. Owen, “Transportation and Economic Development,” Proc. Seventy-first Annu. Meet. Am. Econ., vol. 49, no. 2, pp. 179–187, 1959.
[2] E. Sorupia, “Rethinking the role of transportation in tourism,” East. Asia Soc. Transp. Stud., vol. 5, pp. 1767–1777, 2005.
[3] J. Khadaroo and B. Seetanah, “The role of transport infrastructure in FDI evidence from Africa using gmm estimates,” J. Transp. Econ. Policy, vol. 43, no. 3, pp. 365–384, 2009.
[4] T. Tsubota, C. Fernando, T. Yoshii, and H. Shirayanagi, “Effect of Road Pavement Types and Ages on Traffic Accident Risks Effect of Road Pavement Types and Ages on a Traffic Accident Risks,” Transp. Res. Procedia, vol. 34, pp. 211–218, 2018, doi: 10.1016/j.trpro.2018.11.034.
[5] J. C. Comer, N. J. Rose, and L. S. Bombom, “Poisson regression analysis of highway fatality accident data in Oklahoma,” Int. J. Appl. Geospatial Res., vol. 5, no. 4, pp. 72–86, 2014, doi: 10.4018/ijagr.2014100105.
[6] C. Wang, M. A. Quddus, and S. G. Ison, “Impact of traffic congestion on road accidents: A spatial analysis of the M25 motorway in England,” Accid. Anal. Prev., vol. 41, no. 4, pp. 798–808, 2009, doi: 10.1016/j.aap.2009.04.002.
[7] C. Fernando, T. Yoshii, T. Tsubota, and H. Shirayanagi, “Analysis of the Safety Performance of Drainage Pavement focusing on Pavement Age,” East. Asia Soc. Transp. Stud., vol. 13, pp. 2016–2026, 2019,
[Online]. Available:
[8] S. Dissanayake and J. Lu, “Analysis of Severity of Young Driver Crashes,” Transp. Res. Rec. 1784, no. 02, pp. 108–114.
[9] A. S. Al-ghamdi, “Using logistic regression to estimate the influence of accident factors on accident severity,” Accid. Anal. Prev., vol. 34, pp. 729–741, 2002.
[10] E. T. Donnell and J. M. Mason, “Predicting the Severity of Median-Related Crashes in Pennsylvania by Using Logistic Regression,” Transp. Res. Rec. 1784, no. 1897, pp. 55–63, 2004.
[11] P. Taylor, S. Y. Sohn, and H. Shin, “Pattern recognition for road traYc accident severity in Korea,” Ergonomics, no. April 2013, pp. 37–41, 2010.
[12] M. M. Ahmed, M. Abdel-Aty, J. Lee, and R. Yu, “Real-time assessment of fog-related crashes using airport weather data: A feasibility analysis,” Accid. Anal. Prev., vol. 72, pp. 309–317, 2014, doi: 10.1016/j.aap.2014.07.004.
[13] L. Tao, D. Zhu, L. Yan, and P. Zhang, “The traffic accident hotspot prediction: Based on the logistic regression method,” ICTIS 2015 - 3rd Int. Conf. Transp. Inf. Safety, Proc., pp. 107–110, 2015, doi: 10.1109/ICTIS.2015.7232194.
[14] M. B. Ulak, A. Kocatepe, E. E. Ozguven, M. W. Horner, and L. Spainhour, “Geographic information system–based spatial and statistical analysis of severe crash hotspot accessibility to hospitals,” Transp. Res. Rec., vol. 2635, no. 1, pp. 90–97, 2017, doi: 10.3141/2635-11.
[15] M. Híjar, C. Carrillo, M. Flores, R. Anaya, and V. Lopez, “Risk factors in highway traffic accidents: A case control study,” Accid. Anal. Prev., vol. 32, no. 5, pp. 703–709, 2000, doi: 10.1016/S0001-4575(99)00116-5.
[16] H. Chen, L. Cao, and D. B. Logan, “Analysis of Risk Factors Affecting the Severity of Intersection Crashes by Logistic Regression,” Traffic Inj. Prev., vol. 13, no. 3, pp. 300–307, 2012, doi: 10.1080/15389588.2011.653841.
[17] F. Crocco, S. De Marco, and D. W. E. Mongelli, “An integrated approach for studying the safety of road networks: Logistic regression models between traffic accident occurrence and behavioural, environmental and infrastructure parameters,” WIT Trans. Ecol. Environ., vol. 142, pp. 525–536, 2010, doi: 10.2495/SW100481.
[18] B. Debrabant, U. Halekoh, W. H. Bonat, D. L. Hansen, J. Hjelmborg, and J. Lauritsen, “Identifying traffic accident black spots with Poisson-Tweedie models,” Accid. Anal. Prev., vol. 111, no. November 2017, pp. 147–154, 2018, doi: 10.1016/j.aap.2017.11.021.
[19] D. Lord and F. Mannering, “The statistical analysis of crash-frequency data: A review and assessment of methodological alternatives,” Transp. Res. Part A Policy Pract., vol. 44, no. 5, pp. 291–305, 2010, doi: 10.1016/j.tra.2010.02.001.
[20] D. Lord, S. R. Geedipally, and S. D. Guikema, “Extension of the application of conway-maxwell-poisson models: Analyzing traffic crash data exhibiting underdispersion,” Risk Anal., vol. 30, no. 8, pp. 1268–1276, 2010, doi: 10.1111/j.1539-6924.2010.01417.x.
[21] A. Abdulhafedh, “Crash Frequency Analysis,” J. Transp. Technol., vol. 06, no. 04, pp. 169–180, 2016, doi: 10.4236/jtts.2016.64017.
[22] G. P. Zhang, “Neural networks for classification: A survey,” IEEE Trans. Syst. Man Cybern. Part C Appl. Rev., vol. 30, no. 4, pp. 451–462, 2000, doi: 10.1109/5326.897072.
[23] G. Fürst and P. Ghisletta, “Statistical Interaction between Two Continuous (Latent) Variables,” 11th Congr. Swiss Psychol. Soc. August, pp. 1–12, 2009.
[24] M. G. Karlaftis and E. I. Vlahogianni, “Statistical methods versus neural networks in transportation research: Differences, similarities and some insights,” Transp. Res. Part C Emerg. Technol., vol. 19, no. 3, pp. 387–399, 2011, doi: 10.1016/j.trc.2010.10.004.
[25] Y. Zhang and P. Lorenz, “AI for Network Traffic Control,” IEEE Netw., vol. 32, no. 6, pp. 6–7, 2018, doi: 10.1109/MNET.2018.8553647.
[26] H. T. Abdelwahab and M. A. Abdel-aty, “Development of Artificial Neural Network Models to Predict Driver Injury Severity in Traffic Accidents at Signalized Intersections,” Transp. Res., no. 01, pp. 6–13, 1997.
[27] L. Chang, “Analysis of freeway accident frequencies: Negative binomial regression versus artificial neural network,” Elsevier, Science DirectSafety Sci., vol. 43, pp. 541–557, 2005, doi: 10.1016/j.ssci.2005.04.004.
[28] M. De Luca, “A Comparison between Prediction Power of Artificial Neural Networks and Multivariate Analysis in Road Safety Management,” Transport, vol. 32, no. 4, pp. 379–385, 2017, doi: 10.3846/16484142.2014.995702.
[29] S. Araghinejad, M. Azmi, and M. Kholghi, “Application of artificial neural network ensembles in probabilistic hydrological forecasting,” J. Hydrol., vol. 407, no. 1–4, pp. 94–104, 2011, doi: 10.1016/j.jhydrol.2011.07.011.
[30] M. Bekkar, H. K. Djemaa, and T. A. Alitouche, “Evaluation Measures for Models Assessment over Imbalanced Data Sets,” J. Inf. Eng. Appl., vol. 3, no. 10, pp. 27–38, 2013,
[Online]. Available:
[31] F. He, X. Yan, Y. Liu, and L. Ma, “A Traffic Congestion Assessment Method for Urban Road Networks Based on Speed Performance Index,” Procedia Eng., vol. 137, pp. 425–433, 2016, doi: 10.1016/j.proeng.2016.01.277.
[32] Y. Sun, M. S. Kamel, A. K. C. Wong, and Y. Wang, “Cost-sensitive boosting for classification of imbalanced data,” Sci. Pattern Recognit., vol. 40, no. 12, pp. 3358–3378, 2007, doi: 10.1016/j.patcog.2007.04.009.
[33] G. Menardi and N. Torelli, Training and assessing classification rules with imbalanced data, vol. 28, no. 1. 2014. doi: 10.1007/s10618-012-0295-5.
[34] A. J. C. Sharkey, “On Combining Artificial Neural Nets,” Conn. Sci., vol. 8, no. 3–4, pp. 299–314, 1996, doi: 10.1080/095400996116785.
[35] S. E. Kim and I. W. Seo, “Artificial Neural Network ensemble modeling with conjunctive data clustering for water quality prediction in rivers,” J. Hydro-Environment Res., vol. 9, no. 3, pp. 325–339, 2015, doi: 10.1016/j.jher.2014.09.006.
[36] J. Han, M. Kamber, and J. Pei, “Third Edition: Data Mining Concepts and Techniques,” J. Chem. Inf. Model., vol. 53, no. 9, pp. 1689–1699, 2012,
[Online]. Available:
[37] R. Tibshirani, G. Walther, and T. Hastie, “Estimating the number of clusters in a data set via the gap statistic,” J. R. Stat. Soc. Ser. B Stat. Methodol., vol. 63, no. 2, pp. 411–423, 2001, doi: 10.1111/1467-9868.00293.
[38] S. Chen, W. Wang, G. Qu and J. Lu, “Application of Neural Network Ensembles to Incident Detection”, IEEE International Conference on Integration Technology, 388-393, 2007.
[39] T. Ma, F. Wang, J. Cheng, Y. Yu and X. Chen, “A Hybrid Spectral Clustering and Deep Neural Network Ensemble Algorithm for Intrusion Detection”, Sensors, 2016.
[40] UNECE, “Road Safety for All”, 2019