A Comparative Analysis of Machine Learning Techniques for PM10 Forecasting in Vilnius
Authors: M. A. S. Fahim, J. Sužiedelytė Visockienė
Abstract:
With the growing concern over air pollution (AP), it is clear that this has gained more prominence than ever before. The level of consciousness has increased and a sense of knowledge now has to be forwarded as a duty by those enlightened enough to disseminate it to others. This realization often comes after an understanding of how poor air quality indices (AQI) damage human health. The study focuses on assessing air pollution prediction models specifically for Lithuania, addressing a substantial need for empirical research within the region. Concentrating on Vilnius, it specifically examines particulate matter concentrations 10 micrometers or less in diameter (PM10). Utilizing Gaussian Process Regression (GPR) and Regression Tree Ensemble, and Regression Tree methodologies, predictive forecasting models are validated and tested using hourly data from January 2020 to December 2022. The study explores the classification of AP data into anthropogenic and natural sources, the impact of AP on human health, and its connection to cardiovascular diseases. The study revealed varying levels of accuracy among the models, with GPR achieving the highest accuracy, indicated by an RMSE of 4.14 in validation and 3.89 in testing.
Keywords: Air pollution, anthropogenic and natural sources, machine learning, Gaussian process regression, tree ensemble, forecasting models, particulate matter.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 117References:
[1] Y. A. Ayturan et al., “Short-term prediction of pm2.5 pollution with deep learning methods,” Global Nest Journal, vol. 22, no. 1, pp. 126–131, 2020, doi: 10.30955/gnj.003208.
[2] R. Sharma, G. Shilimkar, and S. Pisal, “Air Quality Prediction by Machine Learning,” Int J Sci Res Sci Technol, pp. 486–492, May 2021, doi: 10.32628/ijsrst218396.
[3] H. K. Doreswamy, Y. Km, and I. Gad, “ScienceDirect Forecasting Air Pollution Particulate Matter (PM 2.5 ) Using Machine Learning Regression Models,” Procedia Comput Sci, vol. 171, pp. 2057–2066, 2020, doi: 10.1016/j.procs.2020.04.221.
[4] W. C. Leong, R. O. Kelani, and Z. Ahmad, “Prediction of air pollution index (API) using support vector machine (SVM),” 2019, doi: 10.1016/j.jece.2019.103208.
[5] S. Abdullah, M. Ismail, and S. Y. Fong, “Multiple Linear Regression (MLR) models for long term Pm10 concentration forecasting during different monsoon seasons,” J Sustain Sci Manag, vol. 12, pp. 60–69, Oct. 2017.
[6] K. Veljanovska and A. Dimoski, “Machine Learning Algorithms in Air Quality Index Prediction,” International Journal of Science and Engineering Investigations, vol. 6, p. 71, 2017, Accessed: Oct. 15, 2023. (Online). Available: www.IJSEI.com
[7] X. Zhao, R. Zhang, J.-L. Wu, and P.-C. Chang, “A Deep Recurrent Neural Network for Air Quality Classification,” Journal of Information Hiding and Multimedia Signal Processing c, vol. 9, no. 2, 2018.
[8] M. Kulkarni, A. Raut, S. Chavan, N. Rajule, and S. Pawar, “Air Quality Monitoring and Prediction using SVM,” in 2022 6th International Conference on Computing, Communication, Control and Automation, ICCUBEA 2022, Institute of Electrical and Electronics Engineers Inc., 2022. doi: 10.1109/ICCUBEA54992.2022.10010942.
[9] L. Contreras Ochando, C. I. Font Julián, and F. Contreras Ochando, “Airvlc: An application for real-time forecasting urban air pollution C` esar Ferri,” 2015, Accessed: Oct. 16, 2023. (Online). Available: http://www.aemet.es/
[10] M. Aljanabi, M. Shkoukani, and M. Hijjawi, “Ground-level Ozone Prediction Using Machine Learning Techniques: A Case Study in Amman, Jordan,” International Journal of Automation and Computing, vol. 17, no. 5, pp. 667–677, 2020, doi: 10.1007/s11633-020-1233-4.
[11] J. He et al., “Atmospheric Pollution Research 14 (2023) 101832 Available online 2,” pp. 1309–1042, 2023, doi: 10.1016/j.apr.2023.101832.
[12] “What is Air Pollution?” Accessed: Nov. 16, 2023. (Online). Available: https://www.aqi.in/blog/what-is-air-pollution/
[13] J. Cook et al., “Quantifying the consensus on anthropogenic global warming in the scientific literature,” Environmental Research Letters, vol. 8, no. 2, 2013, doi: 10.1088/1748-9326/8/2/024024.
[14] M. Kampa and E. Castanas, “Human health effects of air pollution”, doi: 10.1016/j.envpol.2007.06.012.
[15] “WHO releases country estimates on air pollution exposure and health impact.” Accessed: Oct. 03, 2023. (Online). Available: https://www.who.int/en/news-room/detail/27-09-2016-who-releases-country-estimates-on-air-pollution-exposure-and-health-impact
[16] R. Sher et al., “Air pollution and its health impacts in Malaysia: a review”, doi: 10.1007/s11869-020-00867-x.
[17] A. Zhalehdoost and M. Taleai, “A Review of the Application of Machine Learning and Geospatial Analysis Methods in Air Pollution Prediction,” Pollution, vol. 8, no. 3. University of Tehran, pp. 904–933, May 01, 2022. doi: 10.22059/POLL.2022.336044.1300.
[18] G. K. Kang, J. Z. Gao, S. Chiao, S. Lu, and G. Xie, “Air quality prediction: Big data and machine learning approaches,” Int. J. Environ. Sci. Dev, vol. 9, no. 1, pp. 8–16, 2018.
[19] “Vilnius Air Quality Index.” Accessed: Oct. 28, 2023. (Online). Available: https://www.iqair.com/lithuania/vilnius
[20] T. Madan, S. Sagar, and D. Virmani, “Air Quality Prediction using Machine Learning Algorithms-A Review,” in Proceedings - IEEE 2020 2nd International Conference on Advances in Computing, Communication Control and Networking, ICACCCN 2020, Institute of Electrical and Electronics Engineers Inc., Dec. 2020, pp. 140–145. doi: 10.1109/ICACCCN51052.2020.9362912.
[21] Q. Zhang, H. Yu, M. Barbiero, B. Wang, and M. Gu, “Artificial neural networks enabled by nanophotonics,” Official journal of the CIOMP, pp. 2047–7538, doi: 10.1038/s41377-019-0151-0.
[22] “CS 230 - Recurrent Neural Networks Cheatsheet.” Accessed: Oct. 12, 2023. (Online). Available: https://stanford.edu/~shervine/teaching/cs-230/cheatsheet-recurrent-neural-networks
[23] S. Hochreiter and J. Schmidhuber, “Long Short-Term Memory,” Neural Comput, vol. 9, no. 8, pp. 1735–1780, 1997, doi: 10.1162/neco.1997.9.8.1735.
[24] K. Cho et al., “Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation,” in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar: Association for Computational Linguistics, Oct. 2014, pp. 1724–1734. doi: 10.3115/v1/D14-1179.
[25] R. Gove and J. Faytong, “Machine Learning and Event-Based Software Testing: Classifiers for Identifying Infeasible GUI Event Sequences,” 2012, doi: 10.1016/B978-0-12-396535-6.00004-1.
[26] C. M. Bishop, Pattern Recognition and Machine Learning. Secaucus, NJ, USA: Springer-Verlag New York, Inc., 2006.
[27] S. Buhai, “Quantile regression: overview and selected applications,” Ad Astra, vol. 4, no. 4, pp. 1–17, 2005.
[28] “Air Quality e-Reporting (AQ e-Reporting).” Accessed: Mar. 10, 2024. (Online). Available: https://www.eea.europa.eu/en/datahub/datahubitem-view/3b390c9c-f321-490a-b25a-ae93b2ed80c1
[29] “National Centers for Environmental Information (NCEI).” Accessed: Mar. 10, 2024. (Online). Available: https://www.ncei.noaa.gov/
[30] Y. Feng et al., “Prediction of Hourly Air-Conditioning Energy Consumption in Office Buildings Based on Gaussian Process Regression,” Energies (Basel), vol. 15, no. 13, Jul. 2022, doi: 10.3390/en15134626.
[31] C. E. Rasmussen and C. K. I. Williams, “Gaussian Processes for Machine Learning”, Accessed: Mar. 12, 2024. (Online). Available: www.GaussianProcess.org/gpml
[32] T. Hastie, R. Tibshirani, and J. Friedman, “Springer Series in Statistics The Elements of Statistical Learning Data Mining, Inference, and Prediction.”
[33] S. Guo, X. Tao, and L. Liang, “Exploring Natural and Anthropogenic Drivers of PM2.5 Concentrations Based on Random Forest Model: Beijing–Tianjin–Hebei Urban Agglomeration, China,” Atmosphere (Basel), vol. 14, no. 2, Feb. 2023, doi: 10.3390/atmos14020381.
[34] “How to assess air quality sensor accuracy: MAE.” Accessed: Oct. 26, 2023. (Online). Available: https://www.clarity.io/blog/how-to-assess-sensor-accuracy-mae