Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 32
Search results for: overfitting
2 A Data-Driven Compartmental Model for Dengue Forecasting and Covariate Inference
Authors: Yichao Liu, Peter Fransson, Julian Heidecke, Jonas Wallin, Joacim Rockloev
Abstract:
Dengue, a mosquito-borne viral disease, poses a significant public health challenge in endemic tropical or subtropical countries, including Sri Lanka. To reveal insights into the complexity of the dynamics of this disease and study the drivers, a comprehensive model capable of both robust forecasting and insightful inference of drivers while capturing the co-circulating of several virus strains is essential. However, existing studies mostly focus on only one aspect at a time and do not integrate and carry insights across the siloed approach. While mechanistic models are developed to capture immunity dynamics, they are often oversimplified and lack integration of all the diverse drivers of disease transmission. On the other hand, purely data-driven methods lack constraints imposed by immuno-epidemiological processes, making them prone to overfitting and inference bias. This research presents a hybrid model that combines machine learning techniques with mechanistic modelling to overcome the limitations of existing approaches. Leveraging eight years of newly reported dengue case data, along with socioeconomic factors, such as human mobility, weekly climate data from 2011 to 2018, genetic data detecting the introduction and presence of new strains, and estimates of seropositivity for different districts in Sri Lanka, we derive a data-driven vector (SEI) to human (SEIR) model across 16 regions in Sri Lanka at the weekly time scale. By conducting ablation studies, the lag effects allowing delays up to 12 weeks of time-varying climate factors were determined. The model demonstrates superior predictive performance over a pure machine learning approach when considering lead times of 5 and 10 weeks on data withheld from model fitting. It further reveals several interesting interpretable findings of drivers while adjusting for the dynamics and influences of immunity and introduction of a new strain. The study uncovers strong influences of socioeconomic variables: population density, mobility, household income and rural vs. urban population. The study reveals substantial sensitivity to the diurnal temperature range and precipitation, while mean temperature and humidity appear less important in the study location. Additionally, the model indicated sensitivity to vegetation index, both max and average. Predictions on testing data reveal high model accuracy. Overall, this study advances the knowledge of dengue transmission in Sri Lanka and demonstrates the importance of incorporating hybrid modelling techniques to use biologically informed model structures with flexible data-driven estimates of model parameters. The findings show the potential to both inference of drivers in situations of complex disease dynamics and robust forecasting models.Keywords: compartmental model, climate, dengue, machine learning, social-economic
Procedia PDF Downloads 841 Rapid Building Detection in Population-Dense Regions with Overfitted Machine Learning Models
Authors: V. Mantey, N. Findlay, I. Maddox
Abstract:
The quality and quantity of global satellite data have been increasing exponentially in recent years as spaceborne systems become more affordable and the sensors themselves become more sophisticated. This is a valuable resource for many applications, including disaster management and relief. However, while more information can be valuable, the volume of data available is impossible to manually examine. Therefore, the question becomes how to extract as much information as possible from the data with limited manpower. Buildings are a key feature of interest in satellite imagery with applications including telecommunications, population models, and disaster relief. Machine learning tools are fast becoming one of the key resources to solve this problem, and models have been developed to detect buildings in optical satellite imagery. However, by and large, most models focus on affluent regions where buildings are generally larger and constructed further apart. This work is focused on the more difficult problem of detection in populated regions. The primary challenge with detecting small buildings in densely populated regions is both the spatial and spectral resolution of the optical sensor. Densely packed buildings with similar construction materials will be difficult to separate due to a similarity in color and because the physical separation between structures is either non-existent or smaller than the spatial resolution. This study finds that training models until they are overfitting the input sample can perform better in these areas than a more robust, generalized model. An overfitted model takes less time to fine-tune from a generalized pre-trained model and requires fewer input data. The model developed for this study has also been fine-tuned using existing, open-source, building vector datasets. This is particularly valuable in the context of disaster relief, where information is required in a very short time span. Leveraging existing datasets means that little to no manpower or time is required to collect data in the region of interest. The training period itself is also shorter for smaller datasets. Requiring less data means that only a few quality areas are necessary, and so any weaknesses or underpopulated regions in the data can be skipped over in favor of areas with higher quality vectors. In this study, a landcover classification model was developed in conjunction with the building detection tool to provide a secondary source to quality check the detected buildings. This has greatly reduced the false positive rate. The proposed methodologies have been implemented and integrated into a configurable production environment and have been employed for a number of large-scale commercial projects, including continent-wide DEM production, where the extracted building footprints are being used to enhance digital elevation models. Overfitted machine learning models are often considered too specific to have any predictive capacity. However, this study demonstrates that, in cases where input data is scarce, overfitted models can be judiciously applied to solve time-sensitive problems.Keywords: building detection, disaster relief, mask-RCNN, satellite mapping
Procedia PDF Downloads 169