Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 25713

Search results for: meteorological prediction data

25293 Physically Informed Kernels for Wave Loading Prediction

Authors: Daniel James Pitchforth, Timothy James Rogers, Ulf Tyge Tygesen, Elizabeth Jane Cross

Abstract:

Wave loading is a primary cause of fatigue within offshore structures and its quantification presents a challenging and important subtask within the SHM framework. The accurate representation of physics in such environments is difficult, however, driving the development of data-driven techniques in recent years. Within many industrial applications, empirical laws remain the preferred method of wave loading prediction due to their low computational cost and ease of implementation. This paper aims to develop an approach that combines data-driven Gaussian process models with physical empirical solutions for wave loading, including Morison’s Equation. The aim here is to incorporate physics directly into the covariance function (kernel) of the Gaussian process, enforcing derived behaviors whilst still allowing enough flexibility to account for phenomena such as vortex shedding, which may not be represented within the empirical laws. The combined approach has a number of advantages, including improved performance over either component used independently and interpretable hyperparameters.

Keywords: offshore structures, Gaussian processes, Physics informed machine learning, Kernel design

Procedia PDF Downloads 177

25292 Investigation on Remote Sense Surface Latent Heat Temperature Associated with Pre-Seismic Activities in Indian Region

Authors: Vijay S. Katta, Vinod Kushwah, Rudraksh Tiwari, Mulayam Singh Gaur, Priti Dimri, Ashok Kumar Sharma

Abstract:

The formation process of seismic activities because of abrupt slip on faults, tectonic plate moments due to accumulated stress in the Earth’s crust. The prediction of seismic activity is a very challenging task. We have studied the changes in surface latent heat temperatures which are observed prior to significant earthquakes have been investigated and could be considered for short term earthquake prediction. We analyzed the surface latent heat temperature (SLHT) variation for inland earthquakes occurred in Chamba, Himachal Pradesh (32.5 N, 76.1E, M-4.5, depth-5km) nearby the main boundary fault region, the data of SLHT have been taken from National Center for Environmental Prediction (NCEP). In this analysis, we have calculated daily variations with surface latent heat temperature (0C) in the range area 1⁰x1⁰ (~120/KM²) with the pixel covering epicenter of earthquake at the center for a three months period prior to and after the seismic activities. The mean value during that period has been considered in order to take account of the seasonal effect. The monthly mean has been subtracted from daily value to study anomalous behavior (∆SLHT) of SLHT during the earthquakes. The results found that the SLHTs adjacent the epicenters all are anomalous high value 3-5 days before the seismic activities. The abundant surface water and groundwater in the epicenter and its adjacent region can provide the necessary condition for the change of SLHT. To further confirm the reliability of SLHT anomaly, it is necessary to explore its physical mechanism in depth by more earthquakes cases.

Keywords: surface latent heat temperature, satellite data, earthquake, magnetic storm

Procedia PDF Downloads 123

25291 Time and Cost Prediction Models for Language Classification Over a Large Corpus on Spark

Authors: Jairson Barbosa Rodrigues, Paulo Romero Martins Maciel, Germano Crispim Vasconcelos

Abstract:

This paper presents an investigation of the performance impacts regarding the variation of five factors (input data size, node number, cores, memory, and disks) when applying a distributed implementation of Naïve Bayes for text classification of a large Corpus on the Spark big data processing framework. Problem: The algorithm's performance depends on multiple factors, and knowing before-hand the effects of each factor becomes especially critical as hardware is priced by time slice in cloud environments. Objectives: To explain the functional relationship between factors and performance and to develop linear predictor models for time and cost. Methods: the solid statistical principles of Design of Experiments (DoE), particularly the randomized two-level fractional factorial design with replications. This research involved 48 real clusters with different hardware arrangements. The metrics were analyzed using linear models for screening, ranking, and measurement of each factor's impact. Results: Our findings include prediction models and show some non-intuitive results about the small influence of cores and the neutrality of memory and disks on total execution time, and the non-significant impact of data input scale on costs, although notably impacts the execution time.

Keywords: big data, design of experiments, distributed machine learning, natural language processing, spark

Procedia PDF Downloads 102

25290 Spatial Variation of WRF Model Rainfall Prediction over Uganda

Authors: Isaac Mugume, Charles Basalirwa, Daniel Waiswa, Triphonia Ngailo

Abstract:

Rainfall is a major climatic parameter affecting many sectors such as health, agriculture and water resources. Its quantitative prediction remains a challenge to weather forecasters although numerical weather prediction models are increasingly being used for rainfall prediction. The performance of six convective parameterization schemes, namely the Kain-Fritsch scheme, the Betts-Miller-Janjic scheme, the Grell-Deveny scheme, the Grell-3D scheme, the Grell-Fretas scheme, the New Tiedke scheme of the weather research and forecast (WRF) model regarding quantitative rainfall prediction over Uganda is investigated using the root mean square error for the March-May (MAM) 2013 season. The MAM 2013 seasonal rainfall amount ranged from 200 mm to 900 mm over Uganda with northern region receiving comparatively lower rainfall amount (200–500 mm); western Uganda (270–550 mm); eastern Uganda (400–900 mm) and the lake Victoria basin (400–650 mm). A spatial variation in simulated rainfall amount by different convective parameterization schemes was noted with the Kain-Fritsch scheme over estimating the rainfall amount over northern Uganda (300–750 mm) but also presented comparable rainfall amounts over the eastern Uganda (400–900 mm). The Betts-Miller-Janjic, the Grell-Deveny, and the Grell-3D underestimated the rainfall amount over most parts of the country especially the eastern region (300–600 mm). The Grell-Fretas captured rainfall amount over the northern region (250–450 mm) but also underestimated rainfall over the lake Victoria Basin (150–300 mm) while the New Tiedke generally underestimated rainfall amount over many areas of Uganda. For deterministic rainfall prediction, the Grell-Fretas is recommended for rainfall prediction over northern Uganda while the Kain-Fritsch scheme is recommended over eastern region.

Keywords: convective parameterization schemes, March-May 2013 rainfall season, spatial variation of parameterization schemes over Uganda, WRF model

Procedia PDF Downloads 302

25289 Artificial Neural Networks and Geographic Information Systems for Coastal Erosion Prediction

Authors: Angeliki Peponi, Paulo Morgado, Jorge Trindade

Abstract:

Artificial Neural Networks (ANNs) and Geographic Information Systems (GIS) are applied as a robust tool for modeling and forecasting the erosion changes in Costa Caparica, Lisbon, Portugal, for 2021. ANNs present noteworthy advantages compared with other methods used for prediction and decision making in urban coastal areas. Multilayer perceptron type of ANNs was used. Sensitivity analysis was conducted on natural and social forces and dynamic relations in the dune-beach system of the study area. Variations in network’s parameters were performed in order to select the optimum topology of the network. The developed methodology appears fitted to reality; however further steps would make it better suited.

Keywords: artificial neural networks, backpropagation, coastal urban zones, erosion prediction

Procedia PDF Downloads 374

25288 A Machine Learning Approach for Performance Prediction Based on User Behavioral Factors in E-Learning Environments

Authors: Naduni Ranasinghe

Abstract:

E-learning environments are getting more popular than any other due to the impact of COVID19. Even though e-learning is one of the best solutions for the teaching-learning process in the academic process, it’s not without major challenges. Nowadays, machine learning approaches are utilized in the analysis of how behavioral factors lead to better adoption and how they related to better performance of the students in eLearning environments. During the pandemic, we realized the academic process in the eLearning approach had a major issue, especially for the performance of the students. Therefore, an approach that investigates student behaviors in eLearning environments using a data-intensive machine learning approach is appreciated. A hybrid approach was used to understand how each previously told variables are related to the other. A more quantitative approach was used referred to literature to understand the weights of each factor for adoption and in terms of performance. The data set was collected from previously done research to help the training and testing process in ML. Special attention was made to incorporating different dimensionality of the data to understand the dependency levels of each. Five independent variables out of twelve variables were chosen based on their impact on the dependent variable, and by considering the descriptive statistics, out of three models developed (Random Forest classifier, SVM, and Decision tree classifier), random forest Classifier (Accuracy – 0.8542) gave the highest value for accuracy. Overall, this work met its goals of improving student performance by identifying students who are at-risk and dropout, emphasizing the necessity of using both static and dynamic data.

Keywords: academic performance prediction, e learning, learning analytics, machine learning, predictive model

Procedia PDF Downloads 140

25287 Multimedia Data Fusion for Event Detection in Twitter by Using Dempster-Shafer Evidence Theory

Authors: Samar M. Alqhtani, Suhuai Luo, Brian Regan

Abstract:

Data fusion technology can be the best way to extract useful information from multiple sources of data. It has been widely applied in various applications. This paper presents a data fusion approach in multimedia data for event detection in twitter by using Dempster-Shafer evidence theory. The methodology applies a mining algorithm to detect the event. There are two types of data in the fusion. The first is features extracted from text by using the bag-ofwords method which is calculated using the term frequency-inverse document frequency (TF-IDF). The second is the visual features extracted by applying scale-invariant feature transform (SIFT). The Dempster - Shafer theory of evidence is applied in order to fuse the information from these two sources. Our experiments have indicated that comparing to the approaches using individual data source, the proposed data fusion approach can increase the prediction accuracy for event detection. The experimental result showed that the proposed method achieved a high accuracy of 0.97, comparing with 0.93 with texts only, and 0.86 with images only.

Keywords: data fusion, Dempster-Shafer theory, data mining, event detection

Procedia PDF Downloads 397

25286 An Implementation of Fuzzy Logic Technique for Prediction of the Power Transformer Faults

Authors: Omar M. Elmabrouk., Roaa Y. Taha., Najat M. Ebrahim, Sabbreen A. Mohammed

Abstract:

Power transformers are the most crucial part of power electrical system, distribution and transmission grid. This part is maintained using predictive or condition-based maintenance approach. The diagnosis of power transformer condition is performed based on Dissolved Gas Analysis (DGA). There are five main methods utilized for analyzing these gases. These methods are International Electrotechnical Commission (IEC) gas ratio, Key Gas, Roger gas ratio, Doernenburg, and Duval Triangle. Moreover, due to the importance of the transformers, there is a need for an accurate technique to diagnose and hence predict the transformer condition. The main objective of this technique is to avoid the transformer faults and hence to maintain the power electrical system, distribution and transmission grid. In this paper, the DGA was utilized based on the data collected from the transformer records available in the General Electricity Company of Libya (GECOL) which is located in Benghazi-Libya. The Fuzzy Logic (FL) technique was implemented as a diagnostic approach based on IEC gas ratio method. The FL technique gave better results and approved to be used as an accurate prediction technique for power transformer faults. Also, this technique is approved to be a quite interesting for the readers and the concern researchers in the area of FL mathematics and power transformer.

Keywords: dissolved gas-in-oil analysis, fuzzy logic, power transformer, prediction

Procedia PDF Downloads 131

25285 Analysis and Rule Extraction of Coronary Artery Disease Data Using Data Mining

Authors: Rezaei Hachesu Peyman, Oliyaee Azadeh, Salahzadeh Zahra, Alizadeh Somayyeh, Safaei Naser

Abstract:

Coronary Artery Disease (CAD) is one major cause of disability in adults and one main cause of death in developed. In this study, data mining techniques including Decision Trees, Artificial neural networks (ANNs), and Support Vector Machine (SVM) analyze CAD data. Data of 4948 patients who had suffered from heart diseases were included in the analysis. CAD is the target variable, and 24 inputs or predictor variables are used for the classification. The performance of these techniques is compared in terms of sensitivity, specificity, and accuracy. The most significant factor influencing CAD is chest pain. Elderly males (age > 53) have a high probability to be diagnosed with CAD. SVM algorithm is the most useful way for evaluation and prediction of CAD patients as compared to non-CAD ones. Application of data mining techniques in analyzing coronary artery diseases is a good method for investigating the existing relationships between variables.

Keywords: classification, coronary artery disease, data-mining, knowledge discovery, extract

Procedia PDF Downloads 643

25284 ARIMA-GARCH, A Statistical Modeling for Epileptic Seizure Prediction

Authors: Salman Mohamadi, Seyed Mohammad Ali Tayaranian Hosseini, Hamidreza Amindavar

Abstract:

In this paper, we provide a procedure to analyze and model EEG (electroencephalogram) signal as a time series using ARIMA-GARCH to predict an epileptic attack. The heteroskedasticity of EEG signal is examined through the ARCH or GARCH, (Autore- gressive conditional heteroskedasticity, Generalized autoregressive conditional heteroskedasticity) test. The best ARIMA-GARCH model in AIC sense is utilized to measure the volatility of the EEG from epileptic canine subjects, to forecast the future values of EEG. ARIMA-only model can perform prediction, but the ARCH or GARCH model acting on the residuals of ARIMA attains a con- siderable improved forecast horizon. First, we estimate the best ARIMA model, then different orders of ARCH and GARCH modelings are surveyed to determine the best heteroskedastic model of the residuals of the mentioned ARIMA. Using the simulated conditional variance of selected ARCH or GARCH model, we suggest the procedure to predict the oncoming seizures. The results indicate that GARCH modeling determines the dynamic changes of variance well before the onset of seizure. It can be inferred that the prediction capability comes from the ability of the combined ARIMA-GARCH modeling to cover the heteroskedastic nature of EEG signal changes.

Keywords: epileptic seizure prediction , ARIMA, ARCH and GARCH modeling, heteroskedasticity, EEG

Procedia PDF Downloads 397

25283 Renewable Energy System Eolic-Photovoltaic for the Touristic Center La Tranca-Chordeleg in Ecuador

Authors: Christian Castro Samaniego, Daniel Icaza Alvarez, Juan Portoviejo Brito

Abstract:

For this research work, hybrid wind-photovoltaic (SHEF) systems were considered as renewable energy sources that take advantage of wind energy and solar radiation to transform into electrical energy. In the present research work, the feasibility of a wind-photovoltaic hybrid generation system was analyzed for the La Tranca tourist viewpoint of the Chordeleg canton in Ecuador. The research process consisted of the collection of data on solar radiation, temperature, wind speed among others by means of a meteorological station. Simulations were carried out in MATLAB/Simulink based on a mathematical model. In the end, we compared the theoretical radiation-power curves and the measurements made at the site.

Keywords: hybrid system, wind turbine, modeling, simulation, validation, experimental data, panel, Ecuador

Procedia PDF Downloads 232

25282 The Inequality Effects of Natural Disasters: Evidence from Thailand

Authors: Annop Jaewisorn

Abstract:

This study explores the relationship between natural disasters and inequalities -both income and expenditure inequality- at a micro-level of Thailand as the first study of this nature for this country. The analysis uses a unique panel and remote-sensing dataset constructed for the purpose of this research. It contains provincial inequality measures and other economic and social indicators based on the Thailand Household Survey during the period between 1992 and 2019. Meanwhile, the data on natural disasters, which are remote-sensing data, are received from several official geophysical or meteorological databases. Employing a panel fixed effects, the results show that natural disasters significantly reduce household income and expenditure inequality as measured by the Gini index, implying that rich people in Thailand bear a higher cost of natural disasters when compared to poor people. The effect on income inequality is mainly driven by droughts, while the effect on expenditure inequality is mainly driven by flood events. The results are robust across heterogeneity of the samples, lagged effects, outliers, and an alternative inequality measure.

Keywords: inequality, natural disasters, remote-sensing data, Thailand

Procedia PDF Downloads 110

25281 Effective Stacking of Deep Neural Models for Automated Object Recognition in Retail Stores

Authors: Ankit Sinha, Soham Banerjee, Pratik Chattopadhyay

Abstract:

Automated product recognition in retail stores is an important real-world application in the domain of Computer Vision and Pattern Recognition. In this paper, we consider the problem of automatically identifying the classes of the products placed on racks in retail stores from an image of the rack and information about the query/product images. We improve upon the existing approaches in terms of effectiveness and memory requirement by developing a two-stage object detection and recognition pipeline comprising of a Faster-RCNN-based object localizer that detects the object regions in the rack image and a ResNet-18-based image encoder that classifies the detected regions into the appropriate classes. Each of the models is fine-tuned using appropriate data sets for better prediction and data augmentation is performed on each query image to prepare an extensive gallery set for fine-tuning the ResNet-18-based product recognition model. This encoder is trained using a triplet loss function following the strategy of online-hard-negative-mining for improved prediction. The proposed models are lightweight and can be connected in an end-to-end manner during deployment to automatically identify each product object placed in a rack image. Extensive experiments using Grozi-32k and GP-180 data sets verify the effectiveness of the proposed model.

Keywords: retail stores, faster-RCNN, object localization, ResNet-18, triplet loss, data augmentation, product recognition

Procedia PDF Downloads 139

25280 A Multilayer Perceptron Neural Network Model Optimized by Genetic Algorithm for Significant Wave Height Prediction

Authors: Luis C. Parra

Abstract:

The significant wave height prediction is an issue of great interest in the field of coastal activities because of the non-linear behavior of the wave height and its complexity of prediction. This study aims to present a machine learning model to forecast the significant wave height of the oceanographic wave measuring buoys anchored at Mooloolaba of the Queensland Government Data. Modeling was performed by a multilayer perceptron neural network-genetic algorithm (GA-MLP), considering Relu(x) as the activation function of the MLPNN. The GA is in charge of optimized the MLPNN hyperparameters (learning rate, hidden layers, neurons, and activation functions) and wrapper feature selection for the window width size. Results are assessed using Mean Square Error (MSE), Root Mean Square Error (RMSE), and Mean Absolute Error (MAE). The GAMLPNN algorithm was performed with a population size of thirty individuals for eight generations for the prediction optimization of 5 steps forward, obtaining a performance evaluation of 0.00104 MSE, 0.03222 RMSE, 0.02338 MAE, and 0.71163% of MAPE. The results of the analysis suggest that the MLPNNGA model is effective in predicting significant wave height in a one-step forecast with distant time windows, presenting 0.00014 MSE, 0.01180 RMSE, 0.00912 MAE, and 0.52500% of MAPE with 0.99940 of correlation factor. The GA-MLP algorithm was compared with the ARIMA forecasting model, presenting better performance criteria in all performance criteria, validating the potential of this algorithm.

Keywords: significant wave height, machine learning optimization, multilayer perceptron neural networks, evolutionary algorithms

Procedia PDF Downloads 93

25279 Deepnic, A Method to Transform Each Variable into Image for Deep Learning

Authors: Nguyen J. M., Lucas G., Brunner M., Ruan S., Antonioli D.

Abstract:

Deep learning based on convolutional neural networks (CNN) is a very powerful technique for classifying information from an image. We propose a new method, DeepNic, to transform each variable of a tabular dataset into an image where each pixel represents a set of conditions that allow the variable to make an error-free prediction. The contrast of each pixel is proportional to its prediction performance and the color of each pixel corresponds to a sub-family of NICs. NICs are probabilities that depend on the number of inputs to each neuron and the range of coefficients of the inputs. Each variable can therefore be expressed as a function of a matrix of 2 vectors corresponding to an image whose pixels express predictive capabilities. Our objective is to transform each variable of tabular data into images into an image that can be analysed by CNNs, unlike other methods which use all the variables to construct an image. We analyse the NIC information of each variable and express it as a function of the number of neurons and the range of coefficients used. The predictive value and the category of the NIC are expressed by the contrast and the color of the pixel. We have developed a pipeline to implement this technology and have successfully applied it to genomic expressions on an Affymetrix chip.

Keywords: tabular data, deep learning, perfect trees, NICS

Procedia PDF Downloads 78

25278 Phenotype Prediction of DNA Sequence Data: A Machine and Statistical Learning Approach

Authors: Mpho Mokoatle, Darlington Mapiye, James Mashiyane, Stephanie Muller, Gciniwe Dlamini

Abstract:

Great advances in high-throughput sequencing technologies have resulted in availability of huge amounts of sequencing data in public and private repositories, enabling a holistic understanding of complex biological phenomena. Sequence data are used for a wide range of applications such as gene annotations, expression studies, personalized treatment and precision medicine. However, this rapid growth in sequence data poses a great challenge which calls for novel data processing and analytic methods, as well as huge computing resources. In this work, a machine and statistical learning approach for DNA sequence classification based on $k$-mer representation of sequence data is proposed. The approach is tested using whole genome sequences of Mycobacterium tuberculosis (MTB) isolates to (i) reduce the size of genomic sequence data, (ii) identify an optimum size of k-mers and utilize it to build classification models, (iii) predict the phenotype from whole genome sequence data of a given bacterial isolate, and (iv) demonstrate computing challenges associated with the analysis of whole genome sequence data in producing interpretable and explainable insights. The classification models were trained on 104 whole genome sequences of MTB isoloates. Cluster analysis showed that k-mers maybe used to discriminate phenotypes and the discrimination becomes more concise as the size of k-mers increase. The best performing classification model had a k-mer size of 10 (longest k-mer) an accuracy, recall, precision, specificity, and Matthews Correlation coeffient of 72.0%, 80.5%, 80.5%, 63.6%, and 0.4 respectively. This study provides a comprehensive approach for resampling whole genome sequencing data, objectively selecting a k-mer size, and performing classification for phenotype prediction. The analysis also highlights the importance of increasing the k-mer size to produce more biological explainable results, which brings to the fore the interplay that exists amongst accuracy, computing resources and explainability of classification results. However, the analysis provides a new way to elucidate genetic information from genomic data, and identify phenotype relationships which are important especially in explaining complex biological mechanisms.

Keywords: AWD-LSTM, bootstrapping, k-mers, next generation sequencing

Procedia PDF Downloads 156

25277 Phenotype Prediction of DNA Sequence Data: A Machine and Statistical Learning Approach

Authors: Darlington Mapiye, Mpho Mokoatle, James Mashiyane, Stephanie Muller, Gciniwe Dlamini

Abstract:

Great advances in high-throughput sequencing technologies have resulted in availability of huge amounts of sequencing data in public and private repositories, enabling a holistic understanding of complex biological phenomena. Sequence data are used for a wide range of applications such as gene annotations, expression studies, personalized treatment and precision medicine. However, this rapid growth in sequence data poses a great challenge which calls for novel data processing and analytic methods, as well as huge computing resources. In this work, a machine and statistical learning approach for DNA sequence classification based on k-mer representation of sequence data is proposed. The approach is tested using whole genome sequences of Mycobacterium tuberculosis (MTB) isolates to (i) reduce the size of genomic sequence data, (ii) identify an optimum size of k-mers and utilize it to build classification models, (iii) predict the phenotype from whole genome sequence data of a given bacterial isolate, and (iv) demonstrate computing challenges associated with the analysis of whole genome sequence data in producing interpretable and explainable insights. The classification models were trained on 104 whole genome sequences of MTB isoloates. Cluster analysis showed that k-mers maybe used to discriminate phenotypes and the discrimination becomes more concise as the size of k-mers increase. The best performing classification model had a k-mer size of 10 (longest k-mer) an accuracy, recall, precision, specificity, and Matthews Correlation coeffient of 72.0 %, 80.5 %, 80.5 %, 63.6 %, and 0.4 respectively. This study provides a comprehensive approach for resampling whole genome sequencing data, objectively selecting a k-mer size, and performing classification for phenotype prediction. The analysis also highlights the importance of increasing the k-mer size to produce more biological explainable results, which brings to the fore the interplay that exists amongst accuracy, computing resources and explainability of classification results. However, the analysis provides a new way to elucidate genetic information from genomic data, and identify phenotype relationships which are important especially in explaining complex biological mechanisms

Keywords: AWD-LSTM, bootstrapping, k-mers, next generation sequencing

Procedia PDF Downloads 142

25276 Assessing the Influence of Station Density on Geostatistical Prediction of Groundwater Levels in a Semi-arid Watershed of Karnataka

Authors: Sakshi Dhumale, Madhushree C., Amba Shetty

Abstract:

The effect of station density on the geostatistical prediction of groundwater levels is of critical importance to ensure accurate and reliable predictions. Monitoring station density directly impacts the accuracy and reliability of geostatistical predictions by influencing the model's ability to capture localized variations and small-scale features in groundwater levels. This is particularly crucial in regions with complex hydrogeological conditions and significant spatial heterogeneity. Insufficient station density can result in larger prediction uncertainties, as the model may struggle to adequately represent the spatial variability and correlation patterns of the data. On the other hand, an optimal distribution of monitoring stations enables effective coverage of the study area and captures the spatial variability of groundwater levels more comprehensively. In this study, we investigate the effect of station density on the predictive performance of groundwater levels using the geostatistical technique of Ordinary Kriging. The research utilizes groundwater level data collected from 121 observation wells within the semi-arid Berambadi watershed, gathered over a six-year period (2010-2015) from the Indian Institute of Science (IISc), Bengaluru. The dataset is partitioned into seven subsets representing varying sampling densities, ranging from 15% (12 wells) to 100% (121 wells) of the total well network. The results obtained from different monitoring networks are compared against the existing groundwater monitoring network established by the Central Ground Water Board (CGWB). The findings of this study demonstrate that higher station densities significantly enhance the accuracy of geostatistical predictions for groundwater levels. The increased number of monitoring stations enables improved interpolation accuracy and captures finer-scale variations in groundwater levels. These results shed light on the relationship between station density and the geostatistical prediction of groundwater levels, emphasizing the importance of appropriate station densities to ensure accurate and reliable predictions. The insights gained from this study have practical implications for designing and optimizing monitoring networks, facilitating effective groundwater level assessments, and enabling sustainable management of groundwater resources.

Keywords: station density, geostatistical prediction, groundwater levels, monitoring networks, interpolation accuracy, spatial variability

Procedia PDF Downloads 41

25275 Methods of Interpolating Temperature and Rainfall Distribution in Northern Vietnam

Authors: Thanh Van Hoang, Tien Yin Chou, Yao Min Fang, Yi Min Huang, Xuan Linh Nguyen

Abstract:

Reliable information on the spatial distribution of annual rainfall and temperature is essential in research projects relating to urban and regional planning. This research presents results of a classification of temperature and rainfall in the Red River Delta of northern Vietnam based on measurements from seven meteorological stations (Ha Nam, Hung Yen, Lang, Nam Dinh, Ninh Binh, Phu Lien, Thai Binh) in the river basin over a thirty-years period from 1982-2011. The average accumulated rainfall trends in the delta are analysed and form the basis of research essential to weather and climate forecasting. This study employs interpolation based on the Kriging Method for daily rainfall (min and max) and daily temperature (min and max) in order to improve the understanding of sources of variation and uncertainly in these important meteorological parameters. To the Kriging method, the results will show the different models and the different parameters based on the various precipitation series. The results provide a useful reference to assist decision makers in developing smart agriculture strategies for the Red River Delta in Vietnam.

Keywords: spatial interpolation method, ArcGIS, temperature variability, rainfall variability, Red River Delta, Vietnam

Procedia PDF Downloads 318

25274 Prediction of Energy Storage Areas for Static Photovoltaic System Using Irradiation and Regression Modelling

Authors: Kisan Sarda, Bhavika Shingote

Abstract:

This paper aims to evaluate regression modelling for prediction of Energy storage of solar photovoltaic (PV) system using Semi parametric regression techniques because there are some parameters which are known while there are some unknown parameters like humidity, dust etc. Here irradiation of solar energy is different for different places on the basis of Latitudes, so by finding out areas which give more storage we can implement PV systems at those places and our need of energy will be fulfilled. This regression modelling is done for daily, monthly and seasonal prediction of solar energy storage. In this, we have used R modules for designing the algorithm. This algorithm will give the best comparative results than other regression models for the solar PV cell energy storage.

Keywords: semi parametric regression, photovoltaic (PV) system, regression modelling, irradiation

Procedia PDF Downloads 368

25273 Impact of Climate Variation on Natural Vegetations and Human Lives in Thar Desert, Pakistan

Authors: Sujo Meghwar, Zulfqar Ali laghari, Kanji Harijan, Muhib Ali Lagari, G. M. Mastoi, Ali Mohammad Rind

Abstract:

Thar Desert is the most populous Desert of the world. Climate variation in Thar Desert has induced an increase in the magnitude of drought. The variation in climate variation has caused a decrease in natural vegetations. Some plant species are eliminated forever. We have applied the SPI (standardized precipitation index) climate model to investigate the drought induced by climate change. We have gathered the anthropogenic response through a developed questionnaire. The data was analyzed in SPSS version 18. The met-data of two meteorological station elaborated by the time series has suggested an increase in temperature from 1-2.5 centigrade, the decrease in rain fall rainfall from 5-25% and reduction in humidity from 5-12 mm in the 20th century. The anthropogenic responses indicate high impact of climate change on human life and vegetations. Triangle data, we have collected, gives a new insight into the understanding of an association between climate change, drought and human activities.

Keywords: Thar desert, human impact, vegetations, temperature, rainfall, humidity

Procedia PDF Downloads 393

25272 Predictability of Kiremt Rainfall Variability over the Northern Highlands of Ethiopia on Dekadal and Monthly Time Scales Using Global Sea Surface Temperature

Authors: Kibrom Hadush

Abstract:

Countries like Ethiopia, whose economy is mainly rain-fed dependent agriculture, are highly vulnerable to climate variability and weather extremes. Sub-seasonal (monthly) and dekadal forecasts are hence critical for crop production and water resource management. Therefore, this paper was conducted to study the predictability and variability of Kiremt rainfall over the northern half of Ethiopia on monthly and dekadal time scales in association with global Sea Surface Temperature (SST) at different lag time. Trends in rainfall have been analyzed on annual, seasonal (Kiremt), monthly, and dekadal (June–September) time scales based on rainfall records of 36 meteorological stations distributed across four homogenous zones of the northern half of Ethiopia for the period 1992–2017. The results from the progressive Mann–Kendall trend test and the Sen’s slope method shows that there is no significant trend in the annual, Kiremt, monthly and dekadal rainfall total at most of the station's studies. Moreover, the rainfall in the study area varies spatially and temporally, and the distribution of the rainfall pattern increases from the northeast rift valley to northwest highlands. Methods of analysis include graphical correlation and multiple linear regression model are employed to investigate the association between the global SSTs and Kiremt rainfall over the homogeneous rainfall zones and to predict monthly and dekadal (June-September) rainfall using SST predictors. The results of this study show that in general, SST in the equatorial Pacific Ocean is the main source of the predictive skill of the Kiremt rainfall variability over the northern half of Ethiopia. The regional SSTs in the Atlantic and the Indian Ocean as well contribute to the Kiremt rainfall variability over the study area. Moreover, the result of the correlation analysis showed that the decline of monthly and dekadal Kiremt rainfall over most of the homogeneous zones of the study area are caused by the corresponding persistent warming of the SST in the eastern and central equatorial Pacific Ocean during the period 1992 - 2017. It is also found that the monthly and dekadal Kiremt rainfall over the northern, northwestern highlands and northeastern lowlands of Ethiopia are positively correlated with the SST in the western equatorial Pacific, eastern and tropical northern the Atlantic Ocean. Furthermore, the SSTs in the western equatorial Pacific and Indian Oceans are positively correlated to the Kiremt season rainfall in the northeastern highlands. Overall, the results showed that the prediction models using combined SSTs at various ocean regions (equatorial and tropical) performed reasonably well in the prediction (With R2 ranging from 30% to 65%) of monthly and dekadal rainfall and recommends it can be used for efficient prediction of Kiremt rainfall over the study area to aid with systematic and informed decision making within the agricultural sector.

Keywords: dekadal, Kiremt rainfall, monthly, Northern Ethiopia, sea surface temperature

Procedia PDF Downloads 134

25271 Influence of Parameters of Modeling and Data Distribution for Optimal Condition on Locally Weighted Projection Regression Method

Authors: Farhad Asadi, Mohammad Javad Mollakazemi, Aref Ghafouri

Abstract:

Recent research in neural networks science and neuroscience for modeling complex time series data and statistical learning has focused mostly on learning from high input space and signals. Local linear models are a strong choice for modeling local nonlinearity in data series. Locally weighted projection regression is a flexible and powerful algorithm for nonlinear approximation in high dimensional signal spaces. In this paper, different learning scenario of one and two dimensional data series with different distributions are investigated for simulation and further noise is inputted to data distribution for making different disordered distribution in time series data and for evaluation of algorithm in locality prediction of nonlinearity. Then, the performance of this algorithm is simulated and also when the distribution of data is high or when the number of data is less the sensitivity of this approach to data distribution and influence of important parameter of local validity in this algorithm with different data distribution is explained.

Keywords: local nonlinear estimation, LWPR algorithm, online training method, locally weighted projection regression method

Procedia PDF Downloads 487

25270 Prediction of Sepsis Illness from Patients Vital Signs Using Long Short-Term Memory Network and Dynamic Analysis

Authors: Marcio Freire Cruz, Naoaki Ono, Shigehiko Kanaya, Carlos Arthur Mattos Teixeira Cavalcante

Abstract:

The systems that record patient care information, known as Electronic Medical Record (EMR) and those that monitor vital signs of patients, such as heart rate, body temperature, and blood pressure have been extremely valuable for the effectiveness of the patient’s treatment. Several kinds of research have been using data from EMRs and vital signs of patients to predict illnesses. Among them, we highlight those that intend to predict, classify, or, at least identify patterns, of sepsis illness in patients under vital signs monitoring. Sepsis is an organic dysfunction caused by a dysregulated patient's response to an infection that affects millions of people worldwide. Early detection of sepsis is expected to provide a significant improvement in its treatment. Preceding works usually combined medical, statistical, mathematical and computational models to develop detection methods for early prediction, getting higher accuracies, and using the smallest number of variables. Among other techniques, we could find researches using survival analysis, specialist systems, machine learning and deep learning that reached great results. In our research, patients are modeled as points moving each hour in an n-dimensional space where n is the number of vital signs (variables). These points can reach a sepsis target point after some time. For now, the sepsis target point was calculated using the median of all patients’ variables on the sepsis onset. From these points, we calculate for each hour the position vector, the first derivative (velocity vector) and the second derivative (acceleration vector) of the variables to evaluate their behavior. And we construct a prediction model based on a Long Short-Term Memory (LSTM) Network, including these derivatives as explanatory variables. The accuracy of the prediction 6 hours before the time of sepsis, considering only the vital signs reached 83.24% and by including the vectors position, speed, and acceleration, we obtained 94.96%. The data are being collected from Medical Information Mart for Intensive Care (MIMIC) Database, a public database that contains vital signs, laboratory test results, observations, notes, and so on, from more than 60.000 patients.

Keywords: dynamic analysis, long short-term memory, prediction, sepsis

Procedia PDF Downloads 111

25269 Modeling of Wind Loads on Heliostats Installed in South Algeria of Various Pylon Height

Authors: Hakim Merarda, Mounir Aksas, Toufik Arrif, Abd Elfateh Belaid, Amor Gama, Reski Khelifi

Abstract:

Knowledge of wind loads is important to develop a heliostat with good performance. These loads can be calculated by mathematical equations based on several parameters: the density, wind velocity, the aspect ratio of the mirror (height/width) and the coefficient of the height of the tower. Measurement data of the wind velocity and the density of the air are used in a numerical simulation of wind profile that was performed on heliostats with different pylon heights, with 1m^2 mirror areas and with aspect ratio of mirror equal to 1. These measurement data are taken from the meteorological station installed in Ghardaia, Algeria. The main aim of this work is to find a mathematical correlation between the wind loads and the height of the tower.

Keywords: heliostat, solar tower power, wind loads simulation, South Algeria

Procedia PDF Downloads 547

25268 Prediction Compressive Strength of Self-Compacting Concrete Containing Fly Ash Using Fuzzy Logic Inference System

Authors: Belalia Douma Omar, Bakhta Boukhatem, Mohamed Ghrici

Abstract:

Self-compacting concrete (SCC) developed in Japan in the late 80s has enabled the construction industry to reduce demand on the resources, improve the work condition and also reduce the impact of environment by elimination of the need for compaction. Fuzzy logic (FL) approaches has recently been used to model some of the human activities in many areas of civil engineering applications. Especially from these systems in the model experimental studies, very good results have been obtained. In the present study, a model for predicting compressive strength of SCC containing various proportions of fly ash, as partial replacement of cement has been developed by using Adaptive Neuro-Fuzzy Inference System (ANFIS). For the purpose of building this model, a database of experimental data were gathered from the literature and used for training and testing the model. The used data as the inputs of fuzzy logic models are arranged in a format of five parameters that cover the total binder content, fly ash replacement percentage, water content, super plasticizer and age of specimens. The training and testing results in the fuzzy logic model have shown a strong potential for predicting the compressive strength of SCC containing fly ash in the considered range.

Keywords: self-compacting concrete, fly ash, strength prediction, fuzzy logic

Procedia PDF Downloads 321

25267 Stock Prediction and Portfolio Optimization Thesis

Authors: Deniz Peksen

Abstract:

This thesis aims to predict trend movement of closing price of stock and to maximize portfolio by utilizing the predictions. In this context, the study aims to define a stock portfolio strategy from models created by using Logistic Regression, Gradient Boosting and Random Forest. Recently, predicting the trend of stock price has gained a significance role in making buy and sell decisions and generating returns with investment strategies formed by machine learning basis decisions. There are plenty of studies in the literature on the prediction of stock prices in capital markets using machine learning methods but most of them focus on closing prices instead of the direction of price trend. Our study differs from literature in terms of target definition. Ours is a classification problem which is focusing on the market trend in next 20 trading days. To predict trend direction, fourteen years of data were used for training. Following three years were used for validation. Finally, last three years were used for testing. Training data are between 2002-06-18 and 2016-12-30 Validation data are between 2017-01-02 and 2019-12-31 Testing data are between 2020-01-02 and 2022-03-17 We determine Hold Stock Portfolio, Best Stock Portfolio and USD-TRY Exchange rate as benchmarks which we should outperform. We compared our machine learning basis portfolio return on test data with return of Hold Stock Portfolio, Best Stock Portfolio and USD-TRY Exchange rate. We assessed our model performance with the help of roc-auc score and lift charts. We use logistic regression, Gradient Boosting and Random Forest with grid search approach to fine-tune hyper-parameters. As a result of the empirical study, the existence of uptrend and downtrend of five stocks could not be predicted by the models. When we use these predictions to define buy and sell decisions in order to generate model-based-portfolio, model-based-portfolio fails in test dataset. It was found that Model-based buy and sell decisions generated a stock portfolio strategy whose returns can not outperform non-model portfolio strategies on test dataset. We found that any effort for predicting the trend which is formulated on stock price is a challenge. We found same results as Random Walk Theory claims which says that stock price or price changes are unpredictable. Our model iterations failed on test dataset. Although, we built up several good models on validation dataset, we failed on test dataset. We implemented Random Forest, Gradient Boosting and Logistic Regression. We discovered that complex models did not provide advantage or additional performance while comparing them with Logistic Regression. More complexity did not lead us to reach better performance. Using a complex model is not an answer to figure out the stock-related prediction problem. Our approach was to predict the trend instead of the price. This approach converted our problem into classification. However, this label approach does not lead us to solve the stock prediction problem and deny or refute the accuracy of the Random Walk Theory for the stock price.

Keywords: stock prediction, portfolio optimization, data science, machine learning

Procedia PDF Downloads 68

25266 Artificial Neural Network-Based Prediction of Effluent Quality of Wastewater Treatment Plant Employing Data Preprocessing Approaches

Authors: Vahid Nourani, Atefeh Ashrafi

Abstract:

Prediction of treated wastewater quality is a matter of growing importance in water treatment procedure. In this way artificial neural network (ANN), as a robust data-driven approach, has been widely used for forecasting the effluent quality of wastewater treatment. However, developing ANN model based on appropriate input variables is a major concern due to the numerous parameters which are collected from treatment process and the number of them are increasing in the light of electronic sensors development. Various studies have been conducted, using different clustering methods, in order to classify most related and effective input variables. This issue has been overlooked in the selecting dominant input variables among wastewater treatment parameters which could effectively lead to more accurate prediction of water quality. In the presented study two ANN models were developed with the aim of forecasting effluent quality of Tabriz city’s wastewater treatment plant. Biochemical oxygen demand (BOD) was utilized to determine water quality as a target parameter. Model A used Principal Component Analysis (PCA) for input selection as a linear variance-based clustering method. Model B used those variables identified by the mutual information (MI) measure. Therefore, the optimal ANN structure when the result of model B compared with model A showed up to 15% percent increment in Determination Coefficient (DC). Thus, this study highlights the advantage of PCA method in selecting dominant input variables for ANN modeling of wastewater plant efficiency performance.

Keywords: Artificial Neural Networks, biochemical oxygen demand, principal component analysis, mutual information, Tabriz wastewater treatment plant, wastewater treatment plant

Procedia PDF Downloads 118

25265 Prediction of Rotating Machines with Rolling Element Bearings and Its Components Deterioration

Authors: Marimuthu Gurusamy

Abstract:

In vibration analysis (with accelerometers) of rotating machines with rolling element bearing, the customers are interested to know the failure of the machine well in advance to plan the spare inventory and maintenance. But in real world most of the machines fails before the prediction of vibration analyst or Expert analysis software. Presently the prediction of failure is based on ISO 10816 vibration limits only. But this is not enough to monitor the failure of machines well in advance. Because more than 50% of the machines will fail even the vibration readings are within acceptable zone as per ISO 10816.Hence it requires further detail analysis and different techniques to predict the failure well in advance. In vibration Analysis, the velocity spectrum is used to analyse the root cause of the mechanical problems like unbalance, misalignment and looseness etc. The envelope spectrum are used to analyse the bearing frequency components, hence the failure in inner race, outer race and rolling elements are identified. But so far there is no correlation made between these two concepts. The author used both velocity spectrum and Envelope spectrum to analyse the machine behaviour and bearing condition to correlated the changes in dynamic load (by unbalance, misalignment and looseness etc.) and effect of impact on the bearing. Hence we could able to predict the expected life of the machine and bearings in the rotating equipment (with rolling element bearings). Also we used process parameters like temperature, flow and pressure to correlate with flow induced vibration and load variations, when abnormal vibration occurs due to changes in process parameters. Hence by correlation of velocity spectrum, envelope spectrum and process data with 20 years of experience in vibration analysis, the author could able to predict the rotating Equipment and its component’s deterioration and expected duration for maintenance.

Keywords: vibration analysis, velocity spectrum, envelope spectrum, prediction of deterioration

Procedia PDF Downloads 437

25264 Use of Multistage Transition Regression Models for Credit Card Income Prediction

Authors: Denys Osipenko, Jonathan Crook

Abstract:

Because of the variety of the card holders’ behaviour types and income sources each consumer account can be transferred to a variety of states. Each consumer account can be inactive, transactor, revolver, delinquent, defaulted and requires an individual model for the income prediction. The estimation of transition probabilities between statuses at the account level helps to avoid the memorylessness of the Markov Chains approach. This paper investigates the transition probabilities estimation approaches to credit cards income prediction at the account level. The key question of empirical research is which approach gives more accurate results: multinomial logistic regression or multistage conditional logistic regression with binary target. Both models have shown moderate predictive power. Prediction accuracy for conditional logistic regression depends on the order of stages for the conditional binary logistic regression. On the other hand, multinomial logistic regression is easier for usage and gives integrate estimations for all states without priorities. Thus further investigations can be concentrated on alternative modeling approaches such as discrete choice models.

Keywords: multinomial regression, conditional logistic regression, credit account state, transition probability

Procedia PDF Downloads 472