Search results for: meteorological prediction data
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 25715

Search results for: meteorological prediction data

24575 The Relationship Between Artificial Intelligence, Data Science, and Privacy

Authors: M. Naidoo

Abstract:

Artificial intelligence often requires large amounts of good quality data. Within important fields, such as healthcare, the training of AI systems predominately relies on health and personal data; however, the usage of this data is complicated by various layers of law and ethics that seek to protect individuals’ privacy rights. This research seeks to establish the challenges AI and data sciences pose to (i) informational rights, (ii) privacy rights, and (iii) data protection. To solve some of the issues presented, various methods are suggested, such as embedding values in technological development, proper balancing of rights and interests, and others.

Keywords: artificial intelligence, data science, law, policy

Procedia PDF Downloads 95
24574 Non-Destructive Evaluation for Physical State Monitoring of an Angle Section Thin-Walled Curved Beam

Authors: Palash Dey, Sudip Talukdar

Abstract:

In this work, a cross-breed approach is presented for obtaining both the amount of the damage intensity and location of damage existing in thin-walled members. This cross-breed approach is developed based on response surface methodology (RSM) and genetic algorithm (GA). Theoretical finite element (FE) model of cracked angle section thin walled curved beam has been linked to the developed approach to carry out trial experiments to generate response surface functions (RSFs) of free, forced and heterogeneous dynamic response data. Subsequently, the error between the computed response surface functions and measured dynamic response data has been minimized using GA to find out the optimum damage parameters (amount of the damage intensity and location). A single crack of varying location and depth has been considered in this study. The presented approach has been found to reveal good accuracy in prediction of crack parameters and possess great potential in crack detection as it requires only the current response of a cracked beam.

Keywords: damage parameters, finite element, genetic algorithm, response surface methodology, thin walled curved beam

Procedia PDF Downloads 238
24573 Simulation Data Summarization Based on Spatial Histograms

Authors: Jing Zhao, Yoshiharu Ishikawa, Chuan Xiao, Kento Sugiura

Abstract:

In order to analyze large-scale scientific data, research on data exploration and visualization has gained popularity. In this paper, we focus on the exploration and visualization of scientific simulation data, and define a spatial V-Optimal histogram for data summarization. We propose histogram construction algorithms based on a general binary hierarchical partitioning as well as a more specific one, the l-grid partitioning. For effective data summarization and efficient data visualization in scientific data analysis, we propose an optimal algorithm as well as a heuristic algorithm for histogram construction. To verify the effectiveness and efficiency of the proposed methods, we conduct experiments on the massive evacuation simulation data.

Keywords: simulation data, data summarization, spatial histograms, exploration, visualization

Procedia PDF Downloads 166
24572 Investigating the performance of machine learning models on PM2.5 forecasts: A case study in the city of Thessaloniki

Authors: Alexandros Pournaras, Anastasia Papadopoulou, Serafim Kontos, Anastasios Karakostas

Abstract:

The air quality of modern cities is an important concern, as poor air quality contributes to human health and environmental issues. Reliable air quality forecasting has, thus, gained scientific and governmental attention as an essential tool that enables authorities to take proactive measures for public safety. In this study, the potential of Machine Learning (ML) models to forecast PM2.5 at local scale is investigated in the city of Thessaloniki, the second largest city in Greece, which has been struggling with the persistent issue of air pollution. ML models, with proven ability to address timeseries forecasting, are employed to predict the PM2.5 concentrations and the respective Air Quality Index 5-days ahead by learning from daily historical air quality and meteorological data from 2014 to 2016 and gathered from two stations with different land use characteristics in the urban fabric of Thessaloniki. The performance of the ML models on PM2.5 concentrations is evaluated with common statistical methods, such as R squared (r²) and Root Mean Squared Error (RMSE), utilizing a portion of the stations’ measurements as test set. A multi-categorical evaluation is utilized for the assessment of their performance on respective AQIs. Several conclusions were made from the experiments conducted. Experimenting on MLs’ configuration revealed a moderate effect of various parameters and training schemas on the model’s predictions. Their performance of all these models were found to produce satisfactory results on PM2.5 concentrations. In addition, their application on untrained stations showed that these models can perform well, indicating a generalized behavior. Moreover, their performance on AQI was even better, showing that the MLs can be used as predictors for AQI, which is the direct information provided to the general public.

Keywords: Air Quality, AQ Forecasting, AQI, Machine Learning, PM2.5

Procedia PDF Downloads 65
24571 Crack Growth Life Prediction of a Fighter Aircraft Wing Splice Joint Under Spectrum Loading Using Random Forest Regression and Artificial Neural Networks with Hyperparameter Optimization

Authors: Zafer Yüce, Paşa Yayla, Alev Taşkın

Abstract:

There are heaps of analytical methods to estimate the crack growth life of a component. Soft computing methods have an increasing trend in predicting fatigue life. Their ability to build complex relationships and capability to handle huge amounts of data are motivating researchers and industry professionals to employ them for challenging problems. This study focuses on soft computing methods, especially random forest regressors and artificial neural networks with hyperparameter optimization algorithms such as grid search and random grid search, to estimate the crack growth life of an aircraft wing splice joint under variable amplitude loading. TensorFlow and Scikit-learn libraries of Python are used to build the machine learning models for this study. The material considered in this work is 7050-T7451 aluminum, which is commonly preferred as a structural element in the aerospace industry, and regarding the crack type; corner crack is used. A finite element model is built for the joint to calculate fastener loads and stresses on the structure. Since finite element model results are validated with analytical calculations, findings of the finite element model are fed to AFGROW software to calculate analytical crack growth lives. Based on Fighter Aircraft Loading Standard for Fatigue (FALSTAFF), 90 unique fatigue loading spectra are developed for various load levels, and then, these spectrums are utilized as inputs to the artificial neural network and random forest regression models for predicting crack growth life. Finally, the crack growth life predictions of the machine learning models are compared with analytical calculations. According to the findings, a good correlation is observed between analytical and predicted crack growth lives.

Keywords: aircraft, fatigue, joint, life, optimization, prediction.

Procedia PDF Downloads 157
24570 Identification of Potential Predictive Biomarkers for Early Diagnosis of Preeclampsia Growth Factors to microRNAs

Authors: Sadia Munir

Abstract:

Preeclampsia is the contributor to the worldwide maternal mortality of approximately 100,000 deaths a year. It complicates about 10% of all pregnancies and is the first cause of maternal admission to intensive care units. Predicting preeclampsia is a major challenge in obstetrics. More importantly, no major progress has been achieved in the treatment of preeclampsia. As placenta is the main cause of the disease, the only way to treat the disease is to extract placental and deliver the baby. In developed countries, the cost of an average case of preeclampsia is estimated at £9000. Interestingly, preeclampsia may have an impact on the health of mother or infant, beyond the pregnancy. We performed a systematic search of PubMed including the combination of terms such as preeclampsia, biomarkers, treatment, hypoxia, inflammation, oxidative stress, vascular endothelial growth factor A, activin A, inhibin A, placental growth factor, transforming growth factor β-1, Nodal, placenta, trophoblast cells, microRNAs. In this review, we have summarized current knowledge on the identification of potential biomarkers for the diagnosis of preeclampsia. Although these studies show promising data in early diagnosis of preeclampsia, the current value of these factors as biomarkers, for the precise prediction of preeclampsia, has its limitation. Therefore, future studies need to be done to support some of the very promising and interesting data to develop affordable and widely available tests for early detection and treatment of preeclampsia.

Keywords: activin, biomarkers, growth factors, miroRNA

Procedia PDF Downloads 431
24569 Cross-Validation of the Data Obtained for ω-6 Linoleic and ω-3 α-Linolenic Acids Concentration of Hemp Oil Using Jackknife and Bootstrap Resampling

Authors: Vibha Devi, Shabina Khanam

Abstract:

Hemp (Cannabis sativa) possesses a rich content of ω-6 linoleic and ω-3 linolenic essential fatty acid in the ratio of 3:1, which is a rare and most desired ratio that enhances the quality of hemp oil. These components are beneficial for the development of cell and body growth, strengthen the immune system, possess anti-inflammatory action, lowering the risk of heart problem owing to its anti-clotting property and a remedy for arthritis and various disorders. The present study employs supercritical fluid extraction (SFE) approach on hemp seed at various conditions of parameters; temperature (40 - 80) °C, pressure (200 - 350) bar, flow rate (5 - 15) g/min, particle size (0.430 - 1.015) mm and amount of co-solvent (0 - 10) % of solvent flow rate through central composite design (CCD). CCD suggested 32 sets of experiments, which was carried out. As SFE process includes large number of variables, the present study recommends the application of resampling techniques for cross-validation of the obtained data. Cross-validation refits the model on each data to achieve the information regarding the error, variability, deviation etc. Bootstrap and jackknife are the most popular resampling techniques, which create a large number of data through resampling from the original dataset and analyze these data to check the validity of the obtained data. Jackknife resampling is based on the eliminating one observation from the original sample of size N without replacement. For jackknife resampling, the sample size is 31 (eliminating one observation), which is repeated by 32 times. Bootstrap is the frequently used statistical approach for estimating the sampling distribution of an estimator by resampling with replacement from the original sample. For bootstrap resampling, the sample size is 32, which was repeated by 100 times. Estimands for these resampling techniques are considered as mean, standard deviation, variation coefficient and standard error of the mean. For ω-6 linoleic acid concentration, mean value was approx. 58.5 for both resampling methods, which is the average (central value) of the sample mean of all data points. Similarly, for ω-3 linoleic acid concentration, mean was observed as 22.5 through both resampling. Variance exhibits the spread out of the data from its mean. Greater value of variance exhibits the large range of output data, which is 18 for ω-6 linoleic acid (ranging from 48.85 to 63.66 %) and 6 for ω-3 linoleic acid (ranging from 16.71 to 26.2 %). Further, low value of standard deviation (approx. 1 %), low standard error of the mean (< 0.8) and low variance coefficient (< 0.2) reflect the accuracy of the sample for prediction. All the estimator value of variance coefficients, standard deviation and standard error of the mean are found within the 95 % of confidence interval.

Keywords: resampling, supercritical fluid extraction, hemp oil, cross-validation

Procedia PDF Downloads 131
24568 Prediction of Remaining Life of Industrial Cutting Tools with Deep Learning-Assisted Image Processing Techniques

Authors: Gizem Eser Erdek

Abstract:

This study is research on predicting the remaining life of industrial cutting tools used in the industrial production process with deep learning methods. When the life of cutting tools decreases, they cause destruction to the raw material they are processing. This study it is aimed to predict the remaining life of the cutting tool based on the damage caused by the cutting tools to the raw material. For this, hole photos were collected from the hole-drilling machine for 8 months. Photos were labeled in 5 classes according to hole quality. In this way, the problem was transformed into a classification problem. Using the prepared data set, a model was created with convolutional neural networks, which is a deep learning method. In addition, VGGNet and ResNet architectures, which have been successful in the literature, have been tested on the data set. A hybrid model using convolutional neural networks and support vector machines is also used for comparison. When all models are compared, it has been determined that the model in which convolutional neural networks are used gives successful results of a %74 accuracy rate. In the preliminary studies, the data set was arranged to include only the best and worst classes, and the study gave ~93% accuracy when the binary classification model was applied. The results of this study showed that the remaining life of the cutting tools could be predicted by deep learning methods based on the damage to the raw material. Experiments have proven that deep learning methods can be used as an alternative for cutting tool life estimation.

Keywords: classification, convolutional neural network, deep learning, remaining life of industrial cutting tools, ResNet, support vector machine, VggNet

Procedia PDF Downloads 60
24567 Soybean Seed Composition Prediction From Standing Crops Using Planet Scope Satellite Imagery and Machine Learning

Authors: Supria Sarkar, Vasit Sagan, Sourav Bhadra, Meghnath Pokharel, Felix B.Fritschi

Abstract:

Soybean and their derivatives are very important agricultural commodities around the world because of their wide applicability in human food, animal feed, biofuel, and industries. However, the significance of soybean production depends on the quality of the soybean seeds rather than the yield alone. Seed composition is widely dependent on plant physiological properties, aerobic and anaerobic environmental conditions, nutrient content, and plant phenological characteristics, which can be captured by high temporal resolution remote sensing datasets. Planet scope (PS) satellite images have high potential in sequential information of crop growth due to their frequent revisit throughout the world. In this study, we estimate soybean seed composition while the plants are in the field by utilizing PlanetScope (PS) satellite images and different machine learning algorithms. Several experimental fields were established with varying genotypes and different seed compositions were measured from the samples as ground truth data. The PS images were processed to extract 462 hand-crafted vegetative and textural features. Four machine learning algorithms, i.e., partial least squares (PLSR), random forest (RFR), gradient boosting machine (GBM), support vector machine (SVM), and two recurrent neural network architectures, i.e., long short-term memory (LSTM) and gated recurrent unit (GRU) were used in this study to predict oil, protein, sucrose, ash, starch, and fiber of soybean seed samples. The GRU and LSTM architectures had two separate branches, one for vegetative features and the other for textures features, which were later concatenated together to predict seed composition. The results show that sucrose, ash, protein, and oil yielded comparable prediction results. Machine learning algorithms that best predicted the six seed composition traits differed. GRU worked well for oil (R-Squared: of 0.53) and protein (R-Squared: 0.36), whereas SVR and PLSR showed the best result for sucrose (R-Squared: 0.74) and ash (R-Squared: 0.60), respectively. Although, the RFR and GBM provided comparable performance, the models tended to extremely overfit. Among the features, vegetative features were found as the most important variables compared to texture features. It is suggested to utilize many vegetation indices for machine learning training and select the best ones by using feature selection methods. Overall, the study reveals the feasibility and efficiency of PS images and machine learning for plot-level seed composition estimation. However, special care should be given while designing the plot size in the experiments to avoid mixed pixel issues.

Keywords: agriculture, computer vision, data science, geospatial technology

Procedia PDF Downloads 125
24566 Gas Holdups in a Gas-Liquid Upflow Bubble Column With Internal

Authors: C. Milind Caspar, Valtonia Octavio Massingue, K. Maneesh Reddy, K. V. Ramesh

Abstract:

Gas holdup data were obtained from measured pressure drop values in a gas-liquid upflow bubble column in the presence of string of hemispheres promoter internal. The parameters that influenced the gas holdup are gas velocity, liquid velocity, promoter rod diameter, pitch and base diameter of hemisphere. Tap water was used as liquid phase and nitrogen as gas phase. About 26 percent in gas holdup was obtained due to the insertion of promoter in in the present study in comparison with empty conduit. Pitch and rod diameter have not shown any influence on gas holdup whereas gas holdup was strongly influenced by gas velocity, liquid velocity and hemisphere base diameter. Correlation equation was obtained for the prediction of gas holdup by least squares regression analysis.

Keywords: bubble column, gas-holdup, two-phase flow, turbulent promoter

Procedia PDF Downloads 96
24565 Integrating Artificial Neural Network and Taguchi Method on Constructing the Real Estate Appraisal Model

Authors: Mu-Yen Chen, Min-Hsuan Fan, Chia-Chen Chen, Siang-Yu Jhong

Abstract:

In recent years, real estate prediction or valuation has been a topic of discussion in many developed countries. Improper hype created by investors leads to fluctuating prices of real estate, affecting many consumers to purchase their own homes. Therefore, scholars from various countries have conducted research in real estate valuation and prediction. With the back-propagation neural network that has been popular in recent years and the orthogonal array in the Taguchi method, this study aimed to find the optimal parameter combination at different levels of orthogonal array after the system presented different parameter combinations, so that the artificial neural network obtained the most accurate results. The experimental results also demonstrated that the method presented in the study had a better result than traditional machine learning. Finally, it also showed that the model proposed in this study had the optimal predictive effect, and could significantly reduce the cost of time in simulation operation. The best predictive results could be found with a fewer number of experiments more efficiently. Thus users could predict a real estate transaction price that is not far from the current actual prices.

Keywords: artificial neural network, Taguchi method, real estate valuation model, investors

Procedia PDF Downloads 472
24564 Factors Impacting Geostatistical Modeling Accuracy and Modeling Strategy of Fluvial Facies Models

Authors: Benbiao Song, Yan Gao, Zhuo Liu

Abstract:

Geostatistical modeling is the key technic for reservoir characterization, the quality of geological models will influence the prediction of reservoir performance greatly, but few studies have been done to quantify the factors impacting geostatistical reservoir modeling accuracy. In this study, 16 fluvial prototype models have been established to represent different geological complexity, 6 cases range from 16 to 361 wells were defined to reproduce all those 16 prototype models by different methodologies including SIS, object-based and MPFS algorithms accompany with different constraint parameters. Modeling accuracy ratio was defined to quantify the influence of each factor, and ten realizations were averaged to represent each accuracy ratio under the same modeling condition and parameters association. Totally 5760 simulations were done to quantify the relative contribution of each factor to the simulation accuracy, and the results can be used as strategy guide for facies modeling in the similar condition. It is founded that data density, geological trend and geological complexity have great impact on modeling accuracy. Modeling accuracy may up to 90% when channel sand width reaches up to 1.5 times of well space under whatever condition by SIS and MPFS methods. When well density is low, the contribution of geological trend may increase the modeling accuracy from 40% to 70%, while the use of proper variogram may have very limited contribution for SIS method. It can be implied that when well data are dense enough to cover simple geobodies, few efforts were needed to construct an acceptable model, when geobodies are complex with insufficient data group, it is better to construct a set of robust geological trend than rely on a reliable variogram function. For object-based method, the modeling accuracy does not increase obviously as SIS method by the increase of data density, but kept rational appearance when data density is low. MPFS methods have the similar trend with SIS method, but the use of proper geological trend accompany with rational variogram may have better modeling accuracy than MPFS method. It implies that the geological modeling strategy for a real reservoir case needs to be optimized by evaluation of dataset, geological complexity, geological constraint information and the modeling objective.

Keywords: fluvial facies, geostatistics, geological trend, modeling strategy, modeling accuracy, variogram

Procedia PDF Downloads 252
24563 Diagnosis of Diabetes Using Computer Methods: Soft Computing Methods for Diabetes Detection Using Iris

Authors: Piyush Samant, Ravinder Agarwal

Abstract:

Complementary and Alternative Medicine (CAM) techniques are quite popular and effective for chronic diseases. Iridology is more than 150 years old CAM technique which analyzes the patterns, tissue weakness, color, shape, structure, etc. for disease diagnosis. The objective of this paper is to validate the use of iridology for the diagnosis of the diabetes. The suggested model was applied in a systemic disease with ocular effects. 200 subject data of 100 each diabetic and non-diabetic were evaluated. Complete procedure was kept very simple and free from the involvement of any iridologist. From the normalized iris, the region of interest was cropped. All 63 features were extracted using statistical, texture analysis, and two-dimensional discrete wavelet transformation. A comparison of accuracies of six different classifiers has been presented. The result shows 89.66% accuracy by the random forest classifier.

Keywords: complementary and alternative medicine, classification, iridology, iris, feature extraction, disease prediction

Procedia PDF Downloads 389
24562 Development of Web-Based Iceberg Detection Using Deep Learning

Authors: A. Kavya Sri, K. Sai Vineela, R. Vanitha, S. Rohith

Abstract:

Large pieces of ice that break from the glaciers are known as icebergs. The threat that icebergs pose to navigation, production of offshore oil and gas services, and underwater pipelines makes their detection crucial. In this project, an automated iceberg tracking method using deep learning techniques and satellite images of icebergs is to be developed. With a temporal resolution of 12 days and a spatial resolution of 20 m, Sentinel-1 (SAR) images can be used to track iceberg drift over the Southern Ocean. In contrast to multispectral images, SAR images are used for analysis in meteorological conditions. This project develops a web-based graphical user interface to detect and track icebergs using sentinel-1 images. To track the movement of the icebergs by using temporal images based on their latitude and longitude values and by comparing the center and area of all detected icebergs. Testing the accuracy is done by precision and recall measures.

Keywords: synthetic aperture radar (SAR), icebergs, deep learning, spatial resolution, temporal resolution

Procedia PDF Downloads 78
24561 A Long Short-Term Memory Based Deep Learning Model for Corporate Bond Price Predictions

Authors: Vikrant Gupta, Amrit Goswami

Abstract:

The fixed income market forms the basis of the modern financial market. All other assets in financial markets derive their value from the bond market. Owing to its over-the-counter nature, corporate bonds have relatively less data publicly available and thus is researched upon far less compared to Equities. Bond price prediction is a complex financial time series forecasting problem and is considered very crucial in the domain of finance. The bond prices are highly volatile and full of noise which makes it very difficult for traditional statistical time-series models to capture the complexity in series patterns which leads to inefficient forecasts. To overcome the inefficiencies of statistical models, various machine learning techniques were initially used in the literature for more accurate forecasting of time-series. However, simple machine learning methods such as linear regression, support vectors, random forests fail to provide efficient results when tested on highly complex sequences such as stock prices and bond prices. hence to capture these intricate sequence patterns, various deep learning-based methodologies have been discussed in the literature. In this study, a recurrent neural network-based deep learning model using long short term networks for prediction of corporate bond prices has been discussed. Long Short Term networks (LSTM) have been widely used in the literature for various sequence learning tasks in various domains such as machine translation, speech recognition, etc. In recent years, various studies have discussed the effectiveness of LSTMs in forecasting complex time-series sequences and have shown promising results when compared to other methodologies. LSTMs are a special kind of recurrent neural networks which are capable of learning long term dependencies due to its memory function which traditional neural networks fail to capture. In this study, a simple LSTM, Stacked LSTM and a Masked LSTM based model has been discussed with respect to varying input sequences (three days, seven days and 14 days). In order to facilitate faster learning and to gradually decompose the complexity of bond price sequence, an Empirical Mode Decomposition (EMD) has been used, which has resulted in accuracy improvement of the standalone LSTM model. With a variety of Technical Indicators and EMD decomposed time series, Masked LSTM outperformed the other two counterparts in terms of prediction accuracy. To benchmark the proposed model, the results have been compared with traditional time series models (ARIMA), shallow neural networks and above discussed three different LSTM models. In summary, our results show that the use of LSTM models provide more accurate results and should be explored more within the asset management industry.

Keywords: bond prices, long short-term memory, time series forecasting, empirical mode decomposition

Procedia PDF Downloads 126
24560 Fat-Tail Test of Regulatory DNA Sequences

Authors: Jian-Jun Shu

Abstract:

The statistical properties of CRMs are explored by estimating similar-word set occurrence distribution. It is observed that CRMs tend to have a fat-tail distribution for similar-word set occurrence. Thus, the fat-tail test with two fatness coefficients is proposed to distinguish CRMs from non-CRMs, especially from exons. For the first fatness coefficient, the separation accuracy between CRMs and exons is increased as compared with the existing content-based CRM prediction method – fluffy-tail test. For the second fatness coefficient, the computing time is reduced as compared with fluffy-tail test, making it very suitable for long sequences and large data-base analysis in the post-genome time. Moreover, these indexes may be used to predict the CRMs which have not yet been observed experimentally. This can serve as a valuable filtering process for experiment.

Keywords: statistical approach, transcription factor binding sites, cis-regulatory modules, DNA sequences

Procedia PDF Downloads 278
24559 Statistical Modelling of Maximum Temperature in Rwanda Using Extreme Value Analysis

Authors: Emmanuel Iyamuremye, Edouard Singirankabo, Alexis Habineza, Yunvirusaba Nelson

Abstract:

Temperature is one of the most important climatic factors for crop production. However, severe temperatures cause drought, feverish and cold spells that have various consequences for human life, agriculture, and the environment in general. It is necessary to provide reliable information related to the incidents and the probability of such extreme events occurring. In the 21st century, the world faces a huge number of threats, especially from climate change, due to global warming and environmental degradation. The rise in temperature has a direct effect on the decrease in rainfall. This has an impact on crop growth and development, which in turn decreases crop yield and quality. Countries that are heavily dependent on agriculture use to suffer a lot and need to take preventive steps to overcome these challenges. The main objective of this study is to model the statistical behaviour of extreme maximum temperature values in Rwanda. To achieve such an objective, the daily temperature data spanned the period from January 2000 to December 2017 recorded at nine weather stations collected from the Rwanda Meteorological Agency were used. The two methods, namely the block maxima (BM) method and the Peaks Over Threshold (POT), were applied to model and analyse extreme temperature. Model parameters were estimated, while the extreme temperature return periods and confidence intervals were predicted. The model fit suggests Gumbel and Beta distributions to be the most appropriate models for the annual maximum of daily temperature. The results show that the temperature will continue to increase, as shown by estimated return levels.

Keywords: climate change, global warming, extreme value theory, rwanda, temperature, generalised extreme value distribution, generalised pareto distribution

Procedia PDF Downloads 162
24558 Algorithms used in Spatial Data Mining GIS

Authors: Vahid Bairami Rad

Abstract:

Extracting knowledge from spatial data like GIS data is important to reduce the data and extract information. Therefore, the development of new techniques and tools that support the human in transforming data into useful knowledge has been the focus of the relatively new and interdisciplinary research area ‘knowledge discovery in databases’. Thus, we introduce a set of database primitives or basic operations for spatial data mining which are sufficient to express most of the spatial data mining algorithms from the literature. This approach has several advantages. Similar to the relational standard language SQL, the use of standard primitives will speed-up the development of new data mining algorithms and will also make them more portable. We introduced a database-oriented framework for spatial data mining which is based on the concepts of neighborhood graphs and paths. A small set of basic operations on these graphs and paths were defined as database primitives for spatial data mining. Furthermore, techniques to efficiently support the database primitives by a commercial DBMS were presented.

Keywords: spatial data base, knowledge discovery database, data mining, spatial relationship, predictive data mining

Procedia PDF Downloads 444
24557 Predicting Radioactive Waste Glass Viscosity, Density and Dissolution with Machine Learning

Authors: Joseph Lillington, Tom Gout, Mike Harrison, Ian Farnan

Abstract:

The vitrification of high-level nuclear waste within borosilicate glass and its incorporation within a multi-barrier repository deep underground is widely accepted as the preferred disposal method. However, for this to happen, any safety case will require validation that the initially localized radionuclides will not be considerably released into the near/far-field. Therefore, accurate mechanistic models are necessary to predict glass dissolution, and these should be robust to a variety of incorporated waste species and leaching test conditions, particularly given substantial variations across international waste-streams. Here, machine learning is used to predict glass material properties (viscosity, density) and glass leaching model parameters from large-scale industrial data. A variety of different machine learning algorithms have been compared to assess performance. Density was predicted solely from composition, whereas viscosity additionally considered temperature. To predict suitable glass leaching model parameters, a large simulated dataset was created by coupling MATLAB and the chemical reactive-transport code HYTEC, considering the state-of-the-art GRAAL model (glass reactivity in allowance of the alteration layer). The trained models were then subsequently applied to the large-scale industrial, experimental data to identify potentially appropriate model parameters. Results indicate that ensemble methods can accurately predict viscosity as a function of temperature and composition across all three industrial datasets. Glass density prediction shows reliable learning performance with predictions primarily being within the experimental uncertainty of the test data. Furthermore, machine learning can predict glass dissolution model parameters behavior, demonstrating potential value in GRAAL model development and in assessing suitable model parameters for large-scale industrial glass dissolution data.

Keywords: machine learning, predictive modelling, pattern recognition, radioactive waste glass

Procedia PDF Downloads 106
24556 The Effect of Land Cover on Movement of Vehicles in the Terrain

Authors: Krisstalova Dana, Mazal Jan

Abstract:

This article deals with geographical conditions in terrain and their effect on the movement of vehicles, their effect on speed and safety of movement of people and vehicles. Finding of the optimal routes outside the communication is studied in the army environment, but it occur in civilian as well, primarily in crisis situation, or by the provision of assistance when natural disasters such as floods, fires, storms etc., have happened. These movements require the optimization of routes when effects of geographical factors should be included. The most important factor is the surface of a terrain. It is based on several geographical factors as are slopes, soil conditions, micro-relief, a type of surface and meteorological conditions. Their mutual impact has been given by coefficient of deceleration. This coefficient can be used for the commander`s decision. New approaches and methods of terrain testing, mathematical computing, mathematical statistics or cartometric investigation are necessary parts of this evaluation.

Keywords: movement in a terrain, geographical factors, surface of a field, mathematical evaluation, optimization and searching paths

Procedia PDF Downloads 413
24555 A Semantic and Concise Structure to Represent Human Actions

Authors: Tobias Strübing, Fatemeh Ziaeetabar

Abstract:

Humans usually manipulate objects with their hands. To represent these actions in a simple and understandable way, we need to use a semantic framework. For this purpose, the Semantic Event Chain (SEC) method has already been presented which is done by consideration of touching and non-touching relations between manipulated objects in a scene. This method was improved by a computational model, the so-called enriched Semantic Event Chain (eSEC), which incorporates the information of static (e.g. top, bottom) and dynamic spatial relations (e.g. moving apart, getting closer) between objects in an action scene. This leads to a better action prediction as well as the ability to distinguish between more actions. Each eSEC manipulation descriptor is a huge matrix with thirty rows and a massive set of the spatial relations between each pair of manipulated objects. The current eSEC framework has so far only been used in the category of manipulation actions, which eventually involve two hands. Here, we would like to extend this approach to a whole body action descriptor and make a conjoint activity representation structure. For this purpose, we need to do a statistical analysis to modify the current eSEC by summarizing while preserving its features, and introduce a new version called Enhanced eSEC or (e2SEC). This summarization can be done from two points of the view: 1) reducing the number of rows in an eSEC matrix, 2) shrinking the set of possible semantic spatial relations. To achieve these, we computed the importance of each matrix row in an statistical way, to see if it is possible to remove a particular one while all manipulations are still distinguishable from each other. On the other hand, we examined which semantic spatial relations can be merged without compromising the unity of the predefined manipulation actions. Therefore by performing the above analyses, we made the new e2SEC framework which has 20% fewer rows, 16.7% less static spatial and 11.1% less dynamic spatial relations. This simplification, while preserving the salient features of a semantic structure in representing actions, has a tremendous impact on the recognition and prediction of complex actions, as well as the interactions between humans and robots. It also creates a comprehensive platform to integrate with the body limbs descriptors and dramatically increases system performance, especially in complex real time applications such as human-robot interaction prediction.

Keywords: enriched semantic event chain, semantic action representation, spatial relations, statistical analysis

Procedia PDF Downloads 115
24554 Effect of Climate Changing Pattern on Aquatic Biodiversity of Bhimtal Lake at Kumaun Himalaya (India)

Authors: Davendra S. Malik

Abstract:

Bhimtal lake is located between 290 21’ N latitude and 790 24’ E longitude, at an elevation of 1332m above mean sea level in the Kumaun region of Uttarakhand of Indian subcontinent. The lake surface area is decreasing in water area, depth level in relation to ecological and biological characteristics due to climatic variations, invasive land use pattern, degraded forest zones and changed agriculture pattern in lake catchment basin. The present study is focused on long and short term effects of climate change on aquatic biodiversity and productivity of Bhimtal lake. The meteorological data of last fifteen years of Bhimtal lake catchment basin revealed that air temperature has been increased 1.5 to 2.1oC in summer, 0.2 to 0.8 C in winter, relative humidity increased 4 to 6% in summer and rainfall pattern changed erratically in rainy seasons. The surface water temperature of Bhimtal lake showed an increasing pattern as 0.8 to 2.6 C, pH value decreased 0.5 to 0.2 in winter and increased 0.4 to 0.6 in summer. Dissolved oxygen level in lake showed a decreasing trend as 0.7 to 0.4mg/l in winter months. The mesotrophic nature of Bhimtal lake is changing towards eutrophic conditions and contributed for decreasing biodiversity. The aquatic biodiversity of Bhimtal lake consisted mainly phytoplankton, zooplankton, benthos and fish species. In the present study, a total of 5 groups of phytoplankton, 3 groups of zooplankton, 11 groups of benthos and 15 fish species were recorded from Bhimtal lake. The comparative data of biodiversity of Bhimtal lake since January, 2000 indicated the changing pattern of phytoplankton biomass were decreasing as 1.99 and 1.08% of Chlorophyceae and Bacilleriophyceae families respectively. The biomass of Cynophyceae was increasing as 0.45% and contributing the algal blooms during summer season in lake. The biomass of zooplankton and benthos were found decreasing in winter season and increasing during summer season. The endemic fish species (18 no.) were found in year 2000-05, as while the fish species (15 no.) were recorded in present study. The relative fecundity of major fish species were observed decreasing trends during their breeding periods in lake. The natural and anthropogenic factors were identified as ecological threats for existing aquatic biodiversity of Bhimtal lake. The present research paper emphasized on the effect of changing pattern of different climatic variables on species composition, biomass of phytoplankton, zooplankton, benthos, and fishes in Bhimtal lake of Kumaun region. The present research data will be contributed significantly to assess the changing pattern of aquatic biodiversity and productivity of Bhimtal lake with different time scale.

Keywords: aquatic biodiversity, Bhimtal lake, climate change, lake ecology

Procedia PDF Downloads 209
24553 Stress Concentration and Strength Prediction of Carbon/Epoxy Composites

Authors: Emre Ozaslan, Bulent Acar, Mehmet Ali Guler

Abstract:

Unidirectional composites are very popular structural materials used in aerospace, marine, energy and automotive industries thanks to their superior material properties. However, the mechanical behavior of composite materials is more complicated than isotropic materials because of their anisotropic nature. Also, a stress concentration availability on the structure, like a hole, makes the problem further complicated. Therefore, enormous number of tests require to understand the mechanical behavior and strength of composites which contain stress concentration. Accurate finite element analysis and analytical models enable to understand mechanical behavior and predict the strength of composites without enormous number of tests which cost serious time and money. In this study, unidirectional Carbon/Epoxy composite specimens with central circular hole were investigated in terms of stress concentration factor and strength prediction. The composite specimens which had different specimen wide (W) to hole diameter (D) ratio were tested to investigate the effect of hole size on the stress concentration and strength. Also, specimens which had same specimen wide to hole diameter ratio, but varied sizes were tested to investigate the size effect. Finite element analysis was performed to determine stress concentration factor for all specimen configurations. For quasi-isotropic laminate, it was found that the stress concentration factor increased approximately %15 with decreasing of W/D ratio from 6 to 3. Point stress criteria (PSC), inherent flaw method and progressive failure analysis were compared in terms of predicting the strength of specimens. All methods could predict the strength of specimens with maximum %8 error. PSC was better than other methods for high values of W/D ratio, however, inherent flaw method was successful for low values of W/D. Also, it is seen that increasing by 4 times of the W/D ratio rises the failure strength of composite specimen as %62.4. For constant W/D ratio specimens, all the strength prediction methods were more successful for smaller size specimens than larger ones. Increasing the specimen width and hole diameter together by 2 times reduces the specimen failure strength as %13.2.

Keywords: failure, strength, stress concentration, unidirectional composites

Procedia PDF Downloads 146
24552 Data Stream Association Rule Mining with Cloud Computing

Authors: B. Suraj Aravind, M. H. M. Krishna Prasad

Abstract:

There exist emerging applications of data streams that require association rule mining, such as network traffic monitoring, web click streams analysis, sensor data, data from satellites etc. Data streams typically arrive continuously in high speed with huge amount and changing data distribution. This raises new issues that need to be considered when developing association rule mining techniques for stream data. This paper proposes to introduce an improved data stream association rule mining algorithm by eliminating the limitation of resources. For this, the concept of cloud computing is used. Inclusion of this may lead to additional unknown problems which needs further research.

Keywords: data stream, association rule mining, cloud computing, frequent itemsets

Procedia PDF Downloads 490
24551 Prediction of Finned Projectile Aerodynamics Using a Lattice-Boltzmann Method CFD Solution

Authors: Zaki Abiza, Miguel Chavez, David M. Holman, Ruddy Brionnaud

Abstract:

In this paper, the prediction of the aerodynamic behavior of the flow around a Finned Projectile will be validated using a Computational Fluid Dynamics (CFD) solution, XFlow, based on the Lattice-Boltzmann Method (LBM). XFlow is an innovative CFD software developed by Next Limit Dynamics. It is based on a state-of-the-art Lattice-Boltzmann Method which uses a proprietary particle-based kinetic solver and a LES turbulent model coupled with the generalized law of the wall (WMLES). The Lattice-Boltzmann method discretizes the continuous Boltzmann equation, a transport equation for the particle probability distribution function. From the Boltzmann transport equation, and by means of the Chapman-Enskog expansion, the compressible Navier-Stokes equations can be recovered. However to simulate compressible flows, this method has a Mach number limitation because of the lattice discretization. Thanks to this flexible particle-based approach the traditional meshing process is avoided, the discretization stage is strongly accelerated reducing engineering costs, and computations on complex geometries are affordable in a straightforward way. The projectile that will be used in this work is the Army-Navy Basic Finned Missile (ANF) with a caliber of 0.03 m. The analysis will consist in varying the Mach number from M=0.5 comparing the axial force coefficient, normal force slope coefficient and the pitch moment slope coefficient of the Finned Projectile obtained by XFlow with the experimental data. The slope coefficients will be obtained using finite difference techniques in the linear range of the polar curve. The aim of such an analysis is to find out the limiting Mach number value starting from which the effects of high fluid compressibility (related to transonic flow regime) lead the XFlow simulations to differ from the experimental results. This will allow identifying the critical Mach number which limits the validity of the isothermal formulation of XFlow and beyond which a fully compressible solver implementing a coupled momentum-energy equations would be required.

Keywords: CFD, computational fluid dynamics, drag, finned projectile, lattice-boltzmann method, LBM, lift, mach, pitch

Procedia PDF Downloads 406
24550 Predicting Stack Overflow Accepted Answers Using Features and Models with Varying Degrees of Complexity

Authors: Osayande Pascal Omondiagbe, Sherlock a Licorish

Abstract:

Stack Overflow is a popular community question and answer portal which is used by practitioners to solve technology-related challenges during software development. Previous studies have shown that this forum is becoming a substitute for official software programming languages documentation. While tools have looked to aid developers by presenting interfaces to explore Stack Overflow, developers often face challenges searching through many possible answers to their questions, and this extends the development time. To this end, researchers have provided ways of predicting acceptable Stack Overflow answers by using various modeling techniques. However, less interest is dedicated to examining the performance and quality of typically used modeling methods, and especially in relation to models’ and features’ complexity. Such insights could be of practical significance to the many practitioners that use Stack Overflow. This study examines the performance and quality of various modeling methods that are used for predicting acceptable answers on Stack Overflow, drawn from 2014, 2015 and 2016. Our findings reveal significant differences in models’ performance and quality given the type of features and complexity of models used. Researchers examining classifiers’ performance and quality and features’ complexity may leverage these findings in selecting suitable techniques when developing prediction models.

Keywords: feature selection, modeling and prediction, neural network, random forest, stack overflow

Procedia PDF Downloads 123
24549 Crack Opening Investigation in Fiberconcrete

Authors: Arturs Macanovskis, Vitalijs Lusis, Andrejs Krasnikovs

Abstract:

Work has three stages. In the first stage was examined pull-out process for steel fiber was embedded into a concrete by one end and was pulled out of concrete under the angle to pulling out force direction. Angle was varied. Length of steel fiber was 26 mm, diameter 0.5 mm. On the obtained force- displacement diagrams were observed jumps. For such mechanical behavior explanation, fiber channel in concrete surface microscopical experimental investigation, using microscope KEYENCE VHX2000, was performed. Surface of fiber channel in concrete matrix after pull-out test (fiber angle to pulling out force direction 70°). At the second stage were obtained diagrams for load- crack opening displacement for breaking homogeneously reinforced and layered fiber concrete prisms (with dimensions 10x10x40 cm) subjected to 4-point bending. After testing was analyzed main crack. On the main crack’s both surfaces were recognized all pulled out fibers their locations, angles to crack surface and lengths of pull-out fibers parts. At the third stage elaborated prediction model for the fiber-concrete beam, failure under bending, using the following data: a) diagrams for fibers pulling out at different angles; b) experimental data about steel-straight fibers locations in the main crack.

Keywords: fiberconcrete, pull-out, fiber channel, layered fiberconcrete

Procedia PDF Downloads 435
24548 Suitable Site Selection of Small Dams Using Geo-Spatial Technique: A Case Study of Dadu Tehsil, Sindh

Authors: Zahid Khalil, Saad Ul Haque, Asif Khan

Abstract:

Decision making about identifying suitable sites for any project by considering different parameters is difficult. Using GIS and Multi-Criteria Analysis (MCA) can make it easy for those projects. This technology has proved to be an efficient and adequate in acquiring the desired information. In this study, GIS and MCA were employed to identify the suitable sites for small dams in Dadu Tehsil, Sindh. The GIS software is used to create all the spatial parameters for the analysis. The parameters that derived are slope, drainage density, rainfall, land use / land cover, soil groups, Curve Number (CN) and runoff index with a spatial resolution of 30m. The data used for deriving above layers include 30-meter resolution SRTM DEM, Landsat 8 imagery, and rainfall from National Centre of Environment Prediction (NCEP) and soil data from World Harmonized Soil Data (WHSD). Land use/Land cover map is derived from Landsat 8 using supervised classification. Slope, drainage network and watershed are delineated by terrain processing of DEM. The Soil Conservation Services (SCS) method is implemented to estimate the surface runoff from the rainfall. Prior to this, SCS-CN grid is developed by integrating the soil and land use/land cover raster. These layers with some technical and ecological constraints are assigned weights on the basis of suitability criteria. The pairwise comparison method, also known as Analytical Hierarchy Process (AHP) is taken into account as MCA for assigning weights on each decision element. All the parameters and group of parameters are integrated using weighted overlay in GIS environment to produce suitable sites for the Dams. The resultant layer is then classified into four classes namely, best suitable, suitable, moderate and less suitable. This study reveals a contribution to decision-making about suitable sites analysis for small dams using geospatial data with minimal amount of ground data. This suitability maps can be helpful for water resource management organizations in determination of feasible rainwater harvesting structures (RWH).

Keywords: Remote sensing, GIS, AHP, RWH

Procedia PDF Downloads 374
24547 A Comprehensive Survey and Improvement to Existing Privacy Preserving Data Mining Techniques

Authors: Tosin Ige

Abstract:

Ethics must be a condition of the world, like logic. (Ludwig Wittgenstein, 1889-1951). As important as data mining is, it possess a significant threat to ethics, privacy, and legality, since data mining makes it difficult for an individual or consumer (in the case of a company) to control the accessibility and usage of his data. This research focuses on Current issues and the latest research and development on Privacy preserving data mining methods as at year 2022. It also discusses some advances in those techniques while at the same time highlighting and providing a new technique as a solution to an existing technique of privacy preserving data mining methods. This paper also bridges the wide gap between Data mining and the Web Application Programing Interface (web API), where research is urgently needed for an added layer of security in data mining while at the same time introducing a seamless and more efficient way of data mining.

Keywords: data, privacy, data mining, association rule, privacy preserving, mining technique

Procedia PDF Downloads 153
24546 Big Data: Concepts, Technologies and Applications in the Public Sector

Authors: A. Alexandru, C. A. Alexandru, D. Coardos, E. Tudora

Abstract:

Big Data (BD) is associated with a new generation of technologies and architectures which can harness the value of extremely large volumes of very varied data through real time processing and analysis. It involves changes in (1) data types, (2) accumulation speed, and (3) data volume. This paper presents the main concepts related to the BD paradigm, and introduces architectures and technologies for BD and BD sets. The integration of BD with the Hadoop Framework is also underlined. BD has attracted a lot of attention in the public sector due to the newly emerging technologies that allow the availability of network access. The volume of different types of data has exponentially increased. Some applications of BD in the public sector in Romania are briefly presented.

Keywords: big data, big data analytics, Hadoop, cloud

Procedia PDF Downloads 295