Search results for: bagging ensemble methods
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 15407

Search results for: bagging ensemble methods

15317 Scalable Learning of Tree-Based Models on Sparsely Representable Data

Authors: Fares Hedayatit, Arnauld Joly, Panagiotis Papadimitriou

Abstract:

Many machine learning tasks such as text annotation usually require training over very big datasets, e.g., millions of web documents, that can be represented in a sparse input space. State-of the-art tree-based ensemble algorithms cannot scale to such datasets, since they include operations whose running time is a function of the input space size rather than a function of the non-zero input elements. In this paper, we propose an efficient splitting algorithm to leverage input sparsity within decision tree methods. Our algorithm improves training time over sparse datasets by more than two orders of magnitude and it has been incorporated in the current version of scikit-learn.org, the most popular open source Python machine learning library.

Keywords: big data, sparsely representable data, tree-based models, scalable learning

Procedia PDF Downloads 263
15316 Accelerating Quantum Chemistry Calculations: Machine Learning for Efficient Evaluation of Electron-Repulsion Integrals

Authors: Nishant Rodrigues, Nicole Spanedda, Chilukuri K. Mohan, Arindam Chakraborty

Abstract:

A crucial objective in quantum chemistry is the computation of the energy levels of chemical systems. This task requires electron-repulsion integrals as inputs, and the steep computational cost of evaluating these integrals poses a major numerical challenge in efficient implementation of quantum chemical software. This work presents a moment-based machine-learning approach for the efficient evaluation of electron-repulsion integrals. These integrals were approximated using linear combinations of a small number of moments. Machine learning algorithms were applied to estimate the coefficients in the linear combination. A random forest approach was used to identify promising features using a recursive feature elimination approach, which performed best for learning the sign of each coefficient but not the magnitude. A neural network with two hidden layers were then used to learn the coefficient magnitudes along with an iterative feature masking approach to perform input vector compression, identifying a small subset of orbitals whose coefficients are sufficient for the quantum state energy computation. Finally, a small ensemble of neural networks (with a median rule for decision fusion) was shown to improve results when compared to a single network.

Keywords: quantum energy calculations, atomic orbitals, electron-repulsion integrals, ensemble machine learning, random forests, neural networks, feature extraction

Procedia PDF Downloads 113
15315 Production and Mechanical Characterization of Ballistic Thermoplastic Composite Materials

Authors: D. Korsacilar, C. Atas

Abstract:

In this study, first thermoplastic composite materials/plates that have high ballistic impact resistance were produced. For this purpose, the thermoplastic prepreg and the vacuum bagging technique were used to produce a composite material. Thermoplastic prepregs (resin-impregnated fiber) that are supplied ready to be used, namely high-density polyethylene (HDPE) was chosen as matrix and unidirectional glass fiber was used as reinforcement. In order to compare the fiber configuration effect on mechanical properties, unidirectional and biaxial prepregs were used. Then the microstructural properties of the composites were investigated with scanning electron microscopy (SEM) analysis. Impact properties of the composites were examined by Charpy impact test and tensile mechanical tests and then the effects of ultraviolet irradiation were investigated on mechanical performance.

Keywords: ballistic, composite, thermoplastic, prepreg

Procedia PDF Downloads 442
15314 Dynamic Gabor Filter Facial Features-Based Recognition of Emotion in Video Sequences

Authors: T. Hari Prasath, P. Ithaya Rani

Abstract:

In the world of visual technology, recognizing emotions from the face images is a challenging task. Several related methods have not utilized the dynamic facial features effectively for high performance. This paper proposes a method for emotions recognition using dynamic facial features with high performance. Initially, local features are captured by Gabor filter with different scale and orientations in each frame for finding the position and scale of face part from different backgrounds. The Gabor features are sent to the ensemble classifier for detecting Gabor facial features. The region of dynamic features is captured from the Gabor facial features in the consecutive frames which represent the dynamic variations of facial appearances. In each region of dynamic features is normalized using Z-score normalization method which is further encoded into binary pattern features with the help of threshold values. The binary features are passed to Multi-class AdaBoost classifier algorithm with the well-trained database contain happiness, sadness, surprise, fear, anger, disgust, and neutral expressions to classify the discriminative dynamic features for emotions recognition. The developed method is deployed on the Ryerson Multimedia Research Lab and Cohn-Kanade databases and they show significant performance improvement owing to their dynamic features when compared with the existing methods.

Keywords: detecting face, Gabor filter, multi-class AdaBoost classifier, Z-score normalization

Procedia PDF Downloads 278
15313 Spectrogram Pre-Processing to Improve Isotopic Identification to Discriminate Gamma and Neutrons Sources

Authors: Mustafa Alhamdi

Abstract:

Industrial application to classify gamma rays and neutron events is investigated in this study using deep machine learning. The identification using a convolutional neural network and recursive neural network showed a significant improvement in predication accuracy in a variety of applications. The ability to identify the isotope type and activity from spectral information depends on feature extraction methods, followed by classification. The features extracted from the spectrum profiles try to find patterns and relationships to present the actual spectrum energy in low dimensional space. Increasing the level of separation between classes in feature space improves the possibility to enhance classification accuracy. The nonlinear nature to extract features by neural network contains a variety of transformation and mathematical optimization, while principal component analysis depends on linear transformations to extract features and subsequently improve the classification accuracy. In this paper, the isotope spectrum information has been preprocessed by finding the frequencies components relative to time and using them as a training dataset. Fourier transform implementation to extract frequencies component has been optimized by a suitable windowing function. Training and validation samples of different isotope profiles interacted with CdTe crystal have been simulated using Geant4. The readout electronic noise has been simulated by optimizing the mean and variance of normal distribution. Ensemble learning by combing voting of many models managed to improve the classification accuracy of neural networks. The ability to discriminate gamma and neutron events in a single predication approach using deep machine learning has shown high accuracy using deep learning. The paper findings show the ability to improve the classification accuracy by applying the spectrogram preprocessing stage to the gamma and neutron spectrums of different isotopes. Tuning deep machine learning models by hyperparameter optimization of neural network models enhanced the separation in the latent space and provided the ability to extend the number of detected isotopes in the training database. Ensemble learning contributed significantly to improve the final prediction.

Keywords: machine learning, nuclear physics, Monte Carlo simulation, noise estimation, feature extraction, classification

Procedia PDF Downloads 150
15312 Credit Card Fraud Detection with Ensemble Model: A Meta-Heuristic Approach

Authors: Gong Zhilin, Jing Yang, Jian Yin

Abstract:

The purpose of this paper is to develop a novel system for credit card fraud detection based on sequential modeling of data using hybrid deep learning models. The projected model encapsulates five major phases are pre-processing, imbalance-data handling, feature extraction, optimal feature selection, and fraud detection with an ensemble classifier. The collected raw data (input) is pre-processed to enhance the quality of the data through alleviation of the missing data, noisy data as well as null values. The pre-processed data are class imbalanced in nature, and therefore they are handled effectively with the K-means clustering-based SMOTE model. From the balanced class data, the most relevant features like improved Principal Component Analysis (PCA), statistical features (mean, median, standard deviation) and higher-order statistical features (skewness and kurtosis). Among the extracted features, the most optimal features are selected with the Self-improved Arithmetic Optimization Algorithm (SI-AOA). This SI-AOA model is the conceptual improvement of the standard Arithmetic Optimization Algorithm. The deep learning models like Long Short-Term Memory (LSTM), Convolutional Neural Network (CNN), and optimized Quantum Deep Neural Network (QDNN). The LSTM and CNN are trained with the extracted optimal features. The outcomes from LSTM and CNN will enter as input to optimized QDNN that provides the final detection outcome. Since the QDNN is the ultimate detector, its weight function is fine-tuned with the Self-improved Arithmetic Optimization Algorithm (SI-AOA).

Keywords: credit card, data mining, fraud detection, money transactions

Procedia PDF Downloads 130
15311 Investigation of Dynamic Mechanical Properties of Jute/Carbon Reinforced Composites

Authors: H. Sezgin, O. B. Berkalp, R. Mishra, J. Militky

Abstract:

In the last few decades, due to their advanced properties, there has been an increasing interest in hybrid composite materials. In this study, the effect of different stacking sequences of jute and carbon fabric plies on dynamic mechanical properties of composite laminates were investigated. Vacuum bagging system was used to fabricate the composite samples. Each composite laminate was reinforced with two plies of jute fabric and two plies of carbon fabric by varying the position of layers. Dynamic mechanical analyzer (DMA) was used to examine the dynamic mechanical properties of composite laminates with increasing temperature. Results showed that the composite sample, which has carbon fabric at the outer layers, has the highest storage and loss modulus. Besides, it was observed that glass transition temperature (Tg) of samples are close to each other and at about 75 °C.

Keywords: differential scanning calorimetry dynamic mechanical analysis, textile reinforced composites, thermogravimetric analysis

Procedia PDF Downloads 303
15310 Short-Term Forecast of Wind Turbine Production with Machine Learning Methods: Direct Approach and Indirect Approach

Authors: Mamadou Dione, Eric Matzner-lober, Philippe Alexandre

Abstract:

The Energy Transition Act defined by the French State has precise implications on Renewable Energies, in particular on its remuneration mechanism. Until then, a purchase obligation contract permitted the sale of wind-generated electricity at a fixed rate. Tomorrow, it will be necessary to sell this electricity on the Market (at variable rates) before obtaining additional compensation intended to reduce the risk. This sale on the market requires to announce in advance (about 48 hours before) the production that will be delivered on the network, so to be able to predict (in the short term) this production. The fundamental problem remains the variability of the Wind accentuated by the geographical situation. The objective of the project is to provide, every day, short-term forecasts (48-hour horizon) of wind production using weather data. The predictions of the GFS model and those of the ECMWF model are used as explanatory variables. The variable to be predicted is the production of a wind farm. We do two approaches: a direct approach that predicts wind generation directly from weather data, and an integrated approach that estimâtes wind from weather data and converts it into wind power by power curves. We used machine learning techniques to predict this production. The models tested are random forests, CART + Bagging, CART + Boosting, SVM (Support Vector Machine). The application is made on a wind farm of 22MW (11 wind turbines) of the Compagnie du Vent (that became Engie Green France). Our results are very conclusive compared to the literature.

Keywords: forecast aggregation, machine learning, spatio-temporal dynamics modeling, wind power forcast

Procedia PDF Downloads 217
15309 Tectonics Theory and an Example of Its Application in Sustainable Contemporary Architecture

Authors: Mafalda Fabiene Ferreira Pantoja, Luciana Lins Do Amaral Sobreira

Abstract:

The present work seeks to exemplify the precepts of the sustainable tectonics, through a built work. This is the architectural group ‘Chácara dos Professores’ in Brasília, Federal District, Brazil. Such an architectural ensemble allows elucidating vernacular constructive techniques that can still be used in contemporary times, on the way to more sustainable constructions. It seeks to show the principles of sustainable tectonics as a design tool, seeking an architecture that can give answers to contemporary issues and that concern caring for the planet. Firstly, a brief investigation will be made about sustainable tectonics, its application in architecture, and later, to correlate such precepts as a tool for reflections and sustainable projects in the contemporary world. For this, will be exemplified, by means of a visit in loco, the construction methods used in the ‘Chácara dos Professores’ project and its results in the constructed work. The goal is to draw a theoretical reflection on the precepts of tectonics, and to show if such precepts are able to enhance contemporary architecture and help it move towards sustainability, seeking projects increasingly consistent with the reality of today.

Keywords: contemporary architecture, constructive techniques, sustainability, tectonics

Procedia PDF Downloads 249
15308 Evaluation of the Effect of Learning Disabilities and Accommodations on the Prediction of the Exam Performance: Ordinal Decision-Tree Algorithm

Authors: G. Singer, M. Golan

Abstract:

Providing students with learning disabilities (LD) with extra time to grant them equal access to the exam is a necessary but insufficient condition to compensate for their LD; there should also be a clear indication that the additional time was actually used. For example, if students with LD use more time than students without LD and yet receive lower grades, this may indicate that a different accommodation is required. If they achieve higher grades but use the same amount of time, then the effectiveness of the accommodation has not been demonstrated. The main goal of this study is to evaluate the effect of including parameters related to LD and extended exam time, along with other commonly-used characteristics (e.g., student background and ability measures such as high-school grades), on the ability of ordinal decision-tree algorithms to predict exam performance. We use naturally-occurring data collected from hundreds of undergraduate engineering students. The sub-goals are i) to examine the improvement in prediction accuracy when the indicator of exam performance includes 'actual time used' in addition to the conventional indicator (exam grade) employed in most research; ii) to explore the effectiveness of extended exam time on exam performance for different courses and for LD students with different profiles (i.e., sets of characteristics). This is achieved by using the patterns (i.e., subgroups) generated by the algorithms to identify pairs of subgroups that differ in just one characteristic (e.g., course or type of LD) but have different outcomes in terms of exam performance (grade and time used). Since grade and time used to exhibit an ordering form, we propose a method based on ordinal decision-trees, which applies a weighted information-gain ratio (WIGR) measure for selecting the classifying attributes. Unlike other known ordinal algorithms, our method does not assume monotonicity in the data. The proposed WIGR is an extension of an information-theoretic measure, in the sense that it adjusts to the case of an ordinal target and takes into account the error severity between two different target classes. Specifically, we use ordinal C4.5, random-forest, and AdaBoost algorithms, as well as an ensemble technique composed of ordinal and non-ordinal classifiers. Firstly, we find that the inclusion of LD and extended exam-time parameters improves prediction of exam performance (compared to specifications of the algorithms that do not include these variables). Secondly, when the indicator of exam performance includes 'actual time used' together with grade (as opposed to grade only), the prediction accuracy improves. Thirdly, our subgroup analyses show clear differences in the effect of extended exam time on exam performance among different courses and different student profiles. From a methodological perspective, we find that the ordinal decision-tree based algorithms outperform their conventional, non-ordinal counterparts. Further, we demonstrate that the ensemble-based approach leverages the strengths of each type of classifier (ordinal and non-ordinal) and yields better performance than each classifier individually.

Keywords: actual exam time usage, ensemble learning, learning disabilities, ordinal classification, time extension

Procedia PDF Downloads 100
15307 Climate Change Effects in a Mediterranean Island and Streamflow Changes for a Small Basin Using Euro-Cordex Regional Climate Simulations Combined with the SWAT Model

Authors: Pier Andrea Marras, Daniela Lima, Pedro Matos Soares, Rita Maria Cardoso, Daniela Medas, Elisabetta Dore, Giovanni De Giudici

Abstract:

Climate change effects on the hydrologic cycle are the main concern for the evaluation of water management strategies. Climate models project scenarios of precipitation changes in the future, considering greenhouse emissions. In this study, the EURO-CORDEX (European Coordinated Regional Downscaling Experiment) climate models were first evaluated in a Mediterranean island (Sardinia) against observed precipitation for a historical reference period (1976-2005). A weighted multi-model ensemble (ENS) was built, weighting the single models based on their ability to reproduce observed rainfall. Future projections (2071-2100) were carried out using the 8.5 RCP emissions scenario to evaluate changes in precipitations. ENS was then used as climate forcing for the SWAT model (Soil and Water Assessment Tool), with the aim to assess the consequences of such projected changes on streamflow and runoff of two small catchments located in the South-West Sardinia. Results showed that a decrease of mean rainfall values, up to -25 % at yearly scale, is expected for the future, along with an increase of extreme precipitation events. Particularly in the eastern and southern areas, extreme events are projected to increase by 30%. Such changes reflect on the hydrologic cycle with a decrease of mean streamflow and runoff, except in spring, when runoff is projected to increase by 20-30%. These results stress that the Mediterranean is a hotspot for climate change, and the use of model tools can provide very useful information to adopt water and land management strategies to deal with such changes.

Keywords: EURO-CORDEX, climate change, hydrology, SWAT model, Sardinia, multi-model ensemble

Procedia PDF Downloads 213
15306 Molecular Dynamic Simulation of CO2 Absorption into Mixed Aqueous Solutions MDEA/PZ

Authors: N. Harun, E. E. Masiren, W. H. W. Ibrahim, F. Adam

Abstract:

Amine absorption process is an approach for mitigation of CO2 from flue gas that produces from power plant. This process is the most common system used in chemical and oil industries for gas purification to remove acid gases. On the challenges of this process is high energy requirement for solvent regeneration to release CO2. In the past few years, mixed alkanolamines have received increasing attention. In most cases, the mixtures contain N-methyldiethanolamine (MDEA) as the base amine with the addition of one or two more reactive amines such as PZ. The reason for the application of such blend amine is to take advantage of high reaction rate of CO2 with the activator combined with the advantages of the low heat of regeneration of MDEA. Several experimental and simulation studies have been undertaken to understand this process using blend MDEA/PZ solvent. Despite those studies, the mechanism of CO2 absorption into the aqueous MDEA is not well understood and available knowledge within the open literature is limited. The aim of this study is to investigate the intermolecular interaction of the blend MDEA/PZ using Molecular Dynamics (MD) simulation. MD simulation was run under condition 313K and 1 atm using NVE ensemble at 200ps and NVT ensemble at 1ns. The results were interpreted in term of Radial Distribution Function (RDF) analysis through two system of interest i.e binary and tertiary. The binary system will explain the interaction between amine and water molecule while tertiary system used to determine the interaction between the amine and CO2 molecule. For the binary system, it was observed that the –OH group of MDEA is more attracted to water molecule compared to –NH group of MDEA. The –OH group of MDEA can form the hydrogen bond with water that will assist the solubility of MDEA in water. The intermolecular interaction probability of –OH and –NH group of MDEA with CO2 in blended MDEA/PZ is higher than using single MDEA. This findings show that PZ molecule act as an activator to promote the intermolecular interaction between MDEA and CO2.Thus, blend of MDEA with PZ is expecting to increase the absorption rate of CO2 and reduce the heat regeneration requirement.

Keywords: amine absorption process, blend MDEA/PZ, CO2 capture, molecular dynamic simulation, radial distribution function

Procedia PDF Downloads 295
15305 Cryptography Based Authentication Methods

Authors: Mohammad A. Alia, Abdelfatah Aref Tamimi, Omaima N. A. Al-Allaf

Abstract:

This paper reviews a comparison study on the most common used authentication methods. Some of these methods are actually based on cryptography. In this study, we show the main cryptographic services. Also, this study presents a specific discussion about authentication service, since the authentication service is classified into several categorizes according to their methods. However, this study gives more about the real life example for each of the authentication methods. It talks about the simplest authentication methods as well about the available biometric authentication methods such as voice, iris, fingerprint, and face authentication.

Keywords: information security, cryptography, system access control, authentication, network security

Procedia PDF Downloads 470
15304 Multimodal Biometric Cryptography Based Authentication in Cloud Environment to Enhance Information Security

Authors: D. Pugazhenthi, B. Sree Vidya

Abstract:

Cloud computing is one of the emerging technologies that enables end users to use the services of cloud on ‘pay per usage’ strategy. This technology grows in a fast pace and so is its security threat. One among the various services provided by cloud is storage. In this service, security plays a vital factor for both authenticating legitimate users and protection of information. This paper brings in efficient ways of authenticating users as well as securing information on the cloud. Initial phase proposed in this paper deals with an authentication technique using multi-factor and multi-dimensional authentication system with multi-level security. Unique identification and slow intrusive formulates an advanced reliability on user-behaviour based biometrics than conventional means of password authentication. By biometric systems, the accounts are accessed only by a legitimate user and not by a nonentity. The biometric templates employed here do not include single trait but multiple, viz., iris and finger prints. The coordinating stage of the authentication system functions on Ensemble Support Vector Machine (SVM) and optimization by assembling weights of base SVMs for SVM ensemble after individual SVM of ensemble is trained by the Artificial Fish Swarm Algorithm (AFSA). Thus it helps in generating a user-specific secure cryptographic key of the multimodal biometric template by fusion process. Data security problem is averted and enhanced security architecture is proposed using encryption and decryption system with double key cryptography based on Fuzzy Neural Network (FNN) for data storing and retrieval in cloud computing . The proposing scheme aims to protect the records from hackers by arresting the breaking of cipher text to original text. This improves the authentication performance that the proposed double cryptographic key scheme is capable of providing better user authentication and better security which distinguish between the genuine and fake users. Thus, there are three important modules in this proposed work such as 1) Feature extraction, 2) Multimodal biometric template generation and 3) Cryptographic key generation. The extraction of the feature and texture properties from the respective fingerprint and iris images has been done initially. Finally, with the help of fuzzy neural network and symmetric cryptography algorithm, the technique of double key encryption technique has been developed. As the proposed approach is based on neural networks, it has the advantage of not being decrypted by the hacker even though the data were hacked already. The results prove that authentication process is optimal and stored information is secured.

Keywords: artificial fish swarm algorithm (AFSA), biometric authentication, decryption, encryption, fingerprint, fusion, fuzzy neural network (FNN), iris, multi-modal, support vector machine classification

Procedia PDF Downloads 259
15303 An Adaptive Hybrid Surrogate-Assisted Particle Swarm Optimization Algorithm for Expensive Structural Optimization

Authors: Xiongxiong You, Zhanwen Niu

Abstract:

Choosing an appropriate surrogate model plays an important role in surrogates-assisted evolutionary algorithms (SAEAs) since there are many types and different kernel functions in the surrogate model. In this paper, an adaptive selection of the best suitable surrogate model method is proposed to solve different kinds of expensive optimization problems. Firstly, according to the prediction residual error sum of square (PRESS) and different model selection strategies, the excellent individual surrogate models are integrated into multiple ensemble models in each generation. Then, based on the minimum root of mean square error (RMSE), the best suitable surrogate model is selected dynamically. Secondly, two methods with dynamic number of models and selection strategies are designed, which are used to show the influence of the number of individual models and selection strategy. Finally, some compared studies are made to deal with several commonly used benchmark problems, as well as a rotor system optimization problem. The results demonstrate the accuracy and robustness of the proposed method.

Keywords: adaptive selection, expensive optimization, rotor system, surrogates assisted evolutionary algorithms

Procedia PDF Downloads 141
15302 Gradient Boosted Trees on Spark Platform for Supervised Learning in Health Care Big Data

Authors: Gayathri Nagarajan, L. D. Dhinesh Babu

Abstract:

Health care is one of the prominent industries that generate voluminous data thereby finding the need of machine learning techniques with big data solutions for efficient processing and prediction. Missing data, incomplete data, real time streaming data, sensitive data, privacy, heterogeneity are few of the common challenges to be addressed for efficient processing and mining of health care data. In comparison with other applications, accuracy and fast processing are of higher importance for health care applications as they are related to the human life directly. Though there are many machine learning techniques and big data solutions used for efficient processing and prediction in health care data, different techniques and different frameworks are proved to be effective for different applications largely depending on the characteristics of the datasets. In this paper, we present a framework that uses ensemble machine learning technique gradient boosted trees for data classification in health care big data. The framework is built on Spark platform which is fast in comparison with other traditional frameworks. Unlike other works that focus on a single technique, our work presents a comparison of six different machine learning techniques along with gradient boosted trees on datasets of different characteristics. Five benchmark health care datasets are considered for experimentation, and the results of different machine learning techniques are discussed in comparison with gradient boosted trees. The metric chosen for comparison is misclassification error rate and the run time of the algorithms. The goal of this paper is to i) Compare the performance of gradient boosted trees with other machine learning techniques in Spark platform specifically for health care big data and ii) Discuss the results from the experiments conducted on datasets of different characteristics thereby drawing inference and conclusion. The experimental results show that the accuracy is largely dependent on the characteristics of the datasets for other machine learning techniques whereas gradient boosting trees yields reasonably stable results in terms of accuracy without largely depending on the dataset characteristics.

Keywords: big data analytics, ensemble machine learning, gradient boosted trees, Spark platform

Procedia PDF Downloads 240
15301 Experimental Modeling of Spray and Water Sheet Formation Due to Wave Interactions with Vertical and Slant Bow-Shaped Model

Authors: Armin Bodaghkhani, Bruce Colbourne, Yuri S. Muzychka

Abstract:

The process of spray-cloud formation and flow kinematics produced from breaking wave impact on vertical and slant lab-scale bow-shaped models were experimentally investigated. Bubble Image Velocimetry (BIV) and Image Processing (IP) techniques were applied to study the various types of wave-model impacts. Different wave characteristics were generated in a tow tank to investigate the effects of wave characteristics, such as wave phase velocity, wave steepness on droplet velocities, and behavior of the process of spray cloud formation. The phase ensemble-averaged vertical velocity and turbulent intensity were computed. A high-speed camera and diffused LED backlights were utilized to capture images for further post processing. Various pressure sensors and capacitive wave probes were used to measure the wave impact pressure and the free surface profile at different locations of the model and wave-tank, respectively. Droplet sizes and velocities were measured using BIV and IP techniques to trace bubbles and droplets in order to measure their velocities and sizes by correlating the texture in these images. The impact pressure and droplet size distributions were compared to several previously experimental models, and satisfactory agreements were achieved. The distribution of droplets in front of both models are demonstrated. Due to the highly transient process of spray formation, the drag coefficient for several stages of this transient displacement for various droplet size ranges and different Reynolds number were calculated based on the ensemble average method. From the experimental results, the slant model produces less spray in comparison with the vertical model, and the droplet velocities generated from the wave impact with the slant model have a lower velocity as compared with the vertical model.

Keywords: spray charachteristics, droplet size and velocity, wave-body interactions, bubble image velocimetry, image processing

Procedia PDF Downloads 300
15300 Solar Power Forecasting for the Bidding Zones of the Italian Electricity Market with an Analog Ensemble Approach

Authors: Elena Collino, Dario A. Ronzio, Goffredo Decimi, Maurizio Riva

Abstract:

The rapid increase of renewable energy in Italy is led by wind and solar installations. The 2017 Italian energy strategy foresees a further development of these sustainable technologies, especially solar. This fact has resulted in new opportunities, challenges, and different problems to deal with. The growth of renewables allows to meet the European requirements regarding energy and environmental policy, but these types of sources are difficult to manage because they are intermittent and non-programmable. Operationally, these characteristics can lead to instability on the voltage profile and increasing uncertainty on energy reserve scheduling. The increasing renewable production must be considered with more and more attention especially by the Transmission System Operator (TSO). The TSO, in fact, every day provides orders on energy dispatch, once the market outcome has been determined, on extended areas, defined mainly on the basis of power transmission limitations. In Italy, six market zone are defined: Northern-Italy, Central-Northern Italy, Central-Southern Italy, Southern Italy, Sardinia, and Sicily. An accurate hourly renewable power forecasting for the day-ahead on these extended areas brings an improvement both in terms of dispatching and reserve management. In this study, an operational forecasting tool of the hourly solar output for the six Italian market zones is presented, and the performance is analysed. The implementation is carried out by means of a numerical weather prediction model, coupled with a statistical post-processing in order to derive the power forecast on the basis of the meteorological projection. The weather forecast is obtained from the limited area model RAMS on the Italian territory, initialized with IFS-ECMWF boundary conditions. The post-processing calculates the solar power production with the Analog Ensemble technique (AN). This statistical approach forecasts the production using a probability distribution of the measured production registered in the past when the weather scenario looked very similar to the forecasted one. The similarity is evaluated for the components of the solar radiation: global (GHI), diffuse (DIF) and direct normal (DNI) irradiation, together with the corresponding azimuth and zenith solar angles. These are, in fact, the main factors that affect the solar production. Considering that the AN performance is strictly related to the length and quality of the historical data a training period of more than one year has been used. The training set is made by historical Numerical Weather Prediction (NWP) forecasts at 12 UTC for the GHI, DIF and DNI variables over the Italian territory together with corresponding hourly measured production for each of the six zones. The AN technique makes it possible to estimate the aggregate solar production in the area, without information about the technologic characteristics of the all solar parks present in each area. Besides, this information is often only partially available. Every day, the hourly solar power forecast for the six Italian market zones is made publicly available through a website.

Keywords: analog ensemble, electricity market, PV forecast, solar energy

Procedia PDF Downloads 158
15299 FracXpert: Ensemble Machine Learning Approach for Localization and Classification of Bone Fractures in Cricket Athletes

Authors: Madushani Rodrigo, Banuka Athuraliya

Abstract:

In today's world of medical diagnosis and prediction, machine learning stands out as a strong tool, transforming old ways of caring for health. This study analyzes the use of machine learning in the specialized domain of sports medicine, with a focus on the timely and accurate detection of bone fractures in cricket athletes. Failure to identify bone fractures in real time can result in malunion or non-union conditions. To ensure proper treatment and enhance the bone healing process, accurately identifying fracture locations and types is necessary. When interpreting X-ray images, it relies on the expertise and experience of medical professionals in the identification process. Sometimes, radiographic images are of low quality, leading to potential issues. Therefore, it is necessary to have a proper approach to accurately localize and classify fractures in real time. The research has revealed that the optimal approach needs to address the stated problem and employ appropriate radiographic image processing techniques and object detection algorithms. These algorithms should effectively localize and accurately classify all types of fractures with high precision and in a timely manner. In order to overcome the challenges of misidentifying fractures, a distinct model for fracture localization and classification has been implemented. The research also incorporates radiographic image enhancement and preprocessing techniques to overcome the limitations posed by low-quality images. A classification ensemble model has been implemented using ResNet18 and VGG16. In parallel, a fracture segmentation model has been implemented using the enhanced U-Net architecture. Combining the results of these two implemented models, the FracXpert system can accurately localize exact fracture locations along with fracture types from the available 12 different types of fracture patterns, which include avulsion, comminuted, compressed, dislocation, greenstick, hairline, impacted, intraarticular, longitudinal, oblique, pathological, and spiral. This system will generate a confidence score level indicating the degree of confidence in the predicted result. Using ResNet18 and VGG16 architectures, the implemented fracture segmentation model, based on the U-Net architecture, achieved a high accuracy level of 99.94%, demonstrating its precision in identifying fracture locations. Simultaneously, the classification ensemble model achieved an accuracy of 81.0%, showcasing its ability to categorize various fracture patterns, which is instrumental in the fracture treatment process. In conclusion, FracXpert has become a promising ML application in sports medicine, demonstrating its potential to revolutionize fracture detection processes. By leveraging the power of ML algorithms, this study contributes to the advancement of diagnostic capabilities in cricket athlete healthcare, ensuring timely and accurate identification of bone fractures for the best treatment outcomes.

Keywords: multiclass classification, object detection, ResNet18, U-Net, VGG16

Procedia PDF Downloads 118
15298 Optimizing E-commerce Retention: A Detailed Study of Machine Learning Techniques for Churn Prediction

Authors: Saurabh Kumar

Abstract:

In the fiercely competitive landscape of e-commerce, understanding and mitigating customer churn has become paramount for sustainable business growth. This paper presents a thorough investigation into the application of machine learning techniques for churn prediction in e-commerce, aiming to provide actionable insights for businesses seeking to enhance customer retention strategies. We conduct a comparative study of various machine learning algorithms, including traditional statistical methods and ensemble techniques, leveraging a rich dataset sourced from Kaggle. Through rigorous evaluation, we assess the predictive performance, interpretability, and scalability of each method, elucidating their respective strengths and limitations in capturing the intricate dynamics of customer churn. We identified the XGBoost classifier to be the best performing. Our findings not only offer practical guidelines for selecting suitable modeling approaches but also contribute to the broader understanding of customer behavior in the e-commerce domain. Ultimately, this research equips businesses with the knowledge and tools necessary to proactively identify and address churn, thereby fostering long-term customer relationships and sustaining competitive advantage.

Keywords: customer churn, e-commerce, machine learning techniques, predictive performance, sustainable business growth

Procedia PDF Downloads 27
15297 A Single-Channel BSS-Based Method for Structural Health Monitoring of Civil Infrastructure under Environmental Variations

Authors: Yanjie Zhu, André Jesus, Irwanda Laory

Abstract:

Structural Health Monitoring (SHM), involving data acquisition, data interpretation and decision-making system aim to continuously monitor the structural performance of civil infrastructures under various in-service circumstances. The main value and purpose of SHM is identifying damages through data interpretation system. Research on SHM has been expanded in the last decades and a large volume of data is recorded every day owing to the dramatic development in sensor techniques and certain progress in signal processing techniques. However, efficient and reliable data interpretation for damage detection under environmental variations is still a big challenge. Structural damages might be masked because variations in measured data can be the result of environmental variations. This research reports a novel method based on single-channel Blind Signal Separation (BSS), which extracts environmental effects from measured data directly without any prior knowledge of the structure loading and environmental conditions. Despite the successful application in audio processing and bio-medical research fields, BSS has never been used to detect damage under varying environmental conditions. This proposed method optimizes and combines Ensemble Empirical Mode Decomposition (EEMD), Principal Component Analysis (PCA) and Independent Component Analysis (ICA) together to separate structural responses due to different loading conditions respectively from a single channel input signal. The ICA is applying on dimension-reduced output of EEMD. Numerical simulation of a truss bridge, inspired from New Joban Line Arakawa Railway Bridge, is used to validate this method. All results demonstrate that the single-channel BSS-based method can recover temperature effects from mixed structural response recorded by a single sensor with a convincing accuracy. This will be the foundation of further research on direct damage detection under varying environment.

Keywords: damage detection, ensemble empirical mode decomposition (EEMD), environmental variations, independent component analysis (ICA), principal component analysis (PCA), structural health monitoring (SHM)

Procedia PDF Downloads 304
15296 An Adaptive Oversampling Technique for Imbalanced Datasets

Authors: Shaukat Ali Shahee, Usha Ananthakumar

Abstract:

A data set exhibits class imbalance problem when one class has very few examples compared to the other class, and this is also referred to as between class imbalance. The traditional classifiers fail to classify the minority class examples correctly due to its bias towards the majority class. Apart from between-class imbalance, imbalance within classes where classes are composed of a different number of sub-clusters with these sub-clusters containing different number of examples also deteriorates the performance of the classifier. Previously, many methods have been proposed for handling imbalanced dataset problem. These methods can be classified into four categories: data preprocessing, algorithmic based, cost-based methods and ensemble of classifier. Data preprocessing techniques have shown great potential as they attempt to improve data distribution rather than the classifier. Data preprocessing technique handles class imbalance either by increasing the minority class examples or by decreasing the majority class examples. Decreasing the majority class examples lead to loss of information and also when minority class has an absolute rarity, removing the majority class examples is generally not recommended. Existing methods available for handling class imbalance do not address both between-class imbalance and within-class imbalance simultaneously. In this paper, we propose a method that handles between class imbalance and within class imbalance simultaneously for binary classification problem. Removing between class imbalance and within class imbalance simultaneously eliminates the biases of the classifier towards bigger sub-clusters by minimizing the error domination of bigger sub-clusters in total error. The proposed method uses model-based clustering to find the presence of sub-clusters or sub-concepts in the dataset. The number of examples oversampled among the sub-clusters is determined based on the complexity of sub-clusters. The method also takes into consideration the scatter of the data in the feature space and also adaptively copes up with unseen test data using Lowner-John ellipsoid for increasing the accuracy of the classifier. In this study, neural network is being used as this is one such classifier where the total error is minimized and removing the between-class imbalance and within class imbalance simultaneously help the classifier in giving equal weight to all the sub-clusters irrespective of the classes. The proposed method is validated on 9 publicly available data sets and compared with three existing oversampling techniques that rely on the spatial location of minority class examples in the euclidean feature space. The experimental results show the proposed method to be statistically significantly superior to other methods in terms of various accuracy measures. Thus the proposed method can serve as a good alternative to handle various problem domains like credit scoring, customer churn prediction, financial distress, etc., that typically involve imbalanced data sets.

Keywords: classification, imbalanced dataset, Lowner-John ellipsoid, model based clustering, oversampling

Procedia PDF Downloads 418
15295 A Hierarchical Bayesian Calibration of Data-Driven Models for Composite Laminate Consolidation

Authors: Nikolaos Papadimas, Joanna Bennett, Amir Sakhaei, Timothy Dodwell

Abstract:

Composite modeling of consolidation processes is playing an important role in the process and part design by indicating the formation of possible unwanted prior to expensive experimental iterative trial and development programs. Composite materials in their uncured state display complex constitutive behavior, which has received much academic interest, and this with different models proposed. Errors from modeling and statistical which arise from this fitting will propagate through any simulation in which the material model is used. A general hyperelastic polynomial representation was proposed, which can be readily implemented in various nonlinear finite element packages. In our case, FEniCS was chosen. The coefficients are assumed uncertain, and therefore the distribution of parameters learned using Markov Chain Monte Carlo (MCMC) methods. In engineering, the approach often followed is to select a single set of model parameters, which on average, best fits a set of experiments. There are good statistical reasons why this is not a rigorous approach to take. To overcome these challenges, A hierarchical Bayesian framework was proposed in which population distribution of model parameters is inferred from an ensemble of experiments tests. The resulting sampled distribution of hyperparameters is approximated using Maximum Entropy methods so that the distribution of samples can be readily sampled when embedded within a stochastic finite element simulation. The methodology is validated and demonstrated on a set of consolidation experiments of AS4/8852 with various stacking sequences. The resulting distributions are then applied to stochastic finite element simulations of the consolidation of curved parts, leading to a distribution of possible model outputs. With this, the paper, as far as the authors are aware, represents the first stochastic finite element implementation in composite process modelling.

Keywords: data-driven , material consolidation, stochastic finite elements, surrogate models

Procedia PDF Downloads 145
15294 Towards Human-Interpretable, Automated Learning of Feedback Control for the Mixing Layer

Authors: Hao Li, Guy Y. Cornejo Maceda, Yiqing Li, Jianguo Tan, Marek Morzynski, Bernd R. Noack

Abstract:

We propose an automated analysis of the flow control behaviour from an ensemble of control laws and associated time-resolved flow snapshots. The input may be the rich database of machine learning control (MLC) optimizing a feedback law for a cost function in the plant. The proposed methodology provides (1) insights into the control landscape, which maps control laws to performance, including extrema and ridge-lines, (2) a catalogue of representative flow states and their contribution to cost function for investigated control laws and (3) visualization of the dynamics. Key enablers are classification and feature extraction methods of machine learning. The analysis is successfully applied to the stabilization of a mixing layer with sensor-based feedback driving an upstream actuator. The fluctuation energy is reduced by 26%. The control replaces unforced Kelvin-Helmholtz vortices with subsequent vortex pairing by higher-frequency Kelvin-Helmholtz structures of lower energy. These efforts target a human interpretable, fully automated analysis of MLC identifying qualitatively different actuation regimes, distilling corresponding coherent structures, and developing a digital twin of the plant.

Keywords: machine learning control, mixing layer, feedback control, model-free control

Procedia PDF Downloads 223
15293 Transportation Mode Classification Using GPS Coordinates and Recurrent Neural Networks

Authors: Taylor Kolody, Farkhund Iqbal, Rabia Batool, Benjamin Fung, Mohammed Hussaeni, Saiqa Aleem

Abstract:

The rising threat of climate change has led to an increase in public awareness and care about our collective and individual environmental impact. A key component of this impact is our use of cars and other polluting forms of transportation, but it is often difficult for an individual to know how severe this impact is. While there are applications that offer this feedback, they require manual entry of what transportation mode was used for a given trip, which can be burdensome. In order to alleviate this shortcoming, a data from the 2016 TRIPlab datasets has been used to train a variety of machine learning models to automatically recognize the mode of transportation. The accuracy of 89.6% is achieved using single deep neural network model with Gated Recurrent Unit (GRU) architecture applied directly to trip data points over 4 primary classes, namely walking, public transit, car, and bike. These results are comparable in accuracy to results achieved by others using ensemble methods and require far less computation when classifying new trips. The lack of trip context data, e.g., bus routes, bike paths, etc., and the need for only a single set of weights make this an appropriate methodology for applications hoping to reach a broad demographic and have responsive feedback.

Keywords: classification, gated recurrent unit, recurrent neural network, transportation

Procedia PDF Downloads 137
15292 A Survey of Response Generation of Dialogue Systems

Authors: Yifan Fan, Xudong Luo, Pingping Lin

Abstract:

An essential task in the field of artificial intelligence is to allow computers to interact with people through natural language. Therefore, researches such as virtual assistants and dialogue systems have received widespread attention from industry and academia. The response generation plays a crucial role in dialogue systems, so to push forward the research on this topic, this paper surveys various methods for response generation. We sort out these methods into three categories. First one includes finite state machine methods, framework methods, and instance methods. The second contains full-text indexing methods, ontology methods, vast knowledge base method, and some other methods. The third covers retrieval methods and generative methods. We also discuss some hybrid methods based knowledge and deep learning. We compare their disadvantages and advantages and point out in which ways these studies can be improved further. Our discussion covers some studies published in leading conferences such as IJCAI and AAAI in recent years.

Keywords: deep learning, generative, knowledge, response generation, retrieval

Procedia PDF Downloads 134
15291 Teaching and Learning Jazz Improvisation Using Bloom's Taxonomy of Learning Domains

Authors: Graham Wood

Abstract:

The 20th Century saw the introduction of many new approaches to music making, including the structured and academic study of jazz improvisation. The rise of many school and tertiary jazz programs was rapid and quickly spread around the globe in a matter of decades. It could be said that the curriculum taught in these new programs was often developed in an ad-hoc manner due to the lack of written literature in this new and rapidly expanding area and the vastly different pedagogical principles when compared to classical music education that was prevalent in school and tertiary programs. There is widespread information regarding the theory and techniques used by jazz improvisers, but methods to practice these concepts in order to achieve the best outcomes for students and teachers is much harder to find. This research project explores the authors’ experiences as a studio jazz piano teacher, ensemble teacher and classroom improvisation lecturer over fifteen years and suggests an alignment with Bloom’s taxonomy of learning domains. This alignment categorizes the different tasks that need to be taught and practiced in order for the teacher and the student to devise a well balanced and effective practice routine and for the teacher to develop an effective teaching program. These techniques have been very useful to the teacher and the student to ensure that a good balance of cognitive, psychomotor and affective skills are taught to the students in a range of learning contexts.

Keywords: bloom, education, jazz, learning, music, teaching

Procedia PDF Downloads 256
15290 Development of a Biomechanical Method for Ergonomic Evaluation: Comparison with Observational Methods

Authors: M. Zare, S. Biau, M. Corq, Y. Roquelaure

Abstract:

A wide variety of observational methods have been developed to evaluate the ergonomic workloads in manufacturing. However, the precision and accuracy of these methods remain a subject of debate. The aims of this study were to develop biomechanical methods to evaluate ergonomic workloads and to compare them with observational methods. Two observational methods, i.e. SCANIA Ergonomic Standard (SES) and Rapid Upper Limb Assessment (RULA), were used to assess ergonomic workloads at two simulated workstations. They included four tasks such as tightening & loosening, attachment of tubes and strapping as well as other actions. Sensors were also used to measure biomechanical data (Inclinometers, Accelerometers, and Goniometers). Our findings showed that in assessment of some risk factors both RULA & SES were in agreement with the results of biomechanical methods. However, there was disagreement on neck and wrist postures. In conclusion, the biomechanical approach was more precise than observational methods, but some risk factors evaluated with observational methods were not measurable with the biomechanical techniques developed.

Keywords: ergonomic, observational method, biomechanical methods, workload

Procedia PDF Downloads 388
15289 A New Conjugate Gradient Method with Guaranteed Descent

Authors: B. Sellami, M. Belloufi

Abstract:

Conjugate gradient methods are an important class of methods for unconstrained optimization, especially for large-scale problems. Recently, they have been much studied. In this paper, we propose a new two-parameter family of conjugate gradient methods for unconstrained optimization. The two-parameter family of methods not only includes the already existing three practical nonlinear conjugate gradient methods, but also has other family of conjugate gradient methods as subfamily. The two-parameter family of methods with the Wolfe line search is shown to ensure the descent property of each search direction. Some general convergence results are also established for the two-parameter family of methods. The numerical results show that this method is efficient for the given test problems. In addition, the methods related to this family are uniformly discussed.

Keywords: unconstrained optimization, conjugate gradient method, line search, global convergence

Procedia PDF Downloads 452
15288 Predicting Aggregation Propensity from Low-Temperature Conformational Fluctuations

Authors: Hamza Javar Magnier, Robin Curtis

Abstract:

There have been rapid advances in the upstream processing of protein therapeutics, which has shifted the bottleneck to downstream purification and formulation. Finding liquid formulations with shelf lives of up to two years is increasingly difficult for some of the newer therapeutics, which have been engineered for activity, but their formulations are often viscous, can phase separate, and have a high propensity for irreversible aggregation1. We explore means to develop improved predictive ability from a better understanding of how protein-protein interactions on formulation conditions (pH, ionic strength, buffer type, presence of excipients) and how these impact upon the initial steps in protein self-association and aggregation. In this work, we study the initial steps in the aggregation pathways using a minimal protein model based on square-well potentials and discontinuous molecular dynamics. The effect of model parameters, including range of interaction, stiffness, chain length, and chain sequence, implies that protein models fold according to various pathways. By reducing the range of interactions, the folding- and collapse- transition come together, and follow a single-step folding pathway from the denatured to the native state2. After parameterizing the model interaction-parameters, we developed an understanding of low-temperature conformational properties and fluctuations, and the correlation to the folding transition of proteins in isolation. The model fluctuations increase with temperature. We observe a low-temperature point, below which large fluctuations are frozen out. This implies that fluctuations at low-temperature can be correlated to the folding transition at the melting temperature. Because proteins “breath” at low temperatures, defining a native-state as a single structure with conserved contacts and a fixed three-dimensional structure is misleading. Rather, we introduce a new definition of a native-state ensemble based on our understanding of the core conservation, which takes into account the native fluctuations at low temperatures. This approach permits the study of a large range of length and time scales needed to link the molecular interactions to the macroscopically observed behaviour. In addition, these models studied are parameterized by fitting to experimentally observed protein-protein interactions characterized in terms of osmotic second virial coefficients.

Keywords: protein folding, native-ensemble, conformational fluctuation, aggregation

Procedia PDF Downloads 361