Search results for: clustering algorithms
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2388

Search results for: clustering algorithms

2118 Optimal Feature Extraction Dimension in Finger Vein Recognition Using Kernel Principal Component Analysis

Authors: Amir Hajian, Sepehr Damavandinejadmonfared

Abstract:

In this paper the issue of dimensionality reduction is investigated in finger vein recognition systems using kernel Principal Component Analysis (KPCA). One aspect of KPCA is to find the most appropriate kernel function on finger vein recognition as there are several kernel functions which can be used within PCA-based algorithms. In this paper, however, another side of PCA-based algorithms -particularly KPCA- is investigated. The aspect of dimension of feature vector in PCA-based algorithms is of importance especially when it comes to the real-world applications and usage of such algorithms. It means that a fixed dimension of feature vector has to be set to reduce the dimension of the input and output data and extract the features from them. Then a classifier is performed to classify the data and make the final decision. We analyze KPCA (Polynomial, Gaussian, and Laplacian) in details in this paper and investigate the optimal feature extraction dimension in finger vein recognition using KPCA.

Keywords: biometrics, finger vein recognition, principal component analysis (PCA), kernel principal component analysis (KPCA)

Procedia PDF Downloads 326
2117 Understanding Farmers’ Perceptions Towards Agrivoltaics Using Decision Tree Algorithms

Authors: Mayuri Roy Choudhury

Abstract:

In recent times the concept of agrivoltaics has gained popularity due to the dual use of land and the added value provided by photovoltaics in terms of renewable energy and crop production on farms. However, the transition towards agrivoltaics has been slow, and our research tries to investigate the obstacles leading towards the slow progress of agrivoltaics. We applied data science decision tree algorithms to quantify qualitative perceptions of farmers in the United States for agrivoltaics. To date, there has not been much research that mentions farmers' perceptions, as most of the research focuses on the benefits of agrivoltaics. Our study adds value by putting forward the voices of farmers, which play a crucial towards the transition to agrivoltaics in the future. Our results show a mixture of responses in favor of agrivoltaics. Furthermore, it also portrays significant concerns of farmers, which is useful for decision-makers when it comes to formulating policies for agrivoltaics.

Keywords: agrivoltaics, decision-tree algorithms, farmers perception, transition

Procedia PDF Downloads 150
2116 Life Prediction of Condenser Tubes Applying Fuzzy Logic and Neural Network Algorithms

Authors: A. Majidian

Abstract:

The life prediction of thermal power plant components is necessary to prevent the unexpected outages, optimize maintenance tasks in periodic overhauls and plan inspection tasks with their schedules. One of the main critical components in a power plant is condenser because its failure can affect many other components which are positioned in downstream of condenser. This paper deals with factors affecting life of condenser. Failure rates dependency vs. these factors has been investigated using Artificial Neural Network (ANN) and fuzzy logic algorithms. These algorithms have shown their capabilities as dynamic tools to evaluate life prediction of power plant equipments.

Keywords: life prediction, condenser tube, neural network, fuzzy logic

Procedia PDF Downloads 315
2115 Prediction of MicroRNA-Target Gene by Machine Learning Algorithms in Lung Cancer Study

Authors: Nilubon Kurubanjerdjit, Nattakarn Iam-On, Ka-Lok Ng

Abstract:

MicroRNAs are small non-coding RNA found in many different species. They play crucial roles in cancer such as biological processes of apoptosis and proliferation. The identification of microRNA-target genes can be an essential first step towards to reveal the role of microRNA in various cancer types. In this paper, we predict miRNA-target genes for lung cancer by integrating prediction scores from miRanda and PITA algorithms used as a feature vector of miRNA-target interaction. Then, machine-learning algorithms were implemented for making a final prediction. The approach developed in this study should be of value for future studies into understanding the role of miRNAs in molecular mechanisms enabling lung cancer formation.

Keywords: microRNA, miRNAs, lung cancer, machine learning, Naïve Bayes, SVM

Procedia PDF Downloads 356
2114 Cotton Crops Vegetative Indices Based Assessment Using Multispectral Images

Authors: Muhammad Shahzad Shifa, Amna Shifa, Muhammad Omar, Aamir Shahzad, Rahmat Ali Khan

Abstract:

Many applications of remote sensing to vegetation and crop response depend on spectral properties of individual leaves and plants. Vegetation indices are usually determined to estimate crop biophysical parameters like crop canopies and crop leaf area indices with the help of remote sensing. Cotton crops assessment is performed with the help of vegetative indices. Remotely sensed images from an optical multispectral radiometer MSR5 are used in this study. The interpretation is based on the fact that different materials reflect and absorb light differently at different wavelengths. Non-normalized and normalized forms of these datasets are analyzed using two complementary data mining algorithms; K-means and K-nearest neighbor (KNN). Our analysis shows that the use of normalized reflectance data and vegetative indices are suitable for an automated assessment and decision making.

Keywords: cotton, condition assessment, KNN algorithm, clustering, MSR5, vegetation indices

Procedia PDF Downloads 283
2113 Algorithms Inspired from Human Behavior Applied to Optimization of a Complex Process

Authors: S. Curteanu, F. Leon, M. Gavrilescu, S. A. Floria

Abstract:

Optimization algorithms inspired from human behavior were applied in this approach, associated with neural networks models. The algorithms belong to human behaviors of learning and cooperation and human competitive behavior classes. For the first class, the main strategies include: random learning, individual learning, and social learning, and the selected algorithms are: simplified human learning optimization (SHLO), social learning optimization (SLO), and teaching-learning based optimization (TLBO). For the second class, the concept of learning is associated with competitiveness, and the selected algorithms are sports-inspired algorithms (with Football Game Algorithm, FGA and Volleyball Premier League, VPL) and Imperialist Competitive Algorithm (ICA). A real process, the synthesis of polyacrylamide-based multicomponent hydrogels, where some parameters are difficult to obtain experimentally, is considered as a case study. Reaction yield and swelling degree are predicted as a function of reaction conditions (acrylamide concentration, initiator concentration, crosslinking agent concentration, temperature, reaction time, and amount of inclusion polymer, which could be starch, poly(vinyl alcohol) or gelatin). The experimental results contain 175 data. Artificial neural networks are obtained in optimal form with biologically inspired algorithm; the optimization being perform at two level: structural and parametric. Feedforward neural networks with one or two hidden layers and no more than 25 neurons in intermediate layers were obtained with values of correlation coefficient in the validation phase over 0.90. The best results were obtained with TLBO algorithm, correlation coefficient being 0.94 for an MLP(6:9:20:2) – a feedforward neural network with two hidden layers and 9 and 20, respectively, intermediate neurons. Good results obtained prove the efficiency of the optimization algorithms. More than the good results, what is important in this approach is the simulation methodology, including neural networks and optimization biologically inspired algorithms, which provide satisfactory results. In addition, the methodology developed in this approach is general and has flexibility so that it can be easily adapted to other processes in association with different types of models.

Keywords: artificial neural networks, human behaviors of learning and cooperation, human competitive behavior, optimization algorithms

Procedia PDF Downloads 73
2112 Hexagonal Honeycomb Sandwich Plate Optimization Using Gravitational Search Algorithm

Authors: A. Boudjemai, A. Zafrane, R. Hocine

Abstract:

Honeycomb sandwich panels are increasingly used in the construction of space vehicles because of their outstanding strength, stiffness and light weight properties. However, the use of honeycomb sandwich plates comes with difficulties in the design process as a result of the large number of design variables involved, including composite material design, shape and geometry. Hence, this work deals with the presentation of an optimal design of hexagonal honeycomb sandwich structures subjected to space environment. The optimization process is performed using a set of algorithms including the gravitational search algorithm (GSA). Numerical results are obtained and presented for a set of algorithms. The results obtained by the GSA algorithm are much better compared to other algorithms used in this study.

Keywords: optimization, gravitational search algorithm, genetic algorithm, honeycomb plate

Procedia PDF Downloads 345
2111 Performance Evaluation of Various Segmentation Techniques on MRI of Brain Tissue

Authors: U.V. Suryawanshi, S.S. Chowhan, U.V Kulkarni

Abstract:

Accuracy of segmentation methods is of great importance in brain image analysis. Tissue classification in Magnetic Resonance brain images (MRI) is an important issue in the analysis of several brain dementias. This paper portraits performance of segmentation techniques that are used on Brain MRI. A large variety of algorithms for segmentation of Brain MRI has been developed. The objective of this paper is to perform a segmentation process on MR images of the human brain, using Fuzzy c-means (FCM), Kernel based Fuzzy c-means clustering (KFCM), Spatial Fuzzy c-means (SFCM) and Improved Fuzzy c-means (IFCM). The review covers imaging modalities, MRI and methods for noise reduction and segmentation approaches. All methods are applied on MRI brain images which are degraded by salt-pepper noise demonstrate that the IFCM algorithm performs more robust to noise than the standard FCM algorithm. We conclude with a discussion on the trend of future research in brain segmentation and changing norms in IFCM for better results.

Keywords: image segmentation, preprocessing, MRI, FCM, KFCM, SFCM, IFCM

Procedia PDF Downloads 290
2110 Comparison of Back-Projection with Non-Uniform Fast Fourier Transform for Real-Time Photoacoustic Tomography

Authors: Moung Young Lee, Chul Gyu Song

Abstract:

Photoacoustic imaging is the imaging technology that combines the optical imaging and ultrasound. This provides the high contrast and resolution due to optical imaging and ultrasound imaging, respectively. We developed the real-time photoacoustic tomography (PAT) system using linear-ultrasound transducer and digital acquisition (DAQ) board. There are two types of algorithm for reconstructing the photoacoustic signal. One is back-projection algorithm, the other is FFT algorithm. Especially, we used the non-uniform FFT algorithm. To evaluate the performance of our system and algorithms, we monitored two wires that stands at interval of 2.89 mm and 0.87 mm. Then, we compared the images reconstructed by algorithms. Finally, we monitored the two hairs crossed and compared between these algorithms.

Keywords: back-projection, image comparison, non-uniform FFT, photoacoustic tomography

Procedia PDF Downloads 396
2109 Automatic Landmark Selection Based on Feature Clustering for Visual Autonomous Unmanned Aerial Vehicle Navigation

Authors: Paulo Fernando Silva Filho, Elcio Hideiti Shiguemori

Abstract:

The selection of specific landmarks for an Unmanned Aerial Vehicles’ Visual Navigation systems based on Automatic Landmark Recognition has significant influence on the precision of the system’s estimated position. At the same time, manual selection of the landmarks does not guarantee a high recognition rate, which would also result on a poor precision. This work aims to develop an automatic landmark selection that will take the image of the flight area and identify the best landmarks to be recognized by the Visual Navigation Landmark Recognition System. The criterion to select a landmark is based on features detected by ORB or AKAZE and edges information on each possible landmark. Results have shown that disposition of possible landmarks is quite different from the human perception.

Keywords: clustering, edges, feature points, landmark selection, X-means

Procedia PDF Downloads 239
2108 Security of Database Using Chaotic Systems

Authors: Eman W. Boghdady, A. R. Shehata, M. A. Azem

Abstract:

Database (DB) security demands permitting authorized users and prohibiting non-authorized users and intruders actions on the DB and the objects inside it. Organizations that are running successfully demand the confidentiality of their DBs. They do not allow the unauthorized access to their data/information. They also demand the assurance that their data is protected against any malicious or accidental modification. DB protection and confidentiality are the security concerns. There are four types of controls to obtain the DB protection, those include: access control, information flow control, inference control, and cryptographic. The cryptographic control is considered as the backbone for DB security, it secures the DB by encryption during storage and communications. Current cryptographic techniques are classified into two types: traditional classical cryptography using standard algorithms (DES, AES, IDEA, etc.) and chaos cryptography using continuous (Chau, Rossler, Lorenz, etc.) or discreet (Logistics, Henon, etc.) algorithms. The important characteristics of chaos are its extreme sensitivity to initial conditions of the system. In this paper, DB-security systems based on chaotic algorithms are described. The Pseudo Random Numbers Generators (PRNGs) from the different chaotic algorithms are implemented using Matlab and their statistical properties are evaluated using NIST and other statistical test-suits. Then, these algorithms are used to secure conventional DB (plaintext), where the statistical properties of the ciphertext are also tested. To increase the complexity of the PRNGs and to let pass all the NIST statistical tests, we propose two hybrid PRNGs: one based on two chaotic Logistic maps and another based on two chaotic Henon maps, where each chaotic algorithm is running side-by-side and starting from random independent initial conditions and parameters (encryption keys). The resulted hybrid PRNGs passed the NIST statistical test suit.

Keywords: algorithms and data structure, DB security, encryption, chaotic algorithms, Matlab, NIST

Procedia PDF Downloads 235
2107 An Ensemble Learning Method for Applying Particle Swarm Optimization Algorithms to Systems Engineering Problems

Authors: Ken Hampshire, Thomas Mazzuchi, Shahram Sarkani

Abstract:

As a subset of metaheuristics, nature-inspired optimization algorithms such as particle swarm optimization (PSO) have shown promise both in solving intractable problems and in their extensibility to novel problem formulations due to their general approach requiring few assumptions. Unfortunately, single instantiations of algorithms require detailed tuning of parameters and cannot be proven to be best suited to a particular illustrative problem on account of the “no free lunch” (NFL) theorem. Using these algorithms in real-world problems requires exquisite knowledge of the many techniques and is not conducive to reconciling the various approaches to given classes of problems. This research aims to present a unified view of PSO-based approaches from the perspective of relevant systems engineering problems, with the express purpose of then eliciting the best solution for any problem formulation in an ensemble learning bucket of models approach. The central hypothesis of the research is that extending the PSO algorithms found in the literature to real-world optimization problems requires a general ensemble-based method for all problem formulations but a specific implementation and solution for any instance. The main results are a problem-based literature survey and a general method to find more globally optimal solutions for any systems engineering optimization problem.

Keywords: particle swarm optimization, nature-inspired optimization, metaheuristics, systems engineering, ensemble learning

Procedia PDF Downloads 45
2106 An Efficient Machine Learning Model to Detect Metastatic Cancer in Pathology Scans Using Principal Component Analysis Algorithm, Genetic Algorithm, and Classification Algorithms

Authors: Bliss Singhal

Abstract:

Machine learning (ML) is a branch of Artificial Intelligence (AI) where computers analyze data and find patterns in the data. The study focuses on the detection of metastatic cancer using ML. Metastatic cancer is the stage where cancer has spread to other parts of the body and is the cause of approximately 90% of cancer-related deaths. Normally, pathologists spend hours each day to manually classifying whether tumors are benign or malignant. This tedious task contributes to mislabeling metastasis being over 60% of the time and emphasizes the importance of being aware of human error and other inefficiencies. ML is a good candidate to improve the correct identification of metastatic cancer, saving thousands of lives and can also improve the speed and efficiency of the process, thereby taking fewer resources and time. So far, the deep learning methodology of AI has been used in research to detect cancer. This study is a novel approach to determining the potential of using preprocessing algorithms combined with classification algorithms in detecting metastatic cancer. The study used two preprocessing algorithms: principal component analysis (PCA) and the genetic algorithm, to reduce the dimensionality of the dataset and then used three classification algorithms: logistic regression, decision tree classifier, and k-nearest neighbors to detect metastatic cancer in the pathology scans. The highest accuracy of 71.14% was produced by the ML pipeline comprising of PCA, the genetic algorithm, and the k-nearest neighbor algorithm, suggesting that preprocessing and classification algorithms have great potential for detecting metastatic cancer.

Keywords: breast cancer, principal component analysis, genetic algorithm, k-nearest neighbors, decision tree classifier, logistic regression

Procedia PDF Downloads 43
2105 Evolutionary Methods in Cryptography

Authors: Wafa Slaibi Alsharafat

Abstract:

Genetic algorithms (GA) are random algorithms as random numbers that are generated during the operation of the algorithm determine what happens. This means that if GA is applied twice to optimize exactly the same problem it might produces two different answers. In this project, we propose an evolutionary algorithm and Genetic Algorithm (GA) to be implemented in symmetric encryption and decryption. Here, user's message and user secret information (key) which represent plain text to be transferred into cipher text.

Keywords: GA, encryption, decryption, crossover

Procedia PDF Downloads 403
2104 Predicting Groundwater Areas Using Data Mining Techniques: Groundwater in Jordan as Case Study

Authors: Faisal Aburub, Wael Hadi

Abstract:

Data mining is the process of extracting useful or hidden information from a large database. Extracted information can be used to discover relationships among features, where data objects are grouped according to logical relationships; or to predict unseen objects to one of the predefined groups. In this paper, we aim to investigate four well-known data mining algorithms in order to predict groundwater areas in Jordan. These algorithms are Support Vector Machines (SVMs), Naïve Bayes (NB), K-Nearest Neighbor (kNN) and Classification Based on Association Rule (CBA). The experimental results indicate that the SVMs algorithm outperformed other algorithms in terms of classification accuracy, precision and F1 evaluation measures using the datasets of groundwater areas that were collected from Jordanian Ministry of Water and Irrigation.

Keywords: classification, data mining, evaluation measures, groundwater

Procedia PDF Downloads 242
2103 Clustering Based and Centralized Routing Table Topology of Control Protocol in Mobile Wireless Sensor Networks

Authors: Mbida Mohamed, Ezzati Abdellah

Abstract:

A strong challenge in the wireless sensor networks (WSN) is to save the energy and have a long life time in the network without having a high rate of loss information. However, topology control (TC) protocols are designed in a way that the network is divided and having a standard system of exchange packets between nodes. In this article, we will propose a clustering based and centralized routing table protocol of TC (CBCRT) which delegates a leader node that will encapsulate a single routing table in every cluster nodes. Hence, if a node wants to send packets to the sink, it requests the information's routing table of the current cluster from the node leader in order to root the packet.

Keywords: mobile wireless sensor networks, routing, topology of control, protocols

Procedia PDF Downloads 237
2102 Evaluation of Photovoltaic System with Different Research Methods of Maximum Power Point Tracking

Authors: Mehdi Ameur, Ahmed Essadki, Tamou Nasser

Abstract:

The purpose of this paper is the evaluation of photovoltaic system with MPPT techniques. This system is developed by combining the models of established solar module and DC-DC converter with the algorithms of perturbing and observing (P&O), incremental conductance (INC) and fuzzy logic controller (FLC). The system is simulated under different climate conditions and MPPT algorithms to determine the influence of these conditions on characteristic power-voltage of PV system. According to the comparisons of the simulation results, the photovoltaic system can extract the maximum power with precision and rapidity using the MPPT algorithms discussed in this paper.

Keywords: fuzzy logic controller, FLC, hill climbing, HC, incremental conductance (INC), perturb and observe (P&O), maximum power point, MPP, maximum power point tracking, MPPT

Procedia PDF Downloads 475
2101 Identification of Hepatocellular Carcinoma Using Supervised Learning Algorithms

Authors: Sagri Sharma

Abstract:

Analysis of diseases integrating multi-factors increases the complexity of the problem and therefore, development of frameworks for the analysis of diseases is an issue that is currently a topic of intense research. Due to the inter-dependence of the various parameters, the use of traditional methodologies has not been very effective. Consequently, newer methodologies are being sought to deal with the problem. Supervised Learning Algorithms are commonly used for performing the prediction on previously unseen data. These algorithms are commonly used for applications in fields ranging from image analysis to protein structure and function prediction and they get trained using a known dataset to come up with a predictor model that generates reasonable predictions for the response to new data. Gene expression profiles generated by DNA analysis experiments can be quite complex since these experiments can involve hypotheses involving entire genomes. The application of well-known machine learning algorithm - Support Vector Machine - to analyze the expression levels of thousands of genes simultaneously in a timely, automated and cost effective way is thus used. The objectives to undertake the presented work are development of a methodology to identify genes relevant to Hepatocellular Carcinoma (HCC) from gene expression dataset utilizing supervised learning algorithms and statistical evaluations along with development of a predictive framework that can perform classification tasks on new, unseen data.

Keywords: artificial intelligence, biomarker, gene expression datasets, hepatocellular carcinoma, machine learning, supervised learning algorithms, support vector machine

Procedia PDF Downloads 390
2100 Pruning Algorithm for the Minimum Rule Reduct Generation

Authors: Sahin Emrah Amrahov, Fatih Aybar, Serhat Dogan

Abstract:

In this paper we consider the rule reduct generation problem. Rule Reduct Generation (RG) and Modified Rule Generation (MRG) algorithms, that are used to solve this problem, are well-known. Alternative to these algorithms, we develop Pruning Rule Generation (PRG) algorithm. We compare the PRG algorithm with RG and MRG.

Keywords: rough sets, decision rules, rule induction, classification

Procedia PDF Downloads 493
2099 Evaluation of Security and Performance of Master Node Protocol in the Bitcoin Peer-To-Peer Network

Authors: Muntadher Sallal, Gareth Owenson, Mo Adda, Safa Shubbar

Abstract:

Bitcoin is a digital currency based on a peer-to-peer network to propagate and verify transactions. Bitcoin is gaining wider adoption than any previous crypto-currency. However, the mechanism of peers randomly choosing logical neighbors without any knowledge about underlying physical topology can cause a delay overhead in information propagation, which makes the system vulnerable to double-spend attacks. Aiming at alleviating the propagation delay problem, this paper introduces proximity-aware extensions to the current Bitcoin protocol, named Master Node Based Clustering (MNBC). The ultimate purpose of the proposed protocol, that are based on how clusters are formulated and how nodes can define their membership, is to improve the information propagation delay in the Bitcoin network. In MNBC protocol, physical internet connectivity increases, as well as the number of hops between nodes, decreases through assigning nodes to be responsible for maintaining clusters based on physical internet proximity. We show, through simulations, that the proposed protocol defines better clustering structures that optimize the performance of the transaction propagation over the Bitcoin protocol. The evaluation of partition attacks in the MNBC protocol, as well as the Bitcoin network, was done in this paper. Evaluation results prove that even though the Bitcoin network is more resistant against the partitioning attack than the MNBC protocol, more resources are needed to be spent to split the network in the MNBC protocol, especially with a higher number of nodes.

Keywords: Bitcoin network, propagation delay, clustering, scalability

Procedia PDF Downloads 82
2098 Event Driven Dynamic Clustering and Data Aggregation in Wireless Sensor Network

Authors: Ashok V. Sutagundar, Sunilkumar S. Manvi

Abstract:

Energy, delay and bandwidth are the prime issues of wireless sensor network (WSN). Energy usage optimization and efficient bandwidth utilization are important issues in WSN. Event triggered data aggregation facilitates such optimal tasks for event affected area in WSN. Reliable delivery of the critical information to sink node is also a major challenge of WSN. To tackle these issues, we propose an event driven dynamic clustering and data aggregation scheme for WSN that enhances the life time of the network by minimizing redundant data transmission. The proposed scheme operates as follows: (1) Whenever the event is triggered, event triggered node selects the cluster head. (2) Cluster head gathers data from sensor nodes within the cluster. (3) Cluster head node identifies and classifies the events out of the collected data using Bayesian classifier. (4) Aggregation of data is done using statistical method. (5) Cluster head discovers the paths to the sink node using residual energy, path distance and bandwidth. (6) If the aggregated data is critical, cluster head sends the aggregated data over the multipath for reliable data communication. (7) Otherwise aggregated data is transmitted towards sink node over the single path which is having the more bandwidth and residual energy. The performance of the scheme is validated for various WSN scenarios to evaluate the effectiveness of the proposed approach in terms of aggregation time, cluster formation time and energy consumed for aggregation.

Keywords: wireless sensor network, dynamic clustering, data aggregation, wireless communication

Procedia PDF Downloads 402
2097 The Effect of Feature Selection on Pattern Classification

Authors: Chih-Fong Tsai, Ya-Han Hu

Abstract:

The aim of feature selection (or dimensionality reduction) is to filter out unrepresentative features (or variables) making the classifier perform better than the one without feature selection. Since there are many well-known feature selection algorithms, and different classifiers based on different selection results may perform differently, very few studies consider examining the effect of performing different feature selection algorithms on the classification performances by different classifiers over different types of datasets. In this paper, two widely used algorithms, which are the genetic algorithm (GA) and information gain (IG), are used to perform feature selection. On the other hand, three well-known classifiers are constructed, which are the CART decision tree (DT), multi-layer perceptron (MLP) neural network, and support vector machine (SVM). Based on 14 different types of datasets, the experimental results show that in most cases IG is a better feature selection algorithm than GA. In addition, the combinations of IG with DT and IG with SVM perform best and second best for small and large scale datasets.

Keywords: data mining, feature selection, pattern classification, dimensionality reduction

Procedia PDF Downloads 627
2096 Efficient Credit Card Fraud Detection Based on Multiple ML Algorithms

Authors: Neha Ahirwar

Abstract:

In the contemporary digital era, the rise of credit card fraud poses a significant threat to both financial institutions and consumers. As fraudulent activities become more sophisticated, there is an escalating demand for robust and effective fraud detection mechanisms. Advanced machine learning algorithms have become crucial tools in addressing this challenge. This paper conducts a thorough examination of the design and evaluation of a credit card fraud detection system, utilizing four prominent machine learning algorithms: random forest, logistic regression, decision tree, and XGBoost. The surge in digital transactions has opened avenues for fraudsters to exploit vulnerabilities within payment systems. Consequently, there is an urgent need for proactive and adaptable fraud detection systems. This study addresses this imperative by exploring the efficacy of machine learning algorithms in identifying fraudulent credit card transactions. The selection of random forest, logistic regression, decision tree, and XGBoost for scrutiny in this study is based on their documented effectiveness in diverse domains, particularly in credit card fraud detection. These algorithms are renowned for their capability to model intricate patterns and provide accurate predictions. Each algorithm is implemented and evaluated for its performance in a controlled environment, utilizing a diverse dataset comprising both genuine and fraudulent credit card transactions.

Keywords: efficient credit card fraud detection, random forest, logistic regression, XGBoost, decision tree

Procedia PDF Downloads 19
2095 Application of Data Mining Techniques for Tourism Knowledge Discovery

Authors: Teklu Urgessa, Wookjae Maeng, Joong Seek Lee

Abstract:

Application of five implementations of three data mining classification techniques was experimented for extracting important insights from tourism data. The aim was to find out the best performing algorithm among the compared ones for tourism knowledge discovery. Knowledge discovery process from data was used as a process model. 10-fold cross validation method is used for testing purpose. Various data preprocessing activities were performed to get the final dataset for model building. Classification models of the selected algorithms were built with different scenarios on the preprocessed dataset. The outperformed algorithm tourism dataset was Random Forest (76%) before applying information gain based attribute selection and J48 (C4.5) (75%) after selection of top relevant attributes to the class (target) attribute. In terms of time for model building, attribute selection improves the efficiency of all algorithms. Artificial Neural Network (multilayer perceptron) showed the highest improvement (90%). The rules extracted from the decision tree model are presented, which showed intricate, non-trivial knowledge/insight that would otherwise not be discovered by simple statistical analysis with mediocre accuracy of the machine using classification algorithms.

Keywords: classification algorithms, data mining, knowledge discovery, tourism

Procedia PDF Downloads 259
2094 An Experimental Study for Assessing Email Classification Attributes Using Feature Selection Methods

Authors: Issa Qabaja, Fadi Thabtah

Abstract:

Email phishing classification is one of the vital problems in the online security research domain that have attracted several scholars due to its impact on the users payments performed daily online. One aspect to reach a good performance by the detection algorithms in the email phishing problem is to identify the minimal set of features that significantly have an impact on raising the phishing detection rate. This paper investigate three known feature selection methods named Information Gain (IG), Chi-square and Correlation Features Set (CFS) on the email phishing problem to separate high influential features from low influential ones in phishing detection. We measure the degree of influentially by applying four data mining algorithms on a large set of features. We compare the accuracy of these algorithms on the complete features set before feature selection has been applied and after feature selection has been applied. After conducting experiments, the results show 12 common significant features have been chosen among the considered features by the feature selection methods. Further, the average detection accuracy derived by the data mining algorithms on the reduced 12-features set was very slight affected when compared with the one derived from the 47-features set.

Keywords: data mining, email classification, phishing, online security

Procedia PDF Downloads 394
2093 An Optimal Steganalysis Based Approach for Embedding Information in Image Cover Media with Security

Authors: Ahlem Fatnassi, Hamza Gharsellaoui, Sadok Bouamama

Abstract:

This paper deals with the study of interest in the fields of Steganography and Steganalysis. Steganography involves hiding information in a cover media to obtain the stego media in such a way that the cover media is perceived not to have any embedded message for its unintended recipients. Steganalysis is the mechanism of detecting the presence of hidden information in the stego media and it can lead to the prevention of disastrous security incidents. In this paper, we provide a critical review of the steganalysis algorithms available to analyze the characteristics of an image stego media against the corresponding cover media and understand the process of embedding the information and its detection. We anticipate that this paper can also give a clear picture of the current trends in steganography so that we can develop and improvise appropriate steganalysis algorithms.

Keywords: optimization, heuristics and metaheuristics algorithms, embedded systems, low-power consumption, steganalysis heuristic approach

Procedia PDF Downloads 258
2092 Identification of Nonlinear Systems Using Radial Basis Function Neural Network

Authors: C. Pislaru, A. Shebani

Abstract:

This paper uses the radial basis function neural network (RBFNN) for system identification of nonlinear systems. Five nonlinear systems are used to examine the activity of RBFNN in system modeling of nonlinear systems; the five nonlinear systems are dual tank system, single tank system, DC motor system, and two academic models. The feed forward method is considered in this work for modelling the non-linear dynamic models, where the K-Means clustering algorithm used in this paper to select the centers of radial basis function network, because it is reliable, offers fast convergence and can handle large data sets. The least mean square method is used to adjust the weights to the output layer, and Euclidean distance method used to measure the width of the Gaussian function.

Keywords: system identification, nonlinear systems, neural networks, radial basis function, K-means clustering algorithm

Procedia PDF Downloads 431
2091 Discriminating Between Energy Drinks and Sports Drinks Based on Their Chemical Properties Using Chemometric Methods

Authors: Robert Cazar, Nathaly Maza

Abstract:

Energy drinks and sports drinks are quite popular among young adults and teenagers worldwide. Some concerns regarding their health effects – particularly those of the energy drinks - have been raised based on scientific findings. Differentiating between these two types of drinks by means of their chemical properties seems to be an instructive task. Chemometrics provides the most appropriate strategy to do so. In this study, a discrimination analysis of the energy and sports drinks has been carried out applying chemometric methods. A set of eleven samples of available commercial brands of drinks – seven energy drinks and four sports drinks – were collected. Each sample was characterized by eight chemical variables (carbohydrates, energy, sugar, sodium, pH, degrees Brix, density, and citric acid). The data set was standardized and examined by exploratory chemometric techniques such as clustering and principal component analysis. As a preliminary step, a variable selection was carried out by inspecting the variable correlation matrix. It was detected that some variables are redundant, so they can be safely removed, leaving only five variables that are sufficient for this analysis. They are sugar, sodium, pH, density, and citric acid. Then, a hierarchical clustering `employing the average – linkage criterion and using the Euclidian distance metrics was performed. It perfectly separates the two types of drinks since the resultant dendogram, cut at the 25% similarity level, assorts the samples in two well defined groups, one of them containing the energy drinks and the other one the sports drinks. Further assurance of the complete discrimination is provided by the principal component analysis. The projection of the data set on the first two principal components – which retain the 71% of the data information – permits to visualize the distribution of the samples in the two groups identified in the clustering stage. Since the first principal component is the discriminating one, the inspection of its loadings consents to characterize such groups. The energy drinks group possesses medium to high values of density, citric acid, and sugar. The sports drinks group, on the other hand, exhibits low values of those variables. In conclusion, the application of chemometric methods on a data set that features some chemical properties of a number of energy and sports drinks provides an accurate, dependable way to discriminate between these two types of beverages.

Keywords: chemometrics, clustering, energy drinks, principal component analysis, sports drinks

Procedia PDF Downloads 66
2090 The Impact of Temporal Impairment on Quality of Experience (QoE) in Video Streaming: A No Reference (NR) Subjective and Objective Study

Authors: Muhammad Arslan Usman, Muhammad Rehan Usman, Soo Young Shin

Abstract:

Live video streaming is one of the most widely used service among end users, yet it is a big challenge for the network operators in terms of quality. The only way to provide excellent Quality of Experience (QoE) to the end users is continuous monitoring of live video streaming. For this purpose, there are several objective algorithms available that monitor the quality of the video in a live stream. Subjective tests play a very important role in fine tuning the results of objective algorithms. As human perception is considered to be the most reliable source for assessing the quality of a video stream, subjective tests are conducted in order to develop more reliable objective algorithms. Temporal impairments in a live video stream can have a negative impact on the end users. In this paper we have conducted subjective evaluation tests on a set of video sequences containing temporal impairment known as frame freezing. Frame Freezing is considered as a transmission error as well as a hardware error which can result in loss of video frames on the reception side of a transmission system. In our subjective tests, we have performed tests on videos that contain a single freezing event and also for videos that contain multiple freezing events. We have recorded our subjective test results for all the videos in order to give a comparison on the available No Reference (NR) objective algorithms. Finally, we have shown the performance of no reference algorithms used for objective evaluation of videos and suggested the algorithm that works better. The outcome of this study shows the importance of QoE and its effect on human perception. The results for the subjective evaluation can serve the purpose for validating objective algorithms.

Keywords: objective evaluation, subjective evaluation, quality of experience (QoE), video quality assessment (VQA)

Procedia PDF Downloads 568
2089 Neuroevolution Based on Adaptive Ensembles of Biologically Inspired Optimization Algorithms Applied for Modeling a Chemical Engineering Process

Authors: Sabina-Adriana Floria, Marius Gavrilescu, Florin Leon, Silvia Curteanu, Costel Anton

Abstract:

Neuroevolution is a subfield of artificial intelligence used to solve various problems in different application areas. Specifically, neuroevolution is a technique that applies biologically inspired methods to generate neural network architectures and optimize their parameters automatically. In this paper, we use different biologically inspired optimization algorithms in an ensemble strategy with the aim of training multilayer perceptron neural networks, resulting in regression models used to simulate the industrial chemical process of obtaining bricks from silicone-based materials. Installations in the raw ceramics industry, i.e., bricks, are characterized by significant energy consumption and large quantities of emissions. In addition, the initial conditions that were taken into account during the design and commissioning of the installation can change over time, which leads to the need to add new mixes to adjust the operating conditions for the desired purpose, e.g., material properties and energy saving. The present approach follows the study by simulation of a process of obtaining bricks from silicone-based materials, i.e., the modeling and optimization of the process. Optimization aims to determine the working conditions that minimize the emissions represented by nitrogen monoxide. We first use a search procedure to find the best values for the parameters of various biologically inspired optimization algorithms. Then, we propose an adaptive ensemble strategy that uses only a subset of the best algorithms identified in the search stage. The adaptive ensemble strategy combines the results of selected algorithms and automatically assigns more processing capacity to the more efficient algorithms. Their efficiency may also vary at different stages of the optimization process. In a given ensemble iteration, the most efficient algorithms aim to maintain good convergence, while the less efficient algorithms can improve population diversity. The proposed adaptive ensemble strategy outperforms the individual optimizers and the non-adaptive ensemble strategy in convergence speed, and the obtained results provide lower error values.

Keywords: optimization, biologically inspired algorithm, neuroevolution, ensembles, bricks, emission minimization

Procedia PDF Downloads 65