Search results for: sampling algorithms
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 4988

Search results for: sampling algorithms

4868 Prediction of MicroRNA-Target Gene by Machine Learning Algorithms in Lung Cancer Study

Authors: Nilubon Kurubanjerdjit, Nattakarn Iam-On, Ka-Lok Ng

Abstract:

MicroRNAs are small non-coding RNA found in many different species. They play crucial roles in cancer such as biological processes of apoptosis and proliferation. The identification of microRNA-target genes can be an essential first step towards to reveal the role of microRNA in various cancer types. In this paper, we predict miRNA-target genes for lung cancer by integrating prediction scores from miRanda and PITA algorithms used as a feature vector of miRNA-target interaction. Then, machine-learning algorithms were implemented for making a final prediction. The approach developed in this study should be of value for future studies into understanding the role of miRNAs in molecular mechanisms enabling lung cancer formation.

Keywords: microRNA, miRNAs, lung cancer, machine learning, Naïve Bayes, SVM

Procedia PDF Downloads 401
4867 A Fuzzy Kernel K-Medoids Algorithm for Clustering Uncertain Data Objects

Authors: Behnam Tavakkol

Abstract:

Uncertain data mining algorithms use different ways to consider uncertainty in data such as by representing a data object as a sample of points or a probability distribution. Fuzzy methods have long been used for clustering traditional (certain) data objects. They are used to produce non-crisp cluster labels. For uncertain data, however, besides some uncertain fuzzy k-medoids algorithms, not many other fuzzy clustering methods have been developed. In this work, we develop a fuzzy kernel k-medoids algorithm for clustering uncertain data objects. The developed fuzzy kernel k-medoids algorithm is superior to existing fuzzy k-medoids algorithms in clustering data sets with non-linearly separable clusters.

Keywords: clustering algorithm, fuzzy methods, kernel k-medoids, uncertain data

Procedia PDF Downloads 216
4866 Algorithms Inspired from Human Behavior Applied to Optimization of a Complex Process

Authors: S. Curteanu, F. Leon, M. Gavrilescu, S. A. Floria

Abstract:

Optimization algorithms inspired from human behavior were applied in this approach, associated with neural networks models. The algorithms belong to human behaviors of learning and cooperation and human competitive behavior classes. For the first class, the main strategies include: random learning, individual learning, and social learning, and the selected algorithms are: simplified human learning optimization (SHLO), social learning optimization (SLO), and teaching-learning based optimization (TLBO). For the second class, the concept of learning is associated with competitiveness, and the selected algorithms are sports-inspired algorithms (with Football Game Algorithm, FGA and Volleyball Premier League, VPL) and Imperialist Competitive Algorithm (ICA). A real process, the synthesis of polyacrylamide-based multicomponent hydrogels, where some parameters are difficult to obtain experimentally, is considered as a case study. Reaction yield and swelling degree are predicted as a function of reaction conditions (acrylamide concentration, initiator concentration, crosslinking agent concentration, temperature, reaction time, and amount of inclusion polymer, which could be starch, poly(vinyl alcohol) or gelatin). The experimental results contain 175 data. Artificial neural networks are obtained in optimal form with biologically inspired algorithm; the optimization being perform at two level: structural and parametric. Feedforward neural networks with one or two hidden layers and no more than 25 neurons in intermediate layers were obtained with values of correlation coefficient in the validation phase over 0.90. The best results were obtained with TLBO algorithm, correlation coefficient being 0.94 for an MLP(6:9:20:2) – a feedforward neural network with two hidden layers and 9 and 20, respectively, intermediate neurons. Good results obtained prove the efficiency of the optimization algorithms. More than the good results, what is important in this approach is the simulation methodology, including neural networks and optimization biologically inspired algorithms, which provide satisfactory results. In addition, the methodology developed in this approach is general and has flexibility so that it can be easily adapted to other processes in association with different types of models.

Keywords: artificial neural networks, human behaviors of learning and cooperation, human competitive behavior, optimization algorithms

Procedia PDF Downloads 109
4865 Hexagonal Honeycomb Sandwich Plate Optimization Using Gravitational Search Algorithm

Authors: A. Boudjemai, A. Zafrane, R. Hocine

Abstract:

Honeycomb sandwich panels are increasingly used in the construction of space vehicles because of their outstanding strength, stiffness and light weight properties. However, the use of honeycomb sandwich plates comes with difficulties in the design process as a result of the large number of design variables involved, including composite material design, shape and geometry. Hence, this work deals with the presentation of an optimal design of hexagonal honeycomb sandwich structures subjected to space environment. The optimization process is performed using a set of algorithms including the gravitational search algorithm (GSA). Numerical results are obtained and presented for a set of algorithms. The results obtained by the GSA algorithm are much better compared to other algorithms used in this study.

Keywords: optimization, gravitational search algorithm, genetic algorithm, honeycomb plate

Procedia PDF Downloads 378
4864 Comparison of Back-Projection with Non-Uniform Fast Fourier Transform for Real-Time Photoacoustic Tomography

Authors: Moung Young Lee, Chul Gyu Song

Abstract:

Photoacoustic imaging is the imaging technology that combines the optical imaging and ultrasound. This provides the high contrast and resolution due to optical imaging and ultrasound imaging, respectively. We developed the real-time photoacoustic tomography (PAT) system using linear-ultrasound transducer and digital acquisition (DAQ) board. There are two types of algorithm for reconstructing the photoacoustic signal. One is back-projection algorithm, the other is FFT algorithm. Especially, we used the non-uniform FFT algorithm. To evaluate the performance of our system and algorithms, we monitored two wires that stands at interval of 2.89 mm and 0.87 mm. Then, we compared the images reconstructed by algorithms. Finally, we monitored the two hairs crossed and compared between these algorithms.

Keywords: back-projection, image comparison, non-uniform FFT, photoacoustic tomography

Procedia PDF Downloads 434
4863 Security of Database Using Chaotic Systems

Authors: Eman W. Boghdady, A. R. Shehata, M. A. Azem

Abstract:

Database (DB) security demands permitting authorized users and prohibiting non-authorized users and intruders actions on the DB and the objects inside it. Organizations that are running successfully demand the confidentiality of their DBs. They do not allow the unauthorized access to their data/information. They also demand the assurance that their data is protected against any malicious or accidental modification. DB protection and confidentiality are the security concerns. There are four types of controls to obtain the DB protection, those include: access control, information flow control, inference control, and cryptographic. The cryptographic control is considered as the backbone for DB security, it secures the DB by encryption during storage and communications. Current cryptographic techniques are classified into two types: traditional classical cryptography using standard algorithms (DES, AES, IDEA, etc.) and chaos cryptography using continuous (Chau, Rossler, Lorenz, etc.) or discreet (Logistics, Henon, etc.) algorithms. The important characteristics of chaos are its extreme sensitivity to initial conditions of the system. In this paper, DB-security systems based on chaotic algorithms are described. The Pseudo Random Numbers Generators (PRNGs) from the different chaotic algorithms are implemented using Matlab and their statistical properties are evaluated using NIST and other statistical test-suits. Then, these algorithms are used to secure conventional DB (plaintext), where the statistical properties of the ciphertext are also tested. To increase the complexity of the PRNGs and to let pass all the NIST statistical tests, we propose two hybrid PRNGs: one based on two chaotic Logistic maps and another based on two chaotic Henon maps, where each chaotic algorithm is running side-by-side and starting from random independent initial conditions and parameters (encryption keys). The resulted hybrid PRNGs passed the NIST statistical test suit.

Keywords: algorithms and data structure, DB security, encryption, chaotic algorithms, Matlab, NIST

Procedia PDF Downloads 265
4862 An Ensemble Learning Method for Applying Particle Swarm Optimization Algorithms to Systems Engineering Problems

Authors: Ken Hampshire, Thomas Mazzuchi, Shahram Sarkani

Abstract:

As a subset of metaheuristics, nature-inspired optimization algorithms such as particle swarm optimization (PSO) have shown promise both in solving intractable problems and in their extensibility to novel problem formulations due to their general approach requiring few assumptions. Unfortunately, single instantiations of algorithms require detailed tuning of parameters and cannot be proven to be best suited to a particular illustrative problem on account of the “no free lunch” (NFL) theorem. Using these algorithms in real-world problems requires exquisite knowledge of the many techniques and is not conducive to reconciling the various approaches to given classes of problems. This research aims to present a unified view of PSO-based approaches from the perspective of relevant systems engineering problems, with the express purpose of then eliciting the best solution for any problem formulation in an ensemble learning bucket of models approach. The central hypothesis of the research is that extending the PSO algorithms found in the literature to real-world optimization problems requires a general ensemble-based method for all problem formulations but a specific implementation and solution for any instance. The main results are a problem-based literature survey and a general method to find more globally optimal solutions for any systems engineering optimization problem.

Keywords: particle swarm optimization, nature-inspired optimization, metaheuristics, systems engineering, ensemble learning

Procedia PDF Downloads 99
4861 An Efficient Machine Learning Model to Detect Metastatic Cancer in Pathology Scans Using Principal Component Analysis Algorithm, Genetic Algorithm, and Classification Algorithms

Authors: Bliss Singhal

Abstract:

Machine learning (ML) is a branch of Artificial Intelligence (AI) where computers analyze data and find patterns in the data. The study focuses on the detection of metastatic cancer using ML. Metastatic cancer is the stage where cancer has spread to other parts of the body and is the cause of approximately 90% of cancer-related deaths. Normally, pathologists spend hours each day to manually classifying whether tumors are benign or malignant. This tedious task contributes to mislabeling metastasis being over 60% of the time and emphasizes the importance of being aware of human error and other inefficiencies. ML is a good candidate to improve the correct identification of metastatic cancer, saving thousands of lives and can also improve the speed and efficiency of the process, thereby taking fewer resources and time. So far, the deep learning methodology of AI has been used in research to detect cancer. This study is a novel approach to determining the potential of using preprocessing algorithms combined with classification algorithms in detecting metastatic cancer. The study used two preprocessing algorithms: principal component analysis (PCA) and the genetic algorithm, to reduce the dimensionality of the dataset and then used three classification algorithms: logistic regression, decision tree classifier, and k-nearest neighbors to detect metastatic cancer in the pathology scans. The highest accuracy of 71.14% was produced by the ML pipeline comprising of PCA, the genetic algorithm, and the k-nearest neighbor algorithm, suggesting that preprocessing and classification algorithms have great potential for detecting metastatic cancer.

Keywords: breast cancer, principal component analysis, genetic algorithm, k-nearest neighbors, decision tree classifier, logistic regression

Procedia PDF Downloads 83
4860 Semi-Supervised Hierarchical Clustering Given a Reference Tree of Labeled Documents

Authors: Ying Zhao, Xingyan Bin

Abstract:

Semi-supervised clustering algorithms have been shown effective to improve clustering process with even limited supervision. However, semi-supervised hierarchical clustering remains challenging due to the complexities of expressing constraints for agglomerative clustering algorithms. This paper proposes novel semi-supervised agglomerative clustering algorithms to build a hierarchy based on a known reference tree. We prove that by enforcing distance constraints defined by a reference tree during the process of hierarchical clustering, the resultant tree is guaranteed to be consistent with the reference tree. We also propose a framework that allows the hierarchical tree generation be aware of levels of levels of the agglomerative tree under creation, so that metric weights can be learned and adopted at each level in a recursive fashion. The experimental evaluation shows that the additional cost of our contraint-based semi-supervised hierarchical clustering algorithm (HAC) is negligible, and our combined semi-supervised HAC algorithm outperforms the state-of-the-art algorithms on real-world datasets. The experiments also show that our proposed methods can improve clustering performance even with a small number of unevenly distributed labeled data.

Keywords: semi-supervised clustering, hierarchical agglomerative clustering, reference trees, distance constraints

Procedia PDF Downloads 548
4859 Evolutionary Methods in Cryptography

Authors: Wafa Slaibi Alsharafat

Abstract:

Genetic algorithms (GA) are random algorithms as random numbers that are generated during the operation of the algorithm determine what happens. This means that if GA is applied twice to optimize exactly the same problem it might produces two different answers. In this project, we propose an evolutionary algorithm and Genetic Algorithm (GA) to be implemented in symmetric encryption and decryption. Here, user's message and user secret information (key) which represent plain text to be transferred into cipher text.

Keywords: GA, encryption, decryption, crossover

Procedia PDF Downloads 446
4858 Predicting Groundwater Areas Using Data Mining Techniques: Groundwater in Jordan as Case Study

Authors: Faisal Aburub, Wael Hadi

Abstract:

Data mining is the process of extracting useful or hidden information from a large database. Extracted information can be used to discover relationships among features, where data objects are grouped according to logical relationships; or to predict unseen objects to one of the predefined groups. In this paper, we aim to investigate four well-known data mining algorithms in order to predict groundwater areas in Jordan. These algorithms are Support Vector Machines (SVMs), Naïve Bayes (NB), K-Nearest Neighbor (kNN) and Classification Based on Association Rule (CBA). The experimental results indicate that the SVMs algorithm outperformed other algorithms in terms of classification accuracy, precision and F1 evaluation measures using the datasets of groundwater areas that were collected from Jordanian Ministry of Water and Irrigation.

Keywords: classification, data mining, evaluation measures, groundwater

Procedia PDF Downloads 281
4857 Evaluation of Photovoltaic System with Different Research Methods of Maximum Power Point Tracking

Authors: Mehdi Ameur, Ahmed Essadki, Tamou Nasser

Abstract:

The purpose of this paper is the evaluation of photovoltaic system with MPPT techniques. This system is developed by combining the models of established solar module and DC-DC converter with the algorithms of perturbing and observing (P&O), incremental conductance (INC) and fuzzy logic controller (FLC). The system is simulated under different climate conditions and MPPT algorithms to determine the influence of these conditions on characteristic power-voltage of PV system. According to the comparisons of the simulation results, the photovoltaic system can extract the maximum power with precision and rapidity using the MPPT algorithms discussed in this paper.

Keywords: fuzzy logic controller, FLC, hill climbing, HC, incremental conductance (INC), perturb and observe (P&O), maximum power point, MPP, maximum power point tracking, MPPT

Procedia PDF Downloads 511
4856 Time Truncated Group Acceptance Sampling Plans for Exponentiated Half Logistic Distribution

Authors: Srinivasa Rao Gadde

Abstract:

In this article, we considered a group acceptance sampling plans for exponentiated half logistic distribution when the life-test is truncated at a pre-specified time. It is assumed that the index parameter of the exponentiated half logistic distribution is known. The design parameters such as the number of groups and the acceptance number are obtained by satisfying the producer’s and consumer’s risks at the specified quality levels in terms of medians and 10th percentiles under the assumption that the termination time and the number of items in each group are pre-fixed. Finally, an example is given to illustration the methodology.

Keywords: group acceptance sampling plan, operating characteristic, consumer and producer’s risks, truncated life-test

Procedia PDF Downloads 341
4855 Identification of Hepatocellular Carcinoma Using Supervised Learning Algorithms

Authors: Sagri Sharma

Abstract:

Analysis of diseases integrating multi-factors increases the complexity of the problem and therefore, development of frameworks for the analysis of diseases is an issue that is currently a topic of intense research. Due to the inter-dependence of the various parameters, the use of traditional methodologies has not been very effective. Consequently, newer methodologies are being sought to deal with the problem. Supervised Learning Algorithms are commonly used for performing the prediction on previously unseen data. These algorithms are commonly used for applications in fields ranging from image analysis to protein structure and function prediction and they get trained using a known dataset to come up with a predictor model that generates reasonable predictions for the response to new data. Gene expression profiles generated by DNA analysis experiments can be quite complex since these experiments can involve hypotheses involving entire genomes. The application of well-known machine learning algorithm - Support Vector Machine - to analyze the expression levels of thousands of genes simultaneously in a timely, automated and cost effective way is thus used. The objectives to undertake the presented work are development of a methodology to identify genes relevant to Hepatocellular Carcinoma (HCC) from gene expression dataset utilizing supervised learning algorithms and statistical evaluations along with development of a predictive framework that can perform classification tasks on new, unseen data.

Keywords: artificial intelligence, biomarker, gene expression datasets, hepatocellular carcinoma, machine learning, supervised learning algorithms, support vector machine

Procedia PDF Downloads 429
4854 Pruning Algorithm for the Minimum Rule Reduct Generation

Authors: Sahin Emrah Amrahov, Fatih Aybar, Serhat Dogan

Abstract:

In this paper we consider the rule reduct generation problem. Rule Reduct Generation (RG) and Modified Rule Generation (MRG) algorithms, that are used to solve this problem, are well-known. Alternative to these algorithms, we develop Pruning Rule Generation (PRG) algorithm. We compare the PRG algorithm with RG and MRG.

Keywords: rough sets, decision rules, rule induction, classification

Procedia PDF Downloads 529
4853 AI/ML Atmospheric Parameters Retrieval Using the “Atmospheric Retrievals conditional Generative Adversarial Network (ARcGAN)”

Authors: Thomas Monahan, Nicolas Gorius, Thanh Nguyen

Abstract:

Exoplanet atmospheric parameters retrieval is a complex, computationally intensive, inverse modeling problem in which an exoplanet’s atmospheric composition is extracted from an observed spectrum. Traditional Bayesian sampling methods require extensive time and computation, involving algorithms that compare large numbers of known atmospheric models to the input spectral data. Runtimes are directly proportional to the number of parameters under consideration. These increased power and runtime requirements are difficult to accommodate in space missions where model size, speed, and power consumption are of particular importance. The use of traditional Bayesian sampling methods, therefore, compromise model complexity or sampling accuracy. The Atmospheric Retrievals conditional Generative Adversarial Network (ARcGAN) is a deep convolutional generative adversarial network that improves on the previous model’s speed and accuracy. We demonstrate the efficacy of artificial intelligence to quickly and reliably predict atmospheric parameters and present it as a viable alternative to slow and computationally heavy Bayesian methods. In addition to its broad applicability across instruments and planetary types, ARcGAN has been designed to function on low power application-specific integrated circuits. The application of edge computing to atmospheric retrievals allows for real or near-real-time quantification of atmospheric constituents at the instrument level. Additionally, edge computing provides both high-performance and power-efficient computing for AI applications, both of which are critical for space missions. With the edge computing chip implementation, ArcGAN serves as a strong basis for the development of a similar machine-learning algorithm to reduce the downlinked data volume from the Compact Ultraviolet to Visible Imaging Spectrometer (CUVIS) onboard the DAVINCI mission to Venus.

Keywords: deep learning, generative adversarial network, edge computing, atmospheric parameters retrieval

Procedia PDF Downloads 171
4852 Using Genetic Algorithms and Rough Set Based Fuzzy K-Modes to Improve Centroid Model Clustering Performance on Categorical Data

Authors: Rishabh Srivastav, Divyam Sharma

Abstract:

We propose an algorithm to cluster categorical data named as ‘Genetic algorithm initialized rough set based fuzzy K-Modes for categorical data’. We propose an amalgamation of the simple K-modes algorithm, the Rough and Fuzzy set based K-modes and the Genetic Algorithm to form a new algorithm,which we hypothesise, will provide better Centroid Model clustering results, than existing standard algorithms. In the proposed algorithm, the initialization and updation of modes is done by the use of genetic algorithms while the membership values are calculated using the rough set and fuzzy logic.

Keywords: categorical data, fuzzy logic, genetic algorithm, K modes clustering, rough sets

Procedia PDF Downloads 250
4851 Utilizing Spatial Uncertainty of On-The-Go Measurements to Design Adaptive Sampling of Soil Electrical Conductivity in a Rice Field

Authors: Ismaila Olabisi Ogundiji, Hakeem Mayowa Olujide, Qasim Usamot

Abstract:

The main reasons for site-specific management for agricultural inputs are to increase the profitability of crop production, to protect the environment and to improve products’ quality. Information about the variability of different soil attributes within a field is highly essential for the decision-making process. Lack of fast and accurate acquisition of soil characteristics remains one of the biggest limitations of precision agriculture due to being expensive and time-consuming. Adaptive sampling has been proven as an accurate and affordable sampling technique for planning within a field for site-specific management of agricultural inputs. This study employed spatial uncertainty of soil apparent electrical conductivity (ECa) estimates to identify adaptive re-survey areas in the field. The original dataset was grouped into validation and calibration groups where the calibration group was sub-grouped into three sets of different measurements pass intervals. A conditional simulation was performed on the field ECa to evaluate the ECa spatial uncertainty estimates by the use of the geostatistical technique. The grouping of high-uncertainty areas for each set was done using image segmentation in MATLAB, then, high and low area value-separate was identified. Finally, an adaptive re-survey was carried out on those areas of high-uncertainty. Adding adaptive re-surveying significantly minimized the time required for resampling whole field and resulted in ECa with minimal error. For the most spacious transect, the root mean square error (RMSE) yielded from an initial crude sampling survey was minimized after an adaptive re-survey, which was close to that value of the ECa yielded with an all-field re-survey. The estimated sampling time for the adaptive re-survey was found to be 45% lesser than that of all-field re-survey. The results indicate that designing adaptive sampling through spatial uncertainty models significantly mitigates sampling cost, and there was still conformity in the accuracy of the observations.

Keywords: soil electrical conductivity, adaptive sampling, conditional simulation, spatial uncertainty, site-specific management

Procedia PDF Downloads 134
4850 The Effect of Feature Selection on Pattern Classification

Authors: Chih-Fong Tsai, Ya-Han Hu

Abstract:

The aim of feature selection (or dimensionality reduction) is to filter out unrepresentative features (or variables) making the classifier perform better than the one without feature selection. Since there are many well-known feature selection algorithms, and different classifiers based on different selection results may perform differently, very few studies consider examining the effect of performing different feature selection algorithms on the classification performances by different classifiers over different types of datasets. In this paper, two widely used algorithms, which are the genetic algorithm (GA) and information gain (IG), are used to perform feature selection. On the other hand, three well-known classifiers are constructed, which are the CART decision tree (DT), multi-layer perceptron (MLP) neural network, and support vector machine (SVM). Based on 14 different types of datasets, the experimental results show that in most cases IG is a better feature selection algorithm than GA. In addition, the combinations of IG with DT and IG with SVM perform best and second best for small and large scale datasets.

Keywords: data mining, feature selection, pattern classification, dimensionality reduction

Procedia PDF Downloads 669
4849 Efficient Credit Card Fraud Detection Based on Multiple ML Algorithms

Authors: Neha Ahirwar

Abstract:

In the contemporary digital era, the rise of credit card fraud poses a significant threat to both financial institutions and consumers. As fraudulent activities become more sophisticated, there is an escalating demand for robust and effective fraud detection mechanisms. Advanced machine learning algorithms have become crucial tools in addressing this challenge. This paper conducts a thorough examination of the design and evaluation of a credit card fraud detection system, utilizing four prominent machine learning algorithms: random forest, logistic regression, decision tree, and XGBoost. The surge in digital transactions has opened avenues for fraudsters to exploit vulnerabilities within payment systems. Consequently, there is an urgent need for proactive and adaptable fraud detection systems. This study addresses this imperative by exploring the efficacy of machine learning algorithms in identifying fraudulent credit card transactions. The selection of random forest, logistic regression, decision tree, and XGBoost for scrutiny in this study is based on their documented effectiveness in diverse domains, particularly in credit card fraud detection. These algorithms are renowned for their capability to model intricate patterns and provide accurate predictions. Each algorithm is implemented and evaluated for its performance in a controlled environment, utilizing a diverse dataset comprising both genuine and fraudulent credit card transactions.

Keywords: efficient credit card fraud detection, random forest, logistic regression, XGBoost, decision tree

Procedia PDF Downloads 68
4848 Application of Data Mining Techniques for Tourism Knowledge Discovery

Authors: Teklu Urgessa, Wookjae Maeng, Joong Seek Lee

Abstract:

Application of five implementations of three data mining classification techniques was experimented for extracting important insights from tourism data. The aim was to find out the best performing algorithm among the compared ones for tourism knowledge discovery. Knowledge discovery process from data was used as a process model. 10-fold cross validation method is used for testing purpose. Various data preprocessing activities were performed to get the final dataset for model building. Classification models of the selected algorithms were built with different scenarios on the preprocessed dataset. The outperformed algorithm tourism dataset was Random Forest (76%) before applying information gain based attribute selection and J48 (C4.5) (75%) after selection of top relevant attributes to the class (target) attribute. In terms of time for model building, attribute selection improves the efficiency of all algorithms. Artificial Neural Network (multilayer perceptron) showed the highest improvement (90%). The rules extracted from the decision tree model are presented, which showed intricate, non-trivial knowledge/insight that would otherwise not be discovered by simple statistical analysis with mediocre accuracy of the machine using classification algorithms.

Keywords: classification algorithms, data mining, knowledge discovery, tourism

Procedia PDF Downloads 295
4847 An Experimental Study for Assessing Email Classification Attributes Using Feature Selection Methods

Authors: Issa Qabaja, Fadi Thabtah

Abstract:

Email phishing classification is one of the vital problems in the online security research domain that have attracted several scholars due to its impact on the users payments performed daily online. One aspect to reach a good performance by the detection algorithms in the email phishing problem is to identify the minimal set of features that significantly have an impact on raising the phishing detection rate. This paper investigate three known feature selection methods named Information Gain (IG), Chi-square and Correlation Features Set (CFS) on the email phishing problem to separate high influential features from low influential ones in phishing detection. We measure the degree of influentially by applying four data mining algorithms on a large set of features. We compare the accuracy of these algorithms on the complete features set before feature selection has been applied and after feature selection has been applied. After conducting experiments, the results show 12 common significant features have been chosen among the considered features by the feature selection methods. Further, the average detection accuracy derived by the data mining algorithms on the reduced 12-features set was very slight affected when compared with the one derived from the 47-features set.

Keywords: data mining, email classification, phishing, online security

Procedia PDF Downloads 433
4846 An Optimal Steganalysis Based Approach for Embedding Information in Image Cover Media with Security

Authors: Ahlem Fatnassi, Hamza Gharsellaoui, Sadok Bouamama

Abstract:

This paper deals with the study of interest in the fields of Steganography and Steganalysis. Steganography involves hiding information in a cover media to obtain the stego media in such a way that the cover media is perceived not to have any embedded message for its unintended recipients. Steganalysis is the mechanism of detecting the presence of hidden information in the stego media and it can lead to the prevention of disastrous security incidents. In this paper, we provide a critical review of the steganalysis algorithms available to analyze the characteristics of an image stego media against the corresponding cover media and understand the process of embedding the information and its detection. We anticipate that this paper can also give a clear picture of the current trends in steganography so that we can develop and improvise appropriate steganalysis algorithms.

Keywords: optimization, heuristics and metaheuristics algorithms, embedded systems, low-power consumption, steganalysis heuristic approach

Procedia PDF Downloads 292
4845 Cyber Attacks Management in IoT Networks Using Deep Learning and Edge Computing

Authors: Asmaa El Harat, Toumi Hicham, Youssef Baddi

Abstract:

This survey delves into the complex realm of Internet of Things (IoT) security, highlighting the urgent need for effective cybersecurity measures as IoT devices become increasingly common. It explores a wide array of cyber threats targeting IoT devices and focuses on mitigating these attacks through the combined use of deep learning and machine learning algorithms, as well as edge and cloud computing paradigms. The survey starts with an overview of the IoT landscape and the various types of attacks that IoT devices face. It then reviews key machine learning and deep learning algorithms employed in IoT cybersecurity, providing a detailed comparison to assist in selecting the most suitable algorithms. Finally, the survey provides valuable insights for cybersecurity professionals and researchers aiming to enhance security in the intricate world of IoT.

Keywords: internet of things (IoT), cybersecurity, machine learning, deep learning

Procedia PDF Downloads 34
4844 The Effects of Seasonal Variation on the Microbial-N Flow to the Small Intestine and Prediction of Feed Intake in Grazing Karayaka Sheep

Authors: Mustafa Salman, Nurcan Cetinkaya, Zehra Selcuk, Bugra Genc

Abstract:

The objectives of the present study were to estimate the microbial-N flow to the small intestine and to predict the digestible organic matter intake (DOMI) in grazing Karayaka sheep based on urinary excretion of purine derivatives (xanthine, hypoxanthine, uric acid, and allantoin) by the use of spot urine sampling under field conditions. In the trial, 10 Karayaka sheep from 2 to 3 years of age were used. The animals were grazed in a pasture for ten months and fed with concentrate and vetch plus oat hay for the other two months (January and February) indoors. Highly significant linear and cubic relationships (P<0.001) were found among months for purine derivatives index, purine derivatives excretion, purine derivatives absorption, microbial-N and DOMI. Through urine sampling and the determination of levels of excreted urinary PD and Purine Derivatives / Creatinine ratio (PDC index), microbial-N values were estimated and they indicated that the protein nutrition of the sheep was insufficient. In conclusion, the prediction of protein nutrition of sheep under the field conditions may be possible with the use of spot urine sampling, urinary excreted PD and PDC index. The mean purine derivative levels in spot urine samples from sheep were highest in June, July and October. Protein nutrition of pastured sheep may be affected by weather changes, including rainfall. Spot urine sampling may useful in modeling the feed consumption of pasturing sheep. However, further studies are required under different field conditions with different breeds of sheep to develop spot urine sampling as a model.

Keywords: Karayaka sheep, spot sampling, urinary purine derivatives, PDC index, microbial-N, feed intake

Procedia PDF Downloads 529
4843 The Impact of Temporal Impairment on Quality of Experience (QoE) in Video Streaming: A No Reference (NR) Subjective and Objective Study

Authors: Muhammad Arslan Usman, Muhammad Rehan Usman, Soo Young Shin

Abstract:

Live video streaming is one of the most widely used service among end users, yet it is a big challenge for the network operators in terms of quality. The only way to provide excellent Quality of Experience (QoE) to the end users is continuous monitoring of live video streaming. For this purpose, there are several objective algorithms available that monitor the quality of the video in a live stream. Subjective tests play a very important role in fine tuning the results of objective algorithms. As human perception is considered to be the most reliable source for assessing the quality of a video stream, subjective tests are conducted in order to develop more reliable objective algorithms. Temporal impairments in a live video stream can have a negative impact on the end users. In this paper we have conducted subjective evaluation tests on a set of video sequences containing temporal impairment known as frame freezing. Frame Freezing is considered as a transmission error as well as a hardware error which can result in loss of video frames on the reception side of a transmission system. In our subjective tests, we have performed tests on videos that contain a single freezing event and also for videos that contain multiple freezing events. We have recorded our subjective test results for all the videos in order to give a comparison on the available No Reference (NR) objective algorithms. Finally, we have shown the performance of no reference algorithms used for objective evaluation of videos and suggested the algorithm that works better. The outcome of this study shows the importance of QoE and its effect on human perception. The results for the subjective evaluation can serve the purpose for validating objective algorithms.

Keywords: objective evaluation, subjective evaluation, quality of experience (QoE), video quality assessment (VQA)

Procedia PDF Downloads 602
4842 Neuroevolution Based on Adaptive Ensembles of Biologically Inspired Optimization Algorithms Applied for Modeling a Chemical Engineering Process

Authors: Sabina-Adriana Floria, Marius Gavrilescu, Florin Leon, Silvia Curteanu, Costel Anton

Abstract:

Neuroevolution is a subfield of artificial intelligence used to solve various problems in different application areas. Specifically, neuroevolution is a technique that applies biologically inspired methods to generate neural network architectures and optimize their parameters automatically. In this paper, we use different biologically inspired optimization algorithms in an ensemble strategy with the aim of training multilayer perceptron neural networks, resulting in regression models used to simulate the industrial chemical process of obtaining bricks from silicone-based materials. Installations in the raw ceramics industry, i.e., bricks, are characterized by significant energy consumption and large quantities of emissions. In addition, the initial conditions that were taken into account during the design and commissioning of the installation can change over time, which leads to the need to add new mixes to adjust the operating conditions for the desired purpose, e.g., material properties and energy saving. The present approach follows the study by simulation of a process of obtaining bricks from silicone-based materials, i.e., the modeling and optimization of the process. Optimization aims to determine the working conditions that minimize the emissions represented by nitrogen monoxide. We first use a search procedure to find the best values for the parameters of various biologically inspired optimization algorithms. Then, we propose an adaptive ensemble strategy that uses only a subset of the best algorithms identified in the search stage. The adaptive ensemble strategy combines the results of selected algorithms and automatically assigns more processing capacity to the more efficient algorithms. Their efficiency may also vary at different stages of the optimization process. In a given ensemble iteration, the most efficient algorithms aim to maintain good convergence, while the less efficient algorithms can improve population diversity. The proposed adaptive ensemble strategy outperforms the individual optimizers and the non-adaptive ensemble strategy in convergence speed, and the obtained results provide lower error values.

Keywords: optimization, biologically inspired algorithm, neuroevolution, ensembles, bricks, emission minimization

Procedia PDF Downloads 118
4841 A General Framework for Knowledge Discovery from Echocardiographic and Natural Images

Authors: S. Nandagopalan, N. Pradeep

Abstract:

The aim of this paper is to propose a general framework for storing, analyzing, and extracting knowledge from two-dimensional echocardiographic images, color Doppler images, non-medical images, and general data sets. A number of high performance data mining algorithms have been used to carry out this task. Our framework encompasses four layers namely physical storage, object identification, knowledge discovery, user level. Techniques such as active contour model to identify the cardiac chambers, pixel classification to segment the color Doppler echo image, universal model for image retrieval, Bayesian method for classification, parallel algorithms for image segmentation, etc., were employed. Using the feature vector database that have been efficiently constructed, one can perform various data mining tasks like clustering, classification, etc. with efficient algorithms along with image mining given a query image. All these facilities are included in the framework that is supported by state-of-the-art user interface (UI). The algorithms were tested with actual patient data and Coral image database and the results show that their performance is better than the results reported already.

Keywords: active contour, Bayesian, echocardiographic image, feature vector

Procedia PDF Downloads 445
4840 Analysis and Modeling of Photovoltaic System with Different Research Methods of Maximum Power Point Tracking

Authors: Mehdi Ameur, Ahmed Essakdi, Tamou Nasser

Abstract:

The purpose of this paper is the analysis and modeling of the photovoltaic system with MPPT techniques. This system is developed by combining the models of established solar module and DC-DC converter with the algorithms of perturb and observe (P&O), incremental conductance (INC) and fuzzy logic controller(FLC). The system is simulated under different climate conditions and MPPT algorithms to determine the influence of these conditions on characteristic power-voltage of PV system. According to the comparisons of the simulation results, the photovoltaic system can extract the maximum power with precision and rapidity using the MPPT algorithms discussed in this paper.

Keywords: photovoltaic array, maximum power point tracking, MPPT, perturb and observe, P&O, incremental conductance, INC, hill climbing, HC, fuzzy logic controller, FLC

Procedia PDF Downloads 429
4839 Evaluation of Dual Polarization Rainfall Estimation Algorithm Applicability in Korea: A Case Study on Biseulsan Radar

Authors: Chulsang Yoo, Gildo Kim

Abstract:

Dual polarization radar provides comprehensive information about rainfall by measuring multiple parameters. In Korea, for the rainfall estimation, JPOLE and CSU-HIDRO algorithms are generally used. This study evaluated the local applicability of JPOLE and CSU-HIDRO algorithms in Korea by using the observed rainfall data collected on August, 2014 by the Biseulsan dual polarization radar data and KMA AWS. A total of 11,372 pairs of radar-ground rain rate data were classified according to thresholds of synthetic algorithms into suitable and unsuitable data. Then, evaluation criteria were derived by comparing radar rain rate and ground rain rate, respectively, for entire, suitable, unsuitable data. The results are as follows: (1) The radar rain rate equation including KDP, was found better in the rainfall estimation than the other equations for both JPOLE and CSU-HIDRO algorithms. The thresholds were found to be adequately applied for both algorithms including specific differential phase. (2) The radar rain rate equation including horizontal reflectivity and differential reflectivity were found poor compared to the others. The result was not improved even when only the suitable data were applied. Acknowledgments: This work was supported by the Basic Science Research Program through the National Research Foundation of Korea, funded by the Ministry of Education (NRF-2013R1A1A2011012).

Keywords: CSU-HIDRO algorithm, dual polarization radar, JPOLE algorithm, radar rainfall estimation algorithm

Procedia PDF Downloads 215