Search results for: segmentation algorithms
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2370

Search results for: segmentation algorithms

1740 Delaunay Triangulations Efficiency for Conduction-Convection Problems

Authors: Bashar Albaalbaki, Roger E. Khayat

Abstract:

This work is a comparative study on the effect of Delaunay triangulation algorithms on discretization error for conduction-convection conservation problems. A structured triangulation and many unstructured Delaunay triangulations using three popular algorithms for node placement strategies are used. The numerical method employed is the vertex-centered finite volume method. It is found that when the computational domain can be meshed using a structured triangulation, the discretization error is lower for structured triangulations compared to unstructured ones for only low Peclet number values, i.e. when conduction is dominant. However, as the Peclet number is increased and convection becomes more significant, the unstructured triangulations reduce the discretization error. Also, no statistical correlation between triangulation angle extremums and the discretization error is found using 200 samples of randomly generated Delaunay and non-Delaunay triangulations. Thus, the angle extremums cannot be an indicator of the discretization error on their own and need to be combined with other triangulation quality measures, which is the subject of further studies.

Keywords: conduction-convection problems, Delaunay triangulation, discretization error, finite volume method

Procedia PDF Downloads 98
1739 FlexPoints: Efficient Algorithm for Detection of Electrocardiogram Characteristic Points

Authors: Daniel Bulanda, Janusz A. Starzyk, Adrian Horzyk

Abstract:

The electrocardiogram (ECG) is one of the most commonly used medical tests, essential for correct diagnosis and treatment of the patient. While ECG devices generate a huge amount of data, only a small part of them carries valuable medical information. To deal with this problem, many compression algorithms and filters have been developed over the past years. However, the rapid development of new machine learning techniques poses new challenges. To address this class of problems, we created the FlexPoints algorithm that searches for characteristic points on the ECG signal and ignores all other points that do not carry relevant medical information. The conducted experiments proved that the presented algorithm can significantly reduce the number of data points which represents ECG signal without losing valuable medical information. These sparse but essential characteristic points (flex points) can be a perfect input for some modern machine learning models, which works much better using flex points as an input instead of raw data or data compressed by many popular algorithms.

Keywords: characteristic points, electrocardiogram, ECG, machine learning, signal compression

Procedia PDF Downloads 159
1738 Comparison Between Genetic Algorithms and Particle Swarm Optimization Optimized Proportional Integral Derirative and PSS for Single Machine Infinite System

Authors: Benalia Nadia, Zerzouri Nora, Ben Si Ali Nadia

Abstract:

Abstract: Among the many different modern heuristic optimization methods, genetic algorithms (GA) and the particle swarm optimization (PSO) technique have been attracting a lot of interest. The GA has gained popularity in academia and business mostly because to its simplicity, ability to solve highly nonlinear mixed integer optimization problems that are typical of complex engineering systems, and intuitiveness. The mechanics of the PSO methodology, a relatively recent heuristic search tool, are modeled after the swarming or cooperative behavior of biological groups. It is suitable to compare the performance of the two techniques since they both aim to solve a particular objective function but make use of distinct computing methods. In this article, PSO and GA optimization approaches are used for the parameter tuning of the power system stabilizer and Proportional integral derivative regulator. Load angle and rotor speed variations in the single machine infinite bus bar system is used to measure the performance of the suggested solution.

Keywords: SMIB, genetic algorithm, PSO, transient stability, power system stabilizer, PID

Procedia PDF Downloads 74
1737 ACO-TS: an ACO-based Algorithm for Optimizing Cloud Task Scheduling

Authors: Fahad Y. Al-dawish

Abstract:

The current trend by a large number of organizations and individuals to use cloud computing. Many consider it a significant shift in the field of computing. Cloud computing are distributed and parallel systems consisting of a collection of interconnected physical and virtual machines. With increasing request and profit of cloud computing infrastructure, diverse computing processes can be executed on cloud environment. Many organizations and individuals around the world depend on the cloud computing environments infrastructure to carry their applications, platform, and infrastructure. One of the major and essential issues in this environment related to allocating incoming tasks to suitable virtual machine (cloud task scheduling). Cloud task scheduling is classified as optimization problem, and there are several meta-heuristic algorithms have been anticipated to solve and optimize this problem. Good task scheduler should execute its scheduling technique on altering environment and the types of incoming task set. In this research project a cloud task scheduling methodology based on ant colony optimization ACO algorithm, we call it ACO-TS Ant Colony Optimization for Task Scheduling has been proposed and compared with different scheduling algorithms (Random, First Come First Serve FCFS, and Fastest Processor to the Largest Task First FPLTF). Ant Colony Optimization (ACO) is random optimization search method that will be used for assigning incoming tasks to available virtual machines VMs. The main role of proposed algorithm is to minimizing the makespan of certain tasks set and maximizing resource utilization by balance the load among virtual machines. The proposed scheduling algorithm was evaluated by using Cloudsim toolkit framework. Finally after analyzing and evaluating the performance of experimental results we find that the proposed algorithm ACO-TS perform better than Random, FCFS, and FPLTF algorithms in each of the makespaan and resource utilization.

Keywords: cloud Task scheduling, ant colony optimization (ACO), cloudsim, cloud computing

Procedia PDF Downloads 418
1736 Prediction of All-Beta Protein Secondary Structure Using Garnier-Osguthorpe-Robson Method

Authors: K. Tejasri, K. Suvarna Vani, S. Prathyusha, S. Ramya

Abstract:

Proteins are chained sequences of amino acids which are brought together by the peptide bonds. Many varying formations of the chains are possible due to multiple combinations of amino acids and rotation in numerous positions along the chain. Protein structure prediction is one of the crucial goals worked towards by the members of bioinformatics and theoretical chemistry backgrounds. Among the four different structure levels in proteins, we emphasize mainly the secondary level structure. Generally, the secondary protein basically comprises alpha-helix and beta-sheets. Multi-class classification problem of data with disparity is truly a challenge to overcome and has to be addressed for the beta strands. Imbalanced data distribution constitutes a couple of the classes of data having very limited training samples collated with other classes. The secondary structure data is extracted from the protein primary sequence, and the beta-strands are predicted using suitable machine learning algorithms.

Keywords: proteins, secondary structure elements, beta-sheets, beta-strands, alpha-helices, machine learning algorithms

Procedia PDF Downloads 91
1735 Hidro-IA: An Artificial Intelligent Tool Applied to Optimize the Operation Planning of Hydrothermal Systems with Historical Streamflow

Authors: Thiago Ribeiro de Alencar, Jacyro Gramulia Junior, Patricia Teixeira Leite

Abstract:

The area of the electricity sector that deals with energy needs by the hydroelectric in a coordinated manner is called Operation Planning of Hydrothermal Power Systems (OPHPS). The purpose of this is to find a political operative to provide electrical power to the system in a given period, with reliability and minimal cost. Therefore, it is necessary to determine an optimal schedule of generation for each hydroelectric, each range, so that the system meets the demand reliably, avoiding rationing in years of severe drought, and that minimizes the expected cost of operation during the planning, defining an appropriate strategy for thermal complementation. Several optimization algorithms specifically applied to this problem have been developed and are used. Although providing solutions to various problems encountered, these algorithms have some weaknesses, difficulties in convergence, simplification of the original formulation of the problem, or owing to the complexity of the objective function. An alternative to these challenges is the development of techniques for simulation optimization and more sophisticated and reliable, it can assist the planning of the operation. Thus, this paper presents the development of a computational tool, namely Hydro-IA for solving optimization problem identified and to provide the User an easy handling. Adopted as intelligent optimization technique is Genetic Algorithm (GA) and programming language is Java. First made the modeling of the chromosomes, then implemented the function assessment of the problem and the operators involved, and finally the drafting of the graphical interfaces for access to the User. The results with the Genetic Algorithms were compared with the optimization technique nonlinear programming (NLP). Tests were conducted with seven hydroelectric plants interconnected hydraulically with historical stream flow from 1953 to 1955. The results of comparison between the GA and NLP techniques shows that the cost of operating the GA becomes increasingly smaller than the NLP when the number of hydroelectric plants interconnected increases. The program has managed to relate a coherent performance in problem resolution without the need for simplification of the calculations together with the ease of manipulating the parameters of simulation and visualization of output results.

Keywords: energy, optimization, hydrothermal power systems, artificial intelligence and genetic algorithms

Procedia PDF Downloads 417
1734 Improving the Performances of the nMPRA Architecture by Implementing Specific Functions in Hardware

Authors: Ionel Zagan, Vasile Gheorghita Gaitan

Abstract:

Minimizing the response time to asynchronous events in a real-time system is an important factor in increasing the speed of response and an interesting concept in designing equipment fast enough for the most demanding applications. The present article will present the results regarding the validation of the nMPRA (Multi Pipeline Register Architecture) architecture using the FPGA Virtex-7 circuit. The nMPRA concept is a hardware processor with the scheduler implemented at the processor level; this is done without affecting a possible bus communication, as is the case with the other CPU solutions. The implementation of static or dynamic scheduling operations in hardware and the improvement of handling interrupts and events by the real-time executive described in the present article represent a key solution for eliminating the overhead of the operating system functions. The nMPRA processor is capable of executing a preemptive scheduling, using various algorithms without a software scheduler. Therefore, we have also presented various scheduling methods and algorithms used in scheduling the real-time tasks.

Keywords: nMPRA architecture, pipeline processor, preemptive scheduling, real-time system

Procedia PDF Downloads 363
1733 Maximum Likelihood Estimation Methods on a Two-Parameter Rayleigh Distribution under Progressive Type-Ii Censoring

Authors: Daniel Fundi Murithi

Abstract:

Data from economic, social, clinical, and industrial studies are in some way incomplete or incorrect due to censoring. Such data may have adverse effects if used in the estimation problem. We propose the use of Maximum Likelihood Estimation (MLE) under a progressive type-II censoring scheme to remedy this problem. In particular, maximum likelihood estimates (MLEs) for the location (µ) and scale (λ) parameters of two Parameter Rayleigh distribution are realized under a progressive type-II censoring scheme using the Expectation-Maximization (EM) and the Newton-Raphson (NR) algorithms. These algorithms are used comparatively because they iteratively produce satisfactory results in the estimation problem. The progressively type-II censoring scheme is used because it allows the removal of test units before the termination of the experiment. Approximate asymptotic variances and confidence intervals for the location and scale parameters are derived/constructed. The efficiency of EM and the NR algorithms is compared given root mean squared error (RMSE), bias, and the coverage rate. The simulation study showed that in most sets of simulation cases, the estimates obtained using the Expectation-maximization algorithm had small biases, small variances, narrower/small confidence intervals width, and small root of mean squared error compared to those generated via the Newton-Raphson (NR) algorithm. Further, the analysis of a real-life data set (data from simple experimental trials) showed that the Expectation-Maximization (EM) algorithm performs better compared to Newton-Raphson (NR) algorithm in all simulation cases under the progressive type-II censoring scheme.

Keywords: expectation-maximization algorithm, maximum likelihood estimation, Newton-Raphson method, two-parameter Rayleigh distribution, progressive type-II censoring

Procedia PDF Downloads 157
1732 Detect Circles in Image: Using Statistical Image Analysis

Authors: Fathi M. O. Hamed, Salma F. Elkofhaifee

Abstract:

The aim of this work is to detect geometrical shape objects in an image. In this paper, the object is considered to be as a circle shape. The identification requires find three characteristics, which are number, size, and location of the object. To achieve the goal of this work, this paper presents an algorithm that combines from some of statistical approaches and image analysis techniques. This algorithm has been implemented to arrive at the major objectives in this paper. The algorithm has been evaluated by using simulated data, and yields good results, and then it has been applied to real data.

Keywords: image processing, median filter, projection, scale-space, segmentation, threshold

Procedia PDF Downloads 427
1731 Lotus Mechanism: Validation of Deployment Mechanism Using Structural and Dynamic Analysis

Authors: Parth Prajapati, A. R. Srinivas

Abstract:

The purpose of this paper is to validate the concept of the Lotus Mechanism using Computer Aided Engineering (CAE) tools considering the statics and dynamics through actual time dependence involving inertial forces acting on the mechanism joints. For a 1.2 m mirror made of hexagonal segments, with simple harnesses and three-point supports, the maximum diameter is 400 mm, minimum segment base thickness is 1.5 mm, and maximum rib height is considered as 12 mm. Manufacturing challenges are explored for the segments using manufacturing research and development approaches to enable use of large lightweight mirrors required for the future space system.

Keywords: dynamics, manufacturing, reflectors, segmentation, statics

Procedia PDF Downloads 368
1730 Hybrid Deep Learning and FAST-BRISK 3D Object Detection Technique for Bin-Picking Application

Authors: Thanakrit Taweesoontorn, Sarucha Yanyong, Poom Konghuayrob

Abstract:

Robotic arms have gained popularity in various industries due to their accuracy and efficiency. This research proposes a method for bin-picking tasks using the Cobot, combining the YOLOv5 CNNs model for object detection and pose estimation with traditional feature detection (FAST), feature description (BRISK), and matching algorithms. By integrating these algorithms and utilizing a small-scale depth sensor camera for capturing depth and color images, the system achieves real-time object detection and accurate pose estimation, enabling the robotic arm to pick objects correctly in both position and orientation. Furthermore, the proposed method is implemented within the ROS framework to provide a seamless platform for robotic control and integration. This integration of robotics, cameras, and AI technology contributes to the development of industrial robotics, opening up new possibilities for automating challenging tasks and improving overall operational efficiency.

Keywords: robotic vision, image processing, applications of robotics, artificial intelligent

Procedia PDF Downloads 90
1729 Learning Algorithms for Fuzzy Inference Systems Composed of Double- and Single-Input Rule Modules

Authors: Hirofumi Miyajima, Kazuya Kishida, Noritaka Shigei, Hiromi Miyajima

Abstract:

Most of self-tuning fuzzy systems, which are automatically constructed from learning data, are based on the steepest descent method (SDM). However, this approach often requires a large convergence time and gets stuck into a shallow local minimum. One of its solutions is to use fuzzy rule modules with a small number of inputs such as DIRMs (Double-Input Rule Modules) and SIRMs (Single-Input Rule Modules). In this paper, we consider a (generalized) DIRMs model composed of double and single-input rule modules. Further, in order to reduce the redundant modules for the (generalized) DIRMs model, pruning and generative learning algorithms for the model are suggested. In order to show the effectiveness of them, numerical simulations for function approximation, Box-Jenkins and obstacle avoidance problems are performed.

Keywords: Box-Jenkins's problem, double-input rule module, fuzzy inference model, obstacle avoidance, single-input rule module

Procedia PDF Downloads 350
1728 Machine Learning Approach for Yield Prediction in Semiconductor Production

Authors: Heramb Somthankar, Anujoy Chakraborty

Abstract:

This paper presents a classification study on yield prediction in semiconductor production using machine learning approaches. A complicated semiconductor production process is generally monitored continuously by signals acquired from sensors and measurement sites. A monitoring system contains a variety of signals, all of which contain useful information, irrelevant information, and noise. In the case of each signal being considered a feature, "Feature Selection" is used to find the most relevant signals. The open-source UCI SECOM Dataset provides 1567 such samples, out of which 104 fail in quality assurance. Feature extraction and selection are performed on the dataset, and useful signals were considered for further study. Afterward, common machine learning algorithms were employed to predict whether the signal yields pass or fail. The most relevant algorithm is selected for prediction based on the accuracy and loss of the ML model.

Keywords: deep learning, feature extraction, feature selection, machine learning classification algorithms, semiconductor production monitoring, signal processing, time-series analysis

Procedia PDF Downloads 104
1727 Blind Super-Resolution Reconstruction Based on PSF Estimation

Authors: Osama A. Omer, Amal Hamed

Abstract:

Successful blind image Super-Resolution algorithms require the exact estimation of the Point Spread Function (PSF). In the absence of any prior information about the imagery system and the true image; this estimation is normally done by trial and error experimentation until an acceptable restored image quality is obtained. Multi-frame blind Super-Resolution algorithms often have disadvantages of slow convergence and sensitiveness to complex noises. This paper presents a Super-Resolution image reconstruction algorithm based on estimation of the PSF that yields the optimum restored image quality. The estimation of PSF is performed by the knife-edge method and it is implemented by measuring spreading of the edges in the reproduced HR image itself during the reconstruction process. The proposed image reconstruction approach is using L1 norm minimization and robust regularization based on a bilateral prior to deal with different data and noise models. A series of experiment results show that the proposed method can outperform other previous work robustly and efficiently.

Keywords: blind, PSF, super-resolution, knife-edge, blurring, bilateral, L1 norm

Procedia PDF Downloads 361
1726 Multi-Spectral Medical Images Enhancement Using a Weber’s law

Authors: Muna F. Al-Sammaraie

Abstract:

The aim of this research is to present a multi spectral image enhancement methods used to achieve highly real digital image populates only a small portion of the available range of digital values. Also, a quantitative measure of image enhancement is presented. This measure is related with concepts of the Webers Low of the human visual system. For decades, several image enhancement techniques have been proposed. Although most techniques require profuse amount of advance and critical steps, the result for the perceive image are not as satisfied. This study involves changing the original values so that more of the available range is used; then increases the contrast between features and their backgrounds. It consists of reading the binary image on the basis of pixels taking them byte-wise and displaying it, calculating the statistics of an image, automatically enhancing the color of the image based on statistics calculation using algorithms and working with RGB color bands. Finally, the enhanced image is displayed along with image histogram. A number of experimental results illustrated the performance of these algorithms. Particularly the quantitative measure has helped to select optimal processing parameters: the best parameters and transform.

Keywords: image enhancement, multi-spectral, RGB, histogram

Procedia PDF Downloads 325
1725 Automatic Target Recognition in SAR Images Based on Sparse Representation Technique

Authors: Ahmet Karagoz, Irfan Karagoz

Abstract:

Synthetic Aperture Radar (SAR) is a radar mechanism that can be integrated into manned and unmanned aerial vehicles to create high-resolution images in all weather conditions, regardless of day and night. In this study, SAR images of military vehicles with different azimuth and descent angles are pre-processed at the first stage. The main purpose here is to reduce the high speckle noise found in SAR images. For this, the Wiener adaptive filter, the mean filter, and the median filters are used to reduce the amount of speckle noise in the images without causing loss of data. During the image segmentation phase, pixel values are ordered so that the target vehicle region is separated from other regions containing unnecessary information. The target image is parsed with the brightest 20% pixel value of 255 and the other pixel values of 0. In addition, by using appropriate parameters of statistical region merging algorithm, segmentation comparison is performed. In the step of feature extraction, the feature vectors belonging to the vehicles are obtained by using Gabor filters with different orientation, frequency and angle values. A number of Gabor filters are created by changing the orientation, frequency and angle parameters of the Gabor filters to extract important features of the images that form the distinctive parts. Finally, images are classified by sparse representation method. In the study, l₁ norm analysis of sparse representation is used. A joint database of the feature vectors generated by the target images of military vehicle types is obtained side by side and this database is transformed into the matrix form. In order to classify the vehicles in a similar way, the test images of each vehicle is converted to the vector form and l₁ norm analysis of the sparse representation method is applied through the existing database matrix form. As a result, correct recognition has been performed by matching the target images of military vehicles with the test images by means of the sparse representation method. 97% classification success of SAR images of different military vehicle types is obtained.

Keywords: automatic target recognition, sparse representation, image classification, SAR images

Procedia PDF Downloads 360
1724 Building Scalable and Accurate Hybrid Kernel Mapping Recommender

Authors: Hina Iqbal, Mustansar Ali Ghazanfar, Sandor Szedmak

Abstract:

Recommender systems uses artificial intelligence practices for filtering obscure information and can predict if a user likes a specified item. Kernel mapping Recommender systems have been proposed which are accurate and state-of-the-art algorithms and resolve recommender system’s design objectives such as; long tail, cold-start, and sparsity. The aim of research is to propose hybrid framework that can efficiently integrate different versions— namely item-based and user-based KMR— of KMR algorithm. We have proposed various heuristic algorithms that integrate different versions of KMR (into a unified framework) resulting in improved accuracy and elimination of problems associated with conventional recommender system. We have tested our system on publically available movies dataset and benchmark with KMR. The results (in terms of accuracy, precision, recall, F1 measure and ROC metrics) reveal that the proposed algorithm is quite accurate especially under cold-start and sparse scenarios.

Keywords: Kernel Mapping Recommender Systems, hybrid recommender systems, cold start, sparsity, long tail

Procedia PDF Downloads 332
1723 Low Overhead Dynamic Channel Selection with Cluster-Based Spatial-Temporal Station Reporting in Wireless Networks

Authors: Zeyad Abdelmageid, Xianbin Wang

Abstract:

Choosing the operational channel for a WLAN access point (AP) in WLAN networks has been a static channel assignment process initiated by the user during the deployment process of the AP, which fails to cope with the dynamic conditions of the assigned channel at the station side afterward. However, the dramatically growing number of Wi-Fi APs and stations operating in the unlicensed band has led to dynamic, distributed, and often severe interference. This highlights the urgent need for the AP to dynamically select the best overall channel of operation for the basic service set (BSS) by considering the distributed and changing channel conditions at all stations. Consequently, dynamic channel selection algorithms which consider feedback from the station side have been developed. Despite the significant performance improvement, existing channel selection algorithms suffer from very high feedback overhead. Feedback latency from the STAs, due to the high overhead, can cause the eventually selected channel to no longer be optimal for operation due to the dynamic sharing nature of the unlicensed band. This has inspired us to develop our own dynamic channel selection algorithm with reduced overhead through the proposed low-overhead, cluster-based station reporting mechanism. The main idea behind the cluster-based station reporting is the observation that STAs which are very close to each other tend to have very similar channel conditions. Instead of requesting each STA to report on every candidate channel while causing high overhead, the AP divides STAs into clusters then assigns each STA in each cluster one channel to report feedback on. With the proper design of the cluster based reporting, the AP does not lose any information about the channel conditions at the station side while reducing feedback overhead. The simulation results show equal performance and, at times, better performance with a fraction of the overhead. We believe that this algorithm has great potential in designing future dynamic channel selection algorithms with low overhead.

Keywords: channel assignment, Wi-Fi networks, clustering, DBSCAN, overhead

Procedia PDF Downloads 113
1722 Profit-Based Artificial Neural Network (ANN) Trained by Migrating Birds Optimization: A Case Study in Credit Card Fraud Detection

Authors: Ashkan Zakaryazad, Ekrem Duman

Abstract:

A typical classification technique ranks the instances in a data set according to the likelihood of belonging to one (positive) class. A credit card (CC) fraud detection model ranks the transactions in terms of probability of being fraud. In fact, this approach is often criticized, because firms do not care about fraud probability but about the profitability or costliness of detecting a fraudulent transaction. The key contribution in this study is to focus on the profit maximization in the model building step. The artificial neural network proposed in this study works based on profit maximization instead of minimizing the error of prediction. Moreover, some studies have shown that the back propagation algorithm, similar to other gradient–based algorithms, usually gets trapped in local optima and swarm-based algorithms are more successful in this respect. In this study, we train our profit maximization ANN using the Migrating Birds optimization (MBO) which is introduced to literature recently.

Keywords: neural network, profit-based neural network, sum of squared errors (SSE), MBO, gradient descent

Procedia PDF Downloads 470
1721 An Analysis on Clustering Based Gene Selection and Classification for Gene Expression Data

Authors: K. Sathishkumar, V. Thiagarasu

Abstract:

Due to recent advances in DNA microarray technology, it is now feasible to obtain gene expression profiles of tissue samples at relatively low costs. Many scientists around the world use the advantage of this gene profiling to characterize complex biological circumstances and diseases. Microarray techniques that are used in genome-wide gene expression and genome mutation analysis help scientists and physicians in understanding of the pathophysiological mechanisms, in diagnoses and prognoses, and choosing treatment plans. DNA microarray technology has now made it possible to simultaneously monitor the expression levels of thousands of genes during important biological processes and across collections of related samples. Elucidating the patterns hidden in gene expression data offers a tremendous opportunity for an enhanced understanding of functional genomics. However, the large number of genes and the complexity of biological networks greatly increase the challenges of comprehending and interpreting the resulting mass of data, which often consists of millions of measurements. A first step toward addressing this challenge is the use of clustering techniques, which is essential in the data mining process to reveal natural structures and identify interesting patterns in the underlying data. This work presents an analysis of several clustering algorithms proposed to deals with the gene expression data effectively. The existing clustering algorithms like Support Vector Machine (SVM), K-means algorithm and evolutionary algorithm etc. are analyzed thoroughly to identify the advantages and limitations. The performance evaluation of the existing algorithms is carried out to determine the best approach. In order to improve the classification performance of the best approach in terms of Accuracy, Convergence Behavior and processing time, a hybrid clustering based optimization approach has been proposed.

Keywords: microarray technology, gene expression data, clustering, gene Selection

Procedia PDF Downloads 321
1720 Detecting Paraphrases in Arabic Text

Authors: Amal Alshahrani, Allan Ramsay

Abstract:

Paraphrasing is one of the important tasks in natural language processing; i.e. alternative ways to express the same concept by using different words or phrases. Paraphrases can be used in many natural language applications, such as Information Retrieval, Machine Translation, Question Answering, Text Summarization, or Information Extraction. To obtain pairs of sentences that are paraphrases we create a system that automatically extracts paraphrases from a corpus, which is built from different sources of news article since these are likely to contain paraphrases when they report the same event on the same day. There are existing simple standard approaches (e.g. TF-IDF vector space, cosine similarity) and alignment technique (e.g. Dynamic Time Warping (DTW)) for extracting paraphrase which have been applied to the English. However, the performance of these approaches could be affected when they are applied to another language, for instance Arabic language, due to the presence of phenomena which are not present in English, such as Free Word Order, Zero copula, and Pro-dropping. These phenomena will affect the performance of these algorithms. Thus, if we can analysis how the existing algorithms for English fail for Arabic then we can find a solution for Arabic. The results are promising.

Keywords: natural language processing, TF-IDF, cosine similarity, dynamic time warping (DTW)

Procedia PDF Downloads 379
1719 Personalize E-Learning System Based on Clustering and Sequence Pattern Mining Approach

Authors: H. S. Saini, K. Vijayalakshmi, Rishi Sayal

Abstract:

Network-based education has been growing rapidly in size and quality. Knowledge clustering becomes more important in personalized information retrieval for web-learning. A personalized-Learning service after the learners’ knowledge has been classified with clustering. Through automatic analysis of learners’ behaviors, their partition with similar data level and interests may be discovered so as to produce learners with contents that best match educational needs for collaborative learning. We present a specific mining tool and a recommender engine that we have integrated in the online learning in order to help the teacher to carry out the whole e-learning process. We propose to use sequential pattern mining algorithms to discover the most used path by the students and from this information can recommend links to the new students automatically meanwhile they browse in the course. We have Developed a specific author tool in order to help the teacher to apply all the data mining process. We tend to report on many experiments with real knowledge so as to indicate the quality of using both clustering and sequential pattern mining algorithms together for discovering personalized e-learning systems.

Keywords: e-learning, cluster, personalization, sequence, pattern

Procedia PDF Downloads 424
1718 Implications of Optimisation Algorithm on the Forecast Performance of Artificial Neural Network for Streamflow Modelling

Authors: Martins Y. Otache, John J. Musa, Abayomi I. Kuti, Mustapha Mohammed

Abstract:

The performance of an artificial neural network (ANN) is contingent on a host of factors, for instance, the network optimisation scheme. In view of this, the study examined the general implications of the ANN training optimisation algorithm on its forecast performance. To this end, the Bayesian regularisation (Br), Levenberg-Marquardt (LM), and the adaptive learning gradient descent: GDM (with momentum) algorithms were employed under different ANN structural configurations: (1) single-hidden layer, and (2) double-hidden layer feedforward back propagation network. Results obtained revealed generally that the gradient descent with momentum (GDM) optimisation algorithm, with its adaptive learning capability, used a relatively shorter time in both training and validation phases as compared to the Levenberg- Marquardt (LM) and Bayesian Regularisation (Br) algorithms though learning may not be consummated; i.e., in all instances considering also the prediction of extreme flow conditions for 1-day and 5-day ahead, respectively especially using the ANN model. In specific statistical terms on the average, model performance efficiency using the coefficient of efficiency (CE) statistic were Br: 98%, 94%; LM: 98 %, 95 %, and GDM: 96 %, 96% respectively for training and validation phases. However, on the basis of relative error distribution statistics (MAE, MAPE, and MSRE), GDM performed better than the others overall. Based on the findings, it is imperative to state that the adoption of ANN for real-time forecasting should employ training algorithms that do not have computational overhead like the case of LM that requires the computation of the Hessian matrix, protracted time, and sensitivity to initial conditions; to this end, Br and other forms of the gradient descent with momentum should be adopted considering overall time expenditure and quality of the forecast as well as mitigation of network overfitting. On the whole, it is recommended that evaluation should consider implications of (i) data quality and quantity and (ii) transfer functions on the overall network forecast performance.

Keywords: streamflow, neural network, optimisation, algorithm

Procedia PDF Downloads 150
1717 Investigating Data Normalization Techniques in Swarm Intelligence Forecasting for Energy Commodity Spot Price

Authors: Yuhanis Yusof, Zuriani Mustaffa, Siti Sakira Kamaruddin

Abstract:

Data mining is a fundamental technique in identifying patterns from large data sets. The extracted facts and patterns contribute in various domains such as marketing, forecasting, and medical. Prior to that, data are consolidated so that the resulting mining process may be more efficient. This study investigates the effect of different data normalization techniques, which are Min-max, Z-score, and decimal scaling, on Swarm-based forecasting models. Recent swarm intelligence algorithms employed includes the Grey Wolf Optimizer (GWO) and Artificial Bee Colony (ABC). Forecasting models are later developed to predict the daily spot price of crude oil and gasoline. Results showed that GWO works better with Z-score normalization technique while ABC produces better accuracy with the Min-Max. Nevertheless, the GWO is more superior that ABC as its model generates the highest accuracy for both crude oil and gasoline price. Such a result indicates that GWO is a promising competitor in the family of swarm intelligence algorithms.

Keywords: artificial bee colony, data normalization, forecasting, Grey Wolf optimizer

Procedia PDF Downloads 473
1716 Classifying and Analysis 8-Bit to 8-Bit S-Boxes Characteristic Using S-Box Evaluation Characteristic

Authors: Muhammad Luqman, Yusuf Kurniawan

Abstract:

S-Boxes is one of the linear parts of the cryptographic algorithm. The existence of S-Box in the cryptographic algorithm is needed to maintain non-linearity of the algorithm. Nowadays, modern cryptographic algorithms use an S-Box as a part of algorithm process. Despite the fact that several cryptographic algorithms today reuse theoretically secure and carefully constructed S-Boxes, there is an evaluation characteristic that can measure security properties of S-Boxes and hence the corresponding primitives. Analysis of an S-Box usually is done using manual mathematics calculation. Several S-Boxes are presented as a Truth Table without any mathematical background algorithm. Then, it’s rather difficult to determine the strength of Truth Table S-Box without a mathematical algorithm. A comprehensive analysis should be applied to the Truth Table S-Box to determine the characteristic. Several important characteristics should be owned by the S-Boxes, they are Nonlinearity, Balancedness, Algebraic degree, LAT, DAT, differential delta uniformity, correlation immunity and global avalanche criterion. Then, a comprehensive tool will be present to automatically calculate the characteristics of S-Boxes and determine the strength of S-Box. Comprehensive analysis is done on a deterministic process to produce a sequence of S-Boxes characteristic and give advice for a better S-Box construction.

Keywords: cryptographic properties, Truth Table S-Boxes, S-Boxes characteristic, deterministic process

Procedia PDF Downloads 361
1715 Non-Population Search Algorithms for Capacitated Material Requirement Planning in Multi-Stage Assembly Flow Shop with Alternative Machines

Authors: Watcharapan Sukkerd, Teeradej Wuttipornpun

Abstract:

This paper aims to present non-population search algorithms called tabu search (TS), simulated annealing (SA) and variable neighborhood search (VNS) to minimize the total cost of capacitated MRP problem in multi-stage assembly flow shop with two alternative machines. There are three main steps for the algorithm. Firstly, an initial sequence of orders is constructed by a simple due date-based dispatching rule. Secondly, the sequence of orders is repeatedly improved to reduce the total cost by applying TS, SA and VNS separately. Finally, the total cost is further reduced by optimizing the start time of each operation using the linear programming (LP) model. Parameters of the algorithm are tuned by using real data from automotive companies. The result shows that VNS significantly outperforms TS, SA and the existing algorithm.

Keywords: capacitated MRP, tabu search, simulated annealing, variable neighborhood search, linear programming, assembly flow shop, application in industry

Procedia PDF Downloads 229
1714 Kinematic Optimization of Energy Extraction Performances for Flapping Airfoil by Using Radial Basis Function Method and Genetic Algorithm

Authors: M. Maatar, M. Mekadem, M. Medale, B. Hadjed, B. Imine

Abstract:

In this paper, numerical simulations have been carried out to study the performances of a flapping wing used as an energy collector. Metamodeling and genetic algorithms are used to detect the optimal configuration, improving power coefficient and/or efficiency. Radial basis functions and genetic algorithms have been applied to solve this problem. Three optimization factors are controlled, namely dimensionless heave amplitude h₀, pitch amplitude θ₀ and flapping frequency f. ANSYS FLUENT software has been used to solve the principal equations at a Reynolds number of 1100, while the heave and pitch motion of a NACA0015 airfoil has been realized using a developed function (UDF). The results reveal an average power coefficient and efficiency of 0.78 and 0.338 with an inexpensive low-fidelity model and a total relative error of 4.1% versus the simulation. The performances of the simulated optimum RBF-NSGA-II have been improved by 1.2% compared with the validated model.

Keywords: numerical simulation, flapping wing, energy extraction, power coefficient, efficiency, RBF, NSGA-II

Procedia PDF Downloads 37
1713 Anomaly Detection Based Fuzzy K-Mode Clustering for Categorical Data

Authors: Murat Yazici

Abstract:

Anomalies are irregularities found in data that do not adhere to a well-defined standard of normal behavior. The identification of outliers or anomalies in data has been a subject of study within the statistics field since the 1800s. Over time, a variety of anomaly detection techniques have been developed in several research communities. The cluster analysis can be used to detect anomalies. It is the process of associating data with clusters that are as similar as possible while dissimilar clusters are associated with each other. Many of the traditional cluster algorithms have limitations in dealing with data sets containing categorical properties. To detect anomalies in categorical data, fuzzy clustering approach can be used with its advantages. The fuzzy k-Mode (FKM) clustering algorithm, which is one of the fuzzy clustering approaches, by extension to the k-means algorithm, is reported for clustering datasets with categorical values. It is a form of clustering: each point can be associated with more than one cluster. In this paper, anomaly detection is performed on two simulated data by using the FKM cluster algorithm. As a significance of the study, the FKM cluster algorithm allows to determine anomalies with their abnormality degree in contrast to numerous anomaly detection algorithms. According to the results, the FKM cluster algorithm illustrated good performance in the anomaly detection of data, including both one anomaly and more than one anomaly.

Keywords: fuzzy k-mode clustering, anomaly detection, noise, categorical data

Procedia PDF Downloads 48
1712 Short Association Bundle Atlas for Lateralization Studies from dMRI Data

Authors: C. Román, M. Guevara, P. Salas, D. Duclap, J. Houenou, C. Poupon, J. F. Mangin, P. Guevara

Abstract:

Diffusion Magnetic Resonance Imaging (dMRI) allows the non-invasive study of human brain white matter. From diffusion data, it is possible to reconstruct fiber trajectories using tractography algorithms. Our previous work consists in an automatic method for the identification of short association bundles of the superficial white matter (SWM), based on a whole brain inter-subject hierarchical clustering applied to a HARDI database. The method finds representative clusters of similar fibers, belonging to a group of subjects, according to a distance measure between fibers, using a non-linear registration (DTI-TK). The algorithm performs an automatic labeling based on the anatomy, defined by a cortex mesh parcelated with FreeSurfer software. The clustering was applied to two independent groups of 37 subjects. The clusters resulting from both groups were compared using a restrictive threshold of mean distance between each pair of bundles from different groups, in order to keep reproducible connections. In the left hemisphere, 48 reproducible bundles were found, while 43 bundles where found in the right hemisphere. An inter-hemispheric bundle correspondence was then applied. The symmetric horizontal reflection of the right bundles was calculated, in order to obtain the position of them in the left hemisphere. Next, the intersection between similar bundles was calculated. The pairs of bundles with a fiber intersection percentage higher than 50% were considered similar. The similar bundles between both hemispheres were fused and symmetrized. We obtained 30 common bundles between hemispheres. An atlas was created with the resulting bundles and used to segment 78 new subjects from another HARDI database, using a distance threshold between 6-8 mm according to the bundle length. Finally, a laterality index was calculated based on the bundle volume. Seven bundles of the atlas presented right laterality (IP_SP_1i, LO_LO_1i, Op_Tr_0i, PoC_PoC_0i, PoC_PreC_2i, PreC_SM_0i, y RoMF_RoMF_0i) and one presented left laterality (IP_SP_2i), there is no tendency of lateralization according to the brain region. Many factors can affect the results, like tractography artifacts, subject registration, and bundle segmentation. Further studies are necessary in order to establish the influence of these factors and evaluate SWM laterality.

Keywords: dMRI, hierarchical clustering, lateralization index, tractography

Procedia PDF Downloads 328
1711 Inferring Human Mobility in India Using Machine Learning

Authors: Asra Yousuf, Ajaykumar Tannirkulum

Abstract:

Inferring rural-urban migration trends can help design effective policies that promote better urban planning and rural development. In this paper, we describe how machine learning algorithms can be applied to predict internal migration decisions of people. We consider data collected from household surveys in Tamil Nadu to train our model. To measure the performance of the model, we use data on past migration from National Sample Survey Organisation of India. The factors for training the model include socioeconomic characteristic of each individual like age, gender, place of residence, outstanding loans, strength of the household, etc. and his past migration history. We perform a comparative analysis of the performance of a number of machine learning algorithm to determine their prediction accuracy. Our results show that machine learning algorithms provide a stronger prediction accuracy as compared to statistical models. Our goal through this research is to propose the use of data science techniques in understanding human decisions and behaviour in developing countries.

Keywords: development, migration, internal migration, machine learning, prediction

Procedia PDF Downloads 270