Search results for: algorithms

1291 An Attribute-Centre Based Decision Tree Classification Algorithm

Abstract:

Decision tree algorithms have very important place at classification model of data mining. In literature, algorithms use entropy concept or gini index to form the tree. The shape of the classes and their closeness to each other some of the factors that affect the performance of the algorithm. In this paper we introduce a new decision tree algorithm which employs data (attribute) folding method and variation of the class variables over the branches to be created. A comparative performance analysis has been held between the proposed algorithm and C4.5.

Keywords: Classification, decision tree, split, pruning, entropy, gini.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1370

1290 Elephant Herding Optimization for Service Selection in QoS-Aware Web Service Composition

Authors: Samia Sadouki Chibani, Abdelkamel Tari

Abstract:

Web service composition combines available services to provide new functionality. Given the number of available services with similar functionalities and different non functional aspects (QoS), the problem of finding a QoS-optimal web service composition is considered as an optimization problem belonging to NP-hard class. Thus, an optimal solution cannot be found by exact algorithms within a reasonable time. In this paper, a meta-heuristic bio-inspired is presented to address the QoS aware web service composition; it is based on Elephant Herding Optimization (EHO) algorithm, which is inspired by the herding behavior of elephant group. EHO is characterized by a process of dividing and combining the population to sub populations (clan); this process allows the exchange of information between local searches to move toward a global optimum. However, with Applying others evolutionary algorithms the problem of early stagnancy in a local optimum cannot be avoided. Compared with PSO, the results of experimental evaluation show that our proposition significantly outperforms the existing algorithm with better performance of the fitness value and a fast convergence.

Keywords: Elephant herding optimization, web service composition, bio-inspired algorithms, QoS optimization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1032

1289 Restartings: A Technique to Improve Classic Genetic Algorithms Performance

Authors: Grigorios N. Beligiannis, Georgios A. Tsirogiannis, Panayotis E. Pintelas

Abstract:

In this contribution, a way to enhance the performance of the classic Genetic Algorithm is proposed. The idea of restarting a Genetic Algorithm is applied in order to obtain better knowledge of the solution space of the problem. A new operator of 'insertion' is introduced so as to exploit (utilize) the information that has already been collected before the restarting procedure. Finally, numerical experiments comparing the performance of the classic Genetic Algorithm and the Genetic Algorithm with restartings, for some well known test functions, are given.

Keywords: Genetic Algorithms, Restartings, Search space exploration, Search space exploitation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2138

1288 Energy Efficient Clustering Algorithm with Global and Local Re-clustering for Wireless Sensor Networks

Authors: Ashanie Guanathillake, Kithsiri Samarasinghe

Abstract:

Wireless Sensor Networks consist of inexpensive, low power sensor nodes deployed to monitor the environment and collect data. Gathering information in an energy efficient manner is a critical aspect to prolong the network lifetime. Clustering algorithms have an advantage of enhancing the network lifetime. Current clustering algorithms usually focus on global re-clustering and local re-clustering separately. This paper, proposed a combination of those two reclustering methods to reduce the energy consumption of the network. Furthermore, the proposed algorithm can apply to homogeneous as well as heterogeneous wireless sensor networks. In addition, the cluster head rotation happens, only when its energy drops below a dynamic threshold value computed by the algorithm. The simulation result shows that the proposed algorithm prolong the network lifetime compared to existing algorithms.

Keywords: Energy efficient, Global re-clustering, Local re-clustering, Wireless sensor networks.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2370

1287 Implementation of Channel Estimation and Timing Synchronization Algorithms for MIMO-OFDM System Using NI USRP 2920

Authors: Ali Beydoun, Hamzé H. Alaeddine

Abstract:

MIMO-OFDM communication system presents a key solution for the next generation of mobile communication due to its high spectral efficiency, high data rate and robustness against multi-path fading channels. However, MIMO-OFDM system requires a perfect knowledge of the channel state information and a good synchronization between the transmitter and the receiver to achieve the expected performances. Recently, we have proposed two algorithms for channel estimation and timing synchronization with good performances and very low implementation complexity compared to those proposed in the literature. In order to validate and evaluate the efficiency of these algorithms in real environments, this paper presents in detail the implementation of 2 × 2 MIMO-OFDM system based on LabVIEW and USRP 2920. Implementation results show a good agreement with the simulation results under different configuration parameters.

Keywords: MIMO-OFDM system, timing synchronization, channel estimation, STBC, USRP 2920.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 844

1286 Feature Weighting and Selection - A Novel Genetic Evolutionary Approach

Authors: Serkawt Khola

Abstract:

A feature weighting and selection method is proposed which uses the structure of a weightless neuron and exploits the principles that govern the operation of Genetic Algorithms and Evolution. Features are coded onto chromosomes in a novel way which allows weighting information regarding the features to be directly inferred from the gene values. The proposed method is significant in that it addresses several problems concerned with algorithms for feature selection and weighting as well as providing significant advantages such as speed, simplicity and suitability for real-time systems.

Keywords: Feature weighting, genetic algorithm, pattern recognition, weightless neuron.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1856

1285 Visualization and Indexing of Spectral Databases

Authors: Tibor Kulcsar, Gabor Sarossy, Gabor Bereznai, Robert Auer, Janos Abonyi

Abstract:

On-line (near infrared) spectroscopy is widely used to support the operation of complex process systems. Information extracted from spectral database can be used to estimate unmeasured product properties and monitor the operation of the process. These techniques are based on looking for similar spectra by nearest neighborhood algorithms and distance based searching methods. Search for nearest neighbors in the spectral space is an NP-hard problem, the computational complexity increases by the number of points in the discrete spectrum and the number of samples in the database. To reduce the calculation time some kind of indexing could be used. The main idea presented in this paper is to combine indexing and visualization techniques to reduce the computational requirement of estimation algorithms by providing a two dimensional indexing that can also be used to visualize the structure of the spectral database. This 2D visualization of spectral database does not only support application of distance and similarity based techniques but enables the utilization of advanced clustering and prediction algorithms based on the Delaunay tessellation of the mapped spectral space. This means the prediction has not to use the high dimension space but can be based on the mapped space too. The results illustrate that the proposed method is able to segment (cluster) spectral databases and detect outliers that are not suitable for instance based learning algorithms.

Keywords: indexing high dimensional databases, dimensional reduction, clustering, similarity, k-nn algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1769

1284 Design of DC Voltage Control for D-STATCOM

Authors: Kittaya Somsai, Thanatchai Kulworawanichpong, Nitus Voraphonpiput

Abstract:

This paper presents the DC voltage control design of D-STATCOM when the D-STATCOM is used for load voltage regulation. Although, the DC voltage can be controlled by active current of the D-STATCOM, reactive current still affects the DC voltage. To eliminate this effect, the control strategy with elimination effect of the reactive current is proposed and the results of the control with and without the elimination the effect of the reactive current are compared. For obtaining the proportional and integral gains of the PI controllers, the symmetrical optimum and genetic algorithms methods are applied. The stability margin of these methods are obtained and discussed in detail. In addition, the performance of the DC voltage control based on symmetrical optimum and genetic algorithms methods are compared. Effectiveness of the controllers designed was verified through computer simulation performed by using Power System Tool Block (PSB) in SIMULINK/MATLAB. The simulation results demonstrated that the DC voltage control proposed is effective in regulating DC voltage when the DSTATCOM is used for load voltage regulation.

Keywords: D-STATCOM, DC voltage control, Symmetrical optimum, Genetic algorithms

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5038

1283 Robust Design of Power System Stabilizers Using Adaptive Genetic Algorithms

Authors: H. Alkhatib, J. Duveau

Abstract:

Genetic algorithms (GAs) have been widely used for global optimization problems. The GA performance depends highly on the choice of the search space for each parameter to be optimized. Often, this choice is a problem-based experience. The search space being a set of potential solutions may contain the global optimum and/or other local optimums. A bad choice of this search space results in poor solutions. In this paper, our approach consists in extending the search space boundaries during the GA optimization, only when it is required. This leads to more diversification of GA population by new solutions that were not available with fixed search space boundaries. So, these dynamic search spaces can improve the GA optimization performances. The proposed approach is applied to power system stabilizer optimization for multimachine power system (16-generator and 68-bus). The obtained results are evaluated and compared with those obtained by ordinary GAs. Eigenvalue analysis and nonlinear system simulation results show the effectiveness of the proposed approach to damp out the electromechanical oscillation and enhance the global system stability.

Keywords: Genetic Algorithms, Multiobjective Optimization, Power System Stabilizer, Small Signal Stability.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1724

1282 A Characterized and Optimized Approach for End-to-End Delay Constrained QoS Routing

Authors: P.S.Prakash, S.Selvan

Abstract:

QoS Routing aims to find paths between senders and receivers satisfying the QoS requirements of the application which efficiently using the network resources and underlying routing algorithm to be able to find low-cost paths that satisfy given QoS constraints. The problem of finding least-cost routing is known to be NP hard or complete and some algorithms have been proposed to find a near optimal solution. But these heuristics or algorithms either impose relationships among the link metrics to reduce the complexity of the problem which may limit the general applicability of the heuristic, or are too costly in terms of execution time to be applicable to large networks. In this paper, we analyzed two algorithms namely Characterized Delay Constrained Routing (CDCR) and Optimized Delay Constrained Routing (ODCR). The CDCR algorithm dealt an approach for delay constrained routing that captures the trade-off between cost minimization and risk level regarding the delay constraint. The ODCR which uses an adaptive path weight function together with an additional constraint imposed on the path cost, to restrict search space and hence ODCR finds near optimal solution in much quicker time.

Keywords: QoS, Delay, Routing, Optimization

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1274

1281 Open Source Algorithms for 3D Geo-Representation of Subsurface Formations Properties in the Oil and Gas Industry

Authors: Gabriel Quintero

Abstract:

This paper presents the result of the implementation of a series of algorithms intended to be used for representing in most of the 3D geographic software, even Google Earth, the subsurface formations properties combining 2D charts or 3D plots over a 3D background, allowing everyone to use them, no matter the economic size of the company for which they work. Besides the existence of complex and expensive specialized software for modeling subsurface formations based on the same information provided to this one, the use of this open source development shows a higher and easier usability and good results, limiting the rendered properties and polygons to a basic set of charts and tubes.

Keywords: Chart, earth, formations, subsurface, visualization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1918

1280 The Projection Methods for Computing the Pseudospectra of Large Scale Matrices

Authors: Zhengsheng Wang, Xiangyong Ji, Yong Du

Abstract:

The projection methods, usually viewed as the methods for computing eigenvalues, can also be used to estimate pseudospectra. This paper proposes a kind of projection methods for computing the pseudospectra of large scale matrices, including orthogonalization projection method and oblique projection method respectively. This possibility may be of practical importance in applications involving large scale highly nonnormal matrices. Numerical algorithms are given and some numerical experiments illustrate the efficiency of the new algorithms.

Keywords: Pseudospectra, eigenvalue, projection method, Arnoldi, IOM(q)

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1325

1279 Classification of Political Affiliations by Reduced Number of Features

Authors: Vesile Evrim, Aliyu Awwal

Abstract:

By the evolvement in technology, the way of expressing opinions switched direction to the digital world. The domain of politics, as one of the hottest topics of opinion mining research, merged together with the behavior analysis for affiliation determination in texts, which constitutes the subject of this paper. This study aims to classify the text in news/blogs either as Republican or Democrat with the minimum number of features. As an initial set, 68 features which 64 were constituted by Linguistic Inquiry and Word Count (LIWC) features were tested against 14 benchmark classification algorithms. In the later experiments, the dimensions of the feature vector reduced based on the 7 feature selection algorithms. The results show that the “Decision Tree”, “Rule Induction” and “M5 Rule” classifiers when used with “SVM” and “IGR” feature selection algorithms performed the best up to 82.5% accuracy on a given dataset. Further tests on a single feature and the linguistic based feature sets showed the similar results. The feature “Function”, as an aggregate feature of the linguistic category, was found as the most differentiating feature among the 68 features with the accuracy of 81% in classifying articles either as Republican or Democrat.

Keywords: Politics, machine learning, feature selection, LIWC.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2365

1278 Optimized Preprocessing for Accurate and Efficient Bioassay Prediction with Machine Learning Algorithms

Authors: Jeff Clarine, Chang-Shyh Peng, Daisy Sang

Abstract:

Bioassay is the measurement of the potency of a chemical substance by its effect on a living animal or plant tissue. Bioassay data and chemical structures from pharmacokinetic and drug metabolism screening are mined from and housed in multiple databases. Bioassay prediction is calculated accordingly to determine further advancement. This paper proposes a four-step preprocessing of datasets for improving the bioassay predictions. The first step is instance selection in which dataset is categorized into training, testing, and validation sets. The second step is discretization that partitions the data in consideration of accuracy vs. precision. The third step is normalization where data are normalized between 0 and 1 for subsequent machine learning processing. The fourth step is feature selection where key chemical properties and attributes are generated. The streamlined results are then analyzed for the prediction of effectiveness by various machine learning algorithms including Pipeline Pilot, R, Weka, and Excel. Experiments and evaluations reveal the effectiveness of various combination of preprocessing steps and machine learning algorithms in more consistent and accurate prediction.

Keywords: Bioassay, machine learning, preprocessing, virtual screen.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 982

1277 Optimal Algorithm for Constructing the Delaunay Triangulation in Ed

Authors: V. Tereshchenko, D. Taran

Abstract:

In this paper we propose a new approach to constructing the Delaunay Triangulation and the optimum algorithm for the case of multidimensional spaces (d ≥ 2). Analysing the modern state, it is possible to draw a conclusion, that the ideas for the existing effective algorithms developed for the case of d ≥ 2 are not simple to generalize on a multidimensional case, without the loss of efficiency. We offer for the solving this problem an effective algorithm that satisfies all the given requirements. But theoretical complexity of the problem it is impossible to improve as the Worst - Case Optimality for algorithms of solving such a problem is proved.

Keywords: Delaunay triangulation, multidimensional space, Voronoi Diagram, optimal algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1981

1276 A General Variable Neighborhood Search Algorithm to Minimize Makespan of the Distributed Permutation Flowshop Scheduling Problem

Authors: G. M. Komaki, S. Mobin, E. Teymourian, S. Sheikh

Abstract:

This paper addresses minimizing the makespan of the distributed permutation flow shop scheduling problem. In this problem, there are several parallel identical factories or flowshops each with series of similar machines. Each job should be allocated to one of the factories and all of the operations of the jobs should be performed in the allocated factory. This problem has recently gained attention and due to NP-Hard nature of the problem, metaheuristic algorithms have been proposed to tackle it. Majority of the proposed algorithms require large computational time which is the main drawback. In this study, a general variable neighborhood search algorithm (GVNS) is proposed where several time-saving schemes have been incorporated into it. Also, the GVNS uses the sophisticated method to change the shaking procedure or perturbation depending on the progress of the incumbent solution to prevent stagnation of the search. The performance of the proposed algorithm is compared to the state-of-the-art algorithms based on standard benchmark instances.

Keywords: Distributed permutation flow shop, scheduling, makespan, general variable neighborhood search algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2271

1275 Research of Data Cleaning Methods Based on Dependency Rules

Authors: Yang Bao, Shi Wei Deng, Wang Qun Lin

Abstract:

This paper introduces the concept and principle of data cleaning, analyzes the types and causes of dirty data, and proposes several key steps of typical cleaning process, puts forward a well scalability and versatility data cleaning framework, in view of data with attribute dependency relation, designs several of violation data discovery algorithms by formal formula, which can obtain inconsistent data to all target columns with condition attribute dependent no matter data is structured (SQL) or unstructured (NoSql), and gives 6 data cleaning methods based on these algorithms.

Keywords: Data cleaning, dependency rules, violation data discovery, data repair.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2612

1274 A Review on Image Segmentation Techniques and Performance Measures

Authors: David Libouga Li Gwet, Marius Otesteanu, Ideal Oscar Libouga, Laurent Bitjoka, Gheorghe D. Popa

Abstract:

Image segmentation is a method to extract regions of interest from an image. It remains a fundamental problem in computer vision. The increasing diversity and the complexity of segmentation algorithms have led us firstly, to make a review and classify segmentation techniques, secondly to identify the most used measures of segmentation performance and thirdly, discuss deeply on segmentation philosophy in order to help the choice of adequate segmentation techniques for some applications. To justify the relevance of our analysis, recent algorithms of segmentation are presented through the proposed classification.

Keywords: Classification, image segmentation, measures of performance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2053

1273 Implicit Force Control of a Position Controlled Robot – A Comparison with Explicit Algorithms

Authors: Alexander Winkler, Jozef Suchý

Abstract:

This paper investigates simple implicit force control algorithms realizable with industrial robots. A lot of approaches already published are difficult to implement in commercial robot controllers, because the access to the robot joint torques is necessary or the complete dynamic model of the manipulator is used. In the past we already deal with explicit force control of a position controlled robot. Well known schemes of implicit force control are stiffness control, damping control and impedance control. Using such algorithms the contact force cannot be set directly. It is further the result of controller impedance, environment impedance and the commanded robot motion/position. The relationships of these properties are worked out in this paper in detail for the chosen implicit approaches. They have been adapted to be implementable on a position controlled robot. The behaviors of stiffness control and damping control are verified by practical experiments. For this purpose a suitable test bed was configured. Using the full mechanical impedance within the controller structure will not be practical in the case when the robot is in physical contact with the environment. This fact will be verified by simulation.

Keywords: Damping control, impedance control, robot force control, stability, stiffness control.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2871

1272 Distribution Voltage Regulation Under Three- Phase Fault by Using D-STATCOM

Authors: Chaiyut Sumpavakup, Thanatchai Kulworawanichpong

Abstract:

This paper presents the voltage regulation scheme of D-STATCOM under three-phase faults. It consists of the voltage detection and voltage regulation schemes in the 0dq reference. The proposed control strategy uses the proportional controller in which the proportional gain, kp, is appropriately adjusted by using genetic algorithms. To verify its use, a simplified 4-bus test system is situated by assuming a three-phase fault at bus 4. As a result, the DSTATCOM can resume the load voltage to the desired level within 1.8 ms. This confirms that the proposed voltage regulation scheme performs well under three-phase fault events.

Keywords: D-STATCOM, proportional controller, genetic algorithms.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1791

1271 Classic and Heuristic Approaches in Robot Motion Planning A Chronological Review

Authors: Ellips Masehian, Davoud Sedighizadeh

Abstract:

This paper reviews the major contributions to the Motion Planning (MP) field throughout a 35-year period, from classic approaches to heuristic algorithms. Due to the NP-Hardness of the MP problem, heuristic methods have outperformed the classic approaches and have gained wide popularity. After surveying around 1400 papers in the field, the amount of existing works for each method is identified and classified. Especially, the history and applications of numerous heuristic methods in MP is investigated. The paper concludes with comparative tables and graphs demonstrating the frequency of each MP method's application, and so can be used as a guideline for MP researchers.

Keywords: Robot motion planning, Heuristic algorithms.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5192

1270 Segmentation Problems and Solutions in Printed Degraded Gurmukhi Script

Authors: M. K. Jindal, G. S. Lehal, R. K. Sharma

Abstract:

Character segmentation is an important preprocessing step for text recognition. In degraded documents, existence of touching characters decreases recognition rate drastically, for any optical character recognition (OCR) system. In this paper we have proposed a complete solution for segmenting touching characters in all the three zones of printed Gurmukhi script. A study of touching Gurmukhi characters is carried out and these characters have been divided into various categories after a careful analysis. Structural properties of the Gurmukhi characters are used for defining the categories. New algorithms have been proposed to segment the touching characters in middle zone, upper zone and lower zone. These algorithms have shown a reasonable improvement in segmenting the touching characters in degraded printed Gurmukhi script. The algorithms proposed in this paper are applicable only to machine printed text. We have also discussed a new and useful technique to segment the horizontally overlapping lines.

Keywords: Character Segmentation, Middle Zone, Upper Zone, Lower Zone, Touching Characters, Horizontally Overlapping Lines.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1697

1269 Analysis of Modified Heap Sort Algorithm on Different Environment

Authors: Vandana Sharma, Parvinder S. Sandhu, Satwinder Singh, Baljit Saini

Abstract:

In field of Computer Science and Mathematics, sorting algorithm is an algorithm that puts elements of a list in a certain order i.e. ascending or descending. Sorting is perhaps the most widely studied problem in computer science and is frequently used as a benchmark of a system-s performance. This paper presented the comparative performance study of four sorting algorithms on different platform. For each machine, it is found that the algorithm depends upon the number of elements to be sorted. In addition, as expected, results show that the relative performance of the algorithms differed on the various machines. So, algorithm performance is dependent on data size and there exists impact of hardware also.

Keywords: Algorithm, Analysis, Complexity, Sorting.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2413

1268 Faster FPGA Routing Solution using DNA Computing

Authors: Manpreet Singh, Parvinder Singh Sandhu, Manjinder Singh Kahlon

Abstract:

There are many classical algorithms for finding routing in FPGA. But Using DNA computing we can solve the routes efficiently and fast. The run time complexity of DNA algorithms is much less than other classical algorithms which are used for solving routing in FPGA. The research in DNA computing is in a primary level. High information density of DNA molecules and massive parallelism involved in the DNA reactions make DNA computing a powerful tool. It has been proved by many research accomplishments that any procedure that can be programmed in a silicon computer can be realized as a DNA computing procedure. In this paper we have proposed two tier approaches for the FPGA routing solution. First, geometric FPGA detailed routing task is solved by transforming it into a Boolean satisfiability equation with the property that any assignment of input variables that satisfies the equation specifies a valid routing. Satisfying assignment for particular route will result in a valid routing and absence of a satisfying assignment implies that the layout is un-routable. In second step, DNA search algorithm is applied on this Boolean equation for solving routing alternatives utilizing the properties of DNA computation. The simulated results are satisfactory and give the indication of applicability of DNA computing for solving the FPGA Routing problem.

Keywords: FPGA, Routing, DNA Computing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1592

1267 Sample-Weighted Fuzzy Clustering with Regularizations

Authors: Miin-Shen Yang, Yee-Shan Pan

Abstract:

Although there have been many researches in cluster analysis to consider on feature weights, little effort is made on sample weights. Recently, Yu et al. (2011) considered a probability distribution over a data set to represent its sample weights and then proposed sample-weighted clustering algorithms. In this paper, we give a sample-weighted version of generalized fuzzy clustering regularization (GFCR), called the sample-weighted GFCR (SW-GFCR). Some experiments are considered. These experimental results and comparisons demonstrate that the proposed SW-GFCR is more effective than the most clustering algorithms.

Keywords: Clustering; fuzzy c-means, fuzzy clustering, sample weights, regularization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1767

1266 Data Preprocessing for Supervised Leaning

Authors: S. B. Kotsiantis, D. Kanellopoulos, P. E. Pintelas

Abstract:

Many factors affect the success of Machine Learning (ML) on a given task. The representation and quality of the instance data is first and foremost. If there is much irrelevant and redundant information present or noisy and unreliable data, then knowledge discovery during the training phase is more difficult. It is well known that data preparation and filtering steps take considerable amount of processing time in ML problems. Data pre-processing includes data cleaning, normalization, transformation, feature extraction and selection, etc. The product of data pre-processing is the final training set. It would be nice if a single sequence of data pre-processing algorithms had the best performance for each data set but this is not happened. Thus, we present the most well know algorithms for each step of data pre-processing so that one achieves the best performance for their data set.

Keywords: Data mining, feature selection, data cleaning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6092

1265 Data Mining in Medicine Domain Using Decision Trees and Vector Support Machine

Authors: Djamila Benhaddouche, Abdelkader Benyettou

Abstract:

In this paper, we used data mining to extract biomedical knowledge. In general, complex biomedical data collected in studies of populations are treated by statistical methods, although they are robust, they are not sufficient in themselves to harness the potential wealth of data. For that you used in step two learning algorithms: the Decision Trees and Support Vector Machine (SVM). These supervised classification methods are used to make the diagnosis of thyroid disease. In this context, we propose to promote the study and use of symbolic data mining techniques.

Keywords: A classifier, Algorithms decision tree, knowledge extraction, Support Vector Machine.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1870

1264 On Reversal and Transposition Medians

Authors: Martin Bader

Abstract:

During the last years, the genomes of more and more species have been sequenced, providing data for phylogenetic recon- struction based on genome rearrangement measures. A main task in all phylogenetic reconstruction algorithms is to solve the median of three problem. Although this problem is NP-hard even for the sim- plest distance measures, there are exact algorithms for the breakpoint median and the reversal median that are fast enough for practical use. In this paper, this approach is extended to the transposition median as well as to the weighted reversal and transposition median. Although there is no exact polynomial algorithm known even for the pairwise distances, we will show that it is in most cases possible to solve these problems exactly within reasonable time by using a branch and bound algorithm.

Keywords: Comparative genomics, genome rearrangements, me-dian, reversals, transpositions.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1688

1263 Speed Regulation of a Small BLDC Motor Using Genetic-Based Proportional Control

Authors: S. Poonsawat, T. Kulworawanichpong

Abstract:

This paper presents the speed regulation scheme of a small brushless dc motor (BLDC motor) with trapezoidal back-emf consideration. The proposed control strategy uses the proportional controller in which the proportional gain, kp, is appropriately adjusted by using genetic algorithms. As a result, the proportional control can perform well in order to compensate the BLDC motor with load disturbance. This confirms that the proposed speed regulation scheme gives satisfactory results.

Keywords: BLDC motor, proportional controller, genetic algorithms.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2097

1262 Predication Model for Leukemia Diseases Based on Data Mining Classification Algorithms with Best Accuracy

Authors: Fahd Sabry Esmail, M. Badr Senousy, Mohamed Ragaie

Abstract:

In recent years, there has been an explosion in the rate of using technology that help discovering the diseases. For example, DNA microarrays allow us for the first time to obtain a "global" view of the cell. It has great potential to provide accurate medical diagnosis, to help in finding the right treatment and cure for many diseases. Various classification algorithms can be applied on such micro-array datasets to devise methods that can predict the occurrence of Leukemia disease. In this study, we compared the classification accuracy and response time among eleven decision tree methods and six rule classifier methods using five performance criteria. The experiment results show that the performance of Random Tree is producing better result. Also it takes lowest time to build model in tree classifier. The classification rules algorithms such as nearest- neighbor-like algorithm (NNge) is the best algorithm due to the high accuracy and it takes lowest time to build model in classification.

Keywords: Data mining, classification techniques, decision tree, classification rule, leukemia diseases, microarray data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2559