Search results for: Automatic clustering
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1029

Search results for: Automatic clustering

1029 Automatic Clustering of Gene Ontology by Genetic Algorithm

Authors: Razib M. Othman, Safaai Deris, Rosli M. Illias, Zalmiyah Zakaria, Saberi M. Mohamad

Abstract:

Nowadays, Gene Ontology has been used widely by many researchers for biological data mining and information retrieval, integration of biological databases, finding genes, and incorporating knowledge in the Gene Ontology for gene clustering. However, the increase in size of the Gene Ontology has caused problems in maintaining and processing them. One way to obtain their accessibility is by clustering them into fragmented groups. Clustering the Gene Ontology is a difficult combinatorial problem and can be modeled as a graph partitioning problem. Additionally, deciding the number k of clusters to use is not easily perceived and is a hard algorithmic problem. Therefore, an approach for solving the automatic clustering of the Gene Ontology is proposed by incorporating cohesion-and-coupling metric into a hybrid algorithm consisting of a genetic algorithm and a split-and-merge algorithm. Experimental results and an example of modularized Gene Ontology in RDF/XML format are given to illustrate the effectiveness of the algorithm.

Keywords: Automatic clustering, cohesion-and-coupling metric, gene ontology; genetic algorithm, split-and-merge algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1954
1028 Growing Self Organising Map Based Exploratory Analysis of Text Data

Authors: Sumith Matharage, Damminda Alahakoon

Abstract:

Textual data plays an important role in the modern world. The possibilities of applying data mining techniques to uncover hidden information present in large volumes of text collections is immense. The Growing Self Organizing Map (GSOM) is a highly successful member of the Self Organising Map family and has been used as a clustering and visualisation tool across wide range of disciplines to discover hidden patterns present in the data. A comprehensive analysis of the GSOM’s capabilities as a text clustering and visualisation tool has so far not been published. These functionalities, namely map visualisation capabilities, automatic cluster identification and hierarchical clustering capabilities are presented in this paper and are further demonstrated with experiments on a benchmark text corpus.

Keywords: Text Clustering, Growing Self Organizing Map, Automatic Cluster Identification, Hierarchical Clustering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1996
1027 Multi-Agent Systems for Intelligent Clustering

Authors: Jung-Eun Park, Kyung-Whan Oh

Abstract:

Intelligent systems are required in order to quickly and accurately analyze enormous quantities of data in the Internet environment. In intelligent systems, information extracting processes can be divided into supervised learning and unsupervised learning. This paper investigates intelligent clustering by unsupervised learning. Intelligent clustering is the clustering system which determines the clustering model for data analysis and evaluates results by itself. This system can make a clustering model more rapidly, objectively and accurately than an analyzer. The methodology for the automatic clustering intelligent system is a multi-agent system that comprises a clustering agent and a cluster performance evaluation agent. An agent exchanges information about clusters with another agent and the system determines the optimal cluster number through this information. Experiments using data sets in the UCI Machine Repository are performed in order to prove the validity of the system.

Keywords: Intelligent Clustering, Multi-Agent System, PCA, SOM, VC(Variance Criterion)

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1726
1026 Automatic Landmark Selection Based on Feature Clustering for Visual Autonomous Unmanned Aerial Vehicle Navigation

Authors: Paulo Fernando Silva Filho, Elcio Hideiti Shiguemori

Abstract:

The selection of specific landmarks for an Unmanned Aerial Vehicles’ Visual Navigation systems based on Automatic Landmark Recognition has significant influence on the precision of the system’s estimated position. At the same time, manual selection of the landmarks does not guarantee a high recognition rate, which would also result on a poor precision. This work aims to develop an automatic landmark selection that will take the image of the flight area and identify the best landmarks to be recognized by the Visual Navigation Landmark Recognition System. The criterion to select a landmark is based on features detected by ORB or AKAZE and edges information on each possible landmark. Results have shown that disposition of possible landmarks is quite different from the human perception.

Keywords: Clustering, edges, feature points, landmark selection, X-Means.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 817
1025 Fuzzy Types Clustering for Microarray Data

Authors: Seo Young Kim, Tai Myong Choi

Abstract:

The main goal of microarray experiments is to quantify the expression of every object on a slide as precisely as possible, with a further goal of clustering the objects. Recently, many studies have discussed clustering issues involving similar patterns of gene expression. This paper presents an application of fuzzy-type methods for clustering DNA microarray data that can be applied to typical comparisons. Clustering and analyses were performed on microarray and simulated data. The results show that fuzzy-possibility c-means clustering substantially improves the findings obtained by others.

Keywords: Clustering, microarray data, Fuzzy-type clustering, Validation

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1520
1024 Automatic LV Segmentation with K-means Clustering and Graph Searching on Cardiac MRI

Authors: Hae-Yeoun Lee

Abstract:

Quantification of cardiac function is performed by calculating blood volume and ejection fraction in routine clinical practice. However, these works have been performed by manual contouring, which requires computational costs and varies on the observer. In this paper, an automatic left ventricle segmentation algorithm on cardiac magnetic resonance images (MRI) is presented. Using knowledge on cardiac MRI, a K-mean clustering technique is applied to segment blood region on a coil-sensitivity corrected image. Then, a graph searching technique is used to correct segmentation errors from coil distortion and noises. Finally, blood volume and ejection fraction are calculated. Using cardiac MRI from 15 subjects, the presented algorithm is tested and compared with manual contouring by experts to show outstanding performance.

Keywords: Cardiac MRI, Graph searching, Left ventricle segmentation, K-means clustering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2094
1023 Similarity Measures and Weighted Fuzzy C-Mean Clustering Algorithm

Authors: Bainian Li, Kongsheng Zhang, Jian Xu

Abstract:

In this paper we study the fuzzy c-mean clustering algorithm combined with principal components method. Demonstratively analysis indicate that the new clustering method is well rather than some clustering algorithms. We also consider the validity of clustering method.

Keywords: FCM algorithm, Principal Components Analysis, Clustervalidity

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1723
1022 Laser Data Based Automatic Generation of Lane-Level Road Map for Intelligent Vehicles

Authors: Zehai Yu, Hui Zhu, Linglong Lin, Huawei Liang, Biao Yu, Weixin Huang

Abstract:

With the development of intelligent vehicle systems, a high-precision road map is increasingly needed in many aspects. The automatic lane lines extraction and modeling are the most essential steps for the generation of a precise lane-level road map. In this paper, an automatic lane-level road map generation system is proposed. To extract the road markings on the ground, the multi-region Otsu thresholding method is applied, which calculates the intensity value of laser data that maximizes the variance between background and road markings. The extracted road marking points are then projected to the raster image and clustered using a two-stage clustering algorithm. Lane lines are subsequently recognized from these clusters by the shape features of their minimum bounding rectangle. To ensure the storage efficiency of the map, the lane lines are approximated to cubic polynomial curves using a Bayesian estimation approach. The proposed lane-level road map generation system has been tested on urban and expressway conditions in Hefei, China. The experimental results on the datasets show that our method can achieve excellent extraction and clustering effect, and the fitted lines can reach a high position accuracy with an error of less than 10 cm.

Keywords: Curve fitting, lane-level road map, line recognition, multi-thresholding, two-stage clustering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 512
1021 Grid-based Supervised Clustering - GBSC

Authors: Pornpimol Bungkomkhun, Surapong Auwatanamongkol

Abstract:

This paper presents a supervised clustering algorithm, namely Grid-Based Supervised Clustering (GBSC), which is able to identify clusters of any shapes and sizes without presuming any canonical form for data distribution. The GBSC needs no prespecified number of clusters, is insensitive to the order of the input data objects, and is capable of handling outliers. Built on the combination of grid-based clustering and density-based clustering, under the assistance of the downward closure property of density used in bottom-up subspace clustering, the GBSC can notably reduce its search space to avoid the memory confinement situation during its execution. On two-dimension synthetic datasets, the GBSC can identify clusters with different shapes and sizes correctly. The GBSC also outperforms other five supervised clustering algorithms when the experiments are performed on some UCI datasets.

Keywords: supervised clustering, grid-based clustering, subspace clustering

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1609
1020 Exponential Particle Swarm Optimization Approach for Improving Data Clustering

Authors: Neveen I. Ghali, Nahed El-Dessouki, Mervat A. N., Lamiaa Bakrawi

Abstract:

In this paper we use exponential particle swarm optimization (EPSO) to cluster data. Then we compare between (EPSO) clustering algorithm which depends on exponential variation for the inertia weight and particle swarm optimization (PSO) clustering algorithm which depends on linear inertia weight. This comparison is evaluated on five data sets. The experimental results show that EPSO clustering algorithm increases the possibility to find the optimal positions as it decrease the number of failure. Also show that (EPSO) clustering algorithm has a smaller quantization error than (PSO) clustering algorithm, i.e. (EPSO) clustering algorithm more accurate than (PSO) clustering algorithm.

Keywords: Particle swarm optimization, data clustering, exponential PSO.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1689
1019 A Comparison of Fuzzy Clustering Algorithms to Cluster Web Messages

Authors: Sara El Manar El Bouanani, Ismail Kassou

Abstract:

Our objective in this paper is to propose an approach capable of clustering web messages. The clustering is carried out by assigning, with a certain probability, texts written by the same web user to the same cluster based on Stylometric features and using fuzzy clustering algorithms. Focus in the present work is on comparing the most popular algorithms in fuzzy clustering theory namely, Fuzzy C-means, Possibilistic C-means and Fuzzy Possibilistic C-Means.

Keywords: Authorship detection, fuzzy clustering, profiling, stylometric features.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2052
1018 Performance Comparison of Particle Swarm Optimization with Traditional Clustering Algorithms used in Self-Organizing Map

Authors: Anurag Sharma, Christian W. Omlin

Abstract:

Self-organizing map (SOM) is a well known data reduction technique used in data mining. It can reveal structure in data sets through data visualization that is otherwise hard to detect from raw data alone. However, interpretation through visual inspection is prone to errors and can be very tedious. There are several techniques for the automatic detection of clusters of code vectors found by SOM, but they generally do not take into account the distribution of code vectors; this may lead to unsatisfactory clustering and poor definition of cluster boundaries, particularly where the density of data points is low. In this paper, we propose the use of an adaptive heuristic particle swarm optimization (PSO) algorithm for finding cluster boundaries directly from the code vectors obtained from SOM. The application of our method to several standard data sets demonstrates its feasibility. PSO algorithm utilizes a so-called U-matrix of SOM to determine cluster boundaries; the results of this novel automatic method compare very favorably to boundary detection through traditional algorithms namely k-means and hierarchical based approach which are normally used to interpret the output of SOM.

Keywords: cluster boundaries, clustering, code vectors, data mining, particle swarm optimization, self-organizing maps, U-matrix.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1908
1017 Analysis of Diverse Clustering Tools in Data Mining

Authors: S. Sarumathi, N. Shanthi, M. Sharmila

Abstract:

Clustering in data mining is an unsupervised learning technique of aggregating the data objects into meaningful groups such that the intra cluster similarity of objects are maximized and inter cluster similarity of objects are minimized. Over the past decades several clustering tools were emerged in which clustering algorithms are inbuilt and are easier to use and extract the expected results. Data mining mainly deals with the huge databases that inflicts on cluster analysis and additional rigorous computational constraints. These challenges pave the way for the emergence of powerful expansive data mining clustering softwares. In this survey, a variety of clustering tools used in data mining are elucidated along with the pros and cons of each software.

Keywords: Cluster Analysis, Clustering Algorithms, Clustering Techniques, Association, Visualization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2201
1016 Hierarchical Clustering Algorithms in Data Mining

Authors: Z. Abdullah, A. R. Hamdan

Abstract:

Clustering is a process of grouping objects and data into groups of clusters to ensure that data objects from the same cluster are identical to each other. Clustering algorithms in one of the area in data mining and it can be classified into partition, hierarchical, density based and grid based. Therefore, in this paper we do survey and review four major hierarchical clustering algorithms called CURE, ROCK, CHAMELEON and BIRCH. The obtained state of the art of these algorithms will help in eliminating the current problems as well as deriving more robust and scalable algorithms for clustering.

Keywords: Clustering, method, algorithm, hierarchical, survey.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3376
1015 Knowledge Representation Based On Interval Type-2 CFCM Clustering

Authors: Myung-Won Lee, Keun-Chang Kwak

Abstract:

This paper is concerned with knowledge representation and extraction of fuzzy if-then rules using Interval Type-2 Context-based Fuzzy C-Means clustering (IT2-CFCM) with the aid of fuzzy granulation. This proposed clustering algorithm is based on information granulation in the form of IT2 based Fuzzy C-Means (IT2-FCM) clustering and estimates the cluster centers by preserving the homogeneity between the clustered patterns from the IT2 contexts produced in the output space. Furthermore, we can obtain the automatic knowledge representation in the design of Radial Basis Function Networks (RBFN), Linguistic Model (LM), and Adaptive Neuro-Fuzzy Networks (ANFN) from the numerical input-output data pairs. We shall focus on a design of ANFN in this paper. The experimental results on an estimation problem of energy performance reveal that the proposed method showed a good knowledge representation and performance in comparison with the previous works.

Keywords: IT2-FCM, IT2-CFCM, context-based fuzzy clustering, adaptive neuro-fuzzy network, knowledge representation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2616
1014 A Survey: Clustering Ensembles Techniques

Authors: Reza Ghaemi , Md. Nasir Sulaiman , Hamidah Ibrahim , Norwati Mustapha

Abstract:

The clustering ensembles combine multiple partitions generated by different clustering algorithms into a single clustering solution. Clustering ensembles have emerged as a prominent method for improving robustness, stability and accuracy of unsupervised classification solutions. So far, many contributions have been done to find consensus clustering. One of the major problems in clustering ensembles is the consensus function. In this paper, firstly, we introduce clustering ensembles, representation of multiple partitions, its challenges and present taxonomy of combination algorithms. Secondly, we describe consensus functions in clustering ensembles including Hypergraph partitioning, Voting approach, Mutual information, Co-association based functions and Finite mixture model, and next explain their advantages, disadvantages and computational complexity. Finally, we compare the characteristics of clustering ensembles algorithms such as computational complexity, robustness, simplicity and accuracy on different datasets in previous techniques.

Keywords: Clustering Ensembles, Combinational Algorithm, Consensus Function, Unsupervised Classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3448
1013 Object-Based Image Indexing and Retrieval in DCT Domain using Clustering Techniques

Authors: Hossein Nezamabadi-pour, Saeid Saryazdi

Abstract:

In this paper, we present a new and effective image indexing technique that extracts features directly from DCT domain. Our proposed approach is an object-based image indexing. For each block of size 8*8 in DCT domain a feature vector is extracted. Then, feature vectors of all blocks of image using a k-means algorithm is clustered into groups. Each cluster represents a special object of the image. Then we select some clusters that have largest members after clustering. The centroids of the selected clusters are taken as image feature vectors and indexed into the database. Also, we propose an approach for using of proposed image indexing method in automatic image classification. Experimental results on a database of 800 images from 8 semantic groups in automatic image classification are reported.

Keywords: Object-based image retrieval, DCT domain, Image indexing, Image classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2024
1012 Spatial Clustering Model of Vessel Trajectory to Extract Sailing Routes Based on AIS Data

Authors: Lubna Eljabu, Mohammad Etemad, Stan Matwin

Abstract:

The automatic extraction of shipping routes is advantageous for intelligent traffic management systems to identify events and support decision-making in maritime surveillance. At present, there is a high demand for the extraction of maritime traffic networks that resemble the real traffic of vessels accurately, which is valuable for further analytical processing tasks for vessels trajectories (e.g., naval routing and voyage planning, anomaly detection, destination prediction, time of arrival estimation). With the help of big data and processing huge amounts of vessels’ trajectory data, it is possible to learn these shipping routes from the navigation history of past behaviour of other, similar ships that were travelling in a given area. In this paper, we propose a spatial clustering model of vessels’ trajectories (SPTCLUST) to extract spatial representations of sailing routes from historical Automatic Identification System (AIS) data. The whole model consists of three main parts: data preprocessing, path finding, and route extraction, which consists of clustering and representative trajectory extraction. The proposed clustering method provides techniques to overcome the problems of: (i) optimal input parameters selection; (ii) the high complexity of processing a huge volume of multidimensional data; (iii) and the spatial representation of complete representative trajectory detection in the context of trajectory clustering algorithms. The experimental evaluation showed the effectiveness of the proposed model by using a real-world AIS dataset from the Port of Halifax. The results contribute to further understanding of shipping route patterns. This could aid surveillance authorities in stable and sustainable vessel traffic management.

Keywords: Vessel trajectory clustering, trajectory mining, Spatial Clustering, marine intelligent navigation, maritime traffic network extraction, sdailing routes extraction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 453
1011 Ontology-based Concept Weighting for Text Documents

Authors: Hmway Hmway Tar, Thi Thi Soe Nyaunt

Abstract:

Documents clustering become an essential technology with the popularity of the Internet. That also means that fast and high-quality document clustering technique play core topics. Text clustering or shortly clustering is about discovering semantically related groups in an unstructured collection of documents. Clustering has been very popular for a long time because it provides unique ways of digesting and generalizing large amounts of information. One of the issues of clustering is to extract proper feature (concept) of a problem domain. The existing clustering technology mainly focuses on term weight calculation. To achieve more accurate document clustering, more informative features including concept weight are important. Feature Selection is important for clustering process because some of the irrelevant or redundant feature may misguide the clustering results. To counteract this issue, the proposed system presents the concept weight for text clustering system developed based on a k-means algorithm in accordance with the principles of ontology so that the important of words of a cluster can be identified by the weight values. To a certain extent, it has resolved the semantic problem in specific areas.

Keywords: Clustering, Concept Weight, Document clustering, Feature Selection, Ontology

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2404
1010 Journey on Image Clustering Based on Color Composition

Authors: Achmad Nizar Hidayanto, Elisabeth Martha Koeanan

Abstract:

Image clustering is a process of grouping images based on their similarity. The image clustering usually uses the color component, texture, edge, shape, or mixture of two components, etc. This research aims to explore image clustering using color composition. In order to complete this image clustering, three main components should be considered, which are color space, image representation (feature extraction), and clustering method itself. We aim to explore which composition of these factors will produce the best clustering results by combining various techniques from the three components. The color spaces use RGB, HSV, and L*a*b* method. The image representations use Histogram and Gaussian Mixture Model (GMM), whereas the clustering methods use KMeans and Agglomerative Hierarchical Clustering algorithm. The results of the experiment show that GMM representation is better combined with RGB and L*a*b* color space, whereas Histogram is better combined with HSV. The experiments also show that K-Means is better than Agglomerative Hierarchical for images clustering.

Keywords: Image clustering, feature extraction, RGB, HSV, L*a*b*, Gaussian Mixture Model (GMM), histogram, Agglomerative Hierarchical Clustering (AHC), K-Means, Expectation-Maximization (EM).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2205
1009 Sample-Weighted Fuzzy Clustering with Regularizations

Authors: Miin-Shen Yang, Yee-Shan Pan

Abstract:

Although there have been many researches in cluster analysis to consider on feature weights, little effort is made on sample weights. Recently, Yu et al. (2011) considered a probability distribution over a data set to represent its sample weights and then proposed sample-weighted clustering algorithms. In this paper, we give a sample-weighted version of generalized fuzzy clustering regularization (GFCR), called the sample-weighted GFCR (SW-GFCR). Some experiments are considered. These experimental results and comparisons demonstrate that the proposed SW-GFCR is more effective than the most clustering algorithms.

Keywords: Clustering; fuzzy c-means, fuzzy clustering, sample weights, regularization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1765
1008 Application of a New Hybrid Optimization Algorithm on Cluster Analysis

Authors: T. Niknam, M. Nayeripour, B.Bahmani Firouzi

Abstract:

Clustering techniques have received attention in many areas including engineering, medicine, biology and data mining. The purpose of clustering is to group together data points, which are close to one another. The K-means algorithm is one of the most widely used techniques for clustering. However, K-means has two shortcomings: dependency on the initial state and convergence to local optima and global solutions of large problems cannot found with reasonable amount of computation effort. In order to overcome local optima problem lots of studies done in clustering. This paper is presented an efficient hybrid evolutionary optimization algorithm based on combining Particle Swarm Optimization (PSO) and Ant Colony Optimization (ACO), called PSO-ACO, for optimally clustering N object into K clusters. The new PSO-ACO algorithm is tested on several data sets, and its performance is compared with those of ACO, PSO and K-means clustering. The simulation results show that the proposed evolutionary optimization algorithm is robust and suitable for handing data clustering.

Keywords: Ant Colony Optimization (ACO), Data clustering, Hybrid evolutionary optimization algorithm, K-means clustering, Particle Swarm Optimization (PSO).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2197
1007 A Similarity Measure for Clustering and its Applications

Authors: Guadalupe J. Torres, Ram B. Basnet, Andrew H. Sung, Srinivas Mukkamala, Bernardete M. Ribeiro

Abstract:

This paper introduces a measure of similarity between two clusterings of the same dataset produced by two different algorithms, or even the same algorithm (K-means, for instance, with different initializations usually produce different results in clustering the same dataset). We then apply the measure to calculate the similarity between pairs of clusterings, with special interest directed at comparing the similarity between various machine clusterings and human clustering of datasets. The similarity measure thus can be used to identify the best (in terms of most similar to human) clustering algorithm for a specific problem at hand. Experimental results pertaining to the text categorization problem of a Portuguese corpus (wherein a translation-into-English approach is used) are presented, as well as results on the well-known benchmark IRIS dataset. The significance and other potential applications of the proposed measure are discussed.

Keywords: Clustering Algorithms, Clustering Applications, Similarity Measures, Text Clustering

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1570
1006 Determining Cluster Boundaries Using Particle Swarm Optimization

Authors: Anurag Sharma, Christian W. Omlin

Abstract:

Self-organizing map (SOM) is a well known data reduction technique used in data mining. Data visualization can reveal structure in data sets that is otherwise hard to detect from raw data alone. However, interpretation through visual inspection is prone to errors and can be very tedious. There are several techniques for the automatic detection of clusters of code vectors found by SOMs, but they generally do not take into account the distribution of code vectors; this may lead to unsatisfactory clustering and poor definition of cluster boundaries, particularly where the density of data points is low. In this paper, we propose the use of a generic particle swarm optimization (PSO) algorithm for finding cluster boundaries directly from the code vectors obtained from SOMs. The application of our method to unlabeled call data for a mobile phone operator demonstrates its feasibility. PSO algorithm utilizes U-matrix of SOMs to determine cluster boundaries; the results of this novel automatic method correspond well to boundary detection through visual inspection of code vectors and k-means algorithm.

Keywords: Particle swarm optimization, self-organizing maps, clustering, data mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1717
1005 Clustering in WSN Based on Minimum Spanning Tree Using Divide and Conquer Approach

Authors: Uttam Vijay, Nitin Gupta

Abstract:

Due to heavy energy constraints in WSNs clustering is an efficient way to manage the energy in sensors. There are many methods already proposed in the area of clustering and research is still going on to make clustering more energy efficient. In our paper we are proposing a minimum spanning tree based clustering using divide and conquer approach. The MST based clustering was first proposed in 1970’s for large databases. Here we are taking divide and conquer approach and implementing it for wireless sensor networks with the constraints attached to the sensor networks. This Divide and conquer approach is implemented in a way that we don’t have to construct the whole MST before clustering but we just find the edge which will be the part of the MST to a corresponding graph and divide the graph in clusters there itself if that edge from the graph can be removed judging on certain constraints and hence saving lot of computation.

Keywords: Algorithm, Clustering, Edge-Weighted Graph, Weighted-LEACH.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2474
1004 Minimal Spanning Tree based Fuzzy Clustering

Authors: Ágnes Vathy-Fogarassy, Balázs Feil, János Abonyi

Abstract:

Most of fuzzy clustering algorithms have some discrepancies, e.g. they are not able to detect clusters with convex shapes, the number of the clusters should be a priori known, they suffer from numerical problems, like sensitiveness to the initialization, etc. This paper studies the synergistic combination of the hierarchical and graph theoretic minimal spanning tree based clustering algorithm with the partitional Gath-Geva fuzzy clustering algorithm. The aim of this hybridization is to increase the robustness and consistency of the clustering results and to decrease the number of the heuristically defined parameters of these algorithms to decrease the influence of the user on the clustering results. For the analysis of the resulted fuzzy clusters a new fuzzy similarity measure based tool has been presented. The calculated similarities of the clusters can be used for the hierarchical clustering of the resulted fuzzy clusters, which information is useful for cluster merging and for the visualization of the clustering results. As the examples used for the illustration of the operation of the new algorithm will show, the proposed algorithm can detect clusters from data with arbitrary shape and does not suffer from the numerical problems of the classical Gath-Geva fuzzy clustering algorithm.

Keywords: Clustering, fuzzy clustering, minimal spanning tree, cluster validity, fuzzy similarity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2405
1003 A New Hybrid RMN Image Segmentation Algorithm

Authors: Abdelouahab Moussaoui, Nabila Ferahta, Victor Chen

Abstract:

The development of aid's systems for the medical diagnosis is not easy thing because of presence of inhomogeneities in the MRI, the variability of the data from a sequence to the other as well as of other different source distortions that accentuate this difficulty. A new automatic, contextual, adaptive and robust segmentation procedure by MRI brain tissue classification is described in this article. A first phase consists in estimating the density of probability of the data by the Parzen-Rozenblatt method. The classification procedure is completely automatic and doesn't make any assumptions nor on the clusters number nor on the prototypes of these clusters since these last are detected in an automatic manner by an operator of mathematical morphology called skeleton by influence zones detection (SKIZ). The problem of initialization of the prototypes as well as their number is transformed in an optimization problem; in more the procedure is adaptive since it takes in consideration the contextual information presents in every voxel by an adaptive and robust non parametric model by the Markov fields (MF). The number of bad classifications is reduced by the use of the criteria of MPM minimization (Maximum Posterior Marginal).

Keywords: Clustering, Automatic Classification, SKIZ, MarkovFields, Image segmentation, Maximum Posterior Marginal (MPM).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1411
1002 Using Data Clustering in Oral Medicine

Authors: Fahad Shahbaz Khan, Rao Muhammad Anwer, Olof Torgersson

Abstract:

The vast amount of information hidden in huge databases has created tremendous interests in the field of data mining. This paper examines the possibility of using data clustering techniques in oral medicine to identify functional relationships between different attributes and classification of similar patient examinations. Commonly used data clustering algorithms have been reviewed and as a result several interesting results have been gathered.

Keywords: Oral Medicine, Cluto, Data Clustering, Data Mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1976
1001 Dynamic Model of Automatic Loom on SimulationX

Authors: A. Jomartov, A. Tuleshov, B. Tultaev

Abstract:

One of the main tasks in the development of textile machinery is to increase the rapidity of automatic looms, and consequently, their productivity. With increasing automatic loom speeds, the dynamic loads on their separate mechanisms and moving joints sharply increase. Dynamic research allows us to determine the weakest mechanisms of the automatic loom. The modern automatic loom consists of a large number of structurally different mechanisms. These are cam, lever, gear, friction and combined cyclic mechanisms. The modern automatic loom contains various mechatronic devices: A device for the automatic removal of faulty weft, electromechanical drive warp yarns, electronic controllers, servos, etc. In the paper, we consider the multibody dynamic model of the automatic loom on the software complex SimulationX. SimulationX is multidisciplinary software for modeling complex physical and technical facilities and systems. The multibody dynamic model of the automatic loom allows consideration of: The transition processes, backlash at the joints and nodes, the force of resistance and electric motor performance.

Keywords: Automatic loom, dynamics, model, multibody, SimulationX.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1494
1000 A Genetic Algorithm for Clustering on Image Data

Authors: Qin Ding, Jim Gasvoda

Abstract:

Clustering is the process of subdividing an input data set into a desired number of subgroups so that members of the same subgroup are similar and members of different subgroups have diverse properties. Many heuristic algorithms have been applied to the clustering problem, which is known to be NP Hard. Genetic algorithms have been used in a wide variety of fields to perform clustering, however, the technique normally has a long running time in terms of input set size. This paper proposes an efficient genetic algorithm for clustering on very large data sets, especially on image data sets. The genetic algorithm uses the most time efficient techniques along with preprocessing of the input data set. We test our algorithm on both artificial and real image data sets, both of which are of large size. The experimental results show that our algorithm outperforms the k-means algorithm in terms of running time as well as the quality of the clustering.

Keywords: Clustering, data mining, genetic algorithm, image data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2051