Search results for: clustering comparison.
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 5681

Search results for: clustering comparison.

5681 Mitigating the Negative Effect of Intrabrand Clustering: The Role of Interbrand Clustering and Firm Size

Authors: Moeen Naseer Butt

Abstract:

Clustering –geographic concentrations of entities– has recently received more attention in marketing research and has been shown to affect multiple outcomes. This study investigates the impact of intrabrand clustering (clustering of same-brand outlets) on an outlet’s quality performance. Further, it assesses the moderating effects of interbrand clustering (clustering of other-brand outlets) and firm size. An examination of approximately 21,000 food service establishments in New York State in 2019 finds that the impact of intrabrand clustering on an outlet’s quality performance is context-dependent. Specifically, intrabrand clustering decreases, whereas interbrand clustering and firm size help increase the outlet’s performance. Additionally, this study finds that the role of firm size is more substantial than interbrand clustering in mitigating the adverse effects of intrabrand clustering on outlet quality performance.

Keywords: intraband clustering, interbrand clustering, firm size, brand competition, outlet performance, quality violations

Procedia PDF Downloads 188
5680 Hierarchical Clustering Algorithms in Data Mining

Authors: Z. Abdullah, A. R. Hamdan

Abstract:

Clustering is a process of grouping objects and data into groups of clusters to ensure that data objects from the same cluster are identical to each other. Clustering algorithms in one of the areas in data mining and it can be classified into partition, hierarchical, density based, and grid-based. Therefore, in this paper, we do a survey and review for four major hierarchical clustering algorithms called CURE, ROCK, CHAMELEON, and BIRCH. The obtained state of the art of these algorithms will help in eliminating the current problems, as well as deriving more robust and scalable algorithms for clustering.

Keywords: clustering, unsupervised learning, algorithms, hierarchical

Procedia PDF Downloads 885
5679 Hybrid Hierarchical Clustering Approach for Community Detection in Social Network

Authors: Radhia Toujani, Jalel Akaichi

Abstract:

Social Networks generally present a hierarchy of communities. To determine these communities and the relationship between them, detection algorithms should be applied. Most of the existing algorithms, proposed for hierarchical communities identification, are based on either agglomerative clustering or divisive clustering. In this paper, we present a hybrid hierarchical clustering approach for community detection based on both bottom-up and bottom-down clustering. Obviously, our approach provides more relevant community structure than hierarchical method which considers only divisive or agglomerative clustering to identify communities. Moreover, we performed some comparative experiments to enhance the quality of the clustering results and to show the effectiveness of our algorithm.

Keywords: agglomerative hierarchical clustering, community structure, divisive hierarchical clustering, hybrid hierarchical clustering, opinion mining, social network, social network analysis

Procedia PDF Downloads 365
5678 Performance Analysis of Hierarchical Agglomerative Clustering in a Wireless Sensor Network Using Quantitative Data

Authors: Tapan Jain, Davender Singh Saini

Abstract:

Clustering is a useful mechanism in wireless sensor networks which helps to cope with scalability and data transmission problems. The basic aim of our research work is to provide efficient clustering using Hierarchical agglomerative clustering (HAC). If the distance between the sensing nodes is calculated using their location then it’s quantitative HAC. This paper compares the various agglomerative clustering techniques applied in a wireless sensor network using the quantitative data. The simulations are done in MATLAB and the comparisons are made between the different protocols using dendrograms.

Keywords: routing, hierarchical clustering, agglomerative, quantitative, wireless sensor network

Procedia PDF Downloads 615
5677 Clustering of Extremes in Financial Returns: A Comparison between Developed and Emerging Markets

Authors: Sara Ali Alokley, Mansour Saleh Albarrak

Abstract:

This paper investigates the dependency or clustering of extremes in the financial returns data by estimating the extremal index value θ∈[0,1]. The smaller the value of θ the more clustering we have. Here we apply the method of Ferro and Segers (2003) to estimate the extremal index for a range of threshold values. We compare the dependency structure of extremes in the developed and emerging markets. We use the financial returns of the stock market index in the developed markets of US, UK, France, Germany and Japan and the emerging markets of Brazil, Russia, India, China and Saudi Arabia. We expect that more clustering occurs in the emerging markets. This study will help to understand the dependency structure of the financial returns data.

Keywords: clustring, extremes, returns, dependency, extermal index

Procedia PDF Downloads 405
5676 Flowing Online Vehicle GPS Data Clustering Using a New Parallel K-Means Algorithm

Authors: Orhun Vural, Oguz Bayat, Rustu Akay, Osman N. Ucan

Abstract:

This study presents a new parallel approach clustering of GPS data. Evaluation has been made by comparing execution time of various clustering algorithms on GPS data. This paper aims to propose a parallel based on neighborhood K-means algorithm to make it faster. The proposed parallelization approach assumes that each GPS data represents a vehicle and to communicate between vehicles close to each other after vehicles are clustered. This parallelization approach has been examined on different sized continuously changing GPS data and compared with serial K-means algorithm and other serial clustering algorithms. The results demonstrated that proposed parallel K-means algorithm has been shown to work much faster than other clustering algorithms.

Keywords: parallel k-means algorithm, parallel clustering, clustering algorithms, clustering on flowing data

Procedia PDF Downloads 222
5675 Time Series Regression with Meta-Clusters

Authors: Monika Chuchro

Abstract:

This paper presents a preliminary attempt to apply classification of time series using meta-clusters in order to improve the quality of regression models. In this case, clustering was performed as a method to obtain a subgroups of time series data with normal distribution from inflow into waste water treatment plant data which Composed of several groups differing by mean value. Two simple algorithms: K-mean and EM were chosen as a clustering method. The rand index was used to measure the similarity. After simple meta-clustering, regression model was performed for each subgroups. The final model was a sum of subgroups models. The quality of obtained model was compared with the regression model made using the same explanatory variables but with no clustering of data. Results were compared by determination coefficient (R2), measure of prediction accuracy mean absolute percentage error (MAPE) and comparison on linear chart. Preliminary results allows to foresee the potential of the presented technique.

Keywords: clustering, data analysis, data mining, predictive models

Procedia PDF Downloads 466
5674 Semi-Supervised Hierarchical Clustering Given a Reference Tree of Labeled Documents

Authors: Ying Zhao, Xingyan Bin

Abstract:

Semi-supervised clustering algorithms have been shown effective to improve clustering process with even limited supervision. However, semi-supervised hierarchical clustering remains challenging due to the complexities of expressing constraints for agglomerative clustering algorithms. This paper proposes novel semi-supervised agglomerative clustering algorithms to build a hierarchy based on a known reference tree. We prove that by enforcing distance constraints defined by a reference tree during the process of hierarchical clustering, the resultant tree is guaranteed to be consistent with the reference tree. We also propose a framework that allows the hierarchical tree generation be aware of levels of levels of the agglomerative tree under creation, so that metric weights can be learned and adopted at each level in a recursive fashion. The experimental evaluation shows that the additional cost of our contraint-based semi-supervised hierarchical clustering algorithm (HAC) is negligible, and our combined semi-supervised HAC algorithm outperforms the state-of-the-art algorithms on real-world datasets. The experiments also show that our proposed methods can improve clustering performance even with a small number of unevenly distributed labeled data.

Keywords: semi-supervised clustering, hierarchical agglomerative clustering, reference trees, distance constraints

Procedia PDF Downloads 547
5673 A Clustering-Sequencing Approach to the Facility Layout Problem

Authors: Saeideh Salimpour, Sophie-Charlotte Viaux, Ahmed Azab, Mohammed Fazle Baki

Abstract:

The Facility Layout Problem (FLP) is key to the efficient and cost-effective operation of a system. This paper presents a hybrid heuristic- and mathematical-programming-based approach that divides the problem conceptually into those of clustering and sequencing. First, clusters of vertically aligned facilities are formed, which are later on sequenced horizontally. The developed methodology provides promising results in comparison to its counterparts in the literature by minimizing the inter-distances for facilities which have more interactions amongst each other and aims at placing the facilities with more interactions at the centroid of the shop.

Keywords: clustering-sequencing approach, mathematical modeling, optimization, unequal facility layout problem

Procedia PDF Downloads 333
5672 Fuzzy Optimization Multi-Objective Clustering Ensemble Model for Multi-Source Data Analysis

Authors: C. B. Le, V. N. Pham

Abstract:

In modern data analysis, multi-source data appears more and more in real applications. Multi-source data clustering has emerged as a important issue in the data mining and machine learning community. Different data sources provide information about different data. Therefore, multi-source data linking is essential to improve clustering performance. However, in practice multi-source data is often heterogeneous, uncertain, and large. This issue is considered a major challenge from multi-source data. Ensemble is a versatile machine learning model in which learning techniques can work in parallel, with big data. Clustering ensemble has been shown to outperform any standard clustering algorithm in terms of accuracy and robustness. However, most of the traditional clustering ensemble approaches are based on single-objective function and single-source data. This paper proposes a new clustering ensemble method for multi-source data analysis. The fuzzy optimized multi-objective clustering ensemble method is called FOMOCE. Firstly, a clustering ensemble mathematical model based on the structure of multi-objective clustering function, multi-source data, and dark knowledge is introduced. Then, rules for extracting dark knowledge from the input data, clustering algorithms, and base clusterings are designed and applied. Finally, a clustering ensemble algorithm is proposed for multi-source data analysis. The experiments were performed on the standard sample data set. The experimental results demonstrate the superior performance of the FOMOCE method compared to the existing clustering ensemble methods and multi-source clustering methods.

Keywords: clustering ensemble, multi-source, multi-objective, fuzzy clustering

Procedia PDF Downloads 189
5671 Knowledge Representation Based on Interval Type-2 CFCM Clustering

Authors: Lee Myung-Won, Kwak Keun-Chang

Abstract:

This paper is concerned with knowledge representation and extraction of fuzzy if-then rules using Interval Type-2 Context-based Fuzzy C-Means clustering (IT2-CFCM) with the aid of fuzzy granulation. This proposed clustering algorithm is based on information granulation in the form of IT2 based Fuzzy C-Means (IT2-FCM) clustering and estimates the cluster centers by preserving the homogeneity between the clustered patterns from the IT2 contexts produced in the output space. Furthermore, we can obtain the automatic knowledge representation in the design of Radial Basis Function Networks (RBFN), Linguistic Model (LM), and Adaptive Neuro-Fuzzy Networks (ANFN) from the numerical input-output data pairs. We shall focus on a design of ANFN in this paper. The experimental results on an estimation problem of energy performance reveal that the proposed method showed a good knowledge representation and performance in comparison with the previous works.

Keywords: IT2-FCM, IT2-CFCM, context-based fuzzy clustering, adaptive neuro-fuzzy network, knowledge representation

Procedia PDF Downloads 322
5670 ACOPIN: An ACO Algorithm with TSP Approach for Clustering Proteins in Protein Interaction Networks

Authors: Jamaludin Sallim, Rozlina Mohamed, Roslina Abdul Hamid

Abstract:

In this paper, we proposed an Ant Colony Optimization (ACO) algorithm together with Traveling Salesman Problem (TSP) approach to investigate the clustering problem in Protein Interaction Networks (PIN). We named this combination as ACOPIN. The purpose of this work is two-fold. First, to test the efficacy of ACO in clustering PIN and second, to propose the simple generalization of the ACO algorithm that might allow its application in clustering proteins in PIN. We split this paper to three main sections. First, we describe the PIN and clustering proteins in PIN. Second, we discuss the steps involved in each phase of ACO algorithm. Finally, we present some results of the investigation with the clustering patterns.

Keywords: ant colony optimization algorithm, searching algorithm, protein functional module, protein interaction network

Procedia PDF Downloads 612
5669 Spectral Clustering for Manufacturing Cell Formation

Authors: Yessica Nataliani, Miin-Shen Yang

Abstract:

Cell formation (CF) is an important step in group technology. It is used in designing cellular manufacturing systems using similarities between parts in relation to machines so that it can identify part families and machine groups. There are many CF methods in the literature, but there is less spectral clustering used in CF. In this paper, we propose a spectral clustering algorithm for machine-part CF. Some experimental examples are used to illustrate its efficiency. Overall, the spectral clustering algorithm can be used in CF with a wide variety of machine/part matrices.

Keywords: group technology, cell formation, spectral clustering, grouping efficiency

Procedia PDF Downloads 408
5668 Feature Weighting Comparison Based on Clustering Centers in the Detection of Diabetic Retinopathy

Authors: Kemal Polat

Abstract:

In this paper, three feature weighting methods have been used to improve the classification performance of diabetic retinopathy (DR). To classify the diabetic retinopathy, features extracted from the output of several retinal image processing algorithms, such as image-level, lesion-specific and anatomical components, have been used and fed them into the classifier algorithms. The dataset used in this study has been taken from University of California, Irvine (UCI) machine learning repository. Feature weighting methods including the fuzzy c-means clustering based feature weighting, subtractive clustering based feature weighting, and Gaussian mixture clustering based feature weighting, have been used and compered with each other in the classification of DR. After feature weighting, five different classifier algorithms comprising multi-layer perceptron (MLP), k- nearest neighbor (k-NN), decision tree, support vector machine (SVM), and Naïve Bayes have been used. The hybrid method based on combination of subtractive clustering based feature weighting and decision tree classifier has been obtained the classification accuracy of 100% in the screening of DR. These results have demonstrated that the proposed hybrid scheme is very promising in the medical data set classification.

Keywords: machine learning, data weighting, classification, data mining

Procedia PDF Downloads 325
5667 Investigation of Clustering Algorithms Used in Wireless Sensor Networks

Authors: Naim Karasekreter, Ugur Fidan, Fatih Basciftci

Abstract:

Wireless sensor networks are networks in which more than one sensor node is organized among themselves. The working principle is based on the transfer of the sensed data over the other nodes in the network to the central station. Wireless sensor networks concentrate on routing algorithms, energy efficiency and clustering algorithms. In the clustering method, the nodes in the network are divided into clusters using different parameters and the most suitable cluster head is selected from among them. The data to be sent to the center is sent per cluster, and the cluster head is transmitted to the center. With this method, the network traffic is reduced and the energy efficiency of the nodes is increased. In this study, clustering algorithms were examined in terms of clustering performances and cluster head selection characteristics to try to identify weak and strong sides. This work is supported by the Project 17.Kariyer.123 of Afyon Kocatepe University BAP Commission.

Keywords: wireless sensor networks (WSN), clustering algorithm, cluster head, clustering

Procedia PDF Downloads 513
5666 A Comparative Study of Multi-SOM Algorithms for Determining the Optimal Number of Clusters

Authors: Imèn Khanchouch, Malika Charrad, Mohamed Limam

Abstract:

The interpretation of the quality of clusters and the determination of the optimal number of clusters is still a crucial problem in clustering. We focus in this paper on multi-SOM clustering method which overcomes the problem of extracting the number of clusters from the SOM map through the use of a clustering validity index. We then tested multi-SOM using real and artificial data sets with different evaluation criteria not used previously such as Davies Bouldin index, Dunn index and silhouette index. The developed multi-SOM algorithm is compared to k-means and Birch methods. Results show that it is more efficient than classical clustering methods.

Keywords: clustering, SOM, multi-SOM, DB index, Dunn index, silhouette index

Procedia PDF Downloads 599
5665 A Fuzzy Kernel K-Medoids Algorithm for Clustering Uncertain Data Objects

Authors: Behnam Tavakkol

Abstract:

Uncertain data mining algorithms use different ways to consider uncertainty in data such as by representing a data object as a sample of points or a probability distribution. Fuzzy methods have long been used for clustering traditional (certain) data objects. They are used to produce non-crisp cluster labels. For uncertain data, however, besides some uncertain fuzzy k-medoids algorithms, not many other fuzzy clustering methods have been developed. In this work, we develop a fuzzy kernel k-medoids algorithm for clustering uncertain data objects. The developed fuzzy kernel k-medoids algorithm is superior to existing fuzzy k-medoids algorithms in clustering data sets with non-linearly separable clusters.

Keywords: clustering algorithm, fuzzy methods, kernel k-medoids, uncertain data

Procedia PDF Downloads 215
5664 GCM Based Fuzzy Clustering to Identify Homogeneous Climatic Regions of North-East India

Authors: Arup K. Sarma, Jayshree Hazarika

Abstract:

The North-eastern part of India, which receives heavier rainfall than other parts of the subcontinent, is of great concern now-a-days with regard to climate change. High intensity rainfall for short duration and longer dry spell, occurring due to impact of climate change, affects river morphology too. In the present study, an attempt is made to delineate the North-Eastern region of India into some homogeneous clusters based on the Fuzzy Clustering concept and to compare the resulting clusters obtained by using conventional methods and non conventional methods of clustering. The concept of clustering is adapted in view of the fact that, impact of climate change can be studied in a homogeneous region without much variation, which can be helpful in studies related to water resources planning and management. 10 IMD (Indian Meteorological Department) stations, situated in various regions of the North-east, have been selected for making the clusters. The results of the Fuzzy C-Means (FCM) analysis show different clustering patterns for different conditions. From the analysis and comparison it can be concluded that non conventional method of using GCM data is somehow giving better results than the others. However, further analysis can be done by taking daily data instead of monthly means to reduce the effect of standardization.

Keywords: climate change, conventional and nonconventional methods of clustering, FCM analysis, homogeneous regions

Procedia PDF Downloads 386
5663 An Experimental Study on Some Conventional and Hybrid Models of Fuzzy Clustering

Authors: Jeugert Kujtila, Kristi Hoxhalli, Ramazan Dalipi, Erjon Cota, Ardit Murati, Erind Bedalli

Abstract:

Clustering is a versatile instrument in the analysis of collections of data providing insights of the underlying structures of the dataset and enhancing the modeling capabilities. The fuzzy approach to the clustering problem increases the flexibility involving the concept of partial memberships (some value in the continuous interval [0, 1]) of the instances in the clusters. Several fuzzy clustering algorithms have been devised like FCM, Gustafson-Kessel, Gath-Geva, kernel-based FCM, PCM etc. Each of these algorithms has its own advantages and drawbacks, so none of these algorithms would be able to perform superiorly in all datasets. In this paper we will experimentally compare FCM, GK, GG algorithm and a hybrid two-stage fuzzy clustering model combining the FCM and Gath-Geva algorithms. Firstly we will theoretically dis-cuss the advantages and drawbacks for each of these algorithms and we will describe the hybrid clustering model exploiting the advantages and diminishing the drawbacks of each algorithm. Secondly we will experimentally compare the accuracy of the hybrid model by applying it on several benchmark and synthetic datasets.

Keywords: fuzzy clustering, fuzzy c-means algorithm (FCM), Gustafson-Kessel algorithm, hybrid clustering model

Procedia PDF Downloads 514
5662 Using Closed Frequent Itemsets for Hierarchical Document Clustering

Authors: Cheng-Jhe Lee, Chiun-Chieh Hsu

Abstract:

Due to the rapid development of the Internet and the increased availability of digital documents, the excessive information on the Internet has led to information overflow problem. In order to solve these problems for effective information retrieval, document clustering in text mining becomes a popular research topic. Clustering is the unsupervised classification of data items into groups without the need of training data. Many conventional document clustering methods perform inefficiently for large document collections because they were originally designed for relational database. Therefore they are impractical in real-world document clustering and require special handling for high dimensionality and high volume. We propose the FIHC (Frequent Itemset-based Hierarchical Clustering) method, which is a hierarchical clustering method developed for document clustering, where the intuition of FIHC is that there exist some common words for each cluster. FIHC uses such words to cluster documents and builds hierarchical topic tree. In this paper, we combine FIHC algorithm with ontology to solve the semantic problem and mine the meaning behind the words in documents. Furthermore, we use the closed frequent itemsets instead of only use frequent itemsets, which increases efficiency and scalability. The experimental results show that our method is more accurate than those of well-known document clustering algorithms.

Keywords: FIHC, documents clustering, ontology, closed frequent itemset

Procedia PDF Downloads 399
5661 Improved K-Means Clustering Algorithm Using RHadoop with Combiner

Authors: Ji Eun Shin, Dong Hoon Lim

Abstract:

Data clustering is a common technique used in data analysis and is used in many applications, such as artificial intelligence, pattern recognition, economics, ecology, psychiatry and marketing. K-means clustering is a well-known clustering algorithm aiming to cluster a set of data points to a predefined number of clusters. In this paper, we implement K-means algorithm based on MapReduce framework with RHadoop to make the clustering method applicable to large scale data. RHadoop is a collection of R packages that allow users to manage and analyze data with Hadoop. The main idea is to introduce a combiner as a function of our map output to decrease the amount of data needed to be processed by reducers. The experimental results demonstrated that K-means algorithm using RHadoop can scale well and efficiently process large data sets on commodity hardware. We also showed that our K-means algorithm using RHadoop with combiner was faster than regular algorithm without combiner as the size of data set increases.

Keywords: big data, combiner, K-means clustering, RHadoop

Procedia PDF Downloads 438
5660 Clustering Based Level Set Evaluation for Low Contrast Images

Authors: Bikshalu Kalagadda, Srikanth Rangu

Abstract:

The important object of images segmentation is to extract objects with respect to some input features. One of the important methods for image segmentation is Level set method. Generally medical images and synthetic images with low contrast of pixel profile, for such images difficult to locate interested features in images. In conventional level set function, develops irregularity during its process of evaluation of contour of objects, this destroy the stability of evolution process. For this problem a remedy is proposed, a new hybrid algorithm is Clustering Level Set Evolution. Kernel fuzzy particles swarm optimization clustering with the Distance Regularized Level Set (DRLS) and Selective Binary, and Gaussian Filtering Regularized Level Set (SBGFRLS) methods are used. The ability of identifying different regions becomes easy with improved speed. Efficiency of the modified method can be evaluated by comparing with the previous method for similar specifications. Comparison can be carried out by considering medical and synthetic images.

Keywords: segmentation, clustering, level set function, re-initialization, Kernel fuzzy, swarm optimization

Procedia PDF Downloads 352
5659 Effect of Bi-Dispersity on Particle Clustering in Sedimentation

Authors: Ali Abbas Zaidi

Abstract:

In free settling or sedimentation, particles form clusters at high Reynolds number and dilute suspensions. It is due to the entrapment of particles in the wakes of upstream particles. In this paper, the effect of bi-dispersity of settling particles on particle clustering is investigated using particle-resolved direct numerical simulation. Immersed boundary method is used for particle fluid interactions and discrete element method is used for particle-particle interactions. The solid volume fraction used in the simulation is 1% and the Reynolds number based on Sauter mean diameter is 350. Both solid volume fraction and Reynolds number lie in the clustering regime of sedimentation. In simulations, the particle diameter ratio (i.e. diameter of larger particle to smaller particle (d₁/d₂)) is varied from 2:1, 3:1 and 4:1. For each case of particle diameter ratio, solid volume fraction for each particle size (φ₁/φ₂) is varied from 1:1, 1:2 and 2:1. For comparison, simulations are also performed for monodisperse particles. For studying particles clustering, radial distribution function and instantaneous location of particles in the computational domain are studied. It is observed that the degree of particle clustering decreases with the increase in the bi-dispersity of settling particles. The smallest degree of particle clustering or dispersion of particles is observed for particles with d₁/d₂ equal to 4:1 and φ₁/φ₂ equal to 1:2. Simulations showed that the reduction in particle clustering by increasing bi-dispersity is due to the difference in settling velocity of particles. Particles with larger size settle faster and knockout the smaller particles from clustered regions of particles in the computational domain.

Keywords: dispersion in bi-disperse settling particles, particle microstructures in bi-disperse suspensions, particle resolved direct numerical simulations, settling of bi-disperse particles

Procedia PDF Downloads 207
5658 Application of Data Mining for Aquifer Environmental Assessment

Authors: Saman Javadi, Mehdi Hashemy, Mohahammad Mahmoodi

Abstract:

Vulnerability maps are employed as an important solution in order to handle entrance of pollution into the aquifers. The common way to provide vulnerability map is DRASTIC. Meanwhile, application of the method is not easy to apply for any aquifer due to choosing appropriate constant values of weights and ranks. In this study, a new approach using k-means clustering is applied to make vulnerability maps. Four features of depth to groundwater, hydraulic conductivity, recharge value and vadose zone were considered at the same time as features of clustering. Five regions are recognized out of the case study represent zones with different level of vulnerability. The finding results show that clustering provides a realistic vulnerability map so that, Pearson’s correlation coefficients between nitrate concentrations and clustering vulnerability is obtained 61%.

Keywords: clustering, data mining, groundwater, vulnerability assessment

Procedia PDF Downloads 603
5657 Energy Efficient Clustering with Adaptive Particle Swarm Optimization

Authors: KumarShashvat, ArshpreetKaur, RajeshKumar, Raman Chadha

Abstract:

Wireless sensor networks have principal characteristic of having restricted energy and with limitation that energy of the nodes cannot be replenished. To increase the lifetime in this scenario WSN route for data transmission is opted such that utilization of energy along the selected route is negligible. For this energy efficient network, dandy infrastructure is needed because it impinges the network lifespan. Clustering is a technique in which nodes are grouped into disjoints and non–overlapping sets. In this technique data is collected at the cluster head. In this paper, Adaptive-PSO algorithm is proposed which forms energy aware clusters by minimizing the cost of locating the cluster head. The main concern is of the suitability of the swarms by adjusting the learning parameters of PSO. Particle Swarm Optimization converges quickly at the beginning stage of the search but during the course of time, it becomes stable and may be trapped in local optima. In suggested network model swarms are given the intelligence of the spiders which makes them capable enough to avoid earlier convergence and also help them to escape from the local optima. Comparison analysis with traditional PSO shows that new algorithm considerably enhances the performance where multi-dimensional functions are taken into consideration.

Keywords: Particle Swarm Optimization, adaptive – PSO, comparison between PSO and A-PSO, energy efficient clustering

Procedia PDF Downloads 246
5656 3D Mesh Coarsening via Uniform Clustering

Authors: Shuhua Lai, Kairui Chen

Abstract:

In this paper, we present a fast and efficient mesh coarsening algorithm for 3D triangular meshes. Theis approach can be applied to very complex 3D meshes of arbitrary topology and with millions of vertices. The algorithm is based on the clustering of the input mesh elements, which divides the faces of an input mesh into a given number of clusters for clustering purpose by approximating the Centroidal Voronoi Tessellation of the input mesh. Once a clustering is achieved, it provides us an efficient way to construct uniform tessellations, and therefore leads to good coarsening of polygonal meshes. With proliferation of 3D scanners, this coarsening algorithm is particularly useful for reverse engineering applications of 3D models, which in many cases are dense, non-uniform, irregular and arbitrary topology. Examples demonstrating effectiveness of the new algorithm are also included in the paper.

Keywords: coarsening, mesh clustering, shape approximation, mesh simplification

Procedia PDF Downloads 380
5655 Multimodal Optimization of Density-Based Clustering Using Collective Animal Behavior Algorithm

Authors: Kristian Bautista, Ruben A. Idoy

Abstract:

A bio-inspired metaheuristic algorithm inspired by the theory of collective animal behavior (CAB) was integrated to density-based clustering modeled as multimodal optimization problem. The algorithm was tested on synthetic, Iris, Glass, Pima and Thyroid data sets in order to measure its effectiveness relative to CDE-based Clustering algorithm. Upon preliminary testing, it was found out that one of the parameter settings used was ineffective in performing clustering when applied to the algorithm prompting the researcher to do an investigation. It was revealed that fine tuning distance δ3 that determines the extent to which a given data point will be clustered helped improve the quality of cluster output. Even though the modification of distance δ3 significantly improved the solution quality and cluster output of the algorithm, results suggest that there is no difference between the population mean of the solutions obtained using the original and modified parameter setting for all data sets. This implies that using either the original or modified parameter setting will not have any effect towards obtaining the best global and local animal positions. Results also suggest that CDE-based clustering algorithm is better than CAB-density clustering algorithm for all data sets. Nevertheless, CAB-density clustering algorithm is still a good clustering algorithm because it has correctly identified the number of classes of some data sets more frequently in a thirty trial run with a much smaller standard deviation, a potential in clustering high dimensional data sets. Thus, the researcher recommends further investigation in the post-processing stage of the algorithm.

Keywords: clustering, metaheuristics, collective animal behavior algorithm, density-based clustering, multimodal optimization

Procedia PDF Downloads 230
5654 Anomaly Detection Based Fuzzy K-Mode Clustering for Categorical Data

Authors: Murat Yazici

Abstract:

Anomalies are irregularities found in data that do not adhere to a well-defined standard of normal behavior. The identification of outliers or anomalies in data has been a subject of study within the statistics field since the 1800s. Over time, a variety of anomaly detection techniques have been developed in several research communities. The cluster analysis can be used to detect anomalies. It is the process of associating data with clusters that are as similar as possible while dissimilar clusters are associated with each other. Many of the traditional cluster algorithms have limitations in dealing with data sets containing categorical properties. To detect anomalies in categorical data, fuzzy clustering approach can be used with its advantages. The fuzzy k-Mode (FKM) clustering algorithm, which is one of the fuzzy clustering approaches, by extension to the k-means algorithm, is reported for clustering datasets with categorical values. It is a form of clustering: each point can be associated with more than one cluster. In this paper, anomaly detection is performed on two simulated data by using the FKM cluster algorithm. As a significance of the study, the FKM cluster algorithm allows to determine anomalies with their abnormality degree in contrast to numerous anomaly detection algorithms. According to the results, the FKM cluster algorithm illustrated good performance in the anomaly detection of data, including both one anomaly and more than one anomaly.

Keywords: fuzzy k-mode clustering, anomaly detection, noise, categorical data

Procedia PDF Downloads 53
5653 Chemical Reaction Algorithm for Expectation Maximization Clustering

Authors: Li Ni, Pen ManMan, Li KenLi

Abstract:

Clustering is an intensive research for some years because of its multifaceted applications, such as biology, information retrieval, medicine, business and so on. The expectation maximization (EM) is a kind of algorithm framework in clustering methods, one of the ten algorithms of machine learning. Traditionally, optimization of objective function has been the standard approach in EM. Hence, research has investigated the utility of evolutionary computing and related techniques in the regard. Chemical Reaction Optimization (CRO) is a recently established method. So the property embedded in CRO is used to solve optimization problems. This paper presents an algorithm framework (EM-CRO) with modified CRO operators based on EM cluster problems. The hybrid algorithm is mainly to solve the problem of initial value sensitivity of the objective function optimization clustering algorithm. Our experiments mainly take the EM classic algorithm:k-means and fuzzy k-means as an example, through the CRO algorithm to optimize its initial value, get K-means-CRO and FKM-CRO algorithm. The experimental results of them show that there is improved efficiency for solving objective function optimization clustering problems.

Keywords: chemical reaction optimization, expection maimization, initia, objective function clustering

Procedia PDF Downloads 715
5652 Self-Supervised Attributed Graph Clustering with Dual Contrastive Loss Constraints

Authors: Lijuan Zhou, Mengqi Wu, Changyong Niu

Abstract:

Attributed graph clustering can utilize the graph topology and node attributes to uncover hidden community structures and patterns in complex networks, aiding in the understanding and analysis of complex systems. Utilizing contrastive learning for attributed graph clustering can effectively exploit meaningful implicit relationships between data. However, existing attributed graph clustering methods based on contrastive learning suffer from the following drawbacks: 1) Complex data augmentation increases computational cost, and inappropriate data augmentation may lead to semantic drift. 2) The selection of positive and negative samples neglects the intrinsic cluster structure learned from graph topology and node attributes. Therefore, this paper proposes a method called self-supervised Attributed Graph Clustering with Dual Contrastive Loss constraints (AGC-DCL). Firstly, Siamese Multilayer Perceptron (MLP) encoders are employed to generate two views separately to avoid complex data augmentation. Secondly, the neighborhood contrastive loss is introduced to constrain node representation using local topological structure while effectively embedding attribute information through attribute reconstruction. Additionally, clustering-oriented contrastive loss is applied to fully utilize clustering information in global semantics for discriminative node representations, regarding the cluster centers from two views as negative samples to fully leverage effective clustering information from different views. Comparative clustering results with existing attributed graph clustering algorithms on six datasets demonstrate the superiority of the proposed method.

Keywords: attributed graph clustering, contrastive learning, clustering-oriented, self-supervised learning

Procedia PDF Downloads 53