Search results for: cluster analysis
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 8923

Search results for: cluster analysis

8893 Some Issues with Extension of an HPC Cluster

Authors: Pil Seong Park

Abstract:

Homemade HPC clusters are widely used in many small labs, because they are easy to build and cost-effective. Even though incremental growth is an advantage of clusters, it results in heterogeneous systems anyhow. Instead of adding new nodes to the cluster, we can extend clusters to include some other Internet servers working independently on the same LAN, so that we can make use of their idle times, especially during the night. However extension across a firewall raises some security problems with NFS. In this paper, we propose a method to solve such a problem using SSH tunneling, and suggest a modified structure of the cluster that implements it.

Keywords: Extension of HPC clusters, Security, NFS, SSH tunneling.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1898
8892 A New Evolutionary Algorithm for Cluster Analysis

Authors: B.Bahmani Firouzi, T. Niknam, M. Nayeripour

Abstract:

Clustering is a very well known technique in data mining. One of the most widely used clustering techniques is the kmeans algorithm. Solutions obtained from this technique depend on the initialization of cluster centers and the final solution converges to local minima. In order to overcome K-means algorithm shortcomings, this paper proposes a hybrid evolutionary algorithm based on the combination of PSO, SA and K-means algorithms, called PSO-SA-K, which can find better cluster partition. The performance is evaluated through several benchmark data sets. The simulation results show that the proposed algorithm outperforms previous approaches, such as PSO, SA and K-means for partitional clustering problem.

Keywords: Data clustering, Hybrid evolutionary optimization algorithm, K-means algorithm, Simulated Annealing (SA), Particle Swarm Optimization (PSO).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2277
8891 Collocation Assessment between GEO and GSO Satellites

Authors: A. E. Emam, M. Abd Elghany

Abstract:

The change in orbit evolution between collocated satellites (X, Y) inside +/-0.09° E/W and +/- 0.07° N/S cluster, after one of these satellites is placed in an inclined orbit (satellite X) and the effect of this change in the collocation safety inside the cluster window has been studied and evaluated. Several collocation scenarios had been studied in order to adjust the location of both satellites inside their cluster to maximize the separation between them and safe the mission.

Keywords: Satellite, GEO, collocation, risk assessment.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2327
8890 An Energy Efficient Cluster Formation Protocol with Low Latency In Wireless Sensor Networks

Authors: A. Allirani, M. Suganthi

Abstract:

Data gathering is an essential operation in wireless sensor network applications. So it requires energy efficiency techniques to increase the lifetime of the network. Similarly, clustering is also an effective technique to improve the energy efficiency and network lifetime of wireless sensor networks. In this paper, an energy efficient cluster formation protocol is proposed with the objective of achieving low energy dissipation and latency without sacrificing application specific quality. The objective is achieved by applying randomized, adaptive, self-configuring cluster formation and localized control for data transfers. It involves application - specific data processing, such as data aggregation or compression. The cluster formation algorithm allows each node to make independent decisions, so as to generate good clusters as the end. Simulation results show that the proposed protocol utilizes minimum energy and latency for cluster formation, there by reducing the overhead of the protocol.

Keywords: Sensor networks, Low latency, Energy sorting protocol, data processing, Cluster formation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2741
8889 The Effects of Different Level Cluster Tip Reduction and Foliar Boric Acid Applications on Yield and Yield Components of Italia Grape Cultivar

Authors: A. Akin

Abstract:

This study was carried out on Italia grape variety (Vitis vinifera L.) in Konya province, Turkey in 2016. The cultivar is five years old and grown on 1103 Paulsen rootstock. It was determined the effects of applications of the Control (C), 1/3 Cluster Tip Reduction (1/3 CTR), 1/6 Cluster Tip Reduction (1/6 CTR), 1/9 Cluster Tip Reduction (1/9 CTR), 1/3 CTR+Boric Acid (BA), 1/6 CTR+BA, 1/9 CTR+BA, on yield and yield components of the Italia grape variety. The results were obtained as the highest fresh grape yield (4.74 g) with 1/9 CTR+BA application; the highest cluster weight (220.08 g) with 1/3 CTR application; the highest 100 berry weight (565.85 g) with 1/9 CTR+BA application; as the highest maturity index (49.28) with 1/9 CTR+BA application; as the highest must yield (685.33 ml/kg) with 1/3 CTR+BA and (685.33 ml/kg) with 1/9 CTR+BA applications. To increase the fresh grape yield, 100 berry weight and maturity index in the Italia grape variety, the 1/9 CTR+BA application can be recommended.

Keywords: Italia grape variety, boric acid, cluster tip reduction, yield, yield components.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 998
8888 Enhanced Clustering Analysis and Visualization Using Kohonen's Self-Organizing Feature Map Networks

Authors: Kasthurirangan Gopalakrishnan, Siddhartha Khaitan, Anshu Manik

Abstract:

Cluster analysis is the name given to a diverse collection of techniques that can be used to classify objects (e.g. individuals, quadrats, species etc). While Kohonen's Self-Organizing Feature Map (SOFM) or Self-Organizing Map (SOM) networks have been successfully applied as a classification tool to various problem domains, including speech recognition, image data compression, image or character recognition, robot control and medical diagnosis, its potential as a robust substitute for clustering analysis remains relatively unresearched. SOM networks combine competitive learning with dimensionality reduction by smoothing the clusters with respect to an a priori grid and provide a powerful tool for data visualization. In this paper, SOM is used for creating a toroidal mapping of two-dimensional lattice to perform cluster analysis on results of a chemical analysis of wines produced in the same region in Italy but derived from three different cultivators, referred to as the “wine recognition data" located in the University of California-Irvine database. The results are encouraging and it is believed that SOM would make an appealing and powerful decision-support system tool for clustering tasks and for data visualization.

Keywords: Artificial neural networks, cluster analysis, Kohonen maps, wine recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2123
8887 Enhancing K-Means Algorithm with Initial Cluster Centers Derived from Data Partitioning along the Data Axis with the Highest Variance

Authors: S. Deelers, S. Auwatanamongkol

Abstract:

In this paper, we propose an algorithm to compute initial cluster centers for K-means clustering. Data in a cell is partitioned using a cutting plane that divides cell in two smaller cells. The plane is perpendicular to the data axis with the highest variance and is designed to reduce the sum squared errors of the two cells as much as possible, while at the same time keep the two cells far apart as possible. Cells are partitioned one at a time until the number of cells equals to the predefined number of clusters, K. The centers of the K cells become the initial cluster centers for K-means. The experimental results suggest that the proposed algorithm is effective, converge to better clustering results than those of the random initialization method. The research also indicated the proposed algorithm would greatly improve the likelihood of every cluster containing some data in it.

Keywords: Clustering algorithm, K-means algorithm, Datapartitioning, Initial cluster centers.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2866
8886 University Ranking Systems – From League Table to Homogeneous Groups of Universities

Authors: M. Jarocka

Abstract:

The paper contains a review of the literature in terms of the critical analysis of methodologies of university ranking systems. Furthermore, the initiatives supported by the European Commission (U-Map, U-Multirank) and CHE Ranking are described. Special attention is paid to the tendencies in the development of ranking systems. According to the author, the ranking organizations should abandon the classic form of ranking, namely a hierarchical ordering of universities from “the best" to “the worse". In the empirical part of this paper, using one of the method of cluster analysis called k-means clustering, the author presents university classifications of the top universities from the Shanghai Jiao Tong University-s (SJTU) Academic Ranking of World Universities (ARWU).

Keywords: Classification, cluster analysis, ranking, university.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2745
8885 Critical Psychosocial Risk Treatment for Engineers and Technicians

Authors: R. Berglund, T. Backström, M. Bellgran

Abstract:

This study explores how management addresses psychosocial risks in seven teams of engineers and technicians in the midst of the fourth industrial revolution. The sample is from an ongoing quasi-experiment about psychosocial risk management in a manufacturing company in Sweden. Each of the seven teams belongs to one of two clusters: a positive cluster or a negative cluster. The positive cluster reports a significantly positive change in psychosocial risk levels between two time-points and the negative cluster reports a significantly negative change. The data are collected using semi-structured interviews. The results of the computer aided thematic analysis show that there are more differences than similarities when comparing the risk treatment actions taken between the two clusters. Findings show that the managers in the positive cluster use more enabling actions that foster and support formal and informal relationship building. In contrast, managers that use less enabling actions hinder the development of positive group processes and contribute negative changes in psychosocial risk levels. This exploratory study sheds some light on how management can influence significant positive and negative changes in psychosocial risk levels during a risk management process.

Keywords: Group process model, risk treatment, risk management, psychosocial.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1025
8884 Optimizing Hadoop Block Placement Policy and Cluster Blocks Distribution

Authors: Nchimbi Edward Pius, Liu Qin, Fion Yang, Zhu Hong Ming

Abstract:

The current Hadoop block placement policy do not fairly and evenly distributes replicas of blocks written to datanodes in a Hadoop cluster.

This paper presents a new solution that helps to keep the cluster in a balanced state while an HDFS client is writing data to a file in Hadoop cluster. The solution had been implemented, and test had been conducted to evaluate its contribution to Hadoop distributed file system.

It has been found that, the solution has lowered global execution time taken by Hadoop balancer to 22 percent. It also has been found that, Hadoop balancer respectively over replicate 1.75 and 3.3 percent of all re-distributed blocks in the modified and original Hadoop clusters.

The feature that keeps the cluster in a balanced state works as a core part to Hadoop system and not just as a utility like traditional balancer. This is one of the significant achievements and uniqueness of the solution developed during the course of this research work.

Keywords: Balancer, Datanode, Distributed file system, Hadoop, Replicas.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4961
8883 Clustering Unstructured Text Documents Using Fading Function

Authors: Pallav Roxy, Durga Toshniwal

Abstract:

Clustering unstructured text documents is an important issue in data mining community and has a number of applications such as document archive filtering, document organization and topic detection and subject tracing. In the real world, some of the already clustered documents may not be of importance while new documents of more significance may evolve. Most of the work done so far in clustering unstructured text documents overlooks this aspect of clustering. This paper, addresses this issue by using the Fading Function. The unstructured text documents are clustered. And for each cluster a statistics structure called Cluster Profile (CP) is implemented. The cluster profile incorporates the Fading Function. This Fading Function keeps an account of the time-dependent importance of the cluster. The work proposes a novel algorithm Clustering n-ary Merge Algorithm (CnMA) for unstructured text documents, that uses Cluster Profile and Fading Function. Experimental results illustrating the effectiveness of the proposed technique are also included.

Keywords: Clustering, Text Mining, Unstructured TextDocuments, Fading Function.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1985
8882 Integrating Decision Tree and Spatial Cluster Analysis for Landslide Susceptibility Zonation

Authors: Chien-Min Chu, Bor-Wen Tsai, Kang-Tsung Chang

Abstract:

Landslide susceptibility map delineates the potential zones for landslide occurrence. Previous works have applied multivariate methods and neural networks for mapping landslide susceptibility. This study proposed a new approach to integrate decision tree model and spatial cluster statistic for assessing landslide susceptibility spatially. A total of 2057 landslide cells were digitized for developing the landslide decision tree model. The relationships of landslides and instability factors were explicitly represented by using tree graphs in the model. The local Getis-Ord statistics were used to cluster cells with high landslide probability. The analytic result from the local Getis-Ord statistics was classed to create a map of landslide susceptibility zones. The map was validated using new landslide data with 482 cells. Results of validation show an accuracy rate of 86.1% in predicting new landslide occurrence. This indicates that the proposed approach is useful for improving landslide susceptibility mapping.

Keywords: Landslide susceptibility Zonation, Decision treemodel, Spatial cluster, Local Getis-Ord statistics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1941
8881 Game Theory Based Diligent Energy Utilization Algorithm for Routing in Wireless Sensor Network

Authors: X. Mercilin Raajini, R. Raja Kumar, P. Indumathi, V. Praveen

Abstract:

Many cluster based routing protocols have been proposed in the field of wireless sensor networks, in which a group of nodes are formed as clusters. A cluster head is selected from one among those nodes based on residual energy, coverage area, number of hops and that cluster-head will perform data gathering from various sensor nodes and forwards aggregated data to the base station or to a relay node (another cluster-head), which will forward the packet along with its own data packet to the base station. Here a Game Theory based Diligent Energy Utilization Algorithm (GTDEA) for routing is proposed. In GTDEA, the cluster head selection is done with the help of game theory, a decision making process, that selects a cluster-head based on three parameters such as residual energy (RE), Received Signal Strength Index (RSSI) and Packet Reception Rate (PRR). Finding a feasible path to the destination with minimum utilization of available energy improves the network lifetime and is achieved by the proposed approach. In GTDEA, the packets are forwarded to the base station using inter-cluster routing technique, which will further forward it to the base station. Simulation results reveal that GTDEA improves the network performance in terms of throughput, lifetime, and power consumption.

Keywords: Cluster head, Energy utilization, Game Theory, LEACH, Sensor network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1903
8880 Creation of Greater Mekong Subregion Regional Competitiveness through Cluster Mapping

Authors: Danuvasin Charoen

Abstract:

This research investigates cluster development in the area called the Greater Mekong Subregion (GMS), which consists of Thailand, the People’s Republic of China (PRC), the Yunnan Province and Guangxi Zhuang Autonomous Region, Myanmar, the Lao People’s Democratic Republic (Lao PDR), Cambodia, and Vietnam. The study utilized Porter’s competitiveness theory and the cluster mapping approach to analyze the competitiveness of the region. The data collection consists of interviews, focus groups, and the analysis of secondary data. The findings identify some evidence of cluster development in the GMS; however, there is no clear indication of collaboration among the components in the clusters. GMS clusters tend to be stand-alone. The clusters in Vietnam, Lao PDR, Myanmar, and Cambodia tend to be labor intensive, whereas the clusters in Thailand and the PRC (Yunnan) have the potential to successfully develop into innovative clusters. The collaboration and integration among the clusters in the GMS area are promising, though it could take a long time. The most likely relationship between the GMS countries could be, for example, suppliers of the low-end, labor-intensive products will be located in the low income countries such as Myanmar, Lao PDR, and Cambodia, and these countries will be providing input materials for innovative clusters in the middle income countries such as Thailand and the PRC.

Keywords: Greater Mekong Subregion, competitiveness, cluster, development.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1072
8879 Analysis of Palm Perspiration Effect with SVM for Diabetes in People

Authors: Hamdi Melih Saraoğlu, Muhlis Yıldırım, Abdurrahman Özbeyaz, Feyzullah Temurtas

Abstract:

In this research, the diabetes conditions of people (healthy, prediabete and diabete) were tried to be identified with noninvasive palm perspiration measurements. Data clusters gathered from 200 subjects were used (1.Individual Attributes Cluster and 2. Palm Perspiration Attributes Cluster). To decrase the dimensions of these data clusters, Principal Component Analysis Method was used. Data clusters, prepared in that way, were classified with Support Vector Machines. Classifications with highest success were 82% for Glucose parameters and 84% for HbA1c parametres.

Keywords: Palm perspiration, Diabetes, Support Vector Machine, Classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1948
8878 Two-Photon Ionization of Silver Clusters

Authors: V. Paployan, K. Madoyan, A. Melikyan, H. Minassian

Abstract:

In this paper, we calculate the two-photon ionization (TPI) cross-section for pump-probe scheme in Ag neutral cluster. The pump photon energy is assumed to be close to the surface plasmon (SP) energy of cluster in dielectric media. Due to this choice, the pump wave excites collective oscillations of electrons-SP and the probe wave causes ionization of the cluster. Since the interband transition energy in Ag exceeds the SP resonance energy, the main contribution into the TPI comes from the latter. The advantage of Ag clusters as compared to the other noble metals is that the SP resonance in silver cluster is much sharper because of peculiarities of its dielectric function. The calculations are performed by separating the coordinates of electrons corresponding to the collective oscillations and the individual motion that allows taking into account the resonance contribution of excited SP oscillations. It is shown that the ionization cross section increases by two orders of magnitude if the energy of the pump photon matches the surface plasmon energy in the cluster.

Keywords: Resonance enhancement, silver clusters, surface plasmon, two-photon ionization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1470
8877 Network of Coupled Stochastic Oscillators and One-way Quantum Computations

Authors: Eugene Grichuk, Margarita Kuzmina, Eduard Manykin

Abstract:

A network of coupled stochastic oscillators is proposed for modeling of a cluster of entangled qubits that is exploited as a computation resource in one-way quantum computation schemes. A qubit model has been designed as a stochastic oscillator formed by a pair of coupled limit cycle oscillators with chaotically modulated limit cycle radii and frequencies. The qubit simulates the behavior of electric field of polarized light beam and adequately imitates the states of two-level quantum system. A cluster of entangled qubits can be associated with a beam of polarized light, light polarization degree being directly related to cluster entanglement degree. Oscillatory network, imitating qubit cluster, is designed, and system of equations for network dynamics has been written. The constructions of one-qubit gates are suggested. Changing of cluster entanglement degree caused by measurements can be exactly calculated.

Keywords: network of stochastic oscillators, one-way quantumcomputations, a beam of polarized light.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1400
8876 Cluster Analysis for the Statistical Modeling of Aesthetic Judgment Data Related to Comics Artists

Authors: George E. Tsekouras, Evi Sampanikou

Abstract:

We compare three categorical data clustering algorithms with respect to the problem of classifying cultural data related to the aesthetic judgment of comics artists. Such a classification is very important in Comics Art theory since the determination of any classes of similarities in such kind of data will provide to art-historians very fruitful information of Comics Art-s evolution. To establish this, we use a categorical data set and we study it by employing three categorical data clustering algorithms. The performances of these algorithms are compared each other, while interpretations of the clustering results are also given.

Keywords: Aesthetic judgment, comics artists, cluster analysis, categorical data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1634
8875 A Multivariate Statistical Approach for Water Quality Assessment of River Hindon, India

Authors: Nida Rizvi, Deeksha Katyal, Varun Joshi

Abstract:

River Hindon is an important river catering the demand of highly populated rural and industrial cluster of western Uttar Pradesh, India. Water quality of river Hindon is deteriorating at an alarming rate due to various industrial, municipal and agricultural activities. The present study aimed at identifying the pollution sources and quantifying the degree to which these sources are responsible for the deteriorating water quality of the river. Various water quality parameters, like pH, temperature, electrical conductivity, total dissolved solids, total hardness, calcium, chloride, nitrate, sulphate, biological oxygen demand, chemical oxygen demand, and total alkalinity were assessed. Water quality data obtained from eight study sites for one year has been subjected to the two multivariate techniques, namely, principal component analysis and cluster analysis. Principal component analysis was applied with the aim to find out spatial variability and to identify the sources responsible for the water quality of the river. Three Varifactors were obtained after varimax rotation of initial principal components using principal component analysis. Cluster analysis was carried out to classify sampling stations of certain similarity, which grouped eight different sites into two clusters. The study reveals that the anthropogenic influence (municipal, industrial, waste water and agricultural runoff) was the major source of river water pollution. Thus, this study illustrates the utility of multivariate statistical techniques for analysis and elucidation of multifaceted data sets, recognition of pollution sources/factors and understanding temporal/spatial variations in water quality for effective river water quality management.

Keywords: Cluster analysis, multivariate statistical technique, river Hindon, water Quality.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3814
8874 Communities of Ammonia-oxidizing Archaea and Bacteria in Enriched Nitrifying Activated Sludge

Authors: Puntipar Sonthiphand, Tawan Limpiyakorn

Abstract:

In this study, communities of ammonia-oxidizing archaea (AOA) and ammonia-oxidizing bacteria (AOB) in nitrifying activated sludge (NAS) prepared by enriching sludge from a municipal wastewater treatment plant in three continuous-flow reactors receiving an inorganic medium containing different ammonium concentrations of 2, 10, and 30 mM NH4 +-N (NAS2, NAS10, and NAS30, respectively) were investigated using molecular analysis. Results suggested that almost all AOA clones from NAS2, NAS10, and NAS30 fell into the same AOA cluster and AOA communities in NAS2 and NAS10 were more diverse than those of NAS30. In contrast to AOA, AOB communities obviously shifted from the seed sludge to enriched NASs and in each enriched NAS, communities of AOB varied particularly. The seed sludge contained members of N. communis cluster and N. oligotropha cluster. After it was enriched under various ammonium loads, members of N. communis cluster disappeared from all enriched NASs. AOB with high affinity to ammonia presented in NAS 2, AOB with low affinity to ammonia presented in NAS 30, and both types of AOB survived in NAS 10. These demonstrated that ammonium load significantly influenced AOB communities, but not AOA communities in enriched NASs.

Keywords: ammonia-oxidizing bacteria, ammonia-oxidizingarchaea, nitrifying activated sludge.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1588
8873 Research of Potential Cluster Development in Pannonian Croatia

Authors: Mirjana Radman-Funarić, Katarina Potnik Galić

Abstract:

The paper presents an analysis of linkages and structures of co-operation and their intensity like the potential for the establishment of clusters in the Central and Eastern (Pannonian) Croatian. Starting from the theoretical elaboration of the need for entrepreneurs to organize through the cluster model and the terms of their self-actualization, related to the importance of traditional values in terms of benefits, social capital and assess where the company now is, in order to prove the need to create their own identity in terms of clustering. The institutional dimensions of social capital where the public sector has the best role in creating the social structure of clusters, and social dimensions of social capital in terms of trust, cooperation and networking will be analyzed to what extent the trust and coherency are present between companies in the Brod posavina and Pozega slavonia County, expressed through the readiness of inclusion in clusters in the NUTS II region - Central and Eastern (Pannonian) Croatia, as a homogeneous economic entity, with emphasis on limiting factors that stand in the way of greater competitiveness.

Keywords: Analysis of linkages, structures of co-operation, Cluster, Region

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1867
8872 Fuzzy Relatives of the CLARANS Algorithm With Application to Text Clustering

Authors: Mohamed A. Mahfouz, M. A. Ismail

Abstract:

This paper introduces new algorithms (Fuzzy relative of the CLARANS algorithm FCLARANS and Fuzzy c Medoids based on randomized search FCMRANS) for fuzzy clustering of relational data. Unlike existing fuzzy c-medoids algorithm (FCMdd) in which the within cluster dissimilarity of each cluster is minimized in each iteration by recomputing new medoids given current memberships, FCLARANS minimizes the same objective function minimized by FCMdd by changing current medoids in such away that that the sum of the within cluster dissimilarities is minimized. Computing new medoids may be effected by noise because outliers may join the computation of medoids while the choice of medoids in FCLARANS is dictated by the location of a predominant fraction of points inside a cluster and, therefore, it is less sensitive to the presence of outliers. In FCMRANS the step of computing new medoids in FCMdd is modified to be based on randomized search. Furthermore, a new initialization procedure is developed that add randomness to the initialization procedure used with FCMdd. Both FCLARANS and FCMRANS are compared with the robust and linearized version of fuzzy c-medoids (RFCMdd). Experimental results with different samples of the Reuter-21578, Newsgroups (20NG) and generated datasets with noise show that FCLARANS is more robust than both RFCMdd and FCMRANS. Finally, both FCMRANS and FCLARANS are more efficient and their outputs are almost the same as that of RFCMdd in terms of classification rate.

Keywords: Data Mining, Fuzzy Clustering, Relational Clustering, Medoid-Based Clustering, Cluster Analysis, Unsupervised Learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2402
8871 On the Noise Distance in Robust Fuzzy C-Means

Authors: M. G. C. A. Cimino, G. Frosini, B. Lazzerini, F. Marcelloni

Abstract:

In the last decades, a number of robust fuzzy clustering algorithms have been proposed to partition data sets affected by noise and outliers. Robust fuzzy C-means (robust-FCM) is certainly one of the most known among these algorithms. In robust-FCM, noise is modeled as a separate cluster and is characterized by a prototype that has a constant distance δ from all data points. Distance δ determines the boundary of the noise cluster and therefore is a critical parameter of the algorithm. Though some approaches have been proposed to automatically determine the most suitable δ for the specific application, up to today an efficient and fully satisfactory solution does not exist. The aim of this paper is to propose a novel method to compute the optimal δ based on the analysis of the distribution of the percentage of objects assigned to the noise cluster in repeated executions of the robust-FCM with decreasing values of δ . The extremely encouraging results obtained on some data sets found in the literature are shown and discussed.

Keywords: noise prototype, robust fuzzy clustering, robustfuzzy C-means

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1822
8870 Parallelization and Optimization of SIFT Feature Extraction on Cluster System

Authors: Mingling Zheng, Zhenlong Song, Ke Xu, Hengzhu Liu

Abstract:

Scale Invariant Feature Transform (SIFT) has been widely applied, but extracting SIFT feature is complicated and time-consuming. In this paper, to meet the demand of the real-time applications, SIFT is parallelized and optimized on cluster system, which is named pSIFT. Redundancy storage and communication are used for boundary data to improve the performance, and before representation of feature descriptor, data reallocation is adopted to keep load balance in pSIFT. Experimental results show that pSIFT achieves good speedup and scalability.

Keywords: cluster, image matching, parallelization and optimization, SIFT.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1863
8869 Color Image Segmentation Using Competitive and Cooperative Learning Approach

Authors: Yinggan Tang, Xinping Guan

Abstract:

Color image segmentation can be considered as a cluster procedure in feature space. k-means and its adaptive version, i.e. competitive learning approach are powerful tools for data clustering. But k-means and competitive learning suffer from several drawbacks such as dead-unit problem and need to pre-specify number of cluster. In this paper, we will explore to use competitive and cooperative learning approach to perform color image segmentation. In competitive and cooperative learning approach, seed points not only compete each other, but also the winner will dynamically select several nearest competitors to form a cooperative team to adapt to the input together, finally it can automatically select the correct number of cluster and avoid the dead-units problem. Experimental results show that CCL can obtain better segmentation result.

Keywords: Color image segmentation, competitive learning, cluster, k-means algorithm, competitive and cooperative learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1616
8868 Routing Algorithm for a Clustered Network

Authors: Hemanth KumarA.R, Sudhakara G., Satyanarayana B.S.

Abstract:

The Cluster Dimension of a network is defined as, which is the minimum cardinality of a subset S of the set of nodes having the property that for any two distinct nodes x and y, there exist the node Si, s2 (need not be distinct) in S such that ld(x,s1) — d(y, s1)1 > 1 and d(x,s2) < d(x,$) for all s E S — {s2}. In this paper, strictly non overlap¬ping clusters are constructed. The concept of LandMarks for Unique Addressing and Clustering (LMUAC) routing scheme is developed. With the help of LMUAC routing scheme, It is shown that path length (upper bound)PLN,d < PLD, Maximum memory space requirement for the networkMSLmuAc(Az) < MSEmuAc < MSH3L < MSric and Maximum Link utilization factor MLLMUAC(i=3) < MLLMUAC(z03) < M Lc

Keywords: Metric dimension, Cluster dimension, Cluster.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1225
8867 A Text Clustering System based on k-means Type Subspace Clustering and Ontology

Authors: Liping Jing, Michael K. Ng, Xinhua Yang, Joshua Zhexue Huang

Abstract:

This paper presents a text clustering system developed based on a k-means type subspace clustering algorithm to cluster large, high dimensional and sparse text data. In this algorithm, a new step is added in the k-means clustering process to automatically calculate the weights of keywords in each cluster so that the important words of a cluster can be identified by the weight values. For understanding and interpretation of clustering results, a few keywords that can best represent the semantic topic are extracted from each cluster. Two methods are used to extract the representative words. The candidate words are first selected according to their weights calculated by our new algorithm. Then, the candidates are fed to the WordNet to identify the set of noun words and consolidate the synonymy and hyponymy words. Experimental results have shown that the clustering algorithm is superior to the other subspace clustering algorithms, such as PROCLUS and HARP and kmeans type algorithm, e.g., Bisecting-KMeans. Furthermore, the word extraction method is effective in selection of the words to represent the topics of the clusters.

Keywords: Subspace Clustering, Text Mining, Feature Weighting, Cluster Interpretation, Ontology

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2462
8866 Formosa3: A Cloud-Enabled HPC Cluster in NCHC

Authors: Chin-Hung Li, Te-Ming Chen, Ying-Chuan Chen, Shuen-Tai Wang

Abstract:

This paper proposes a new approach to offer a private cloud service in HPC clusters. In particular, our approach relies on automatically scheduling users- customized environment request as a normal job in batch system. After finishing virtualization request jobs, those guest operating systems will dismiss so that compute nodes will be released again for computing. We present initial work on the innovative integration of HPC batch system and virtualization tools that aims at coexistence such that they suffice for meeting the minimizing interference required by a traditional HPC cluster. Given the design of initial infrastructure, the proposed effort has the potential to positively impact on synergy model. The results from the experiment concluded that goal for provisioning customized cluster environment indeed can be fulfilled by using virtual machines, and efficiency can be improved with proper setup and arrangements.

Keywords: Cloud Computing, HPC Cluster, Private Cloud, Virtualization

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2042
8865 Clustering Multivariate Empiric Characteristic Functions for Multi-Class SVM Classification

Authors: María-Dolores Cubiles-de-la-Vega, Rafael Pino-Mejías, Esther-Lydia Silva-Ramírez

Abstract:

A dissimilarity measure between the empiric characteristic functions of the subsamples associated to the different classes in a multivariate data set is proposed. This measure can be efficiently computed, and it depends on all the cases of each class. It may be used to find groups of similar classes, which could be joined for further analysis, or it could be employed to perform an agglomerative hierarchical cluster analysis of the set of classes. The final tree can serve to build a family of binary classification models, offering an alternative approach to the multi-class SVM problem. We have tested this dendrogram based SVM approach with the oneagainst- one SVM approach over four publicly available data sets, three of them being microarray data. Both performances have been found equivalent, but the first solution requires a smaller number of binary SVM models.

Keywords: Cluster Analysis, Empiric Characteristic Function, Multi-class SVM, R.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1878
8864 Development of a Clustered Network based on Unique Hop ID

Authors: Hemanth Kumar, A. R., Sudhakar G, Satyanarayana B. S.

Abstract:

In this paper, Land Marks for Unique Addressing( LMUA) algorithm is develped to generate unique ID for each and every node which leads to the formation of overlapping/Non overlapping clusters based on unique ID. To overcome the draw back of the developed LMUA algorithm, the concept of clustering is introduced. Based on the clustering concept a Land Marks for Unique Addressing and Clustering(LMUAC) Algorithm is developed to construct strictly non-overlapping clusters and classify those nodes in to Cluster Heads, Member Nodes, Gate way nodes and generating the Hierarchical code for the cluster heads to operate in the level one hierarchy for wireless communication switching. The expansion of the existing network can be performed or not without modifying the cost of adding the clusterhead is shown. The developed algorithm shows one way of efficiently constructing the

Keywords: Cluster Dimension, Cluster Basis, Metric Dimension, Metric Basis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1305