Search results for: conventional and nonconventional methods of clustering
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 5504

Search results for: conventional and nonconventional methods of clustering

5354 Quantity and Quality Aware Artificial Bee Colony Algorithm for Clustering

Authors: U. Idachaba, F. Z. Wang, A. Qi, N. Helian

Abstract:

Artificial Bee Colony (ABC) algorithm is a relatively new swarm intelligence technique for clustering. It produces higher quality clusters compared to other population-based algorithms but with poor energy efficiency, cluster quality consistency and typically slower in convergence speed. Inspired by energy saving foraging behavior of natural honey bees this paper presents a Quality and Quantity Aware Artificial Bee Colony (Q2ABC) algorithm to improve quality of cluster identification, energy efficiency and convergence speed of the original ABC. To evaluate the performance of Q2ABC algorithm, experiments were conducted on a suite of ten benchmark UCI datasets. The results demonstrate Q2ABC outperformed ABC and K-means algorithm in the quality of clusters delivered.

Keywords: Artificial bee colony algorithm, clustering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2087
5353 A Computational Cost-Effective Clustering Algorithm in Multidimensional Space Using the Manhattan Metric: Application to the Global Terrorism Database

Authors: Semeh Ben Salem, Sami Naouali, Moetez Sallami

Abstract:

The increasing amount of collected data has limited the performance of the current analyzing algorithms. Thus, developing new cost-effective algorithms in terms of complexity, scalability, and accuracy raised significant interests. In this paper, a modified effective k-means based algorithm is developed and experimented. The new algorithm aims to reduce the computational load without significantly affecting the quality of the clusterings. The algorithm uses the City Block distance and a new stop criterion to guarantee the convergence. Conducted experiments on a real data set show its high performance when compared with the original k-means version.

Keywords: Pattern recognition, partitional clustering, K-means clustering, Manhattan distance, terrorism data analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1320
5352 Effect of Clustering on Energy Efficiency and Network Lifetime in Wireless Sensor Networks

Authors: Prakash G L, Chaitra K Meti, Poojitha K, Divya R.K.

Abstract:

Wireless Sensor Network is Multi hop Self-configuring Wireless Network consisting of sensor nodes. The deployment of wireless sensor networks in many application areas, e.g., aggregation services, requires self-organization of the network nodes into clusters. Efficient way to enhance the lifetime of the system is to partition the network into distinct clusters with a high energy node as cluster head. The different methods of node clustering techniques have appeared in the literature, and roughly fall into two families; those based on the construction of a dominating set and those which are based solely on energy considerations. Energy optimized cluster formation for a set of randomly scattered wireless sensors is presented. Sensors within a cluster are expected to be communicating with cluster head only. The energy constraint and limited computing resources of the sensor nodes present the major challenges in gathering the data. In this paper we propose a framework to study how partially correlated data affect the performance of clustering algorithms. The total energy consumption and network lifetime can be analyzed by combining random geometry techniques and rate distortion theory. We also present the relation between compression distortion and data correlation.

Keywords: Clusters, multi hop, random geometry, rate distortion.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1594
5351 A 3D Approach for Extraction of the Coronaryartery and Quantification of the Stenosis

Authors: Mahdi Mazinani, S. D. Qanadli, Rahil Hosseini, Tim Ellis, Jamshid Dehmeshki

Abstract:

Segmentation and quantification of stenosis is an important task in assessing coronary artery disease. One of the main challenges is measuring the real diameter of curved vessels. Moreover, uncertainty in segmentation of different tissues in the narrow vessel is an important issue that affects accuracy. This paper proposes an algorithm to extract coronary arteries and measure the degree of stenosis. Markovian fuzzy clustering method is applied to model uncertainty arises from partial volume effect problem. The algorithm employs: segmentation, centreline extraction, estimation of orthogonal plane to centreline, measurement of the degree of stenosis. To evaluate the accuracy and reproducibility, the approach has been applied to a vascular phantom and the results are compared with real diameter. The results of 10 patient datasets have been visually judged by a qualified radiologist. The results reveal the superiority of the proposed method compared to the Conventional thresholding Method (CTM) on both datasets.

Keywords: 3D coronary artery tree extraction, segmentation, quantification, fuzzy clustering, and Markov random field

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1550
5350 Fast and Accuracy Control Chart Pattern Recognition using a New cluster-k-Nearest Neighbor

Authors: Samir Brahim Belhaouari

Abstract:

By taking advantage of both k-NN which is highly accurate and K-means cluster which is able to reduce the time of classification, we can introduce Cluster-k-Nearest Neighbor as "variable k"-NN dealing with the centroid or mean point of all subclasses generated by clustering algorithm. In general the algorithm of K-means cluster is not stable, in term of accuracy, for that reason we develop another algorithm for clustering our space which gives a higher accuracy than K-means cluster, less subclass number, stability and bounded time of classification with respect to the variable data size. We find between 96% and 99.7 % of accuracy in the lassification of 6 different types of Time series by using K-means cluster algorithm and we find 99.7% by using the new clustering algorithm.

Keywords: Pattern recognition, Time series, k-Nearest Neighbor, k-means cluster, Gaussian Mixture Model, Classification

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1932
5349 Neural Networks Learning Improvement using the K-Means Clustering Algorithm to Detect Network Intrusions

Authors: K. M. Faraoun, A. Boukelif

Abstract:

In the present work, we propose a new technique to enhance the learning capabilities and reduce the computation intensity of a competitive learning multi-layered neural network using the K-means clustering algorithm. The proposed model use multi-layered network architecture with a back propagation learning mechanism. The K-means algorithm is first applied to the training dataset to reduce the amount of samples to be presented to the neural network, by automatically selecting an optimal set of samples. The obtained results demonstrate that the proposed technique performs exceptionally in terms of both accuracy and computation time when applied to the KDD99 dataset compared to a standard learning schema that use the full dataset.

Keywords: Neural networks, Intrusion detection, learningenhancement, K-means clustering

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3561
5348 Data Oriented Model of Image: as a Framework for Image Processing

Authors: A. Habibizad Navin, A. Sadighi, M. Naghian Fesharaki, M. Mirnia, M. Teshnelab, R. Keshmiri

Abstract:

This paper presents a new data oriented model of image. Then a representation of it, ADBT, is introduced. The ability of ADBT is clustering, segmentation, measuring similarity of images etc, with desired precision and corresponding speed.

Keywords: Data oriented modelling, image, clustering, segmentation, classification, ADBT and image processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1763
5347 Detection of Salmonella in Egg Shell and Egg Content from Different Housing Systems for Laying Hens

Authors: Wiriya Loongyai, Kiettisak Promphet, Nilubol Kangsukul, Ratchawat Noppha

Abstract:

Polymerase chain reaction (PCR) assay and conventional microbiological methods were used to detect bacterial contamination of egg shells and egg content in different commercial housing systems, open house system and evaporative cooling system. A PCR assay was developed for direct detection using a set of primers specific for the invasion by A gene (invA) of Salmonella spp. PCR detected the presence of Salmonella in 2 samples of shell egg from the evaporative cooling system, while conventional cultural methods detected no Salmonella from the same samples.

Keywords: egg content, egg shell, invA gene, PCR, Salmonellaspp.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3259
5346 Comparison between the Conventional Methods and PSO Based MPPT Algorithm for Photovoltaic Systems

Authors: Ramdan B. A. Koad, Ahmed. F. Zobaa

Abstract:

Since the output characteristics of photovoltaic (PV) system depends on the ambient temperature, solar radiation and load impedance, its maximum power point (MPP) is not constant. Under each condition PV module has a point at which it can produce its MPP. Therefore, a maximum power point tracking (MPPT) method is needed to uphold the PV panel operating at its MPP. This paper presents comparative study between the conventional MPPT methods used in (PV) system: Perturb and Observe (P&O), Incremental Conductance (IncCond), andParticle Swarm Optimization (PSO) algorithmfor (MPPT) of (PV) system. To evaluate the study, the proposed PSO MPPT is implemented on a DC-DC cuk converter and has been compared with P&O and INcond methods in terms of their tracking speed, accuracy and performance by using the Matlab tool Simulink. The simulation result shows that the proposed algorithm is simple, and is superior to the P&O and IncCond methods.

Keywords: Incremental Conductance (IncCond) Method, Perturb and Observe (P&O) Method, Photovoltaic Systems (PV) and Practical Swarm Optimization Algorithm (PSO).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5622
5345 Characteristic Study on Conventional and Soliton Based Transmission System

Authors: Bhupeshwaran Mani, S. Radha, A. Jawahar, A. Sivasubramanian

Abstract:

Here, we study the characteristic feature of conventional (ON-OFF keying) and soliton based transmission system. We consider 20Gbps transmission system implemented with Conventional Single Mode Fiber (C-SMF) to examine the role of Gaussian pulse which is the characteristic of conventional propagation and Hyperbolic-secant pulse which is the characteristic of soliton propagation in it. We note the influence of these pulses with respect to different dispersion lengths and soliton period in conventional and soliton system respectively and evaluate the system performance in terms of Quality factor. From the analysis, we could prove that the soliton pulse has the consistent performance even for long distance without dispersion compensation than the conventional system as it is robust to dispersion. For the length of transmission of 200Km, soliton system yielded Q of 33.958 while the conventional system totally exhausted with Q=0.

Keywords: Soliton, dispersion length, Soliton period, Return-tozero (RZ), Q-factor.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1595
5344 Spatial Clustering Model of Vessel Trajectory to Extract Sailing Routes Based on AIS Data

Authors: Lubna Eljabu, Mohammad Etemad, Stan Matwin

Abstract:

The automatic extraction of shipping routes is advantageous for intelligent traffic management systems to identify events and support decision-making in maritime surveillance. At present, there is a high demand for the extraction of maritime traffic networks that resemble the real traffic of vessels accurately, which is valuable for further analytical processing tasks for vessels trajectories (e.g., naval routing and voyage planning, anomaly detection, destination prediction, time of arrival estimation). With the help of big data and processing huge amounts of vessels’ trajectory data, it is possible to learn these shipping routes from the navigation history of past behaviour of other, similar ships that were travelling in a given area. In this paper, we propose a spatial clustering model of vessels’ trajectories (SPTCLUST) to extract spatial representations of sailing routes from historical Automatic Identification System (AIS) data. The whole model consists of three main parts: data preprocessing, path finding, and route extraction, which consists of clustering and representative trajectory extraction. The proposed clustering method provides techniques to overcome the problems of: (i) optimal input parameters selection; (ii) the high complexity of processing a huge volume of multidimensional data; (iii) and the spatial representation of complete representative trajectory detection in the context of trajectory clustering algorithms. The experimental evaluation showed the effectiveness of the proposed model by using a real-world AIS dataset from the Port of Halifax. The results contribute to further understanding of shipping route patterns. This could aid surveillance authorities in stable and sustainable vessel traffic management.

Keywords: Vessel trajectory clustering, trajectory mining, Spatial Clustering, marine intelligent navigation, maritime traffic network extraction, sdailing routes extraction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 364
5343 Probabilistic Graphical Model for the Web

Authors: M. Nekri, A. Khelladi

Abstract:

The world wide web network is a network with a complex topology, the main properties of which are the distribution of degrees in power law, A low clustering coefficient and a weak average distance. Modeling the web as a graph allows locating the information in little time and consequently offering a help in the construction of the research engine. Here, we present a model based on the already existing probabilistic graphs with all the aforesaid characteristics. This work will consist in studying the web in order to know its structuring thus it will enable us to modelize it more easily and propose a possible algorithm for its exploration.

Keywords: Clustering coefficient, preferential attachment, small world, Web community.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1575
5342 Developing a Viral Artifact to Improve Employees’ Security Behavior

Authors: Stefan Bauer, Josef Frysak

Abstract:

According to the scientific information management literature, the improper use of information technology (e.g. personal computers) by employees are one main cause for operational and information security loss events. Therefore, organizations implement information security awareness programs to increase employees’ awareness to further prevention of loss events. However, in many cases these information security awareness programs consist of conventional delivery methods like posters, leaflets, or internal messages to make employees aware of information security policies. We assume that a viral information security awareness video might be more effective medium than conventional methods commonly used by organizations. The purpose of this research is to develop a viral video artifact to improve employee security behavior concerning information technology.

Keywords: Information Security Awareness, Delivery Methods, Viral Videos, Employee Security Behavior.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1762
5341 Scattering Operator and Spectral Clustering for Ultrasound Images: Application on Deep Venous Thrombi

Authors: Thibaud Berthomier, Ali Mansour, Luc Bressollette, Frédéric Le Roy, Dominique Mottier, Léo Fréchier, Barthélémy Hermenault

Abstract:

Deep Venous Thrombosis (DVT) occurs when a thrombus is formed within a deep vein (most often in the legs). This disease can be deadly if a part or the whole thrombus reaches the lung and causes a Pulmonary Embolism (PE). This disorder, often asymptomatic, has multifactorial causes: immobilization, surgery, pregnancy, age, cancers, and genetic variations. Our project aims to relate the thrombus epidemiology (origins, patient predispositions, PE) to its structure using ultrasound images. Ultrasonography and elastography were collected using Toshiba Aplio 500 at Brest Hospital. This manuscript compares two classification approaches: spectral clustering and scattering operator. The former is based on the graph and matrix theories while the latter cascades wavelet convolutions with nonlinear modulus and averaging operators.

Keywords: Deep venous thrombosis, ultrasonography, elastography, scattering operator, wavelet, spectral clustering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1131
5340 A Hybrid Approach for Color Image Quantization Using K-means and Firefly Algorithms

Authors: Parisut Jitpakdee, Pakinee Aimmanee, Bunyarit Uyyanonvara

Abstract:

Color Image quantization (CQ) is an important problem in computer graphics, image and processing. The aim of quantization is to reduce colors in an image with minimum distortion. Clustering is a widely used technique for color quantization; all colors in an image are grouped to small clusters. In this paper, we proposed a new hybrid approach for color quantization using firefly algorithm (FA) and K-means algorithm. Firefly algorithm is a swarmbased algorithm that can be used for solving optimization problems. The proposed method can overcome the drawbacks of both algorithms such as the local optima converge problem in K-means and the early converge of firefly algorithm. Experiments on three commonly used images and the comparison results shows that the proposed algorithm surpasses both the base-line technique k-means clustering and original firefly algorithm.

Keywords: Clustering, Color quantization, Firefly algorithm, Kmeans.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2182
5339 Skin Lesion Segmentation Using Color Channel Optimization and Clustering-based Histogram Thresholding

Authors: Rahil Garnavi, Mohammad Aldeen, M. Emre Celebi, Alauddin Bhuiyan, Constantinos Dolianitis, George Varigos

Abstract:

Automatic segmentation of skin lesions is the first step towards the automated analysis of malignant melanoma. Although numerous segmentation methods have been developed, few studies have focused on determining the most effective color space for melanoma application. This paper proposes an automatic segmentation algorithm based on color space analysis and clustering-based histogram thresholding, a process which is able to determine the optimal color channel for detecting the borders in dermoscopy images. The algorithm is tested on a set of 30 high resolution dermoscopy images. A comprehensive evaluation of the results is provided, where borders manually drawn by four dermatologists, are compared to automated borders detected by the proposed algorithm, applying three previously used metrics of accuracy, sensitivity, and specificity and a new metric of similarity. By performing ROC analysis and ranking the metrics, it is demonstrated that the best results are obtained with the X and XoYoR color channels, resulting in an accuracy of approximately 97%. The proposed method is also compared with two state-of-theart skin lesion segmentation methods.

Keywords: Border detection, Color space analysis, Dermoscopy, Histogram thresholding, Melanoma, Segmentation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2206
5338 An Adaptive Fuzzy Clustering Approach for the Network Management

Authors: Amal Elmzabi, Mostafa Bellafkih, Mohammed Ramdani

Abstract:

The Chiu-s method which generates a Takagi-Sugeno Fuzzy Inference System (FIS) is a method of fuzzy rules extraction. The rules output is a linear function of inputs. In addition, these rules are not explicit for the expert. In this paper, we develop a method which generates Mamdani FIS, where the rules output is fuzzy. The method proceeds in two steps: first, it uses the subtractive clustering principle to estimate both the number of clusters and the initial locations of a cluster centers. Each obtained cluster corresponds to a Mamdani fuzzy rule. Then, it optimizes the fuzzy model parameters by applying a genetic algorithm. This method is illustrated on a traffic network management application. We suggest also a Mamdani fuzzy rules generation method, where the expert wants to classify the output variables in some fuzzy predefined classes.

Keywords: Fuzzy entropy, fuzzy inference systems, genetic algorithms, network management, subtractive clustering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1849
5337 Modeling Peer-to-Peer Networks with Interest-Based Clusters

Authors: Bertalan Forstner, Dr. Hassan Charaf

Abstract:

In the world of Peer-to-Peer (P2P) networking different protocols have been developed to make the resource sharing or information retrieval more efficient. The SemPeer protocol is a new layer on Gnutella that transforms the connections of the nodes based on semantic information to make information retrieval more efficient. However, this transformation causes high clustering in the network that decreases the number of nodes reached, therefore the probability of finding a document is also decreased. In this paper we describe a mathematical model for the Gnutella and SemPeer protocols that captures clustering-related issues, followed by a proposition to modify the SemPeer protocol to achieve moderate clustering. This modification is a sort of link management for the individual nodes that allows the SemPeer protocol to be more efficient, because the probability of a successful query in the P2P network is reasonably increased. For the validation of the models, we evaluated a series of simulations that supported our results.

Keywords: Peer-to-Peer, model, performance, networkmanagement.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1272
5336 An Energy Aware Data Aggregation in Wireless Sensor Network Using Connected Dominant Set

Authors: M. Santhalakshmi, P Suganthi

Abstract:

Wireless Sensor Networks (WSNs) have many advantages. Their deployment is easier and faster than wired sensor networks or other wireless networks, as they do not need fixed infrastructure. Nodes are partitioned into many small groups named clusters to aggregate data through network organization. WSN clustering guarantees performance achievement of sensor nodes. Sensor nodes energy consumption is reduced by eliminating redundant energy use and balancing energy sensor nodes use over a network. The aim of such clustering protocols is to prolong network life. Low Energy Adaptive Clustering Hierarchy (LEACH) is a popular protocol in WSN. LEACH is a clustering protocol in which the random rotations of local cluster heads are utilized in order to distribute energy load among all sensor nodes in the network. This paper proposes Connected Dominant Set (CDS) based cluster formation. CDS aggregates data in a promising approach for reducing routing overhead since messages are transmitted only within virtual backbone by means of CDS and also data aggregating lowers the ratio of responding hosts to the hosts existing in virtual backbones. CDS tries to increase networks lifetime considering such parameters as sensors lifetime, remaining and consumption energies in order to have an almost optimal data aggregation within networks. Experimental results proved CDS outperformed LEACH regarding number of cluster formations, average packet loss rate, average end to end delay, life computation, and remaining energy computation.

Keywords: Wireless sensor network, connected dominant set, clustering, data aggregation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1097
5335 Comparison between Torsional Ultrasonic Assisted Drilling and Conventional Drilling of Bone: An in vitro Study

Authors: Nikoo Soleimani

Abstract:

Background: Reducing torque during bone drilling is one of the effective factors in reaching to an optimal drilling process. Methods: 15 bovine femurs were drilled in vitro with a drill bit with a diameter of 4 mm using two methods of torsional ultrasonic assisted drilling (T-UAD) and convent conventional drilling (CD) and the effects of changing the feed rate and rotational speed on the torque were compared in both methods. Results: There was no significant difference in the thrust force measured in both methods due to the direction of vibrations. Results showed that using T-UAD method for bone drilling at feed rates of 0.16, 0.24 and 0.32 mm/rev led for all rotational speeds to a decrease of at least 16.3% in torque compared to the CD method. Further, using T-UAD at rotational speeds of 355~1000 rpm with various feed rates resulted in a torque reduction of 16.3~50.5% compared to CD method. Conclusions: Reducing the feed rate and increasing the rotational speed, except for the rotational speed of 500 rpm and a feed rate of 0.32 mm/rev, resulted generally in torque reduction in both methods. However, T-UAD is a more effective and desirable option for bone drilling considering its significant torque reduction.

Keywords: Torsional ultrasonic assisted drilling, torque, bone drilling, rotational speed, feed rate.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 607
5334 Identification of a PWA Model of a Batch Reactor for Model Predictive Control

Authors: Gorazd Karer, Igor Skrjanc, Borut Zupancic

Abstract:

The complex hybrid and nonlinear nature of many processes that are met in practice causes problems with both structure modelling and parameter identification; therefore, obtaining a model that is suitable for MPC is often a difficult task. The basic idea of this paper is to present an identification method for a piecewise affine (PWA) model based on a fuzzy clustering algorithm. First we introduce the PWA model. Next, we tackle the identification method. We treat the fuzzy clustering algorithm, deal with the projections of the fuzzy clusters into the input space of the PWA model and explain the estimation of the parameters of the PWA model by means of a modified least-squares method. Furthermore, we verify the usability of the proposed identification approach on a hybrid nonlinear batch reactor example. The result suggest that the batch reactor can be efficiently identified and thus formulated as a PWA model, which can eventually be used for model predictive control purposes.

Keywords: Batch reactor, fuzzy clustering, hybrid systems, identification, nonlinear systems, PWA systems.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2152
5333 Automatic LV Segmentation with K-means Clustering and Graph Searching on Cardiac MRI

Authors: Hae-Yeoun Lee

Abstract:

Quantification of cardiac function is performed by calculating blood volume and ejection fraction in routine clinical practice. However, these works have been performed by manual contouring, which requires computational costs and varies on the observer. In this paper, an automatic left ventricle segmentation algorithm on cardiac magnetic resonance images (MRI) is presented. Using knowledge on cardiac MRI, a K-mean clustering technique is applied to segment blood region on a coil-sensitivity corrected image. Then, a graph searching technique is used to correct segmentation errors from coil distortion and noises. Finally, blood volume and ejection fraction are calculated. Using cardiac MRI from 15 subjects, the presented algorithm is tested and compared with manual contouring by experts to show outstanding performance.

Keywords: Cardiac MRI, Graph searching, Left ventricle segmentation, K-means clustering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2056
5332 A Optimal Subclass Detection Method for Credit Scoring

Authors: Luciano Nieddu, Giuseppe Manfredi, Salvatore D'Acunto, Katia La Regina

Abstract:

In this paper a non-parametric statistical pattern recognition algorithm for the problem of credit scoring will be presented. The proposed algorithm is based on a clustering k- means algorithm and allows for the determination of subclasses of homogenous elements in the data. The algorithm will be tested on two benchmark datasets and its performance compared with other well known pattern recognition algorithm for credit scoring.

Keywords: Constrained clustering, Credit scoring, Statistical pattern recognition, Supervised classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2010
5331 Predictive Clustering Hybrid Regression(pCHR) Approach and Its Application to Sucrose-Based Biohydrogen Production

Authors: Nikhil, Ari Visa, Chin-Chao Chen, Chiu-Yue Lin, Jaakko A. Puhakka, Olli Yli-Harja

Abstract:

A predictive clustering hybrid regression (pCHR) approach was developed and evaluated using dataset from H2- producing sucrose-based bioreactor operated for 15 months. The aim was to model and predict the H2-production rate using information available about envirome and metabolome of the bioprocess. Selforganizing maps (SOM) and Sammon map were used to visualize the dataset and to identify main metabolic patterns and clusters in bioprocess data. Three metabolic clusters: acetate coupled with other metabolites, butyrate only, and transition phases were detected. The developed pCHR model combines principles of k-means clustering, kNN classification and regression techniques. The model performed well in modeling and predicting the H2-production rate with mean square error values of 0.0014 and 0.0032, respectively.

Keywords: Biohydrogen, bioprocess modeling, clusteringhybrid regression.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1733
5330 A Minimum Spanning Tree-Based Method for Initializing the K-Means Clustering Algorithm

Authors: J. Yang, Y. Ma, X. Zhang, S. Li, Y. Zhang

Abstract:

The traditional k-means algorithm has been widely used as a simple and efficient clustering method. However, the algorithm often converges to local minima for the reason that it is sensitive to the initial cluster centers. In this paper, an algorithm for selecting initial cluster centers on the basis of minimum spanning tree (MST) is presented. The set of vertices in MST with same degree are regarded as a whole which is used to find the skeleton data points. Furthermore, a distance measure between the skeleton data points with consideration of degree and Euclidean distance is presented. Finally, MST-based initialization method for the k-means algorithm is presented, and the corresponding time complexity is analyzed as well. The presented algorithm is tested on five data sets from the UCI Machine Learning Repository. The experimental results illustrate the effectiveness of the presented algorithm compared to three existing initialization methods.

Keywords: Degree, initial cluster center, k-means, minimum spanning tree.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1502
5329 Enhancing K-Means Algorithm with Initial Cluster Centers Derived from Data Partitioning along the Data Axis with the Highest Variance

Authors: S. Deelers, S. Auwatanamongkol

Abstract:

In this paper, we propose an algorithm to compute initial cluster centers for K-means clustering. Data in a cell is partitioned using a cutting plane that divides cell in two smaller cells. The plane is perpendicular to the data axis with the highest variance and is designed to reduce the sum squared errors of the two cells as much as possible, while at the same time keep the two cells far apart as possible. Cells are partitioned one at a time until the number of cells equals to the predefined number of clusters, K. The centers of the K cells become the initial cluster centers for K-means. The experimental results suggest that the proposed algorithm is effective, converge to better clustering results than those of the random initialization method. The research also indicated the proposed algorithm would greatly improve the likelihood of every cluster containing some data in it.

Keywords: Clustering algorithm, K-means algorithm, Datapartitioning, Initial cluster centers.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2820
5328 Recursive Similarity Hashing of Fractal Geometry

Authors: Timothee G. Leleu

Abstract:

A new technique of topological multi-scale analysis is introduced. By performing a clustering recursively to build a hierarchy, and analyzing the co-scale and intra-scale similarities, an Iterated Function System can be extracted from any data set. The study of fractals shows that this method is efficient to extract self-similarities, and can find elegant solutions the inverse problem of building fractals. The theoretical aspects and practical implementations are discussed, together with examples of analyses of simple fractals.

Keywords: hierarchical clustering, multi-scale analysis, Similarity hashing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1823
5327 An Optimal Unsupervised Satellite image Segmentation Approach Based on Pearson System and k-Means Clustering Algorithm Initialization

Authors: Ahmed Rekik, Mourad Zribi, Ahmed Ben Hamida, Mohamed Benjelloun

Abstract:

This paper presents an optimal and unsupervised satellite image segmentation approach based on Pearson system and k-Means Clustering Algorithm Initialization. Such method could be considered as original by the fact that it utilised K-Means clustering algorithm for an optimal initialisation of image class number on one hand and it exploited Pearson system for an optimal statistical distributions- affectation of each considered class on the other hand. Satellite image exploitation requires the use of different approaches, especially those founded on the unsupervised statistical segmentation principle. Such approaches necessitate definition of several parameters like image class number, class variables- estimation and generalised mixture distributions. Use of statistical images- attributes assured convincing and promoting results under the condition of having an optimal initialisation step with appropriated statistical distributions- affectation. Pearson system associated with a k-means clustering algorithm and Stochastic Expectation-Maximization 'SEM' algorithm could be adapted to such problem. For each image-s class, Pearson system attributes one distribution type according to different parameters and especially the Skewness 'β1' and the kurtosis 'β2'. The different adapted algorithms, K-Means clustering algorithm, SEM algorithm and Pearson system algorithm, are then applied to satellite image segmentation problem. Efficiency of those combined algorithms was firstly validated with the Mean Quadratic Error 'MQE' evaluation, and secondly with visual inspection along several comparisons of these unsupervised images- segmentation.

Keywords: Unsupervised classification, Pearson system, Satellite image, Segmentation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1989
5326 Development of a Clustered Network based on Unique Hop ID

Authors: Hemanth Kumar, A. R., Sudhakar G, Satyanarayana B. S.

Abstract:

In this paper, Land Marks for Unique Addressing( LMUA) algorithm is develped to generate unique ID for each and every node which leads to the formation of overlapping/Non overlapping clusters based on unique ID. To overcome the draw back of the developed LMUA algorithm, the concept of clustering is introduced. Based on the clustering concept a Land Marks for Unique Addressing and Clustering(LMUAC) Algorithm is developed to construct strictly non-overlapping clusters and classify those nodes in to Cluster Heads, Member Nodes, Gate way nodes and generating the Hierarchical code for the cluster heads to operate in the level one hierarchy for wireless communication switching. The expansion of the existing network can be performed or not without modifying the cost of adding the clusterhead is shown. The developed algorithm shows one way of efficiently constructing the

Keywords: Cluster Dimension, Cluster Basis, Metric Dimension, Metric Basis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1272
5325 Performance Evaluation of Energy Efficient Communication Protocol for Mobile Ad Hoc Networks

Authors: Toshihiko Sasama, Kentaro Kishida, Kazunori Sugahara, Hiroshi Masuyama

Abstract:

A mobile ad hoc network is a network of mobile nodes without any notion of centralized administration. In such a network, each mobile node behaves not only as a host which runs applications but also as a router to forward packets on behalf of others. Clustering has been applied to routing protocols to achieve efficient communications. A CH network expresses the connected relationship among cluster-heads. This paper discusses the methods for constructing a CH network, and produces the following results: (1) The required running costs of 3 traditional methods for constructing a CH network are not so different from each other in the static circumstance, or in the dynamic circumstance. Their running costs in the static circumstance do not differ from their costs in the dynamic circumstance. Meanwhile, although the routing costs required for the above 3 methods are not so different in the static circumstance, the costs are considerably different from each other in the dynamic circumstance. Their routing costs in the static circumstance are also very different from their costs in the dynamic circumstance, and the former is one tenths of the latter. The routing cost in the dynamic circumstance is mostly the cost for re-routing. (2) On the strength of the above results, we discuss new 2 methods regarding whether they are tolerable or not in the dynamic circumstance, that is, whether the times of re-routing are small or not. These new methods are revised methods that are based on the traditional methods. We recommended the method which produces the smallest routing cost in the dynamic circumstance, therefore producing the smallest total cost.

Keywords: cluster, mobile ad hoc network, re-routing cost, simulation

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1314