Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 10

K-Means Clustering Related Abstracts

10 Optical Flow Direction Determination for Railway Crossing Occupancy Monitoring

Authors: Martin Dobrovolny, Zdenek Silar


This article deals with the obstacle detection on a railway crossing (clearance detection). Detection is based on the optical flow estimation and classification of the flow vectors by K-means clustering algorithm. For classification of passing vehicles is used optical flow direction determination. The optical flow estimation is based on a modified Lucas-Kanade method.

Keywords: K-Means Clustering, background estimation, direction of optical flow, objects detection, railway crossing monitoring, velocity vectors

Procedia PDF Downloads 330
9 Feature Based Unsupervised Intrusion Detection

Authors: Deeman Yousif Mahmood, Mohammed Abdullah Hussein


The goal of a network-based intrusion detection system is to classify activities of network traffics into two major categories: normal and attack (intrusive) activities. Nowadays, data mining and machine learning plays an important role in many sciences; including intrusion detection system (IDS) using both supervised and unsupervised techniques. However, one of the essential steps of data mining is feature selection that helps in improving the efficiency, performance and prediction rate of proposed approach. This paper applies unsupervised K-means clustering algorithm with information gain (IG) for feature selection and reduction to build a network intrusion detection system. For our experimental analysis, we have used the new NSL-KDD dataset, which is a modified dataset for KDDCup 1999 intrusion detection benchmark dataset. With a split of 60.0% for the training set and the remainder for the testing set, a 2 class classifications have been implemented (Normal, Attack). Weka framework which is a java based open source software consists of a collection of machine learning algorithms for data mining tasks has been used in the testing process. The experimental results show that the proposed approach is very accurate with low false positive rate and high true positive rate and it takes less learning time in comparison with using the full features of the dataset with the same algorithm.

Keywords: K-Means Clustering, information gain (IG), intrusion detection system (IDS), Weka

Procedia PDF Downloads 177
8 Wireless Sensor Networks Optimization by Using 2-Stage Algorithm Based on Imperialist Competitive Algorithm (2S-ICA)

Authors: Hamid Reza Lashgarian Azad, Seyed Nader Shetab Boushehri


Wireless sensor networks (WSN) have become progressively popular due to their wide range of applications. Wireless Sensor Network is made of numerous tiny sensor nodes that are battery powered. Maximizing the lifetime of wireless sensor network is very important problem. In this paper, we propose two-stage protocol based on imperialist competitive algorithm (2S-ICA) to solve a sensor network optimization problem. Long communication distances between sensors and a sink (or destination) in a sensor network can greatly drain the energy of sensors and reduce the lifetime of a network. By clustering a sensor network into a number of independent clusters using a 2S-ICA, we can greatly minimize the total communication distance, thus prolonging the network lifetime. Comparison results of proposed protocol and LEACH protocol, which is the common to solve WSN problem, show that our protocol has a better performance in term of improving network life and increasing number of transmitted data.

Keywords: Wireless Sensor Network, K-Means Clustering, imperialist competitive algorithm, LEACH protocol

Procedia PDF Downloads 379
7 Automatic LV Segmentation with K-means Clustering and Graph Searching on Cardiac MRI

Authors: Hae-Yeoun Lee


Quantification of cardiac function is performed by calculating blood volume and ejection fraction in routine clinical practice. However, these works have been performed by manual contouring,which requires computational costs and varies on the observer. In this paper, an automatic left ventricle segmentation algorithm on cardiac magnetic resonance images (MRI) is presented. Using knowledge on cardiac MRI, a K-mean clustering technique is applied to segment blood region on a coil-sensitivity corrected image. Then, a graph searching technique is used to correct segmentation errors from coil distortion and noises. Finally, blood volume and ejection fraction are calculated. Using cardiac MRI from 15 subjects, the presented algorithm is tested and compared with manual contouring by experts to show outstanding performance.

Keywords: K-Means Clustering, cardiac MRI, graph searching, left ventricle segmentation

Procedia PDF Downloads 281
6 Computing Customer Lifetime Value in E-Commerce Websites with Regard to Returned Orders and Payment Method

Authors: Morteza Giti


As online shopping is becoming increasingly popular, computing customer lifetime value for better knowing the customers is also gaining more importance. Two distinct factors that can affect the value of a customer in the context of online shopping is the number of returned orders and payment method. Returned orders are those which have been shipped but not collected by the customer and are returned to the store. Payment method refers to the way that customers choose to pay for the price of the order which are usually two: Pre-pay and Cash-on-delivery. In this paper, a novel model called RFMSP is presented to calculated the customer lifetime value, taking these two parameters into account. The RFMSP model is based on the common RFM model while adding two extra parameter. The S represents the order status and the P indicates the payment method. As a case study for this model, the purchase history of customers in an online shop is used to compute the customer lifetime value over a period of twenty months.

Keywords: E-Commerce, K-Means Clustering, AHP, RFMSP model, customer lifetime value

Procedia PDF Downloads 186
5 Improved K-Means Clustering Algorithm Using RHadoop with Combiner

Authors: Dong Hoon Lim, Ji Eun Shin


Data clustering is a common technique used in data analysis and is used in many applications, such as artificial intelligence, pattern recognition, economics, ecology, psychiatry and marketing. K-means clustering is a well-known clustering algorithm aiming to cluster a set of data points to a predefined number of clusters. In this paper, we implement K-means algorithm based on MapReduce framework with RHadoop to make the clustering method applicable to large scale data. RHadoop is a collection of R packages that allow users to manage and analyze data with Hadoop. The main idea is to introduce a combiner as a function of our map output to decrease the amount of data needed to be processed by reducers. The experimental results demonstrated that K-means algorithm using RHadoop can scale well and efficiently process large data sets on commodity hardware. We also showed that our K-means algorithm using RHadoop with combiner was faster than regular algorithm without combiner as the size of data set increases.

Keywords: Big Data, K-Means Clustering, RHadoop, combiner

Procedia PDF Downloads 239
4 Electricity Generation from Renewables and Targets: An Application of Multivariate Statistical Techniques

Authors: Filiz Ersoz, Taner Ersoz, Tugrul Bayraktar


Renewable energy is referred to as "clean energy" and common popular support for the use of renewable energy (RE) is to provide electricity with zero carbon dioxide emissions. This study provides useful insight into the European Union (EU) RE, especially, into electricity generation obtained from renewables, and their targets. The objective of this study is to identify groups of European countries, using multivariate statistical analysis and selected indicators. The hierarchical clustering method is used to decide the number of clusters for EU countries. The conducted statistical hierarchical cluster analysis is based on the Ward’s clustering method and squared Euclidean distances. Hierarchical cluster analysis identified eight distinct clusters of European countries. Then, non-hierarchical clustering (k-means) method was applied. Discriminant analysis was used to determine the validity of the results with data normalized by Z score transformation. To explore the relationship between the selected indicators, correlation coefficients were computed. The results of the study reveal the current situation of RE in European Union Member States.

Keywords: K-Means Clustering, CO2 emission, share of electricity generation, discriminant

Procedia PDF Downloads 242
3 A Computational Cost-Effective Clustering Algorithm in Multidimensional Space Using the Manhattan Metric: Application to the Global Terrorism Database

Authors: Semeh Ben Salem, Sami Naouali, Moetez Sallami


The increasing amount of collected data has limited the performance of the current analyzing algorithms. Thus, developing new cost-effective algorithms in terms of complexity, scalability, and accuracy raised significant interests. In this paper, a modified effective k-means based algorithm is developed and experimented. The new algorithm aims to reduce the computational load without significantly affecting the quality of the clusterings. The algorithm uses the City Block distance and a new stop criterion to guarantee the convergence. Conducted experiments on a real data set show its high performance when compared with the original k-means version.

Keywords: Pattern Recognition, K-Means Clustering, global terrorism database, Manhattan distance, terrorism data analysis

Procedia PDF Downloads 224
2 K-Means Based Matching Algorithm for Multi-Resolution Feature Descriptors

Authors: Chen-Chien Hsu, Shao-Tzu Huang, Wei-Yen Wang


Matching high dimensional features between images is computationally expensive for exhaustive search approaches in computer vision. Although the dimension of the feature can be degraded by simplifying the prior knowledge of homography, matching accuracy may degrade as a tradeoff. In this paper, we present a feature matching method based on k-means algorithm that reduces the matching cost and matches the features between images instead of using a simplified geometric assumption. Experimental results show that the proposed method outperforms the previous linear exhaustive search approaches in terms of the inlier ratio of matched pairs.

Keywords: K-Means Clustering, SIFT, feature matching, RANSAC

Procedia PDF Downloads 187
1 Real-Time Classification of Marbles with Decision-Tree Method

Authors: E. Turan, K. S. Parlak


The separation of marbles according to the pattern quality is a process made according to expert decision. The classification phase is the most critical part in terms of economic value. In this study, a self-learning system is proposed which performs the classification of marbles quickly and with high success. This system performs ten feature extraction by taking ten marble images from the camera. The marbles are classified by decision tree method using the obtained properties. The user forms the training set by training the system at the marble classification stage. The system evolves itself in every marble image that is classified. The aim of the proposed system is to minimize the error caused by the person performing the classification and achieve it quickly.

Keywords: Feature Extraction, K-Means Clustering, Decision Tree, marble classification

Procedia PDF Downloads 198