Search results for: Subspace clustering
359 Fuzzy Clustering of Locations for Degree of Accident Proneness based on Vehicle User Perceptions
Authors: Jayanth Jacob, C. V. Hariharakrishnan, Suganthi L.
Abstract:
The rapid urbanization of cities has a bane in the form road accidents that cause extensive damage to life and limbs. A number of location based factors are enablers of road accidents in the city. The speed of travel of vehicles is non-uniform among locations within a city. In this study, the perception of vehicle users is captured on a 10-point rating scale regarding the degree of variation in speed of travel at chosen locations in the city. The average rating is used to cluster locations using fuzzy c-means clustering and classify them as low, moderate and high speed of travel locations. The high speed of travel locations can be classified proactively to ensure that accidents do not occur due to the speeding of vehicles at such locations. The advantage of fuzzy c-means clustering is that a location may be a part of more than one cluster to a varying degree and this gives a better picture about the location with respect to the characteristic (speed of travel) being studied.Keywords: C-means clustering, Location Specific, Road Accidents.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1841358 A Distributed Weighted Cluster Based Routing Protocol for Manets
Authors: Naveen Chauhan, L.K. Awasthi, Narottam chand, Vivek Katiyar, Ankit Chug
Abstract:
Mobile ad-hoc networks (MANETs) are a form of wireless networks which do not require a base station for providing network connectivity. Mobile ad-hoc networks have many characteristics which distinguish them from other wireless networks which make routing in such networks a challenging task. Cluster based routing is one of the routing schemes for MANETs in which various clusters of mobile nodes are formed with each cluster having its own clusterhead which is responsible for routing among clusters. In this paper we have proposed and implemented a distributed weighted clustering algorithm for MANETs. This approach is based on combined weight metric that takes into account several system parameters like the node degree, transmission range, energy and mobility of the nodes. We have evaluated the performance of proposed scheme through simulation in various network situations. Simulation results show that proposed scheme outperforms the original distributed weighted clustering algorithm (DWCA).Keywords: MANETs, Clustering, Routing, WirelessCommunication, Distributed Clustering
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1890357 Double Clustering as an Unsupervised Approach for Order Picking of Distributed Warehouses
Authors: Hsin-Yi Huang, Ming-Sheng Liu, Jiun-Yan Shiau
Abstract:
Planning the order picking lists for warehouses to achieve some operational performances is a significant challenge when the costs associated with logistics are relatively high, and it is especially important in e-commerce era. Nowadays, many order planning techniques employ supervised machine learning algorithms. However, to define features for supervised machine learning algorithms is not a simple task. Against this background, we consider whether unsupervised algorithms can enhance the planning of order-picking lists. A double zone picking approach, which is based on using clustering algorithms twice, is developed. A simplified example is given to demonstrate the merit of our approach.
Keywords: order picking, warehouse, clustering, unsupervised learning
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 519356 A Distributed Algorithm for Intrinsic Cluster Detection over Large Spatial Data
Authors: Sauravjyoti Sarmah, Rosy Das, Dhruba Kr. Bhattacharyya
Abstract:
Clustering algorithms help to understand the hidden information present in datasets. A dataset may contain intrinsic and nested clusters, the detection of which is of utmost importance. This paper presents a Distributed Grid-based Density Clustering algorithm capable of identifying arbitrary shaped embedded clusters as well as multi-density clusters over large spatial datasets. For handling massive datasets, we implemented our method using a 'sharednothing' architecture where multiple computers are interconnected over a network. Experimental results are reported to establish the superiority of the technique in terms of scale-up, speedup as well as cluster quality.Keywords: Clustering, Density-based, Grid-based, Adaptive Grid.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1597355 Optimal Economic Restructuring Aimed at an Increase in GDP Constrained by a Decrease in Energy Consumption and CO2 Emissions
Authors: Alexander Y. Vaninsky
Abstract:
The objective of this paper is finding the way of economic restructuring - that is, change in the shares of sectoral gross outputs - resulting in the maximum possible increase in the gross domestic product (GDP) combined with decreases in energy consumption and CO2 emissions. It uses an input-output model for the GDP and factorial models for the energy consumption and CO2 emissions to determine the projection of the gradient of GDP, and the antigradients of the energy consumption and CO2 emissions, respectively, on a subspace formed by the structure-related variables. Since the gradient (antigradient) provides a direction of the steepest increase (decrease) of the objective function, and their projections retain this property for the functions' limitation to the subspace, each of the three directional vectors solves a particular problem of optimal structural change. In the next step, a type of factor analysis is applied to find a convex combination of the projected gradient and antigradients having maximal possible positive correlation with each of the three. This convex combination provides the desired direction of the structural change. The national economy of the United States is used as an example of applications.
Keywords: Economic restructuring, Input-Output analysis, Divisia index, Factorial decomposition, E3 models.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1608354 A Subtractive Clustering Based Approach for Early Prediction of Fault Proneness in Software Modules
Authors: Ramandeep S. Sidhu, Sunil Khullar, Parvinder S. Sandhu, R. P. S. Bedi, Kiranbir Kaur
Abstract:
In this paper, subtractive clustering based fuzzy inference system approach is used for early detection of faults in the function oriented software systems. This approach has been tested with real time defect datasets of NASA software projects named as PC1 and CM1. Both the code based model and joined model (combination of the requirement and code based metrics) of the datasets are used for training and testing of the proposed approach. The performance of the models is recorded in terms of Accuracy, MAE and RMSE values. The performance of the proposed approach is better in case of Joined Model. As evidenced from the results obtained it can be concluded that Clustering and fuzzy logic together provide a simple yet powerful means to model the earlier detection of faults in the function oriented software systems.
Keywords: Subtractive clustering, fuzzy inference system, fault proneness.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2579353 Towards Clustering of Web-based Document Structures
Authors: Matthias Dehmer, Frank Emmert Streib, Jürgen Kilian, Andreas Zulauf
Abstract:
Methods for organizing web data into groups in order to analyze web-based hypertext data and facilitate data availability are very important in terms of the number of documents available online. Thereby, the task of clustering web-based document structures has many applications, e.g., improving information retrieval on the web, better understanding of user navigation behavior, improving web users requests servicing, and increasing web information accessibility. In this paper we investigate a new approach for clustering web-based hypertexts on the basis of their graph structures. The hypertexts will be represented as so called generalized trees which are more general than usual directed rooted trees, e.g., DOM-Trees. As a important preprocessing step we measure the structural similarity between the generalized trees on the basis of a similarity measure d. Then, we apply agglomerative clustering to the obtained similarity matrix in order to create clusters of hypertext graph patterns representing navigation structures. In the present paper we will run our approach on a data set of hypertext structures and obtain good results in Web Structure Mining. Furthermore we outline the application of our approach in Web Usage Mining as future work.Keywords: Clustering methods, graph-based patterns, graph similarity, hypertext structures, web structure mining
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1505352 An Improved Adaptive Dot-Shape Beamforming Algorithm Research on Frequency Diverse Array
Authors: Yanping Liao, Zenan Wu, Ruigang Zhao
Abstract:
Frequency diverse array (FDA) beamforming is a technology developed in recent years, and its antenna pattern has a unique angle-distance-dependent characteristic. However, the beam is always required to have strong concentration, high resolution and low sidelobe level to form the point-to-point interference in the concentrated set. In order to eliminate the angle-distance coupling of the traditional FDA and to make the beam energy more concentrated, this paper adopts a multi-carrier FDA structure based on proposed power exponential frequency offset to improve the array structure and frequency offset of the traditional FDA. The simulation results show that the beam pattern of the array can form a dot-shape beam with more concentrated energy, and its resolution and sidelobe level performance are improved. However, the covariance matrix of the signal in the traditional adaptive beamforming algorithm is estimated by the finite-time snapshot data. When the number of snapshots is limited, the algorithm has an underestimation problem, which leads to the estimation error of the covariance matrix to cause beam distortion, so that the output pattern cannot form a dot-shape beam. And it also has main lobe deviation and high sidelobe level problems in the case of limited snapshot. Aiming at these problems, an adaptive beamforming technique based on exponential correction for multi-carrier FDA is proposed to improve beamforming robustness. The steps are as follows: first, the beamforming of the multi-carrier FDA is formed under linear constrained minimum variance (LCMV) criteria. Then the eigenvalue decomposition of the covariance matrix is performed to obtain the diagonal matrix composed of the interference subspace, the noise subspace and the corresponding eigenvalues. Finally, the correction index is introduced to exponentially correct the small eigenvalues of the noise subspace, improve the divergence of small eigenvalues in the noise subspace, and improve the performance of beamforming. The theoretical analysis and simulation results show that the proposed algorithm can make the multi-carrier FDA form a dot-shape beam at limited snapshots, reduce the sidelobe level, improve the robustness of beamforming, and have better performance.
Keywords: Multi-carrier frequency diverse array, adaptive beamforming, correction index, limited snapshot, robust.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 677351 Innovation Development of Food Market of Kazakhstan
Authors: G.B. Nurlikhina, G.N. Azhimetova, R. M. Ashimova
Abstract:
Currently, one of the main directions is developing of development based on the clustering of economic operations of Kazakhstan, providing for the organization and concentration of production capacity in one region or the most optimal system. In the modern economic literature clustering is regarded as one of the most effective tools to ensure competitive businesses, and improve their business itself.Keywords: A cluster, food market, innovation cluster.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2243350 UDCA: An Energy Efficient Clustering Algorithm for Wireless Sensor Network
Authors: Boregowda S.B., Hemanth Kumar A.R. Babu N.V, Puttamadappa C., And H.S Mruthyunjaya
Abstract:
In the past few years, the use of wireless sensor networks (WSNs) potentially increased in applications such as intrusion detection, forest fire detection, disaster management and battle field. Sensor nodes are generally battery operated low cost devices. The key challenge in the design and operation of WSNs is to prolong the network life time by reducing the energy consumption among sensor nodes. Node clustering is one of the most promising techniques for energy conservation. This paper presents a novel clustering algorithm which maximizes the network lifetime by reducing the number of communication among sensor nodes. This approach also includes new distributed cluster formation technique that enables self-organization of large number of nodes, algorithm for maintaining constant number of clusters by prior selection of cluster head and rotating the role of cluster head to evenly distribute the energy load among all sensor nodes.
Keywords: Clustering algorithms, Cluster head, Energy consumption, Sensor nodes, and Wireless sensor networks.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2390349 Generating Concept Trees from Dynamic Self-organizing Map
Authors: Norashikin Ahmad, Damminda Alahakoon
Abstract:
Self-organizing map (SOM) provides both clustering and visualization capabilities in mining data. Dynamic self-organizing maps such as Growing Self-organizing Map (GSOM) has been developed to overcome the problem of fixed structure in SOM to enable better representation of the discovered patterns. However, in mining large datasets or historical data the hierarchical structure of the data is also useful to view the cluster formation at different levels of abstraction. In this paper, we present a technique to generate concept trees from the GSOM. The formation of tree from different spread factor values of GSOM is also investigated and the quality of the trees analyzed. The results show that concept trees can be generated from GSOM, thus, eliminating the need for re-clustering of the data from scratch to obtain a hierarchical view of the data under study.
Keywords: dynamic self-organizing map, concept formation, clustering.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1458348 Information Gain Ratio Based Clustering for Investigation of Environmental Parameters Effects on Human Mental Performance
Authors: H. Mehdi, Kh. S. Karimov, A. A. Kavokin
Abstract:
Methods of clustering which were developed in the data mining theory can be successfully applied to the investigation of different kinds of dependencies between the conditions of environment and human activities. It is known, that environmental parameters such as temperature, relative humidity, atmospheric pressure and illumination have significant effects on the human mental performance. To investigate these parameters effect, data mining technique of clustering using entropy and Information Gain Ratio (IGR) K(Y/X) = (H(X)–H(Y/X))/H(Y) is used, where H(Y)=-ΣPi ln(Pi). This technique allows adjusting the boundaries of clusters. It is shown that the information gain ratio (IGR) grows monotonically and simultaneously with degree of connectivity between two variables. This approach has some preferences if compared, for example, with correlation analysis due to relatively smaller sensitivity to shape of functional dependencies. Variant of an algorithm to implement the proposed method with some analysis of above problem of environmental effects is also presented. It was shown that proposed method converges with finite number of steps.Keywords: Clustering, Correlation analysis, EnvironmentalParameters, Information Gain Ratio, Mental Performance.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1823347 Pattern Recognition Using Feature Based Die-Map Clusteringin the Semiconductor Manufacturing Process
Authors: Seung Hwan Park, Cheng-Sool Park, Jun Seok Kim, Youngji Yoo, Daewoong An, Jun-Geol Baek
Abstract:
Depending on the big data analysis becomes important, yield prediction using data from the semiconductor process is essential. In general, yield prediction and analysis of the causes of the failure are closely related. The purpose of this study is to analyze pattern affects the final test results using a die map based clustering. Many researches have been conducted using die data from the semiconductor test process. However, analysis has limitation as the test data is less directly related to the final test results. Therefore, this study proposes a framework for analysis through clustering using more detailed data than existing die data. This study consists of three phases. In the first phase, die map is created through fail bit data in each sub-area of die. In the second phase, clustering using map data is performed. And the third stage is to find patterns that affect final test result. Finally, the proposed three steps are applied to actual industrial data and experimental results showed the potential field application.
Keywords: Die-Map Clustering, Feature Extraction, Pattern Recognition, Semiconductor Manufacturing Process.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3151346 K-Means for Spherical Clusters with Large Variance in Sizes
Authors: A. M. Fahim, G. Saake, A. M. Salem, F. A. Torkey, M. A. Ramadan
Abstract:
Data clustering is an important data exploration technique with many applications in data mining. The k-means algorithm is well known for its efficiency in clustering large data sets. However, this algorithm is suitable for spherical shaped clusters of similar sizes and densities. The quality of the resulting clusters decreases when the data set contains spherical shaped with large variance in sizes. In this paper, we introduce a competent procedure to overcome this problem. The proposed method is based on shifting the center of the large cluster toward the small cluster, and recomputing the membership of small cluster points, the experimental results reveal that the proposed algorithm produces satisfactory results.Keywords: K-Means, Data Clustering, Cluster Analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3280345 Improving RBF Networks Classification Performance by using K-Harmonic Means
Authors: Z. Zainuddin, W. K. Lye
Abstract:
In this paper, a clustering algorithm named KHarmonic means (KHM) was employed in the training of Radial Basis Function Networks (RBFNs). KHM organized the data in clusters and determined the centres of the basis function. The popular clustering algorithms, namely K-means (KM) and Fuzzy c-means (FCM), are highly dependent on the initial identification of elements that represent the cluster well. In KHM, the problem can be avoided. This leads to improvement in the classification performance when compared to other clustering algorithms. A comparison of the classification accuracy was performed between KM, FCM and KHM. The classification performance is based on the benchmark data sets: Iris Plant, Diabetes and Breast Cancer. RBFN training with the KHM algorithm shows better accuracy in classification problem.Keywords: Neural networks, Radial basis functions, Clusteringmethod, K-harmonic means.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1849344 Clustering Based Formulation for Short Term Load Forecasting
Authors: Ajay Shekhar Pandey, D. Singh, S. K. Sinha
Abstract:
A clustering based technique has been developed and implemented for Short Term Load Forecasting, in this article. Formulation has been done using Mean Absolute Percentage Error (MAPE) as an objective function. Data Matrix and cluster size are optimization variables. Model designed, uses two temperature variables. This is compared with six input Radial Basis Function Neural Network (RBFNN) and Fuzzy Inference Neural Network (FINN) for the data of the same system, for same time period. The fuzzy inference system has the network structure and the training procedure of a neural network which initially creates a rule base from existing historical load data. It is observed that the proposed clustering based model is giving better forecasting accuracy as compared to the other two methods. Test results also indicate that the RBFNN can forecast future loads with accuracy comparable to that of proposed method, where as the training time required in the case of FINN is much less.
Keywords: Load forecasting, clustering, fuzzy inference.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1625343 Fuzzy Relatives of the CLARANS Algorithm With Application to Text Clustering
Authors: Mohamed A. Mahfouz, M. A. Ismail
Abstract:
This paper introduces new algorithms (Fuzzy relative of the CLARANS algorithm FCLARANS and Fuzzy c Medoids based on randomized search FCMRANS) for fuzzy clustering of relational data. Unlike existing fuzzy c-medoids algorithm (FCMdd) in which the within cluster dissimilarity of each cluster is minimized in each iteration by recomputing new medoids given current memberships, FCLARANS minimizes the same objective function minimized by FCMdd by changing current medoids in such away that that the sum of the within cluster dissimilarities is minimized. Computing new medoids may be effected by noise because outliers may join the computation of medoids while the choice of medoids in FCLARANS is dictated by the location of a predominant fraction of points inside a cluster and, therefore, it is less sensitive to the presence of outliers. In FCMRANS the step of computing new medoids in FCMdd is modified to be based on randomized search. Furthermore, a new initialization procedure is developed that add randomness to the initialization procedure used with FCMdd. Both FCLARANS and FCMRANS are compared with the robust and linearized version of fuzzy c-medoids (RFCMdd). Experimental results with different samples of the Reuter-21578, Newsgroups (20NG) and generated datasets with noise show that FCLARANS is more robust than both RFCMdd and FCMRANS. Finally, both FCMRANS and FCLARANS are more efficient and their outputs are almost the same as that of RFCMdd in terms of classification rate.Keywords: Data Mining, Fuzzy Clustering, Relational Clustering, Medoid-Based Clustering, Cluster Analysis, Unsupervised Learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2401342 Electricity Generation from Renewables and Targets: An Application of Multivariate Statistical Techniques
Authors: Filiz Ersoz, Taner Ersoz, Tugrul Bayraktar
Abstract:
Renewable energy is referred to as "clean energy" and common popular support for the use of renewable energy (RE) is to provide electricity with zero carbon dioxide emissions. This study provides useful insight into the European Union (EU) RE, especially, into electricity generation obtained from renewables, and their targets. The objective of this study is to identify groups of European countries, using multivariate statistical analysis and selected indicators. The hierarchical clustering method is used to decide the number of clusters for EU countries. The conducted statistical hierarchical cluster analysis is based on the Ward’s clustering method and squared Euclidean distances. Hierarchical cluster analysis identified eight distinct clusters of European countries. Then, non-hierarchical clustering (k-means) method was applied. Discriminant analysis was used to determine the validity of the results with data normalized by Z score transformation. To explore the relationship between the selected indicators, correlation coefficients were computed. The results of the study reveal the current situation of RE in European Union Member States.Keywords: Share of electricity generation, CO2 emission, targets, multivariate methods, hierarchical clustering, K-means clustering, discriminant analyzed, correlation, EU member countries.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1247341 Quantity and Quality Aware Artificial Bee Colony Algorithm for Clustering
Authors: U. Idachaba, F. Z. Wang, A. Qi, N. Helian
Abstract:
Artificial Bee Colony (ABC) algorithm is a relatively new swarm intelligence technique for clustering. It produces higher quality clusters compared to other population-based algorithms but with poor energy efficiency, cluster quality consistency and typically slower in convergence speed. Inspired by energy saving foraging behavior of natural honey bees this paper presents a Quality and Quantity Aware Artificial Bee Colony (Q2ABC) algorithm to improve quality of cluster identification, energy efficiency and convergence speed of the original ABC. To evaluate the performance of Q2ABC algorithm, experiments were conducted on a suite of ten benchmark UCI datasets. The results demonstrate Q2ABC outperformed ABC and K-means algorithm in the quality of clusters delivered.
Keywords: Artificial bee colony algorithm, clustering.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2120340 A Hybrid Method for Determination of Effective Poles Using Clustering Dominant Pole Algorithm
Authors: Anuj Abraham, N. Pappa, Daniel Honc, Rahul Sharma
Abstract:
In this paper, an analysis of some model order reduction techniques is presented. A new hybrid algorithm for model order reduction of linear time invariant systems is compared with the conventional techniques namely Balanced Truncation, Hankel Norm reduction and Dominant Pole Algorithm (DPA). The proposed hybrid algorithm is known as Clustering Dominant Pole Algorithm (CDPA), is able to compute the full set of dominant poles and its cluster center efficiently. The dominant poles of a transfer function are specific eigenvalues of the state space matrix of the corresponding dynamical system. The effectiveness of this novel technique is shown through the simulation results.
Keywords: Balanced truncation, Clustering, Dominant pole, Hankel norm, Model reduction.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2687339 A Computational Cost-Effective Clustering Algorithm in Multidimensional Space Using the Manhattan Metric: Application to the Global Terrorism Database
Authors: Semeh Ben Salem, Sami Naouali, Moetez Sallami
Abstract:
The increasing amount of collected data has limited the performance of the current analyzing algorithms. Thus, developing new cost-effective algorithms in terms of complexity, scalability, and accuracy raised significant interests. In this paper, a modified effective k-means based algorithm is developed and experimented. The new algorithm aims to reduce the computational load without significantly affecting the quality of the clusterings. The algorithm uses the City Block distance and a new stop criterion to guarantee the convergence. Conducted experiments on a real data set show its high performance when compared with the original k-means version.
Keywords: Pattern recognition, partitional clustering, K-means clustering, Manhattan distance, terrorism data analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1359338 Random Subspace Neural Classifier for Meteor Recognition in the Night Sky
Authors: Carlos Vera, Tetyana Baydyk, Ernst Kussul, Graciela Velasco, Miguel Aparicio
Abstract:
This article describes the Random Subspace Neural Classifier (RSC) for the recognition of meteors in the night sky. We used images of meteors entering the atmosphere at night between 8:00 p.m.-5: 00 a.m. The objective of this project is to classify meteor and star images (with stars as the image background). The monitoring of the sky and the classification of meteors are made for future applications by scientists. The image database was collected from different websites. We worked with RGB-type images with dimensions of 220x220 pixels stored in the BitMap Protocol (BMP) format. Subsequent window scanning and processing were carried out for each image. The scan window where the characteristics were extracted had the size of 20x20 pixels with a scanning step size of 10 pixels. Brightness, contrast and contour orientation histograms were used as inputs for the RSC. The RSC worked with two classes and classified into: 1) with meteors and 2) without meteors. Different tests were carried out by varying the number of training cycles and the number of images for training and recognition. The percentage error for the neural classifier was calculated. The results show a good RSC classifier response with 89% correct recognition. The results of these experiments are presented and discussed.
Keywords: Contour orientation histogram, meteors, night sky, RSC neural classifier, stars.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 406337 Fast and Accuracy Control Chart Pattern Recognition using a New cluster-k-Nearest Neighbor
Authors: Samir Brahim Belhaouari
Abstract:
By taking advantage of both k-NN which is highly accurate and K-means cluster which is able to reduce the time of classification, we can introduce Cluster-k-Nearest Neighbor as "variable k"-NN dealing with the centroid or mean point of all subclasses generated by clustering algorithm. In general the algorithm of K-means cluster is not stable, in term of accuracy, for that reason we develop another algorithm for clustering our space which gives a higher accuracy than K-means cluster, less subclass number, stability and bounded time of classification with respect to the variable data size. We find between 96% and 99.7 % of accuracy in the lassification of 6 different types of Time series by using K-means cluster algorithm and we find 99.7% by using the new clustering algorithm.Keywords: Pattern recognition, Time series, k-Nearest Neighbor, k-means cluster, Gaussian Mixture Model, Classification
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1964336 Neural Networks Learning Improvement using the K-Means Clustering Algorithm to Detect Network Intrusions
Authors: K. M. Faraoun, A. Boukelif
Abstract:
In the present work, we propose a new technique to enhance the learning capabilities and reduce the computation intensity of a competitive learning multi-layered neural network using the K-means clustering algorithm. The proposed model use multi-layered network architecture with a back propagation learning mechanism. The K-means algorithm is first applied to the training dataset to reduce the amount of samples to be presented to the neural network, by automatically selecting an optimal set of samples. The obtained results demonstrate that the proposed technique performs exceptionally in terms of both accuracy and computation time when applied to the KDD99 dataset compared to a standard learning schema that use the full dataset.Keywords: Neural networks, Intrusion detection, learningenhancement, K-means clustering
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3609335 Data Oriented Model of Image: as a Framework for Image Processing
Authors: A. Habibizad Navin, A. Sadighi, M. Naghian Fesharaki, M. Mirnia, M. Teshnelab, R. Keshmiri
Abstract:
This paper presents a new data oriented model of image. Then a representation of it, ADBT, is introduced. The ability of ADBT is clustering, segmentation, measuring similarity of images etc, with desired precision and corresponding speed.
Keywords: Data oriented modelling, image, clustering, segmentation, classification, ADBT and image processing.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1798334 Using Pattern Search Methods for Minimizing Clustering Problems
Authors: Parvaneh Shabanzadeh, Malik Hj Abu Hassan, Leong Wah June, Maryam Mohagheghtabar
Abstract:
Clustering is one of an interesting data mining topics that can be applied in many fields. Recently, the problem of cluster analysis is formulated as a problem of nonsmooth, nonconvex optimization, and an algorithm for solving the cluster analysis problem based on nonsmooth optimization techniques is developed. This optimization problem has a number of characteristics that make it challenging: it has many local minimum, the optimization variables can be either continuous or categorical, and there are no exact analytical derivatives. In this study we show how to apply a particular class of optimization methods known as pattern search methods to address these challenges. These methods do not explicitly use derivatives, an important feature that has not been addressed in previous studies. Results of numerical experiments are presented which demonstrate the effectiveness of the proposed method.Keywords: Clustering functions, Non-smooth Optimization, Nonconvex Optimization, Pattern Search Method.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1639333 Ensemble Learning with Decision Tree for Remote Sensing Classification
Authors: Mahesh Pal
Abstract:
In recent years, a number of works proposing the combination of multiple classifiers to produce a single classification have been reported in remote sensing literature. The resulting classifier, referred to as an ensemble classifier, is generally found to be more accurate than any of the individual classifiers making up the ensemble. As accuracy is the primary concern, much of the research in the field of land cover classification is focused on improving classification accuracy. This study compares the performance of four ensemble approaches (boosting, bagging, DECORATE and random subspace) with a univariate decision tree as base classifier. Two training datasets, one without ant noise and other with 20 percent noise was used to judge the performance of different ensemble approaches. Results with noise free data set suggest an improvement of about 4% in classification accuracy with all ensemble approaches in comparison to the results provided by univariate decision tree classifier. Highest classification accuracy of 87.43% was achieved by boosted decision tree. A comparison of results with noisy data set suggests that bagging, DECORATE and random subspace approaches works well with this data whereas the performance of boosted decision tree degrades and a classification accuracy of 79.7% is achieved which is even lower than that is achieved (i.e. 80.02%) by using unboosted decision tree classifier.Keywords: Ensemble learning, decision tree, remote sensingclassification.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2583332 Spatial Clustering Model of Vessel Trajectory to Extract Sailing Routes Based on AIS Data
Authors: Lubna Eljabu, Mohammad Etemad, Stan Matwin
Abstract:
The automatic extraction of shipping routes is advantageous for intelligent traffic management systems to identify events and support decision-making in maritime surveillance. At present, there is a high demand for the extraction of maritime traffic networks that resemble the real traffic of vessels accurately, which is valuable for further analytical processing tasks for vessels trajectories (e.g., naval routing and voyage planning, anomaly detection, destination prediction, time of arrival estimation). With the help of big data and processing huge amounts of vessels’ trajectory data, it is possible to learn these shipping routes from the navigation history of past behaviour of other, similar ships that were travelling in a given area. In this paper, we propose a spatial clustering model of vessels’ trajectories (SPTCLUST) to extract spatial representations of sailing routes from historical Automatic Identification System (AIS) data. The whole model consists of three main parts: data preprocessing, path finding, and route extraction, which consists of clustering and representative trajectory extraction. The proposed clustering method provides techniques to overcome the problems of: (i) optimal input parameters selection; (ii) the high complexity of processing a huge volume of multidimensional data; (iii) and the spatial representation of complete representative trajectory detection in the context of trajectory clustering algorithms. The experimental evaluation showed the effectiveness of the proposed model by using a real-world AIS dataset from the Port of Halifax. The results contribute to further understanding of shipping route patterns. This could aid surveillance authorities in stable and sustainable vessel traffic management.
Keywords: Vessel trajectory clustering, trajectory mining, Spatial Clustering, marine intelligent navigation, maritime traffic network extraction, sdailing routes extraction.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 453331 Advanced Information Extraction with n-gram based LSI
Authors: Ahmet Güven, Ö. Özgür Bozkurt, Oya Kalıpsız
Abstract:
Number of documents being created increases at an increasing pace while most of them being in already known topics and little of them introducing new concepts. This fact has started a new era in information retrieval discipline where the requirements have their own specialties. That is digging into topics and concepts and finding out subtopics or relations between topics. Up to now IR researches were interested in retrieving documents about a general topic or clustering documents under generic subjects. However these conventional approaches can-t go deep into content of documents which makes it difficult for people to reach to right documents they were searching. So we need new ways of mining document sets where the critic point is to know much about the contents of the documents. As a solution we are proposing to enhance LSI, one of the proven IR techniques by supporting its vector space with n-gram forms of words. Positive results we have obtained are shown in two different application area of IR domain; querying a document database, clustering documents in the document database.Keywords: Document clustering, Information Extraction, Information Retrieval, LSI, n-gram.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1802330 A Spanning Tree for Enhanced Cluster Based Routing in Wireless Sensor Network
Authors: M. Saravanan, M. Madheswaran
Abstract:
Wireless Sensor Network (WSN) clustering architecture enables features like network scalability, communication overhead reduction, and fault tolerance. After clustering, aggregated data is transferred to data sink and reducing unnecessary, redundant data transfer. It reduces nodes transmitting, and so saves energy consumption. Also, it allows scalability for many nodes, reduces communication overhead, and allows efficient use of WSN resources. Clustering based routing methods manage network energy consumption efficiently. Building spanning trees for data collection rooted at a sink node is a fundamental data aggregation method in sensor networks. The problem of determining Cluster Head (CH) optimal number is an NP-Hard problem. In this paper, we combine cluster based routing features for cluster formation and CH selection and use Minimum Spanning Tree (MST) for intra-cluster communication. The proposed method is based on optimizing MST using Simulated Annealing (SA). In this work, normalized values of mobility, delay, and remaining energy are considered for finding optimal MST. Simulation results demonstrate the effectiveness of the proposed method in improving the packet delivery ratio and reducing the end to end delay.
Keywords: Wireless sensor network, clustering, minimum spanning tree, genetic algorithm, low energy adaptive clustering hierarchy, simulated annealing.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1786