Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 11

Data Clustering Related Publications

11 Resistance and Sub-Resistances of RC Beams Subjected to Multiple Failure Modes

Authors: F. Sangiorgio, J. Silfwerbrand, G. Mancini

Abstract:

Geometric and mechanical properties all influence the resistance of RC structures and may, in certain combination of property values, increase the risk of a brittle failure of the whole system. This paper presents a statistical and probabilistic investigation on the resistance of RC beams designed according to Eurocodes 2 and 8, and subjected to multiple failure modes, under both the natural variation of material properties and the uncertainty associated with cross-section and transverse reinforcement geometry. A full probabilistic model based on JCSS Probabilistic Model Code is derived. Different beams are studied through material nonlinear analysis via Monte Carlo simulations. The resistance model is consistent with Eurocode 2. Both a multivariate statistical evaluation and the data clustering analysis of outcomes are then performed. Results show that the ultimate load behaviour of RC beams subjected to flexural and shear failure modes seems to be mainly influenced by the combination of the mechanical properties of both longitudinal reinforcement and stirrups, and the tensile strength of concrete, of which the latter appears to affect the overall response of the system in a nonlinear way. The model uncertainty of the resistance model used in the analysis plays undoubtedly an important role in interpreting results.

Keywords: Modelling, Data Clustering, Monte Carlo simulations, reinforced concrete members, Probabilistic Models, Structural Design

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1743
10 A Growing Natural Gas Approach for Evaluating Quality of Software Modules

Authors: Parvinder S. Sandhu, Sandeep Khimta, Kiranpreet Kaur

Abstract:

The prediction of Software quality during development life cycle of software project helps the development organization to make efficient use of available resource to produce the product of highest quality. “Whether a module is faulty or not" approach can be used to predict quality of a software module. There are numbers of software quality prediction models described in the literature based upon genetic algorithms, artificial neural network and other data mining algorithms. One of the promising aspects for quality prediction is based on clustering techniques. Most quality prediction models that are based on clustering techniques make use of K-means, Mixture-of-Guassians, Self-Organizing Map, Neural Gas and fuzzy K-means algorithm for prediction. In all these techniques a predefined structure is required that is number of neurons or clusters should be known before we start clustering process. But in case of Growing Neural Gas there is no need of predetermining the quantity of neurons and the topology of the structure to be used and it starts with a minimal neurons structure that is incremented during training until it reaches a maximum number user defined limits for clusters. Hence, in this work we have used Growing Neural Gas as underlying cluster algorithm that produces the initial set of labeled cluster from training data set and thereafter this set of clusters is used to predict the quality of test data set of software modules. The best testing results shows 80% accuracy in evaluating the quality of software modules. Hence, the proposed technique can be used by programmers in evaluating the quality of modules during software development.

Keywords: Fault Prediction, Data Clustering, Growing Neural Gas

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1451
9 A Context-Aware Supplier Selection Model

Authors: Mohammadreza Razzazi, Maryam Bayat

Abstract:

Selection of the best possible set of suppliers has a significant impact on the overall profitability and success of any business. For this reason, it is usually necessary to optimize all business processes and to make use of cost-effective alternatives for additional savings. This paper proposes a new efficient context-aware supplier selection model that takes into account possible changes of the environment while significantly reducing selection costs. The proposed model is based on data clustering techniques while inspiring certain principles of online algorithms for an optimally selection of suppliers. Unlike common selection models which re-run the selection algorithm from the scratch-line for any decision-making sub-period on the whole environment, our model considers the changes only and superimposes it to the previously defined best set of suppliers to obtain a new best set of suppliers. Therefore, any recomputation of unchanged elements of the environment is avoided and selection costs are consequently reduced significantly. A numerical evaluation confirms applicability of this model and proves that it is a more optimal solution compared with common static selection models in this field.

Keywords: Supplier Selection, Data Clustering, Context-Awareness, OnlineAlgorithms

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1401
8 Application of a New Hybrid Optimization Algorithm on Cluster Analysis

Authors: B.Bahmani Firouzi, T. Niknam, M. Nayeripour

Abstract:

Clustering techniques have received attention in many areas including engineering, medicine, biology and data mining. The purpose of clustering is to group together data points, which are close to one another. The K-means algorithm is one of the most widely used techniques for clustering. However, K-means has two shortcomings: dependency on the initial state and convergence to local optima and global solutions of large problems cannot found with reasonable amount of computation effort. In order to overcome local optima problem lots of studies done in clustering. This paper is presented an efficient hybrid evolutionary optimization algorithm based on combining Particle Swarm Optimization (PSO) and Ant Colony Optimization (ACO), called PSO-ACO, for optimally clustering N object into K clusters. The new PSO-ACO algorithm is tested on several data sets, and its performance is compared with those of ACO, PSO and K-means clustering. The simulation results show that the proposed evolutionary optimization algorithm is robust and suitable for handing data clustering.

Keywords: Data Clustering, K-Means Clustering, Particle Swarm Optimization (PSO), ant colony optimization (ACO), Hybrid evolutionary optimization algorithm

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1839
7 A New Evolutionary Algorithm for Cluster Analysis

Authors: B.Bahmani Firouzi, T. Niknam, M. Nayeripour

Abstract:

Clustering is a very well known technique in data mining. One of the most widely used clustering techniques is the kmeans algorithm. Solutions obtained from this technique depend on the initialization of cluster centers and the final solution converges to local minima. In order to overcome K-means algorithm shortcomings, this paper proposes a hybrid evolutionary algorithm based on the combination of PSO, SA and K-means algorithms, called PSO-SA-K, which can find better cluster partition. The performance is evaluated through several benchmark data sets. The simulation results show that the proposed algorithm outperforms previous approaches, such as PSO, SA and K-means for partitional clustering problem.

Keywords: Data Clustering, Particle Swarm Optimization (PSO), K-means algorithm, Hybrid evolutionary optimization algorithm, Simulated Annealing (SA)

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1872
6 DCBOR: A Density Clustering Based on Outlier Removal

Authors: F. A. Torkey, A. M. Salem, M. A. Ramadan, A. M. Fahim, G. Saake

Abstract:

Data clustering is an important data exploration technique with many applications in data mining. We present an enhanced version of the well known single link clustering algorithm. We will refer to this algorithm as DCBOR. The proposed algorithm alleviates the chain effect by removing the outliers from the given dataset. So this algorithm provides outlier detection and data clustering simultaneously. This algorithm does not need to update the distance matrix, since the algorithm depends on merging the most k-nearest objects in one step and the cluster continues grow as long as possible under specified condition. So the algorithm consists of two phases; at the first phase, it removes the outliers from the input dataset. At the second phase, it performs the clustering process. This algorithm discovers clusters of different shapes, sizes, densities and requires only one input parameter; this parameter represents a threshold for outlier points. The value of the input parameter is ranging from 0 to 1. The algorithm supports the user in determining an appropriate value for it. We have tested this algorithm on different datasets contain outlier and connecting clusters by chain of density points, and the algorithm discovers the correct clusters. The results of our experiments demonstrate the effectiveness and the efficiency of DCBOR.

Keywords: Data Clustering, Clustering Algorithms, Arbitrary Shape of clusters, Handling Noise

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1557
5 K-Means for Spherical Clusters with Large Variance in Sizes

Authors: F. A. Torkey, A. M. Salem, M. A. Ramadan, A. M. Fahim, G. Saake

Abstract:

Data clustering is an important data exploration technique with many applications in data mining. The k-means algorithm is well known for its efficiency in clustering large data sets. However, this algorithm is suitable for spherical shaped clusters of similar sizes and densities. The quality of the resulting clusters decreases when the data set contains spherical shaped with large variance in sizes. In this paper, we introduce a competent procedure to overcome this problem. The proposed method is based on shifting the center of the large cluster toward the small cluster, and recomputing the membership of small cluster points, the experimental results reveal that the proposed algorithm produces satisfactory results.

Keywords: Cluster Analysis, Data Clustering, k-means

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2954
4 Exponential Particle Swarm Optimization Approach for Improving Data Clustering

Authors: Neveen I. Ghali, Nahed El-Dessouki, Mervat A. N., Lamiaa Bakrawi

Abstract:

In this paper we use exponential particle swarm optimization (EPSO) to cluster data. Then we compare between (EPSO) clustering algorithm which depends on exponential variation for the inertia weight and particle swarm optimization (PSO) clustering algorithm which depends on linear inertia weight. This comparison is evaluated on five data sets. The experimental results show that EPSO clustering algorithm increases the possibility to find the optimal positions as it decrease the number of failure. Also show that (EPSO) clustering algorithm has a smaller quantization error than (PSO) clustering algorithm, i.e. (EPSO) clustering algorithm more accurate than (PSO) clustering algorithm.

Keywords: Particle Swarm Optimization, Data Clustering, exponential PSO

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1320
3 Observations about the Principal Components Analysis and Data Clustering Techniques in the Study of Medical Data

Authors: Cristina G. Dascalu, Corina Dima Cozma, Elena Carmen Cotrutz

Abstract:

The medical data statistical analysis often requires the using of some special techniques, because of the particularities of these data. The principal components analysis and the data clustering are two statistical methods for data mining very useful in the medical field, the first one as a method to decrease the number of studied parameters, and the second one as a method to analyze the connections between diagnosis and the data about the patient-s condition. In this paper we investigate the implications obtained from a specific data analysis technique: the data clustering preceded by a selection of the most relevant parameters, made using the principal components analysis. Our assumption was that, using the principal components analysis before data clustering - in order to select and to classify only the most relevant parameters – the accuracy of clustering is improved, but the practical results showed the opposite fact: the clustering accuracy decreases, with a percentage approximately equal with the percentage of information loss reported by the principal components analysis.

Keywords: Medical Data, Data Clustering, principal components analysis

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1139
2 Using Data Clustering in Oral Medicine

Authors: Fahad Shahbaz Khan, Rao Muhammad Anwer, Olof Torgersson

Abstract:

The vast amount of information hidden in huge databases has created tremendous interests in the field of data mining. This paper examines the possibility of using data clustering techniques in oral medicine to identify functional relationships between different attributes and classification of similar patient examinations. Commonly used data clustering algorithms have been reviewed and as a result several interesting results have been gathered.

Keywords: Data Mining, Oral medicine, Data Clustering, Cluto

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1579
1 IMDC: An Image-Mapped Data Clustering Technique for Large Datasets

Authors: Faruq A. Al-Omari, Nabeel I. Al-Fayoumi

Abstract:

In this paper, we present a new algorithm for clustering data in large datasets using image processing approaches. First the dataset is mapped into a binary image plane. The synthesized image is then processed utilizing efficient image processing techniques to cluster the data in the dataset. Henceforth, the algorithm avoids exhaustive search to identify clusters. The algorithm considers only a small set of the data that contains critical boundary information sufficient to identify contained clusters. Compared to available data clustering techniques, the proposed algorithm produces similar quality results and outperforms them in execution time and storage requirements.

Keywords: Data Mining, Pattern Discovery, Data Clustering, predictive analysis, Image-mapping

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1130