Search results for: clustering algorithms
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1841

Search results for: clustering algorithms

1721 Performance Comparison of Parallel Sorting Algorithms on the Cluster of Workstations

Authors: Lai Lai Win Kyi, Nay Min Tun

Abstract:

Sorting appears the most attention among all computational tasks over the past years because sorted data is at the heart of many computations. Sorting is of additional importance to parallel computing because of its close relation to the task of routing data among processes, which is an essential part of many parallel algorithms. Many parallel sorting algorithms have been investigated for a variety of parallel computer architectures. In this paper, three parallel sorting algorithms have been implemented and compared in terms of their overall execution time. The algorithms implemented are the odd-even transposition sort, parallel merge sort and parallel rank sort. Cluster of Workstations or Windows Compute Cluster has been used to compare the algorithms implemented. The C# programming language is used to develop the sorting algorithms. The MPI (Message Passing Interface) library has been selected to establish the communication and synchronization between processors. The time complexity for each parallel sorting algorithm will also be mentioned and analyzed.

Keywords: Cluster of Workstations, Parallel sorting algorithms, performance analysis, parallel computing and MPI.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1482
1720 Hierarchical Clustering Analysis with SOM Networks

Authors: Diego Ordonez, Carlos Dafonte, Minia Manteiga, Bernardino Arcayy

Abstract:

This work presents a neural network model for the clustering analysis of data based on Self Organizing Maps (SOM). The model evolves during the training stage towards a hierarchical structure according to the input requirements. The hierarchical structure symbolizes a specialization tool that provides refinements of the classification process. The structure behaves like a single map with different resolutions depending on the region to analyze. The benefits and performance of the algorithm are discussed in application to the Iris dataset, a classical example for pattern recognition.

Keywords: Neural networks, Self-organizing feature maps, Hierarchicalsystems, Pattern clustering methods.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1947
1719 Order Reduction using Modified Pole Clustering and Pade Approximations

Authors: C.B. Vishwakarma

Abstract:

The authors present a mixed method for reducing the order of the large-scale dynamic systems. In this method, the denominator polynomial of the reduced order model is obtained by using the modified pole clustering technique while the coefficients of the numerator are obtained by Pade approximations. This method is conceptually simple and always generates stable reduced models if the original high-order system is stable. The proposed method is illustrated with the help of the numerical examples taken from the literature.

Keywords: Modified pole clustering, order reduction, padeapproximation, stability, transfer function.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2980
1718 Image Clustering Framework for BAVM Segmentation in 3DRA Images: Performance Analysis

Authors: FH. Sarieddeen, R. El Berbari, S. Imad, J. Abdel Baki, M. Hamad, R. Blanc, A. Nakib, Y.Chenoune

Abstract:

Brain ArterioVenous Malformation (BAVM) is an abnormal tangle of brain blood vessels where arteries shunt directly into veins with no intervening capillary bed which causes high pressure and hemorrhage risk. The success of treatment by embolization in interventional neuroradiology is highly dependent on the accuracy of the vessels visualization. In this paper the performance of clustering techniques on vessel segmentation from 3- D rotational angiography (3DRA) images is investigated and a new technique of segmentation is proposed. This method consists in: preprocessing step of image enhancement, then K-Means (KM), Fuzzy C-Means (FCM) and Expectation Maximization (EM) clustering are used to separate vessel pixels from background and artery pixels from vein pixels when possible. A post processing step of removing false-alarm components is applied before constructing a three-dimensional volume of the vessels. The proposed method was tested on six datasets along with a medical assessment of an expert. Obtained results showed encouraging segmentations.

Keywords: Brain arteriovenous malformation (BAVM), 3-D rotational angiography (3DRA), K-Means (KM) clustering, Fuzzy CMeans (FCM) clustering, Expectation Maximization (EM) clustering, volume rendering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1910
1717 The Robust Clustering with Reduction Dimension

Authors: Dyah E. Herwindiati

Abstract:

A clustering is process to identify a homogeneous groups of object called as cluster. Clustering is one interesting topic on data mining. A group or class behaves similarly characteristics. This paper discusses a robust clustering process for data images with two reduction dimension approaches; i.e. the two dimensional principal component analysis (2DPCA) and principal component analysis (PCA). A standard approach to overcome this problem is dimension reduction, which transforms a high-dimensional data into a lower-dimensional space with limited loss of information. One of the most common forms of dimensionality reduction is the principal components analysis (PCA). The 2DPCA is often called a variant of principal component (PCA), the image matrices were directly treated as 2D matrices; they do not need to be transformed into a vector so that the covariance matrix of image can be constructed directly using the original image matrices. The decomposed classical covariance matrix is very sensitive to outlying observations. The objective of paper is to compare the performance of robust minimizing vector variance (MVV) in the two dimensional projection PCA (2DPCA) and the PCA for clustering on an arbitrary data image when outliers are hiden in the data set. The simulation aspects of robustness and the illustration of clustering images are discussed in the end of paper

Keywords: Breakdown point, Consistency, 2DPCA, PCA, Outlier, Vector Variance

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1697
1716 Queen-bee Algorithm for Energy Efficient Clusters in Wireless Sensor Networks

Authors: Z. Pooranian, A. Barati, A. Movaghar

Abstract:

Wireless sensor networks include small nodes which have sensing ability; calculation and connection extend themselves everywhere soon. Such networks have source limitation on connection, calculation and energy consumption. So, since the nodes have limited energy in sensor networks, the optimized energy consumption in these networks is of more importance and has created many challenges. The previous works have shown that by organizing the network nodes in a number of clusters, the energy consumption could be reduced considerably. So the lifetime of the network would be increased. In this paper, we used the Queen-bee algorithm to create energy efficient clusters in wireless sensor networks. The Queen-bee (QB) is similar to nature in that the queen-bee plays a major role in reproduction process. The QB is simulated with J-sim simulator. The results of the simulation showed that the clustering by the QB algorithm decreases the energy consumption with regard to the other existing algorithms and increases the lifetime of the network.

Keywords: Queen-bee, sensor network, energy efficient, clustering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1974
1715 Fuzzy Clustering Analysis in Real Estate Companies in China

Authors: Jianfeng Li, Feng Jin, Xiaoyu Yang

Abstract:

This paper applies fuzzy clustering algorithm in classifying real estate companies in China according to some general financial indexes, such as income per share, share accumulation fund, net profit margins, weighted net assets yield and shareholders' equity. By constructing and normalizing initial partition matrix, getting fuzzy similar matrix with Minkowski metric and gaining the transitive closure, the dynamic fuzzy clustering analysis for real estate companies is shown clearly that different clustered result change gradually with the threshold reducing, and then, it-s shown there is the similar relationship with the prices of those companies in stock market. In this way, it-s great valuable in contrasting the real estate companies- financial condition in order to grasp some good chances of investment, and so on.

Keywords: Fuzzy clustering algorithm, data mining, real estate company, financial analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1917
1714 Hybrid Recommender Systems using Social Network Analysis

Authors: Kyoung-Jae Kim, Hyunchul Ahn

Abstract:

This study proposes novel hybrid social network analysis and collaborative filtering approach to enhance the performance of recommender systems. The proposed model selects subgroups of users in Internet community through social network analysis (SNA), and then performs clustering analysis using the information about subgroups. Finally, it makes recommendations using cluster-indexing CF based on the clustering results. This study tries to use the cores in subgroups as an initial seed for a conventional clustering algorithm. This model chooses five cores which have the highest value of degree centrality from SNA, and then performs clustering analysis by using the cores as initial centroids (cluster centers). Then, the model amplifies the impact of friends in social network in the process of cluster-indexing CF.

Keywords: Social network analysis, Recommender systems, Collaborative filtering, Customer relationship management

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2773
1713 A Context-Aware Supplier Selection Model

Authors: Mohammadreza Razzazi, Maryam Bayat

Abstract:

Selection of the best possible set of suppliers has a significant impact on the overall profitability and success of any business. For this reason, it is usually necessary to optimize all business processes and to make use of cost-effective alternatives for additional savings. This paper proposes a new efficient context-aware supplier selection model that takes into account possible changes of the environment while significantly reducing selection costs. The proposed model is based on data clustering techniques while inspiring certain principles of online algorithms for an optimally selection of suppliers. Unlike common selection models which re-run the selection algorithm from the scratch-line for any decision-making sub-period on the whole environment, our model considers the changes only and superimposes it to the previously defined best set of suppliers to obtain a new best set of suppliers. Therefore, any recomputation of unchanged elements of the environment is avoided and selection costs are consequently reduced significantly. A numerical evaluation confirms applicability of this model and proves that it is a more optimal solution compared with common static selection models in this field.

Keywords: Supplier Selection, Context-Awareness, OnlineAlgorithms, Data Clustering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1818
1712 Visualization of Searching and Sorting Algorithms

Authors: Bremananth R, Radhika.V, Thenmozhi.S

Abstract:

Sequences of execution of algorithms in an interactive manner using multimedia tools are employed in this paper. It helps to realize the concept of fundamentals of algorithms such as searching and sorting method in a simple manner. Visualization gains more attention than theoretical study and it is an easy way of learning process. We propose methods for finding runtime sequence of each algorithm in an interactive way and aims to overcome the drawbacks of the existing character systems. System illustrates each and every step clearly using text and animation. Comparisons of its time complexity have been carried out and results show that our approach provides better perceptive of algorithms.

Keywords: Algorithms, Searching, Sorting, Visualization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2113
1711 Segmentation of Breast Lesions in Ultrasound Images Using Spatial Fuzzy Clustering and Structure Tensors

Authors: Yan Xu, Toshihiro Nishimura

Abstract:

Segmentation in ultrasound images is challenging due to the interference from speckle noise and fuzziness of boundaries. In this paper, a segmentation scheme using fuzzy c-means (FCM) clustering incorporating both intensity and texture information of images is proposed to extract breast lesions in ultrasound images. Firstly, the nonlinear structure tensor, which can facilitate to refine the edges detected by intensity, is used to extract speckle texture. And then, a spatial FCM clustering is applied on the image feature space for segmentation. In the experiments with simulated and clinical ultrasound images, the spatial FCM clustering with both intensity and texture information gets more accurate results than the conventional FCM or spatial FCM without texture information.

Keywords: fuzzy c-means, spatial information, structure tensor, ultrasound image segmentation

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1801
1710 Investigation on Novel Based Metaheuristic Algorithms for Combinatorial Optimization Problems in Ad Hoc Networks

Authors: C. Rajan, N. Shanthi, C. Rasi Priya, K. Geetha

Abstract:

Routing in MANET is extremely challenging because of MANETs dynamic features, its limited bandwidth, frequent topology changes caused by node mobility and power energy consumption. In order to efficiently transmit data to destinations, the applicable routing algorithms must be implemented in mobile ad-hoc networks. Thus we can increase the efficiency of the routing by satisfying the Quality of Service (QoS) parameters by developing routing algorithms for MANETs. The algorithms that are inspired by the principles of natural biological evolution and distributed collective behavior of social colonies have shown excellence in dealing with complex optimization problems and are becoming more popular. This paper presents a survey on few meta-heuristic algorithms and naturally-inspired algorithms.

Keywords: Ant colony optimization, genetic algorithm, Naturally-inspired algorithms and particle swarm optimization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2702
1709 Self-Organizing Maps in Evolutionary Approachmeant for Dimensioning Routes to the Demand

Authors: J.-C. Créput, A. Koukam, A. Hajjam

Abstract:

We present a non standard Euclidean vehicle routing problem adding a level of clustering, and we revisit the use of self-organizing maps as a tool which naturally handles such problems. We present how they can be used as a main operator into an evolutionary algorithm to address two conflicting objectives of route length and distance from customers to bus stops minimization and to deal with capacity constraints. We apply the approach to a real-life case of combined clustering and vehicle routing for the transportation of the 780 employees of an enterprise. Basing upon a geographic information system we discuss the influence of road infrastructures on the solutions generated.

Keywords: Evolutionary algorithm, self-organizing map, clustering and vehicle routing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1382
1708 An Efficient Clustering Technique for Copy-Paste Attack Detection

Authors: N. Chaitawittanun, M. Munlin

Abstract:

Due to rapid advancement of powerful image processing software, digital images are easy to manipulate and modify by ordinary people. Lots of digital images are edited for a specific purpose and more difficult to distinguish form their original ones. We propose a clustering method to detect a copy-move image forgery of JPEG, BMP, TIFF, and PNG. The process starts with reducing the color of the photos. Then, we use the clustering technique to divide information of measuring data by Hausdorff Distance. The result shows that the purposed methods is capable of inspecting the image file and correctly identify the forgery.

Keywords: Image detection, forgery image, copy-paste.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1319
1707 A Novel Microarray Biclustering Algorithm

Authors: Chieh-Yuan Tsai, Chuang-Cheng Chiu

Abstract:

Biclustering aims at identifying several biclusters that reveal potential local patterns from a microarray matrix. A bicluster is a sub-matrix of the microarray consisting of only a subset of genes co-regulates in a subset of conditions. In this study, we extend the motif of subspace clustering to present a K-biclusters clustering (KBC) algorithm for the microarray biclustering issue. Besides minimizing the dissimilarities between genes and bicluster centers within all biclusters, the objective function of the KBC algorithm additionally takes into account how to minimize the residues within all biclusters based on the mean square residue model. In addition, the objective function also maximizes the entropy of conditions to stimulate more conditions to contribute the identification of biclusters. The KBC algorithm adopts the K-means type clustering process to efficiently make the partition of K biclusters be optimized. A set of experiments on a practical microarray dataset are demonstrated to show the performance of the proposed KBC algorithm.

Keywords: Microarray, Biclustering, Subspace clustering, Meansquare residue model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1615
1706 GCM Based Fuzzy Clustering to Identify Homogeneous Climatic Regions of North-East India

Authors: Arup K. Sarma, Jayshree Hazarika

Abstract:

The North-eastern part of India, which receives heavier rainfall than other parts of the subcontinent, is of great concern now-a-days with regard to climate change. High intensity rainfall for short duration and longer dry spell, occurring due to impact of climate change, affects river morphology too. In the present study, an attempt is made to delineate the North-eastern region of India into some homogeneous clusters based on the Fuzzy Clustering concept and to compare the resulting clusters obtained by using conventional methods and nonconventional methods of clustering. The concept of clustering is adapted in view of the fact that, impact of climate change can be studied in a homogeneous region without much variation, which can be helpful in studies related to water resources planning and management. 10 IMD (Indian Meteorological Department) stations, situated in various regions of the North-east, have been selected for making the clusters. The results of the Fuzzy C-Means (FCM) analysis show different clustering patterns for different conditions. From the analysis and comparison it can be concluded that nonconventional method of using GCM data is somehow giving better results than the others. However, further analysis can be done by taking daily data instead of monthly means to reduce the effect of standardization.

Keywords: Climate change, conventional and nonconventional methods of clustering, FCM analysis, homogeneous regions.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2211
1705 Selective Mutation for Genetic Algorithms

Authors: Sung Hoon Jung

Abstract:

In this paper, we propose a selective mutation method for improving the performances of genetic algorithms. In selective mutation, individuals are first ranked and then additionally mutated one bit in a part of their strings which is selected corresponding to their ranks. This selective mutation helps genetic algorithms to fast approach the global optimum and to quickly escape local optima. This results in increasing the performances of genetic algorithms. We measured the effects of selective mutation with four function optimization problems. It was found from extensive experiments that the selective mutation can significantly enhance the performances of genetic algorithms.

Keywords: Genetic algorithm, selective mutation, function optimization

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1836
1704 A Bibliometric Assessment on Sustainability and Clustering

Authors: Fernanda M. Assef, Maria Teresinha A. Steiner, David Gabriel F. de Barros

Abstract:

Review researches are useful in terms of analysis of research problems. Between the types of review documents, we commonly find bibliometric studies. This type of application often helps the global visualization of a research problem and helps academics worldwide to understand the context of a research area better. In this document, a bibliometric view surrounding clustering techniques and sustainability problems is presented. The authors aimed at which issues mostly use clustering techniques and even which sustainability issue would be more impactful on today’s moment of research. During the bibliometric analysis, we found 10 different groups of research in clustering applications for sustainability issues: Energy; Environmental; Non-urban Planning; Sustainable Development; Sustainable Supply Chain; Transport; Urban Planning; Water; Waste Disposal; and, Others. Moreover, by analyzing the citations of each group, it was discovered that the Environmental group could be classified as the most impactful research cluster in the area mentioned. After the content analysis of each paper classified in the environmental group, it was found that the k-means technique is preferred for solving sustainability problems with clustering methods since it appeared the most amongst the documents. The authors finally conclude that a bibliometric assessment could help indicate a gap of researches on waste disposal – which was the group with the least amount of publications – and the most impactful research on environmental problems.

Keywords: Bibliometric assessment, clustering, sustainability, territorial partitioning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 389
1703 MIBiClus: Mutual Information based Biclustering Algorithm

Authors: Neelima Gupta, Seema Aggarwal

Abstract:

Most of the biclustering/projected clustering algorithms are based either on the Euclidean distance or correlation coefficient which capture only linear relationships. However, in many applications, like gene expression data and word-document data, non linear relationships may exist between the objects. Mutual Information between two variables provides a more general criterion to investigate dependencies amongst variables. In this paper, we improve upon our previous algorithm that uses mutual information for biclustering in terms of computation time and also the type of clusters identified. The algorithm is able to find biclusters with mixed relationships and is faster than the previous one. To the best of our knowledge, none of the other existing algorithms for biclustering have used mutual information as a similarity measure. We present the experimental results on synthetic data as well as on the yeast expression data. Biclusters on the yeast data were found to be biologically and statistically significant using GO Tool Box and FuncAssociate.

Keywords: Biclustering, mutual information.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1631
1702 Model Order Reduction of Discrete-Time Systems Using Fuzzy C-Means Clustering

Authors: Anirudha Narain, Dinesh Chandra, Ravindra K. S.

Abstract:

A computationally simple approach of model order reduction for single input single output (SISO) and linear timeinvariant discrete systems modeled in frequency domain is proposed in this paper. Denominator of the reduced order model is determined using fuzzy C-means clustering while the numerator parameters are found by matching time moments and Markov parameters of high order system.

Keywords: Model Order reduction, Discrete-time system, Fuzzy C-Means Clustering, Padé approximation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2813
1701 Unsupervised Outlier Detection in Streaming Data Using Weighted Clustering

Authors: Yogita, Durga Toshniwal

Abstract:

Outlier detection in streaming data is very challenging because streaming data cannot be scanned multiple times and also new concepts may keep evolving. Irrelevant attributes can be termed as noisy attributes and such attributes further magnify the challenge of working with data streams. In this paper, we propose an unsupervised outlier detection scheme for streaming data. This scheme is based on clustering as clustering is an unsupervised data mining task and it does not require labeled data, both density based and partitioning clustering are combined for outlier detection. In this scheme partitioning clustering is also used to assign weights to attributes depending upon their respective relevance and weights are adaptive. Weighted attributes are helpful to reduce or remove the effect of noisy attributes. Keeping in view the challenges of streaming data, the proposed scheme is incremental and adaptive to concept evolution. Experimental results on synthetic and real world data sets show that our proposed approach outperforms other existing approach (CORM) in terms of outlier detection rate, false alarm rate, and increasing percentages of outliers.

Keywords: Concept Evolution, Irrelevant Attributes, Streaming Data, Unsupervised Outlier Detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2637
1700 Selection Initial modes for Belief K-modes Method

Authors: Sarra Ben Hariz, Zied Elouedi, Khaled Mellouli

Abstract:

The belief K-modes method (BKM) approach is a new clustering technique handling uncertainty in the attribute values of objects in both the cluster construction task and the classification one. Like the standard version of this method, the BKM results depend on the chosen initial modes. So, one selection method of initial modes is developed, in this paper, aiming at improving the performances of the BKM approach. Experiments with several sets of real data show that by considered the developed selection initial modes method, the clustering algorithm produces more accurate results.

Keywords: Clustering, Uncertainty, Belief function theory, Belief K-modes Method, Initial modes selection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1813
1699 An Exploratory Environment for Concurrency Control Algorithms

Authors: Jinhua Guo

Abstract:

Designing, implementing, and debugging concurrency control algorithms in a real system is a complex, tedious, and errorprone process. Further, understanding concurrency control algorithms and distributed computations is itself a difficult task. Visualization can help with both of these problems. Thus, we have developed an exploratory environment in which people can prototype and test various versions of concurrency control algorithms, study and debug distributed computations, and view performance statistics of distributed systems. In this paper, we describe the exploratory environment and show how it can be used to explore concurrency control algorithms for the interactive steering of distributed computations.

Keywords: Consistency, Distributed Computing, InteractiveSteering, Simulation, Visualization

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1817
1698 A Cuckoo Search with Differential Evolution for Clustering Microarray Gene Expression Data

Authors: M. Pandi, K. Premalatha

Abstract:

A DNA microarray technology is a collection of microscopic DNA spots attached to a solid surface. Scientists use DNA microarrays to measure the expression levels of large numbers of genes simultaneously or to genotype multiple regions of a genome. Elucidating the patterns hidden in gene expression data offers a tremendous opportunity for an enhanced understanding of functional genomics. However, the large number of genes and the complexity of biological networks greatly increase the challenges of comprehending and interpreting the resulting mass of data, which often consists of millions of measurements. It is handled by clustering which reveals the natural structures and identifying the interesting patterns in the underlying data. In this paper, gene based clustering in gene expression data is proposed using Cuckoo Search with Differential Evolution (CS-DE). The experiment results are analyzed with gene expression benchmark datasets. The results show that CS-DE outperforms CS in benchmark datasets. To find the validation of the clustering results, this work is tested with one internal and one external cluster validation indexes.

Keywords: DNA, Microarray, genomics, Cuckoo Search, Differential Evolution, Gene expression data, Clustering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1483
1697 Identifying Autism Spectrum Disorder Using Optimization-Based Clustering

Authors: Sharifah Mousli, Sona Taheri, Jiayuan He

Abstract:

Autism spectrum disorder (ASD) is a complex developmental condition involving persistent difficulties with social communication, restricted interests, and repetitive behavior. The challenges associated with ASD can interfere with an affected individual’s ability to function in social, academic, and employment settings. Although there is no effective medication known to treat ASD, to our best knowledge, early intervention can significantly improve an affected individual’s overall development. Hence, an accurate diagnosis of ASD at an early phase is essential. The use of machine learning approaches improves and speeds up the diagnosis of ASD. In this paper, we focus on the application of unsupervised clustering methods in ASD, as a large volume of ASD data generated through hospitals, therapy centers, and mobile applications has no pre-existing labels. We conduct a comparative analysis using seven clustering approaches, such as K-means, agglomerative hierarchical, model-based, fuzzy-C-means, affinity propagation, self organizing maps, linear vector quantisation – as well as the recently developed optimization-based clustering (COMSEP-Clust) approach. We evaluate the performances of the clustering methods extensively on real-world ASD datasets encompassing different age groups: toddlers, children, adolescents, and adults. Our experimental results suggest that the COMSEP-Clust approach outperforms the other seven methods in recognizing ASD with well-separated clusters.

Keywords: Autism spectrum disorder, clustering, optimization, unsupervised machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 415
1696 Grocery Customer Behavior Analysis using RFID-based Shopping Paths Data

Authors: In-Chul Jung, Young S. Kwon

Abstract:

Knowing about the customer behavior in a grocery has been a long-standing issue in the retailing industry. The advent of RFID has made it easier to collect moving data for an individual shopper's behavior. Most of the previous studies used the traditional statistical clustering technique to find the major characteristics of customer behavior, especially shopping path. However, in using the clustering technique, due to various spatial constraints in the store, standard clustering methods are not feasible because moving data such as the shopping path should be adjusted in advance of the analysis, which is time-consuming and causes data distortion. To alleviate this problem, we propose a new approach to spatial pattern clustering based on the longest common subsequence. Experimental results using real data obtained from a grocery confirm the good performance of the proposed method in finding the hot spot, dead spot and major path patterns of customer movements.

Keywords: customer path, shopping behavior, exploratoryanalysis, LCS, RFID

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3148
1695 Using Suffix Tree Document Representation in Hierarchical Agglomerative Clustering

Authors: Daniel I. Morariu, Radu G. Cretulescu, Lucian N. Vintan

Abstract:

In text categorization problem the most used method for documents representation is based on words frequency vectors called VSM (Vector Space Model). This representation is based only on words from documents and in this case loses any “word context" information found in the document. In this article we make a comparison between the classical method of document representation and a method called Suffix Tree Document Model (STDM) that is based on representing documents in the Suffix Tree format. For the STDM model we proposed a new approach for documents representation and a new formula for computing the similarity between two documents. Thus we propose to build the suffix tree only for any two documents at a time. This approach is faster, it has lower memory consumption and use entire document representation without using methods for disposing nodes. Also for this method is proposed a formula for computing the similarity between documents, which improves substantially the clustering quality. This representation method was validated using HAC - Hierarchical Agglomerative Clustering. In this context we experiment also the stemming influence in the document preprocessing step and highlight the difference between similarity or dissimilarity measures to find “closer" documents.

Keywords: Text Clustering, Suffix tree documentrepresentation, Hierarchical Agglomerative Clustering

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1910
1694 Granulation using Clustering and Rough Set Theory and its Tree Representation

Authors: Girish Kumar Singh, Sonajharia Minz

Abstract:

Granular computing deals with representation of information in the form of some aggregates and related methods for transformation and analysis for problem solving. A granulation scheme based on clustering and Rough Set Theory is presented with focus on structured conceptualization of information has been presented in this paper. Experiments for the proposed method on four labeled data exhibit good result with reference to classification problem. The proposed granulation technique is semi-supervised imbibing global as well as local information granulation. To represent the results of the attribute oriented granulation a tree structure is proposed in this paper.

Keywords: Granular computing, clustering, Rough sets, datamining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1719
1693 Clustering Approach to Unveiling Relationships between Gene Regulatory Networks

Authors: Hiba Hasan, Khalid Raza

Abstract:

Reverse engineering of genetic regulatory network involves the modeling of the given gene expression data into a form of the network. Computationally it is possible to have the relationships between genes, so called gene regulatory networks (GRNs), that can help to find the genomics and proteomics based diagnostic approach for any disease. In this paper, clustering based method has been used to reconstruct genetic regulatory network from time series gene expression data. Supercoiled data set from Escherichia coli has been taken to demonstrate the proposed method.

Keywords: Gene expression, gene regulatory networks (GRNs), clustering, data preprocessing, network visualization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2152
1692 Genetic Algorithms with Oracle for the Traveling Salesman Problem

Authors: Robin Gremlich, Andreas Hamfelt, Héctor de Pereda, Vladislav Valkovsky

Abstract:

By introducing the concept of Oracle we propose an approach for improving the performance of genetic algorithms for large-scale asymmetric Traveling Salesman Problems. The results have shown that the proposed approach allows overcoming some traditional problems for creating efficient genetic algorithms.

Keywords: Genetic algorithms, Traveling Salesman Problem, optimal decision distribution, oracle.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1722