Search results for: hybrid hierarchical clustering
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2803

Search results for: hybrid hierarchical clustering

2743 3D Mesh Coarsening via Uniform Clustering

Authors: Shuhua Lai, Kairui Chen

Abstract:

In this paper, we present a fast and efficient mesh coarsening algorithm for 3D triangular meshes. Theis approach can be applied to very complex 3D meshes of arbitrary topology and with millions of vertices. The algorithm is based on the clustering of the input mesh elements, which divides the faces of an input mesh into a given number of clusters for clustering purpose by approximating the Centroidal Voronoi Tessellation of the input mesh. Once a clustering is achieved, it provides us an efficient way to construct uniform tessellations, and therefore leads to good coarsening of polygonal meshes. With proliferation of 3D scanners, this coarsening algorithm is particularly useful for reverse engineering applications of 3D models, which in many cases are dense, non-uniform, irregular and arbitrary topology. Examples demonstrating effectiveness of the new algorithm are also included in the paper.

Keywords: coarsening, mesh clustering, shape approximation, mesh simplification

Procedia PDF Downloads 380
2742 The Properties of Na2CO3 and Ti Hybrid Modified LM 6 Alloy Using Ladle Metallurgy

Authors: M. N. Ervina Efzan, H. J. Kong, C. K. Kok

Abstract:

The present work deals with a study on the influences of hybrid modifier on LM 6 added through ladle metallurgy. In this study, LM 6 served as the reference alloy while Na2CO3 and Ti powders were used as the hybrid modifier. The effects of hybrid modifier on the micro structural enhancement of LM 6 were investigated using optical microscope (OM) and Scanning Electron Microscope (SEM). The results showed fragmented Si-rich needles and strength enhanced petal/ globular-like structures without obvious formation of soft primary α-Al and β-Fe-rich inter metallic compound (IMC) after the hybrid modification. Hardness test was conducted to examine the mechanical improvement of hybrid modified LM 6. 10% of hardness improvement was recorded in the hybrid modified LM 6 through ladle metallurgy.

Keywords: Al-Si, hybrid modifier, ladle metallurgy, hardness

Procedia PDF Downloads 395
2741 Identification of Watershed Landscape Character Types in Middle Yangtze River within Wuhan Metropolitan Area

Authors: Huijie Wang, Bin Zhang

Abstract:

In China, the middle reaches of the Yangtze River are well-developed, boasting a wealth of different types of watershed landscape. In this regard, landscape character assessment (LCA) can serve as a basis for protection, management and planning of trans-regional watershed landscape types. For this study, we chose the middle reaches of the Yangtze River in Wuhan metropolitan area as our study site, wherein the water system consists of rich variety in landscape types. We analyzed trans-regional data to cluster and identify types of landscape characteristics at two levels. 55 basins were analyzed as variables with topography, land cover and river system features in order to identify the watershed landscape character types. For watershed landscape, drainage density and degree of curvature were specified as special variables to directly reflect the regional differences of river system features. Then, we used the principal component analysis (PCA) method and hierarchical clustering algorithm based on the geographic information system (GIS) and statistical products and services solution (SPSS) to obtain results for clusters of watershed landscape which were divided into 8 characteristic groups. These groups highlighted watershed landscape characteristics of different river systems as well as key landscape characteristics that can serve as a basis for targeted protection of watershed landscape characteristics, thus helping to rationally develop multi-value landscape resources and promote coordinated development of trans-regions.

Keywords: GIS, hierarchical clustering, landscape character, landscape typology, principal component analysis, watershed

Procedia PDF Downloads 228
2740 Multimodal Optimization of Density-Based Clustering Using Collective Animal Behavior Algorithm

Authors: Kristian Bautista, Ruben A. Idoy

Abstract:

A bio-inspired metaheuristic algorithm inspired by the theory of collective animal behavior (CAB) was integrated to density-based clustering modeled as multimodal optimization problem. The algorithm was tested on synthetic, Iris, Glass, Pima and Thyroid data sets in order to measure its effectiveness relative to CDE-based Clustering algorithm. Upon preliminary testing, it was found out that one of the parameter settings used was ineffective in performing clustering when applied to the algorithm prompting the researcher to do an investigation. It was revealed that fine tuning distance δ3 that determines the extent to which a given data point will be clustered helped improve the quality of cluster output. Even though the modification of distance δ3 significantly improved the solution quality and cluster output of the algorithm, results suggest that there is no difference between the population mean of the solutions obtained using the original and modified parameter setting for all data sets. This implies that using either the original or modified parameter setting will not have any effect towards obtaining the best global and local animal positions. Results also suggest that CDE-based clustering algorithm is better than CAB-density clustering algorithm for all data sets. Nevertheless, CAB-density clustering algorithm is still a good clustering algorithm because it has correctly identified the number of classes of some data sets more frequently in a thirty trial run with a much smaller standard deviation, a potential in clustering high dimensional data sets. Thus, the researcher recommends further investigation in the post-processing stage of the algorithm.

Keywords: clustering, metaheuristics, collective animal behavior algorithm, density-based clustering, multimodal optimization

Procedia PDF Downloads 230
2739 Digital Geography and Geographic Information System in Schools: Towards a Hierarchical Geospatial Approach

Authors: Mary Fargher

Abstract:

This paper examines the opportunities of using a more hierarchical approach to geospatial enquiry in using GIS in school geography. A case is made that it is not just the lack of teacher technological knowledge that is stopping some teachers from using GIS in the classroom but that there is a gap in their understanding of how to link GIS use more specifically to the pedagogy of teaching geography with GIS. Using a hierarchical approach to geospatial enquiry as a theoretical framework, the analysis shows clearly how concepts of spatial distribution, interaction, relation, comparison, and temporal relationships can be used by teachers more explicitly to capitalise on the analytical power of GIS and to construct what can be interpreted as powerful geographical knowledge. An exemplar illustrating this approach on the topic of geo-hazards is then presented for critical analysis and discussion. Recommendations are then made for a model of progression for geography teacher education with GIS through hierarchical geospatial enquiry that takes into account beginner, intermediate, and more advanced users.

Keywords: digital geography, GIS, education, hierarchical geospatial enquiry, powerful geographical knowledge

Procedia PDF Downloads 152
2738 A Hybrid Fuzzy Clustering Approach for Fertile and Unfertile Analysis

Authors: Shima Soltanzadeh, Mohammad Hosain Fazel Zarandi, Mojtaba Barzegar Astanjin

Abstract:

Diagnosis of male infertility by the laboratory tests is expensive and, sometimes it is intolerable for patients. Filling out the questionnaire and then using classification method can be the first step in decision-making process, so only in the cases with a high probability of infertility we can use the laboratory tests. In this paper, we evaluated the performance of four classification methods including naive Bayesian, neural network, logistic regression and fuzzy c-means clustering as a classification, in the diagnosis of male infertility due to environmental factors. Since the data are unbalanced, the ROC curves are most suitable method for the comparison. In this paper, we also have selected the more important features using a filtering method and examined the impact of this feature reduction on the performance of each methods; generally, most of the methods had better performance after applying the filter. We have showed that using fuzzy c-means clustering as a classification has a good performance according to the ROC curves and its performance is comparable to other classification methods like logistic regression.

Keywords: classification, fuzzy c-means, logistic regression, Naive Bayesian, neural network, ROC curve

Procedia PDF Downloads 337
2737 Filtering Intrusion Detection Alarms Using Ant Clustering Approach

Authors: Ghodhbani Salah, Jemili Farah

Abstract:

With the growth of cyber attacks, information safety has become an important issue all over the world. Many firms rely on security technologies such as intrusion detection systems (IDSs) to manage information technology security risks. IDSs are considered to be the last line of defense to secure a network and play a very important role in detecting large number of attacks. However the main problem with today’s most popular commercial IDSs is generating high volume of alerts and huge number of false positives. This drawback has become the main motivation for many research papers in IDS area. Hence, in this paper we present a data mining technique to assist network administrators to analyze and reduce false positive alarms that are produced by an IDS and increase detection accuracy. Our data mining technique is unsupervised clustering method based on hybrid ANT algorithm. This algorithm discovers clusters of intruders’ behavior without prior knowledge of a possible number of classes, then we apply K-means algorithm to improve the convergence of the ANT clustering. Experimental results on real dataset show that our proposed approach is efficient with high detection rate and low false alarm rate.

Keywords: intrusion detection system, alarm filtering, ANT class, ant clustering, intruders’ behaviors, false alarms

Procedia PDF Downloads 403
2736 An Analysis on Clustering Based Gene Selection and Classification for Gene Expression Data

Authors: K. Sathishkumar, V. Thiagarasu

Abstract:

Due to recent advances in DNA microarray technology, it is now feasible to obtain gene expression profiles of tissue samples at relatively low costs. Many scientists around the world use the advantage of this gene profiling to characterize complex biological circumstances and diseases. Microarray techniques that are used in genome-wide gene expression and genome mutation analysis help scientists and physicians in understanding of the pathophysiological mechanisms, in diagnoses and prognoses, and choosing treatment plans. DNA microarray technology has now made it possible to simultaneously monitor the expression levels of thousands of genes during important biological processes and across collections of related samples. Elucidating the patterns hidden in gene expression data offers a tremendous opportunity for an enhanced understanding of functional genomics. However, the large number of genes and the complexity of biological networks greatly increase the challenges of comprehending and interpreting the resulting mass of data, which often consists of millions of measurements. A first step toward addressing this challenge is the use of clustering techniques, which is essential in the data mining process to reveal natural structures and identify interesting patterns in the underlying data. This work presents an analysis of several clustering algorithms proposed to deals with the gene expression data effectively. The existing clustering algorithms like Support Vector Machine (SVM), K-means algorithm and evolutionary algorithm etc. are analyzed thoroughly to identify the advantages and limitations. The performance evaluation of the existing algorithms is carried out to determine the best approach. In order to improve the classification performance of the best approach in terms of Accuracy, Convergence Behavior and processing time, a hybrid clustering based optimization approach has been proposed.

Keywords: microarray technology, gene expression data, clustering, gene Selection

Procedia PDF Downloads 323
2735 Anomaly Detection Based Fuzzy K-Mode Clustering for Categorical Data

Authors: Murat Yazici

Abstract:

Anomalies are irregularities found in data that do not adhere to a well-defined standard of normal behavior. The identification of outliers or anomalies in data has been a subject of study within the statistics field since the 1800s. Over time, a variety of anomaly detection techniques have been developed in several research communities. The cluster analysis can be used to detect anomalies. It is the process of associating data with clusters that are as similar as possible while dissimilar clusters are associated with each other. Many of the traditional cluster algorithms have limitations in dealing with data sets containing categorical properties. To detect anomalies in categorical data, fuzzy clustering approach can be used with its advantages. The fuzzy k-Mode (FKM) clustering algorithm, which is one of the fuzzy clustering approaches, by extension to the k-means algorithm, is reported for clustering datasets with categorical values. It is a form of clustering: each point can be associated with more than one cluster. In this paper, anomaly detection is performed on two simulated data by using the FKM cluster algorithm. As a significance of the study, the FKM cluster algorithm allows to determine anomalies with their abnormality degree in contrast to numerous anomaly detection algorithms. According to the results, the FKM cluster algorithm illustrated good performance in the anomaly detection of data, including both one anomaly and more than one anomaly.

Keywords: fuzzy k-mode clustering, anomaly detection, noise, categorical data

Procedia PDF Downloads 53
2734 Hybrid Algorithm for Frequency Channel Selection in Wi-Fi Networks

Authors: Cesar Hernández, Diego Giral, Ingrid Páez

Abstract:

This article proposes a hybrid algorithm for spectrum allocation in cognitive radio networks based on the algorithms Analytical Hierarchical Process (AHP) and Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS) to improve the performance of the spectrum mobility of secondary users in cognitive radio networks. To calculate the level of performance of the proposed algorithm a comparative analysis between the proposed AHP-TOPSIS, Grey Relational Analysis (GRA) and Multiplicative Exponent Weighting (MEW) algorithm is performed. Four evaluation metrics is used. These metrics are the accumulative average of failed handoffs, the accumulative average of handoffs performed, the accumulative average of transmission bandwidth, and the accumulative average of the transmission delay. The results of the comparison show that AHP-TOPSIS Algorithm provides 2.4 times better performance compared to a GRA Algorithm and, 1.5 times better than the MEW Algorithm.

Keywords: cognitive radio, decision making, hybrid algorithm, spectrum handoff, wireless networks

Procedia PDF Downloads 541
2733 Self-Supervised Attributed Graph Clustering with Dual Contrastive Loss Constraints

Authors: Lijuan Zhou, Mengqi Wu, Changyong Niu

Abstract:

Attributed graph clustering can utilize the graph topology and node attributes to uncover hidden community structures and patterns in complex networks, aiding in the understanding and analysis of complex systems. Utilizing contrastive learning for attributed graph clustering can effectively exploit meaningful implicit relationships between data. However, existing attributed graph clustering methods based on contrastive learning suffer from the following drawbacks: 1) Complex data augmentation increases computational cost, and inappropriate data augmentation may lead to semantic drift. 2) The selection of positive and negative samples neglects the intrinsic cluster structure learned from graph topology and node attributes. Therefore, this paper proposes a method called self-supervised Attributed Graph Clustering with Dual Contrastive Loss constraints (AGC-DCL). Firstly, Siamese Multilayer Perceptron (MLP) encoders are employed to generate two views separately to avoid complex data augmentation. Secondly, the neighborhood contrastive loss is introduced to constrain node representation using local topological structure while effectively embedding attribute information through attribute reconstruction. Additionally, clustering-oriented contrastive loss is applied to fully utilize clustering information in global semantics for discriminative node representations, regarding the cluster centers from two views as negative samples to fully leverage effective clustering information from different views. Comparative clustering results with existing attributed graph clustering algorithms on six datasets demonstrate the superiority of the proposed method.

Keywords: attributed graph clustering, contrastive learning, clustering-oriented, self-supervised learning

Procedia PDF Downloads 53
2732 Integrating Molecular Approaches to Understand Diatom Assemblages in Marine Environment

Authors: Shruti Malviya, Chris Bowler

Abstract:

Environmental processes acting at multiple spatial scales control marine diatom community structure. However, the contribution of local factors (e.g., temperature, salinity, etc.) in these highly complex systems is poorly understood. We, therefore, investigated the diatom community organization as a function of environmental predictors and determined the relative contribution of various environmental factors on the structure of marine diatoms assemblages in the world’s ocean. The dataset for this study was derived from the Tara Oceans expedition, constituting 46 sampling stations from diverse oceanic provinces. The V9 hypervariable region of 18s rDNA was organized into assemblages based on their distributional co-occurrence. Using Ward’s hierarchical clustering, nine clusters were defined. The number of ribotypes and reads varied within each cluster-three clusters (II, VIII and IX) contained only a few reads whereas two of them (I and IV) were highly abundant. Of the nine clusters, seven can be divided into two categories defined by a positive correlation with phosphate and nitrate and a negative correlation with longitude and, the other by a negative correlation with salinity, temperature, latitude and positive correlation with Lyapunov exponent. All the clusters were found to be remarkably dominant in South Pacific Ocean and can be placed into three classes, namely Southern Ocean-South Pacific Ocean clusters (I, II, V, VIII, IX), South Pacific Ocean clusters (IV and VII), and cosmopolitan clusters (III and VI). Our findings showed that co-occurring ribotypes can be significantly associated into recognizable clusters which exhibit a distinct response to environmental variables. This study, thus, demonstrated distinct behavior of each recognized assemblage displaying a taxonomic and environmental signature.

Keywords: assemblage, diatoms, hierarchical clustering, Tara Oceans

Procedia PDF Downloads 202
2731 Decision Trees Constructing Based on K-Means Clustering Algorithm

Authors: Loai Abdallah, Malik Yousef

Abstract:

A domain space for the data should reflect the actual similarity between objects. Since objects belonging to the same cluster usually share some common traits even though their geometric distance might be relatively large. In general, the Euclidean distance of data points that represented by large number of features is not capturing the actual relation between those points. In this study, we propose a new method to construct a different space that is based on clustering to form a new distance metric. The new distance space is based on ensemble clustering (EC). The EC distance space is defined by tracking the membership of the points over multiple runs of clustering algorithm metric. Over this distance, we train the decision trees classifier (DT-EC). The results obtained by applying DT-EC on 10 datasets confirm our hypotheses that embedding the EC space as a distance metric would improve the performance.

Keywords: ensemble clustering, decision trees, classification, K nearest neighbors

Procedia PDF Downloads 190
2730 K-Means Clustering-Based Infinite Feature Selection Method

Authors: Seyyedeh Faezeh Hassani Ziabari, Sadegh Eskandari, Maziar Salahi

Abstract:

Infinite Feature Selection (IFS) algorithm is an efficient feature selection algorithm that selects a subset of features of all sizes (including infinity). In this paper, we present an improved version of it, called clustering IFS (CIFS), by clustering the dataset in advance. To do so, first, we apply the K-means algorithm to cluster the dataset, then we apply IFS. In the CIFS method, the spatial and temporal complexities are reduced compared to the IFS method. Experimental results on 6 datasets show the superiority of CIFS compared to IFS in terms of accuracy, running time, and memory consumption.

Keywords: feature selection, infinite feature selection, clustering, graph

Procedia PDF Downloads 128
2729 Embedded Hybrid Intuition: A Deep Learning and Fuzzy Logic Approach to Collective Creation and Computational Assisted Narratives

Authors: Roberto Cabezas H

Abstract:

The current work shows the methodology developed to create narrative lighting spaces for the multimedia performance piece 'cluster: the vanished paradise.' This empirical research is focused on exploring unconventional roles for machines in subjective creative processes, by delving into the semantics of data and machine intelligence algorithms in hybrid technological, creative contexts to expand epistemic domains trough human-machine cooperation. The creative process in scenic and performing arts is guided mostly by intuition; from that idea, we developed an approach to embed collective intuition in computational creative systems, by joining the properties of Generative Adversarial Networks (GAN’s) and Fuzzy Clustering based on a semi-supervised data creation and analysis pipeline. The model makes use of GAN’s to learn from phenomenological data (data generated from experience with lighting scenography) and algorithmic design data (augmented data by procedural design methods), fuzzy logic clustering is then applied to artificially created data from GAN’s to define narrative transitions built on membership index; this process allowed for the creation of simple and complex spaces with expressive capabilities based on position and light intensity as the parameters to guide the narrative. Hybridization comes not only from the human-machine symbiosis but also on the integration of different techniques for the implementation of the aided design system. Machine intelligence tools as proposed in this work are well suited to redefine collaborative creation by learning to express and expand a conglomerate of ideas and a wide range of opinions for the creation of sensory experiences. We found in GAN’s and Fuzzy Logic an ideal tool to develop new computational models based on interaction, learning, emotion and imagination to expand the traditional algorithmic model of computation.

Keywords: fuzzy clustering, generative adversarial networks, human-machine cooperation, hybrid collective data, multimedia performance

Procedia PDF Downloads 142
2728 An Improved K-Means Algorithm for Gene Expression Data Clustering

Authors: Billel Kenidra, Mohamed Benmohammed

Abstract:

Data mining technique used in the field of clustering is a subject of active research and assists in biological pattern recognition and extraction of new knowledge from raw data. Clustering means the act of partitioning an unlabeled dataset into groups of similar objects. Each group, called a cluster, consists of objects that are similar between themselves and dissimilar to objects of other groups. Several clustering methods are based on partitional clustering. This category attempts to directly decompose the dataset into a set of disjoint clusters leading to an integer number of clusters that optimizes a given criterion function. The criterion function may emphasize a local or a global structure of the data, and its optimization is an iterative relocation procedure. The K-Means algorithm is one of the most widely used partitional clustering techniques. Since K-Means is extremely sensitive to the initial choice of centers and a poor choice of centers may lead to a local optimum that is quite inferior to the global optimum, we propose a strategy to initiate K-Means centers. The improved K-Means algorithm is compared with the original K-Means, and the results prove how the efficiency has been significantly improved.

Keywords: microarray data mining, biological pattern recognition, partitional clustering, k-means algorithm, centroid initialization

Procedia PDF Downloads 190
2727 Clustering Categorical Data Using the K-Means Algorithm and the Attribute’s Relative Frequency

Authors: Semeh Ben Salem, Sami Naouali, Moetez Sallami

Abstract:

Clustering is a well known data mining technique used in pattern recognition and information retrieval. The initial dataset to be clustered can either contain categorical or numeric data. Each type of data has its own specific clustering algorithm. In this context, two algorithms are proposed: the k-means for clustering numeric datasets and the k-modes for categorical datasets. The main encountered problem in data mining applications is clustering categorical dataset so relevant in the datasets. One main issue to achieve the clustering process on categorical values is to transform the categorical attributes into numeric measures and directly apply the k-means algorithm instead the k-modes. In this paper, it is proposed to experiment an approach based on the previous issue by transforming the categorical values into numeric ones using the relative frequency of each modality in the attributes. The proposed approach is compared with a previously method based on transforming the categorical datasets into binary values. The scalability and accuracy of the two methods are experimented. The obtained results show that our proposed method outperforms the binary method in all cases.

Keywords: clustering, unsupervised learning, pattern recognition, categorical datasets, knowledge discovery, k-means

Procedia PDF Downloads 259
2726 Cluster Analysis of Retailers’ Benefits from Their Cooperation with Manufacturers: Business Models Perspective

Authors: M. K. Witek-Hajduk, T. M. Napiórkowski

Abstract:

A number of studies discussed the topic of benefits of retailers-manufacturers cooperation and coopetition. However, there are only few publications focused on the benefits of cooperation and coopetition between retailers and their suppliers of durable consumer goods; especially in the context of business model of cooperating partners. This paper aims to provide a clustering approach to segment retailers selling consumer durables according to the benefits they obtain from their cooperation with key manufacturers and differentiate the said retailers’ in term of the business models of cooperating partners. For the purpose of the study, a survey (with a CATI method) collected data on 603 consumer durables retailers present on the Polish market. Retailers are clustered both, with hierarchical and non-hierarchical methods. Five distinctive groups of consumer durables’ retailers are (based on the studied benefits) identified using the two-stage clustering approach. The clusters are then characterized with a set of exogenous variables, key of which are business models employed by the retailer and its partnering key manufacturer. The paper finds that the a combination of a medium sized retailer classified as an Integrator with a chiefly domestic capital and a manufacturer categorized as a Market Player will yield the highest benefits. On the other side of the spectrum is medium sized Distributor retailer with solely domestic capital – in this case, the business model of the cooperating manufactrer appears to be irreleveant. This paper is the one of the first empirical study using cluster analysis on primary data that defines the types of cooperation between consumer durables’ retailers and manufacturers – their key suppliers. The analysis integrates a perspective of both retailers’ and manufacturers’ business models and matches them with individual and joint benefits.

Keywords: benefits of cooperation, business model, cluster analysis, retailer-manufacturer cooperation

Procedia PDF Downloads 256
2725 Generalization of Clustering Coefficient on Lattice Networks Applied to Criminal Networks

Authors: Christian H. Sanabria-Montaña, Rodrigo Huerta-Quintanilla

Abstract:

A lattice network is a special type of network in which all nodes have the same number of links, and its boundary conditions are periodic. The most basic lattice network is the ring, a one-dimensional network with periodic border conditions. In contrast, the Cartesian product of d rings forms a d-dimensional lattice network. An analytical expression currently exists for the clustering coefficient in this type of network, but the theoretical value is valid only up to certain connectivity value; in other words, the analytical expression is incomplete. Here we obtain analytically the clustering coefficient expression in d-dimensional lattice networks for any link density. Our analytical results show that the clustering coefficient for a lattice network with density of links that tend to 1, leads to the value of the clustering coefficient of a fully connected network. We developed a model on criminology in which the generalized clustering coefficient expression is applied. The model states that delinquents learn the know-how of crime business by sharing knowledge, directly or indirectly, with their friends of the gang. This generalization shed light on the network properties, which is important to develop new models in different fields where network structure plays an important role in the system dynamic, such as criminology, evolutionary game theory, econophysics, among others.

Keywords: clustering coefficient, criminology, generalized, regular network d-dimensional

Procedia PDF Downloads 411
2724 Hierarchical Piecewise Linear Representation of Time Series Data

Authors: Vineetha Bettaiah, Heggere S. Ranganath

Abstract:

This paper presents a Hierarchical Piecewise Linear Approximation (HPLA) for the representation of time series data in which the time series is treated as a curve in the time-amplitude image space. The curve is partitioned into segments by choosing perceptually important points as break points. Each segment between adjacent break points is recursively partitioned into two segments at the best point or midpoint until the error between the approximating line and the original curve becomes less than a pre-specified threshold. The HPLA representation achieves dimensionality reduction while preserving prominent local features and general shape of time series. The representation permits course-fine processing at different levels of details, allows flexible definition of similarity based on mathematical measures or general time series shape, and supports time series data mining operations including query by content, clustering and classification based on whole or subsequence similarity.

Keywords: data mining, dimensionality reduction, piecewise linear representation, time series representation

Procedia PDF Downloads 275
2723 A Relative Entropy Regularization Approach for Fuzzy C-Means Clustering Problem

Authors: Ouafa Amira, Jiangshe Zhang

Abstract:

Clustering is an unsupervised machine learning technique; its aim is to extract the data structures, in which similar data objects are grouped in the same cluster, whereas dissimilar objects are grouped in different clusters. Clustering methods are widely utilized in different fields, such as: image processing, computer vision , and pattern recognition, etc. Fuzzy c-means clustering (fcm) is one of the most well known fuzzy clustering methods. It is based on solving an optimization problem, in which a minimization of a given cost function has been studied. This minimization aims to decrease the dissimilarity inside clusters, where the dissimilarity here is measured by the distances between data objects and cluster centers. The degree of belonging of a data point in a cluster is measured by a membership function which is included in the interval [0, 1]. In fcm clustering, the membership degree is constrained with the condition that the sum of a data object’s memberships in all clusters must be equal to one. This constraint can cause several problems, specially when our data objects are included in a noisy space. Regularization approach took a part in fuzzy c-means clustering technique. This process introduces an additional information in order to solve an ill-posed optimization problem. In this study, we focus on regularization by relative entropy approach, where in our optimization problem we aim to minimize the dissimilarity inside clusters. Finding an appropriate membership degree to each data object is our objective, because an appropriate membership degree leads to an accurate clustering result. Our clustering results in synthetic data sets, gaussian based data sets, and real world data sets show that our proposed model achieves a good accuracy.

Keywords: clustering, fuzzy c-means, regularization, relative entropy

Procedia PDF Downloads 259
2722 Short Association Bundle Atlas for Lateralization Studies from dMRI Data

Authors: C. Román, M. Guevara, P. Salas, D. Duclap, J. Houenou, C. Poupon, J. F. Mangin, P. Guevara

Abstract:

Diffusion Magnetic Resonance Imaging (dMRI) allows the non-invasive study of human brain white matter. From diffusion data, it is possible to reconstruct fiber trajectories using tractography algorithms. Our previous work consists in an automatic method for the identification of short association bundles of the superficial white matter (SWM), based on a whole brain inter-subject hierarchical clustering applied to a HARDI database. The method finds representative clusters of similar fibers, belonging to a group of subjects, according to a distance measure between fibers, using a non-linear registration (DTI-TK). The algorithm performs an automatic labeling based on the anatomy, defined by a cortex mesh parcelated with FreeSurfer software. The clustering was applied to two independent groups of 37 subjects. The clusters resulting from both groups were compared using a restrictive threshold of mean distance between each pair of bundles from different groups, in order to keep reproducible connections. In the left hemisphere, 48 reproducible bundles were found, while 43 bundles where found in the right hemisphere. An inter-hemispheric bundle correspondence was then applied. The symmetric horizontal reflection of the right bundles was calculated, in order to obtain the position of them in the left hemisphere. Next, the intersection between similar bundles was calculated. The pairs of bundles with a fiber intersection percentage higher than 50% were considered similar. The similar bundles between both hemispheres were fused and symmetrized. We obtained 30 common bundles between hemispheres. An atlas was created with the resulting bundles and used to segment 78 new subjects from another HARDI database, using a distance threshold between 6-8 mm according to the bundle length. Finally, a laterality index was calculated based on the bundle volume. Seven bundles of the atlas presented right laterality (IP_SP_1i, LO_LO_1i, Op_Tr_0i, PoC_PoC_0i, PoC_PreC_2i, PreC_SM_0i, y RoMF_RoMF_0i) and one presented left laterality (IP_SP_2i), there is no tendency of lateralization according to the brain region. Many factors can affect the results, like tractography artifacts, subject registration, and bundle segmentation. Further studies are necessary in order to establish the influence of these factors and evaluate SWM laterality.

Keywords: dMRI, hierarchical clustering, lateralization index, tractography

Procedia PDF Downloads 331
2721 Investigation of Cylindrical Multi-Layer Hybrid Plasmonic Waveguides

Authors: Prateeksha Sharma, V. Dinesh Kumar

Abstract:

Performances of cylindrical multilayer hybrid plasmonic waveguides have been investigated in detail considering their structural and material aspects. Characteristics of hybrid metal insulator metal (HMIM) and hybrid insulator metal insulator (HIMI) waveguides have been compared on the basis of propagation length and confinement factor. Necessity of this study is to understand newer kind of waveguides that overcome the limitations of conventional waveguides. Investigation reveals that sub wavelength confinement can be obtained in two low dielectric spacer layers. This study provides gateway for many applications such as nano lasers, interconnects, bio sensors and optical trapping etc.

Keywords: hybrid insulator metal insulator, hybrid metal insulator metal, nano laser, surface plasmon polariton

Procedia PDF Downloads 427
2720 Max-Entropy Feed-Forward Clustering Neural Network

Authors: Xiaohan Bookman, Xiaoyan Zhu

Abstract:

The outputs of non-linear feed-forward neural network are positive, which could be treated as probability when they are normalized to one. If we take Entropy-Based Principle into consideration, the outputs for each sample could be represented as the distribution of this sample for different clusters. Entropy-Based Principle is the principle with which we could estimate the unknown distribution under some limited conditions. As this paper defines two processes in Feed-Forward Neural Network, our limited condition is the abstracted features of samples which are worked out in the abstraction process. And the final outputs are the probability distribution for different clusters in the clustering process. As Entropy-Based Principle is considered into the feed-forward neural network, a clustering method is born. We have conducted some experiments on six open UCI data sets, comparing with a few baselines and applied purity as the measurement. The results illustrate that our method outperforms all the other baselines that are most popular clustering methods.

Keywords: feed-forward neural network, clustering, max-entropy principle, probabilistic models

Procedia PDF Downloads 435
2719 Recurrent Neural Networks with Deep Hierarchical Mixed Structures for Chinese Document Classification

Authors: Zhaoxin Luo, Michael Zhu

Abstract:

In natural languages, there are always complex semantic hierarchies. Obtaining the feature representation based on these complex semantic hierarchies becomes the key to the success of the model. Several RNN models have recently been proposed to use latent indicators to obtain the hierarchical structure of documents. However, the model that only uses a single-layer latent indicator cannot achieve the true hierarchical structure of the language, especially a complex language like Chinese. In this paper, we propose a deep layered model that stacks arbitrarily many RNN layers equipped with latent indicators. After using EM and training it hierarchically, our model solves the computational problem of stacking RNN layers and makes it possible to stack arbitrarily many RNN layers. Our deep hierarchical model not only achieves comparable results to large pre-trained models on the Chinese short text classification problem but also achieves state of art results on the Chinese long text classification problem.

Keywords: nature language processing, recurrent neural network, hierarchical structure, document classification, Chinese

Procedia PDF Downloads 68
2718 Clustering of Extremes in Financial Returns: A Comparison between Developed and Emerging Markets

Authors: Sara Ali Alokley, Mansour Saleh Albarrak

Abstract:

This paper investigates the dependency or clustering of extremes in the financial returns data by estimating the extremal index value θ∈[0,1]. The smaller the value of θ the more clustering we have. Here we apply the method of Ferro and Segers (2003) to estimate the extremal index for a range of threshold values. We compare the dependency structure of extremes in the developed and emerging markets. We use the financial returns of the stock market index in the developed markets of US, UK, France, Germany and Japan and the emerging markets of Brazil, Russia, India, China and Saudi Arabia. We expect that more clustering occurs in the emerging markets. This study will help to understand the dependency structure of the financial returns data.

Keywords: clustring, extremes, returns, dependency, extermal index

Procedia PDF Downloads 405
2717 An Energy Efficient Clustering Approach for Underwater ‎Wireless Sensor Networks

Authors: Mohammad Reza Taherkhani‎

Abstract:

Wireless sensor networks that are used to monitor a special environment, are formed from a large number of sensor nodes. The role of these sensors is to sense special parameters from ambient and to make a connection. In these networks, the most important challenge is the management of energy usage. Clustering is one of the methods that are broadly used to face this challenge. In this paper, a distributed clustering protocol based on learning automata is proposed for underwater wireless sensor networks. The proposed algorithm that is called LA-Clustering forms clusters in the same energy level, based on the energy level of nodes and the connection radius regardless of size and the structure of sensor network. The proposed approach is simulated and is compared with some other protocols with considering some metrics such as network lifetime, number of alive nodes, and number of transmitted data. The simulation results demonstrate the efficiency of the proposed approach.

Keywords: underwater sensor networks, clustering, learning automata, energy consumption

Procedia PDF Downloads 361
2716 Time Series Regression with Meta-Clusters

Authors: Monika Chuchro

Abstract:

This paper presents a preliminary attempt to apply classification of time series using meta-clusters in order to improve the quality of regression models. In this case, clustering was performed as a method to obtain a subgroups of time series data with normal distribution from inflow into waste water treatment plant data which Composed of several groups differing by mean value. Two simple algorithms: K-mean and EM were chosen as a clustering method. The rand index was used to measure the similarity. After simple meta-clustering, regression model was performed for each subgroups. The final model was a sum of subgroups models. The quality of obtained model was compared with the regression model made using the same explanatory variables but with no clustering of data. Results were compared by determination coefficient (R2), measure of prediction accuracy mean absolute percentage error (MAPE) and comparison on linear chart. Preliminary results allows to foresee the potential of the presented technique.

Keywords: clustering, data analysis, data mining, predictive models

Procedia PDF Downloads 466
2715 Electrification Strategy of Hybrid Electric Vehicle as a Solution to Decrease CO2 Emission in Cities

Authors: M. Mourad, K. Mahmoud

Abstract:

Recently hybrid vehicles have become a major concern as one alternative vehicles. This type of hybrid vehicle contributes greatly to reducing pollution. Therefore, this work studies the influence of electrification phase of hybrid electric vehicle on emission of vehicle at different road conditions. To accomplish this investigation, a simulation model was used to evaluate the external characteristics of the hybrid electric vehicle according to variant conditions of road resistances. Therefore, this paper reports a methodology to decrease the vehicle emission especially greenhouse gas emission inside cities. The results show the effect of electrification on vehicle performance characteristics. The results show that CO2 emission of vehicle decreases up to 50.6% according to an urban driving cycle due to applying the electrification strategy for hybrid electric vehicle.

Keywords: electrification strategy, hybrid electric vehicle, driving cycle, CO2 emission

Procedia PDF Downloads 442
2714 Hybrid Concrete Construction (HCC) for Sustainable Infrastructure Development in Nigeria

Authors: Muhammad Bello Ibrahim, M. Auwal Zakari, Aliyu Usman

Abstract:

Hybrid concrete construction (HCC) combines all the benefits of pre-casting with the advantages of cast in-situ construction. Merging the two, as a hybrid structure, results in even greater construction speed, value, and the overall economy. Its variety of uses has gained popularity in the United States and in Europe due to its distinctive benefits. However, the increase of its application in some countries (including Nigeria) has been relatively slow. Several researches have shown that hybrid construction offers an ultra-high performance concrete that offers superior strength, durability and aesthetics with design flexibility and within sustainability credentials, based on the available and economically visible technologies. This paper examines and documents the criterion that will help inform the process of deciding whether or not to adopt hybrid concrete construction (HCC) technology rather than more traditional alternatives. It also the present situation of design, construction and research on hybrid structures.

Keywords: hybrid concrete construction, Nigeria, sustainable infrastructure development, design flexibility

Procedia PDF Downloads 561