Search results for: volatility clustering
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 814

Search results for: volatility clustering

754 Self-Supervised Attributed Graph Clustering with Dual Contrastive Loss Constraints

Authors: Lijuan Zhou, Mengqi Wu, Changyong Niu

Abstract:

Attributed graph clustering can utilize the graph topology and node attributes to uncover hidden community structures and patterns in complex networks, aiding in the understanding and analysis of complex systems. Utilizing contrastive learning for attributed graph clustering can effectively exploit meaningful implicit relationships between data. However, existing attributed graph clustering methods based on contrastive learning suffer from the following drawbacks: 1) Complex data augmentation increases computational cost, and inappropriate data augmentation may lead to semantic drift. 2) The selection of positive and negative samples neglects the intrinsic cluster structure learned from graph topology and node attributes. Therefore, this paper proposes a method called self-supervised Attributed Graph Clustering with Dual Contrastive Loss constraints (AGC-DCL). Firstly, Siamese Multilayer Perceptron (MLP) encoders are employed to generate two views separately to avoid complex data augmentation. Secondly, the neighborhood contrastive loss is introduced to constrain node representation using local topological structure while effectively embedding attribute information through attribute reconstruction. Additionally, clustering-oriented contrastive loss is applied to fully utilize clustering information in global semantics for discriminative node representations, regarding the cluster centers from two views as negative samples to fully leverage effective clustering information from different views. Comparative clustering results with existing attributed graph clustering algorithms on six datasets demonstrate the superiority of the proposed method.

Keywords: attributed graph clustering, contrastive learning, clustering-oriented, self-supervised learning

Procedia PDF Downloads 21
753 Decision Trees Constructing Based on K-Means Clustering Algorithm

Authors: Loai Abdallah, Malik Yousef

Abstract:

A domain space for the data should reflect the actual similarity between objects. Since objects belonging to the same cluster usually share some common traits even though their geometric distance might be relatively large. In general, the Euclidean distance of data points that represented by large number of features is not capturing the actual relation between those points. In this study, we propose a new method to construct a different space that is based on clustering to form a new distance metric. The new distance space is based on ensemble clustering (EC). The EC distance space is defined by tracking the membership of the points over multiple runs of clustering algorithm metric. Over this distance, we train the decision trees classifier (DT-EC). The results obtained by applying DT-EC on 10 datasets confirm our hypotheses that embedding the EC space as a distance metric would improve the performance.

Keywords: ensemble clustering, decision trees, classification, K nearest neighbors

Procedia PDF Downloads 165
752 A Non-parametric Clustering Approach for Multivariate Geostatistical Data

Authors: Francky Fouedjio

Abstract:

Multivariate geostatistical data have become omnipresent in the geosciences and pose substantial analysis challenges. One of them is the grouping of data locations into spatially contiguous clusters so that data locations within the same cluster are more similar while clusters are different from each other, in some sense. Spatially contiguous clusters can significantly improve the interpretation that turns the resulting clusters into meaningful geographical subregions. In this paper, we develop an agglomerative hierarchical clustering approach that takes into account the spatial dependency between observations. It relies on a dissimilarity matrix built from a non-parametric kernel estimator of the spatial dependence structure of data. It integrates existing methods to find the optimal cluster number and to evaluate the contribution of variables to the clustering. The capability of the proposed approach to provide spatially compact, connected and meaningful clusters is assessed using bivariate synthetic dataset and multivariate geochemical dataset. The proposed clustering method gives satisfactory results compared to other similar geostatistical clustering methods.

Keywords: clustering, geostatistics, multivariate data, non-parametric

Procedia PDF Downloads 461
751 Power Iteration Clustering Based on Deflation Technique on Large Scale Graphs

Authors: Taysir Soliman

Abstract:

One of the current popular clustering techniques is Spectral Clustering (SC) because of its advantages over conventional approaches such as hierarchical clustering, k-means, etc. and other techniques as well. However, one of the disadvantages of SC is the time consuming process because it requires computing the eigenvectors. In the past to overcome this disadvantage, a number of attempts have been proposed such as the Power Iteration Clustering (PIC) technique, which is one of versions from SC; some of PIC advantages are: 1) its scalability and efficiency, 2) finding one pseudo-eigenvectors instead of computing eigenvectors, and 3) linear combination of the eigenvectors in linear time. However, its worst disadvantage is an inter-class collision problem because it used only one pseudo-eigenvectors which is not enough. Previous researchers developed Deflation-based Power Iteration Clustering (DPIC) to overcome problems of PIC technique on inter-class collision with the same efficiency of PIC. In this paper, we developed Parallel DPIC (PDPIC) to improve the time and memory complexity which is run on apache spark framework using sparse matrix. To test the performance of PDPIC, we compared it to SC, ESCG, ESCALG algorithms on four small graph benchmark datasets and nine large graph benchmark datasets, where PDPIC proved higher accuracy and better time consuming than other compared algorithms.

Keywords: spectral clustering, power iteration clustering, deflation-based power iteration clustering, Apache spark, large graph

Procedia PDF Downloads 168
750 Agglomerative Hierarchical Clustering Using the Tθ Family of Similarity Measures

Authors: Salima Kouici, Abdelkader Khelladi

Abstract:

In this work, we begin with the presentation of the Tθ family of usual similarity measures concerning multidimensional binary data. Subsequently, some properties of these measures are proposed. Finally, the impact of the use of different inter-elements measures on the results of the Agglomerative Hierarchical Clustering Methods is studied.

Keywords: binary data, similarity measure, Tθ measures, agglomerative hierarchical clustering

Procedia PDF Downloads 463
749 Finding Bicluster on Gene Expression Data of Lymphoma Based on Singular Value Decomposition and Hierarchical Clustering

Authors: Alhadi Bustaman, Soeganda Formalidin, Titin Siswantining

Abstract:

DNA microarray technology is used to analyze thousand gene expression data simultaneously and a very important task for drug development and test, function annotation, and cancer diagnosis. Various clustering methods have been used for analyzing gene expression data. However, when analyzing very large and heterogeneous collections of gene expression data, conventional clustering methods often cannot produce a satisfactory solution. Biclustering algorithm has been used as an alternative approach to identifying structures from gene expression data. In this paper, we introduce a transform technique based on singular value decomposition to identify normalized matrix of gene expression data followed by Mixed-Clustering algorithm and the Lift algorithm, inspired in the node-deletion and node-addition phases proposed by Cheng and Church based on Agglomerative Hierarchical Clustering (AHC). Experimental study on standard datasets demonstrated the effectiveness of the algorithm in gene expression data.

Keywords: agglomerative hierarchical clustering (AHC), biclustering, gene expression data, lymphoma, singular value decomposition (SVD)

Procedia PDF Downloads 258
748 Comparison Study of Capital Protection Risk Management Strategies: Constant Proportion Portfolio Insurance versus Volatility Target Based Investment Strategy with a Guarantee

Authors: Olga Biedova, Victoria Steblovskaya, Kai Wallbaum

Abstract:

In the current capital market environment, investors constantly face the challenge of finding a successful and stable investment mechanism. Highly volatile equity markets and extremely low bond returns bring about the demand for sophisticated yet reliable risk management strategies. Investors are looking for risk management solutions to efficiently protect their investments. This study compares a classic Constant Proportion Portfolio Insurance (CPPI) strategy to a Volatility Target portfolio insurance (VTPI). VTPI is an extension of the well-known Option Based Portfolio Insurance (OBPI) to the case where an embedded option is linked not to a pure risky asset such as e.g., S&P 500, but to a Volatility Target (VolTarget) portfolio. VolTarget strategy is a recently emerged rule-based dynamic asset allocation mechanism where the portfolio’s volatility is kept under control. As a result, a typical VTPI strategy allows higher participation rates in the market due to reduced embedded option prices. In addition, controlled volatility levels eliminate the volatility spread in option pricing, one of the frequently cited reasons for OBPI strategy fall behind CPPI. The strategies are compared within the framework of the stochastic dominance theory based on numerical simulations, rather than on the restrictive assumption of the Black-Scholes type dynamics of the underlying asset. An extended comparative quantitative analysis of performances of the above investment strategies in various market scenarios and within a range of input parameter values is presented.

Keywords: CPPI, portfolio insurance, stochastic dominance, volatility target

Procedia PDF Downloads 147
747 K-Means Clustering-Based Infinite Feature Selection Method

Authors: Seyyedeh Faezeh Hassani Ziabari, Sadegh Eskandari, Maziar Salahi

Abstract:

Infinite Feature Selection (IFS) algorithm is an efficient feature selection algorithm that selects a subset of features of all sizes (including infinity). In this paper, we present an improved version of it, called clustering IFS (CIFS), by clustering the dataset in advance. To do so, first, we apply the K-means algorithm to cluster the dataset, then we apply IFS. In the CIFS method, the spatial and temporal complexities are reduced compared to the IFS method. Experimental results on 6 datasets show the superiority of CIFS compared to IFS in terms of accuracy, running time, and memory consumption.

Keywords: feature selection, infinite feature selection, clustering, graph

Procedia PDF Downloads 103
746 The Effect of Oil Price Uncertainty on Food Price in South Africa

Authors: Goodness C. Aye

Abstract:

This paper examines the effect of the volatility of oil prices on food price in South Africa using monthly data covering the period 2002:01 to 2014:09. Food price is measured by the South African consumer price index for food while oil price is proxied by the Brent crude oil. The study employs the GARCH-in-mean VAR model, which allows the investigation of the effect of a negative and positive shock in oil price volatility on food price. The model also allows the oil price uncertainty to be measured as the conditional standard deviation of a one-step-ahead forecast error of the change in oil price. The results show that oil price uncertainty has a positive and significant effect on food price in South Africa. The responses of food price to a positive and negative oil price shocks is asymmetric.

Keywords: oil price volatility, food price, bivariate, GARCH-in-mean VAR, asymmetric

Procedia PDF Downloads 458
745 Cryptocurrency as a Payment Method in the Tourism Industry: A Comparison of Volatility, Correlation and Portfolio Performance

Authors: Shu-Han Hsu, Jiho Yoon, Chwen Sheu

Abstract:

With the rapidly growing of blockchain technology and cryptocurrency, various industries which include tourism has added in cryptocurrency as the payment method of their transaction. More and more tourism companies accept payments in digital currency for flights, hotel reservations, transportation, and more. For travellers and tourists, using cryptocurrency as a payment method has become a way to circumvent costs and prevent risks. Understanding volatility dynamics and interdependencies between standard currency and cryptocurrency is important for appropriate financial risk management to assist policy-makers and investors in marking more informed decisions. The purpose of this paper has been to understand and explain the risk spillover effects between six major cryptocurrencies and the top ten most traded standard currencies. Using data for the daily closing price of cryptocurrencies and currency exchange rates from 7 August 2015 to 10 December 2019, with 1,133 observations. The diagonal BEKK model was used to analyze the co-volatility spillover effects between cryptocurrency returns and exchange rate returns, which are measures of how the shocks to returns in different assets affect each other’s subsequent volatility. The empirical results show there are co-volatility spillover effects between the cryptocurrency returns and GBP/USD, CNY/USD and MXN/USD exchange rate returns. Therefore, currencies (British Pound, Chinese Yuan and Mexican Peso) and cryptocurrencies (Bitcoin, Ethereum, Ripple, Tether, Litecoin and Stellar) are suitable for constructing a financial portfolio from an optimal risk management perspective and also for dynamic hedging purposes.

Keywords: blockchain, co-volatility effects, cryptocurrencies, diagonal BEKK model, exchange rates, risk spillovers

Procedia PDF Downloads 123
744 An Improved K-Means Algorithm for Gene Expression Data Clustering

Authors: Billel Kenidra, Mohamed Benmohammed

Abstract:

Data mining technique used in the field of clustering is a subject of active research and assists in biological pattern recognition and extraction of new knowledge from raw data. Clustering means the act of partitioning an unlabeled dataset into groups of similar objects. Each group, called a cluster, consists of objects that are similar between themselves and dissimilar to objects of other groups. Several clustering methods are based on partitional clustering. This category attempts to directly decompose the dataset into a set of disjoint clusters leading to an integer number of clusters that optimizes a given criterion function. The criterion function may emphasize a local or a global structure of the data, and its optimization is an iterative relocation procedure. The K-Means algorithm is one of the most widely used partitional clustering techniques. Since K-Means is extremely sensitive to the initial choice of centers and a poor choice of centers may lead to a local optimum that is quite inferior to the global optimum, we propose a strategy to initiate K-Means centers. The improved K-Means algorithm is compared with the original K-Means, and the results prove how the efficiency has been significantly improved.

Keywords: microarray data mining, biological pattern recognition, partitional clustering, k-means algorithm, centroid initialization

Procedia PDF Downloads 168
743 Application of Forward Contract and Crop Insurance as Risk Management Tools of Agriculture: A Case Study in Bangladesh

Authors: M. Bokhtiar Hasan, M. Delowar Hossain, Abu N. M. Wahid

Abstract:

The principal aim of the study is to find out a way to effectively manage the agricultural risks like price volatility, weather risks, and fund shortage. To hedge price volatility, farmers sometimes make contracts with agro-traders but fail to protect themselves effectively due to not having legal framework for such contracts. The study extensively reviews existing literature and find evidence that the majority studies either deal with price volatility or weather risks. If we could address these risks through a single model, it would be more useful to both the farmers and traders. Intrinsically, the authors endeavor in this regard, and the key contribution of this study basically lies in it. Initially, we conduct a small survey aspiring to identify the shortcomings of existing contracts. Later, we propose a model encompassing forward and insurance contracts together where forward contract will be used to hedge price volatility and insurance contract will be used to protect weather risks. Contribution/Originality: The study adds to the existing literature through proposing an integrated model comprising of forward contract and crop insurance which will support both farmers and traders to cope with the agricultural risks like price volatility, weather hazards, and fund shortage. JEL Classifications: O13, Q13

Keywords: agriculture, forward contract, insurance contract, risk management, model

Procedia PDF Downloads 136
742 Clustering Categorical Data Using the K-Means Algorithm and the Attribute’s Relative Frequency

Authors: Semeh Ben Salem, Sami Naouali, Moetez Sallami

Abstract:

Clustering is a well known data mining technique used in pattern recognition and information retrieval. The initial dataset to be clustered can either contain categorical or numeric data. Each type of data has its own specific clustering algorithm. In this context, two algorithms are proposed: the k-means for clustering numeric datasets and the k-modes for categorical datasets. The main encountered problem in data mining applications is clustering categorical dataset so relevant in the datasets. One main issue to achieve the clustering process on categorical values is to transform the categorical attributes into numeric measures and directly apply the k-means algorithm instead the k-modes. In this paper, it is proposed to experiment an approach based on the previous issue by transforming the categorical values into numeric ones using the relative frequency of each modality in the attributes. The proposed approach is compared with a previously method based on transforming the categorical datasets into binary values. The scalability and accuracy of the two methods are experimented. The obtained results show that our proposed method outperforms the binary method in all cases.

Keywords: clustering, unsupervised learning, pattern recognition, categorical datasets, knowledge discovery, k-means

Procedia PDF Downloads 241
741 Generalization of Clustering Coefficient on Lattice Networks Applied to Criminal Networks

Authors: Christian H. Sanabria-Montaña, Rodrigo Huerta-Quintanilla

Abstract:

A lattice network is a special type of network in which all nodes have the same number of links, and its boundary conditions are periodic. The most basic lattice network is the ring, a one-dimensional network with periodic border conditions. In contrast, the Cartesian product of d rings forms a d-dimensional lattice network. An analytical expression currently exists for the clustering coefficient in this type of network, but the theoretical value is valid only up to certain connectivity value; in other words, the analytical expression is incomplete. Here we obtain analytically the clustering coefficient expression in d-dimensional lattice networks for any link density. Our analytical results show that the clustering coefficient for a lattice network with density of links that tend to 1, leads to the value of the clustering coefficient of a fully connected network. We developed a model on criminology in which the generalized clustering coefficient expression is applied. The model states that delinquents learn the know-how of crime business by sharing knowledge, directly or indirectly, with their friends of the gang. This generalization shed light on the network properties, which is important to develop new models in different fields where network structure plays an important role in the system dynamic, such as criminology, evolutionary game theory, econophysics, among others.

Keywords: clustering coefficient, criminology, generalized, regular network d-dimensional

Procedia PDF Downloads 385
740 Modelling Volatility Spillovers and Cross Hedging among Major Agricultural Commodity Futures

Authors: Roengchai Tansuchat, Woraphon Yamaka, Paravee Maneejuk

Abstract:

From the past recent, the global financial crisis, economic instability, and large fluctuation in agricultural commodity price have led to increased concerns about the volatility transmission among them. The problem is further exacerbated by commodities volatility caused by other commodity price fluctuations, hence the decision on hedging strategy has become both costly and useless. Thus, this paper is conducted to analysis the volatility spillover effect among major agriculture including corn, soybeans, wheat and rice, to help the commodity suppliers hedge their portfolios, and manage the risk and co-volatility of them. We provide a switching regime approach to analyzing the issue of volatility spillovers in different economic conditions, namely upturn and downturn economic. In particular, we investigate relationships and volatility transmissions between these commodities in different economic conditions. We purposed a Copula-based multivariate Markov Switching GARCH model with two regimes that depend on an economic conditions and perform simulation study to check the accuracy of our proposed model. In this study, the correlation term in the cross-hedge ratio is obtained from six copula families – two elliptical copulas (Gaussian and Student-t) and four Archimedean copulas (Clayton, Gumbel, Frank, and Joe). We use one-step maximum likelihood estimation techniques to estimate our models and compare the performance of these copula using Akaike information criterion (AIC) and Bayesian information criteria (BIC). In the application study of agriculture commodities, the weekly data used are conducted from 4 January 2005 to 1 September 2016, covering 612 observations. The empirical results indicate that the volatility spillover effects among cereal futures are different, as response of different economic condition. In addition, the results of hedge effectiveness will also suggest the optimal cross hedge strategies in different economic condition especially upturn and downturn economic.

Keywords: agricultural commodity futures, cereal, cross-hedge, spillover effect, switching regime approach

Procedia PDF Downloads 182
739 A Relative Entropy Regularization Approach for Fuzzy C-Means Clustering Problem

Authors: Ouafa Amira, Jiangshe Zhang

Abstract:

Clustering is an unsupervised machine learning technique; its aim is to extract the data structures, in which similar data objects are grouped in the same cluster, whereas dissimilar objects are grouped in different clusters. Clustering methods are widely utilized in different fields, such as: image processing, computer vision , and pattern recognition, etc. Fuzzy c-means clustering (fcm) is one of the most well known fuzzy clustering methods. It is based on solving an optimization problem, in which a minimization of a given cost function has been studied. This minimization aims to decrease the dissimilarity inside clusters, where the dissimilarity here is measured by the distances between data objects and cluster centers. The degree of belonging of a data point in a cluster is measured by a membership function which is included in the interval [0, 1]. In fcm clustering, the membership degree is constrained with the condition that the sum of a data object’s memberships in all clusters must be equal to one. This constraint can cause several problems, specially when our data objects are included in a noisy space. Regularization approach took a part in fuzzy c-means clustering technique. This process introduces an additional information in order to solve an ill-posed optimization problem. In this study, we focus on regularization by relative entropy approach, where in our optimization problem we aim to minimize the dissimilarity inside clusters. Finding an appropriate membership degree to each data object is our objective, because an appropriate membership degree leads to an accurate clustering result. Our clustering results in synthetic data sets, gaussian based data sets, and real world data sets show that our proposed model achieves a good accuracy.

Keywords: clustering, fuzzy c-means, regularization, relative entropy

Procedia PDF Downloads 245
738 The Impact of the Global Financial Crises on MILA Stock Markets

Authors: Miriam Sosa, Edgar Ortiz, Alejandra Cabello

Abstract:

This paper examines the volatility changes and leverage effects of the MILA stock markets and their changes since the 2007 global financial crisis. This group integrates the stock markets from Chile, Colombia, Mexico and Peru. Volatility changes and leverage effects are tested with a symmetric GARCH (1,1) and asymmetric TARCH (1,1) models with a dummy variable in the variance equation. Daily closing prices of the stock indexes of Chile (IPSA), Colombia (COLCAP), Mexico (IPC) and Peru (IGBVL) are examined for the period 2003:01 to 2015:02. The evidence confirms the presence of an overall increase in asymmetric market volatility in the Peruvian share market since the 2007 crisis.

Keywords: financial crisis, Latin American Integrated Market, TARCH, GARCH

Procedia PDF Downloads 255
737 Mean and Volatility Spillover between US Stocks Market and Crude Oil Markets

Authors: Kamel Malik Bensafta, Gervasio Bensafta

Abstract:

The purpose of this paper is to investigate the relationship between oil prices and socks markets. The empirical analysis in this paper is conducted within the context of Multivariate GARCH models, using a transform version of the so-called BEKK parameterization. We show that mean and uncertainty of US market are transmitted to oil market and European market. We also identify an important transmission from WTI prices to Brent Prices.

Keywords: oil volatility, stock markets, MGARCH, transmission, structural break

Procedia PDF Downloads 466
736 Economic Growth: The Nexus of Oil Price Volatility and Renewable Energy Resources among Selected Developed and Developing Economies

Authors: Muhammad Siddique, Volodymyr Lugovskyy

Abstract:

This paper explores how nations might mitigate the unfavorable impacts of oil price volatility on economic growth by switching to renewable energy sources. The impacts of uncertain factor prices on economic activity are examined by looking at the Realized Volatility (RV) of oil prices rather than the more traditional method of looking at oil price shocks. The United States of America (USA), China (C), India (I), United Kingdom (UK), Germany (G), Malaysia (M), and Pakistan (P) are all included to round out the traditional literature's examination of selected nations, which focuses on oil-importing and exporting economies. Granger Causality Tests (GCT), Impulse Response Functions (IRF), and Variance Decompositions (VD) demonstrate that in a Vector Auto-Regressive (VAR) scenario, the negative impacts of oil price volatility extend beyond what can be explained by oil price shocks alone for all of the nations in the sample. Different nations have different levels of vulnerability to changes in oil prices and other factors that may play a role in a sectoral composition and the energy mix. The conventional method, which only takes into account whether a country is a net oil importer or exporter, is inadequate. The potential economic advantages of initiatives to decouple the macroeconomy from volatile commodities markets are shown through simulations of volatility shocks in alternative energy mixes (with greater proportions of renewables). It is determined that in developing countries like Pakistan, increasing the use of renewable energy sources might lessen an economy's sensitivity to changes in oil prices; nonetheless, a country-specific study is required to identify particular policy actions. In sum, the research provides an innovative justification for mitigating economic growth's dependence on stable oil prices in our sample countries.

Keywords: oil price volatility, renewable energy, economic growth, developed and developing economies

Procedia PDF Downloads 62
735 Statistical Inferences for GQARCH-It\^{o} - Jumps Model Based on The Realized Range Volatility

Authors: Fu Jinyu, Lin Jinguan

Abstract:

This paper introduces a novel approach that unifies two types of models: one is the continuous-time jump-diffusion used to model high-frequency data, and the other is discrete-time GQARCH employed to model low-frequency financial data by embedding the discrete GQARCH structure with jumps in the instantaneous volatility process. This model is named “GQARCH-It\^{o} -Jumps mode.” We adopt the realized range-based threshold estimation for high-frequency financial data rather than the realized return-based volatility estimators, which entail the loss of intra-day information of the price movement. Meanwhile, a quasi-likelihood function for the low-frequency GQARCH structure with jumps is developed for the parametric estimate. The asymptotic theories are mainly established for the proposed estimators in the case of finite activity jumps. Moreover, simulation studies are implemented to check the finite sample performance of the proposed methodology. Specifically, it is demonstrated that how our proposed approaches can be practically used on some financial data.

Keywords: It\^{o} process, GQARCH, leverage effects, threshold, realized range-based volatility estimator, quasi-maximum likelihood estimate

Procedia PDF Downloads 135
734 Combining the Dynamic Conditional Correlation and Range-GARCH Models to Improve Covariance Forecasts

Authors: Piotr Fiszeder, Marcin Fałdziński, Peter Molnár

Abstract:

The dynamic conditional correlation model of Engle (2002) is one of the most popular multivariate volatility models. However, this model is based solely on closing prices. It has been documented in the literature that the high and low price of the day can be used in an efficient volatility estimation. We, therefore, suggest a model which incorporates high and low prices into the dynamic conditional correlation framework. Empirical evaluation of this model is conducted on three datasets: currencies, stocks, and commodity exchange-traded funds. The utilisation of realized variances and covariances as proxies for true variances and covariances allows us to reach a strong conclusion that our model outperforms not only the standard dynamic conditional correlation model but also a competing range-based dynamic conditional correlation model.

Keywords: volatility, DCC model, high and low prices, range-based models, covariance forecasting

Procedia PDF Downloads 157
733 Max-Entropy Feed-Forward Clustering Neural Network

Authors: Xiaohan Bookman, Xiaoyan Zhu

Abstract:

The outputs of non-linear feed-forward neural network are positive, which could be treated as probability when they are normalized to one. If we take Entropy-Based Principle into consideration, the outputs for each sample could be represented as the distribution of this sample for different clusters. Entropy-Based Principle is the principle with which we could estimate the unknown distribution under some limited conditions. As this paper defines two processes in Feed-Forward Neural Network, our limited condition is the abstracted features of samples which are worked out in the abstraction process. And the final outputs are the probability distribution for different clusters in the clustering process. As Entropy-Based Principle is considered into the feed-forward neural network, a clustering method is born. We have conducted some experiments on six open UCI data sets, comparing with a few baselines and applied purity as the measurement. The results illustrate that our method outperforms all the other baselines that are most popular clustering methods.

Keywords: feed-forward neural network, clustering, max-entropy principle, probabilistic models

Procedia PDF Downloads 417
732 Clustering of Extremes in Financial Returns: A Comparison between Developed and Emerging Markets

Authors: Sara Ali Alokley, Mansour Saleh Albarrak

Abstract:

This paper investigates the dependency or clustering of extremes in the financial returns data by estimating the extremal index value θ∈[0,1]. The smaller the value of θ the more clustering we have. Here we apply the method of Ferro and Segers (2003) to estimate the extremal index for a range of threshold values. We compare the dependency structure of extremes in the developed and emerging markets. We use the financial returns of the stock market index in the developed markets of US, UK, France, Germany and Japan and the emerging markets of Brazil, Russia, India, China and Saudi Arabia. We expect that more clustering occurs in the emerging markets. This study will help to understand the dependency structure of the financial returns data.

Keywords: clustring, extremes, returns, dependency, extermal index

Procedia PDF Downloads 379
731 An Energy Efficient Clustering Approach for Underwater ‎Wireless Sensor Networks

Authors: Mohammad Reza Taherkhani‎

Abstract:

Wireless sensor networks that are used to monitor a special environment, are formed from a large number of sensor nodes. The role of these sensors is to sense special parameters from ambient and to make a connection. In these networks, the most important challenge is the management of energy usage. Clustering is one of the methods that are broadly used to face this challenge. In this paper, a distributed clustering protocol based on learning automata is proposed for underwater wireless sensor networks. The proposed algorithm that is called LA-Clustering forms clusters in the same energy level, based on the energy level of nodes and the connection radius regardless of size and the structure of sensor network. The proposed approach is simulated and is compared with some other protocols with considering some metrics such as network lifetime, number of alive nodes, and number of transmitted data. The simulation results demonstrate the efficiency of the proposed approach.

Keywords: underwater sensor networks, clustering, learning automata, energy consumption

Procedia PDF Downloads 337
730 Time Series Regression with Meta-Clusters

Authors: Monika Chuchro

Abstract:

This paper presents a preliminary attempt to apply classification of time series using meta-clusters in order to improve the quality of regression models. In this case, clustering was performed as a method to obtain a subgroups of time series data with normal distribution from inflow into waste water treatment plant data which Composed of several groups differing by mean value. Two simple algorithms: K-mean and EM were chosen as a clustering method. The rand index was used to measure the similarity. After simple meta-clustering, regression model was performed for each subgroups. The final model was a sum of subgroups models. The quality of obtained model was compared with the regression model made using the same explanatory variables but with no clustering of data. Results were compared by determination coefficient (R2), measure of prediction accuracy mean absolute percentage error (MAPE) and comparison on linear chart. Preliminary results allows to foresee the potential of the presented technique.

Keywords: clustering, data analysis, data mining, predictive models

Procedia PDF Downloads 447
729 A Learning Automata Based Clustering Approach for Underwater ‎Sensor Networks to Reduce Energy Consumption

Authors: Motahareh Fadaei

Abstract:

Wireless sensor networks that are used to monitor a special environment, are formed from a large number of sensor nodes. The role of these sensors is to sense special parameters from ambient and to make connection. In these networks, the most important challenge is the management of energy usage. Clustering is one of the methods that are broadly used to face this challenge. In this paper, a distributed clustering protocol based on learning automata is proposed for underwater wireless sensor networks. The proposed algorithm that is called LA-Clustering forms clusters in the same energy level, based on the energy level of nodes and the connection radius regardless of size and the structure of sensor network. The proposed approach is simulated and is compared with some other protocols with considering some metrics such as network lifetime, number of alive nodes, and number of transmitted data. The simulation results demonstrate the efficiency of the proposed approach.

Keywords: clustering, energy consumption‎, learning automata, underwater sensor networks

Procedia PDF Downloads 296
728 Knowledge Representation Based on Interval Type-2 CFCM Clustering

Authors: Lee Myung-Won, Kwak Keun-Chang

Abstract:

This paper is concerned with knowledge representation and extraction of fuzzy if-then rules using Interval Type-2 Context-based Fuzzy C-Means clustering (IT2-CFCM) with the aid of fuzzy granulation. This proposed clustering algorithm is based on information granulation in the form of IT2 based Fuzzy C-Means (IT2-FCM) clustering and estimates the cluster centers by preserving the homogeneity between the clustered patterns from the IT2 contexts produced in the output space. Furthermore, we can obtain the automatic knowledge representation in the design of Radial Basis Function Networks (RBFN), Linguistic Model (LM), and Adaptive Neuro-Fuzzy Networks (ANFN) from the numerical input-output data pairs. We shall focus on a design of ANFN in this paper. The experimental results on an estimation problem of energy performance reveal that the proposed method showed a good knowledge representation and performance in comparison with the previous works.

Keywords: IT2-FCM, IT2-CFCM, context-based fuzzy clustering, adaptive neuro-fuzzy network, knowledge representation

Procedia PDF Downloads 299
727 Intelligent Recognition of Diabetes Disease via FCM Based Attribute Weighting

Authors: Kemal Polat

Abstract:

In this paper, an attribute weighting method called fuzzy C-means clustering based attribute weighting (FCMAW) for classification of Diabetes disease dataset has been used. The aims of this study are to reduce the variance within attributes of diabetes dataset and to improve the classification accuracy of classifier algorithm transforming from non-linear separable datasets to linearly separable datasets. Pima Indians Diabetes dataset has two classes including normal subjects (500 instances) and diabetes subjects (268 instances). Fuzzy C-means clustering is an improved version of K-means clustering method and is one of most used clustering methods in data mining and machine learning applications. In this study, as the first stage, fuzzy C-means clustering process has been used for finding the centers of attributes in Pima Indians diabetes dataset and then weighted the dataset according to the ratios of the means of attributes to centers of theirs. Secondly, after weighting process, the classifier algorithms including support vector machine (SVM) and k-NN (k- nearest neighbor) classifiers have been used for classifying weighted Pima Indians diabetes dataset. Experimental results show that the proposed attribute weighting method (FCMAW) has obtained very promising results in the classification of Pima Indians diabetes dataset.

Keywords: fuzzy C-means clustering, fuzzy C-means clustering based attribute weighting, Pima Indians diabetes, SVM

Procedia PDF Downloads 389
726 Method of Cluster Based Cross-Domain Knowledge Acquisition for Biologically Inspired Design

Authors: Shen Jian, Hu Jie, Ma Jin, Peng Ying Hong, Fang Yi, Liu Wen Hai

Abstract:

Biologically inspired design inspires inventions and new technologies in the field of engineering by mimicking functions, principles, and structures in the biological domain. To deal with the obstacles of cross-domain knowledge acquisition in the existing biologically inspired design process, functional semantic clustering based on functional feature semantic correlation and environmental constraint clustering composition based on environmental characteristic constraining adaptability are proposed. A knowledge cell clustering algorithm and the corresponding prototype system is developed. Finally, the effectiveness of the method is verified by the visual prosthetic device design.

Keywords: knowledge clustering, knowledge acquisition, knowledge based engineering, knowledge cell, biologically inspired design

Procedia PDF Downloads 408
725 Pattern Recognition Using Feature Based Die-Map Clustering in the Semiconductor Manufacturing Process

Authors: Seung Hwan Park, Cheng-Sool Park, Jun Seok Kim, Youngji Yoo, Daewoong An, Jun-Geol Baek

Abstract:

Depending on the big data analysis becomes important, yield prediction using data from the semiconductor process is essential. In general, yield prediction and analysis of the causes of the failure are closely related. The purpose of this study is to analyze pattern affects the final test results using a die map based clustering. Many researches have been conducted using die data from the semiconductor test process. However, analysis has limitation as the test data is less directly related to the final test results. Therefore, this study proposes a framework for analysis through clustering using more detailed data than existing die data. This study consists of three phases. In the first phase, die map is created through fail bit data in each sub-area of die. In the second phase, clustering using map data is performed. And the third stage is to find patterns that affect final test result. Finally, the proposed three steps are applied to actual industrial data and experimental results showed the potential field application.

Keywords: die-map clustering, feature extraction, pattern recognition, semiconductor manufacturing process

Procedia PDF Downloads 379