Search results for: cluster ensemble methods
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 4318

Search results for: cluster ensemble methods

4228 Influence of Ambiguity Cluster on Quality Improvement in Image Compression

Authors: Safaa Al-Ali, Ahmad Shahin, Fadi Chakik

Abstract:

Image coding based on clustering provides immediate access to targeted features of interest in a high quality decoded image. This approach is useful for intelligent devices, as well as for multimedia content-based description standards. The result of image clustering cannot be precise in some positions especially on pixels with edge information which produce ambiguity among the clusters. Even with a good enhancement operator based on PDE, the quality of the decoded image will highly depend on the clustering process. In this paper, we introduce an ambiguity cluster in image coding to represent pixels with vagueness properties. The presence of such cluster allows preserving some details inherent to edges as well for uncertain pixels. It will also be very useful during the decoding phase in which an anisotropic diffusion operator, such as Perona-Malik, enhances the quality of the restored image. This work also offers a comparative study to demonstrate the effectiveness of a fuzzy clustering technique in detecting the ambiguity cluster without losing lot of the essential image information. Several experiments have been carried out to demonstrate the usefulness of ambiguity concept in image compression. The coding results and the performance of the proposed algorithms are discussed in terms of the peak signal-tonoise ratio and the quantity of ambiguous pixels.

Keywords: Ambiguity Cluster, Anisotropic Diffusion, Fuzzy Clustering, Image Compression.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1523
4227 A Balanced Cost Cluster-Heads Selection Algorithm for Wireless Sensor Networks

Authors: Ouadoudi Zytoune, Youssef Fakhri, Driss Aboutajdine

Abstract:

This paper focuses on reducing the power consumption of wireless sensor networks. Therefore, a communication protocol named LEACH (Low-Energy Adaptive Clustering Hierarchy) is modified. We extend LEACHs stochastic cluster-head selection algorithm by a modifying the probability of each node to become cluster-head based on its required energy to transmit to the sink. We present an efficient energy aware routing algorithm for the wireless sensor networks. Our contribution consists in rotation selection of clusterheads considering the remoteness of the nodes to the sink, and then, the network nodes residual energy. This choice allows a best distribution of the transmission energy in the network. The cluster-heads selection algorithm is completely decentralized. Simulation results show that the energy is significantly reduced compared with the previous clustering based routing algorithm for the sensor networks.

Keywords: Wireless Sensor Networks, Energy efficiency, WirelessCommunications, Clustering-based algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2597
4226 Study of Functional Relevant Conformational Mobility of β-2 Adrenoreceptor by Means of Molecular Dynamics Simulation

Authors: G. V. Novikov, V. S. Sivozhelezov, S. S. Kolesnikov, K. V. Shaitan

Abstract:

The study reports about the influence of binding of orthosteric ligands as well as point mutations on the conformational dynamics of β-2-adrenoreceptor. Using molecular dynamics simulation we found that there was a little fraction of active states of the receptor in its apo (ligand free) ensemble corresponded to its constitutive activity. Analysis of MD trajectories indicated that such spontaneous activation of the receptor is accompanied by the motion in intracellular part of its alpha-helices. Thus receptor’s constitutive activity directly results from its conformational dynamics. On the other hand the binding of a full agonist resulted in a significant shift of the initial equilibrium towards its active state. Finally, the binding of the inverse agonist stabilized the receptor in its inactive state. It is likely that the binding of inverse agonists might be a universal way of constitutive activity inhibition in vivo. Our results indicate that ligand binding redistribute pre-existing conformational degrees of freedom (in accordance to the Monod-Wyman-Changeux-Model) of the receptor rather than cause induced fit in it. Therefore, the ensemble of biologically relevant receptor conformations is encoded in its spatial structure, and individual conformations from that ensemble might be used by the cell in conformity with the physiological behavior.

Keywords: Seven-transmembrane receptors, constitutive activity, activation, x-ray crystallography, principal component analysis, molecular dynamics simulation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3908
4225 Meta Random Forests

Authors: Praveen Boinee, Alessandro De Angelis, Gian Luca Foresti

Abstract:

Leo Breimans Random Forests (RF) is a recent development in tree based classifiers and quickly proven to be one of the most important algorithms in the machine learning literature. It has shown robust and improved results of classifications on standard data sets. Ensemble learning algorithms such as AdaBoost and Bagging have been in active research and shown improvements in classification results for several benchmarking data sets with mainly decision trees as their base classifiers. In this paper we experiment to apply these Meta learning techniques to the random forests. We experiment the working of the ensembles of random forests on the standard data sets available in UCI data sets. We compare the original random forest algorithm with their ensemble counterparts and discuss the results.

Keywords: Random Forests [RF], ensembles, UCI.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2644
4224 Creation of Greater Mekong Subregion Regional Competitiveness through Cluster Mapping

Authors: Danuvasin Charoen

Abstract:

This research investigates cluster development in the area called the Greater Mekong Subregion (GMS), which consists of Thailand, the People’s Republic of China (PRC), the Yunnan Province and Guangxi Zhuang Autonomous Region, Myanmar, the Lao People’s Democratic Republic (Lao PDR), Cambodia, and Vietnam. The study utilized Porter’s competitiveness theory and the cluster mapping approach to analyze the competitiveness of the region. The data collection consists of interviews, focus groups, and the analysis of secondary data. The findings identify some evidence of cluster development in the GMS; however, there is no clear indication of collaboration among the components in the clusters. GMS clusters tend to be stand-alone. The clusters in Vietnam, Lao PDR, Myanmar, and Cambodia tend to be labor intensive, whereas the clusters in Thailand and the PRC (Yunnan) have the potential to successfully develop into innovative clusters. The collaboration and integration among the clusters in the GMS area are promising, though it could take a long time. The most likely relationship between the GMS countries could be, for example, suppliers of the low-end, labor-intensive products will be located in the low income countries such as Myanmar, Lao PDR, and Cambodia, and these countries will be providing input materials for innovative clusters in the middle income countries such as Thailand and the PRC.

Keywords: Greater Mekong Subregion, competitiveness, cluster, development.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1014
4223 Idiopathic Constipation can be Subdivided in Clinical Subtypes: Data Mining by Cluster Analysis on a Population based Study

Authors: Mauro Giacomini, Stefania Bertone, Carlo Mansi, Pietro Dulbecco, Vincenzo Savarino

Abstract:

The prevalence of non organic constipation differs from country to country and the reliability of the estimate rates is uncertain. Moreover, the clinical relevance of subdividing the heterogeneous functional constipation disorders into pre-defined subgroups is largely unknown.. Aim: to estimate the prevalence of constipation in a population-based sample and determine whether clinical subgroups can be identified. An age and gender stratified sample population from 5 Italian cities was evaluated using a previously validated questionnaire. Data mining by cluster analysis was used to determine constipation subgroups. Results: 1,500 complete interviews were obtained from 2,083 contacted households (72%). Self-reported constipation correlated poorly with symptombased constipation found in 496 subjects (33.1%). Cluster analysis identified four constipation subgroups which correlated to subgroups identified according to pre-defined symptom criteria. Significant differences in socio-demographics and lifestyle were observed among subgroups.

Keywords: Cluster analysis, constipation, data mining, statistical analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1252
4222 Determining Cluster Boundaries Using Particle Swarm Optimization

Authors: Anurag Sharma, Christian W. Omlin

Abstract:

Self-organizing map (SOM) is a well known data reduction technique used in data mining. Data visualization can reveal structure in data sets that is otherwise hard to detect from raw data alone. However, interpretation through visual inspection is prone to errors and can be very tedious. There are several techniques for the automatic detection of clusters of code vectors found by SOMs, but they generally do not take into account the distribution of code vectors; this may lead to unsatisfactory clustering and poor definition of cluster boundaries, particularly where the density of data points is low. In this paper, we propose the use of a generic particle swarm optimization (PSO) algorithm for finding cluster boundaries directly from the code vectors obtained from SOMs. The application of our method to unlabeled call data for a mobile phone operator demonstrates its feasibility. PSO algorithm utilizes U-matrix of SOMs to determine cluster boundaries; the results of this novel automatic method correspond well to boundary detection through visual inspection of code vectors and k-means algorithm.

Keywords: Particle swarm optimization, self-organizing maps, clustering, data mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1672
4221 Innovation Development of Food Market of Kazakhstan

Authors: G.B. Nurlikhina, G.N. Azhimetova, R. M. Ashimova

Abstract:

Currently, one of the main directions is developing of development based on the clustering of economic operations of Kazakhstan, providing for the organization and concentration of production capacity in one region or the most optimal system. In the modern economic literature clustering is regarded as one of the most effective tools to ensure competitive businesses, and improve their business itself.

Keywords: A cluster, food market, innovation cluster.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2204
4220 UDCA: An Energy Efficient Clustering Algorithm for Wireless Sensor Network

Authors: Boregowda S.B., Hemanth Kumar A.R. Babu N.V, Puttamadappa C., And H.S Mruthyunjaya

Abstract:

In the past few years, the use of wireless sensor networks (WSNs) potentially increased in applications such as intrusion detection, forest fire detection, disaster management and battle field. Sensor nodes are generally battery operated low cost devices. The key challenge in the design and operation of WSNs is to prolong the network life time by reducing the energy consumption among sensor nodes. Node clustering is one of the most promising techniques for energy conservation. This paper presents a novel clustering algorithm which maximizes the network lifetime by reducing the number of communication among sensor nodes. This approach also includes new distributed cluster formation technique that enables self-organization of large number of nodes, algorithm for maintaining constant number of clusters by prior selection of cluster head and rotating the role of cluster head to evenly distribute the energy load among all sensor nodes.

Keywords: Clustering algorithms, Cluster head, Energy consumption, Sensor nodes, and Wireless sensor networks.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2328
4219 Specific Frequency of Globular Clusters in Different Galaxy Types

Authors: Ahmed H. Abdullah, Pavel Kroupa

Abstract:

Globular clusters (GC) are important objects for tracing the early evolution of a galaxy. We study the correlation between the cluster population and the global properties of the host galaxy. We found that the correlation between cluster population (NGC) and the baryonic mass (Mb) of the host galaxy are best described as 10 −5.6038Mb. In order to understand the origin of the U -shape relation between the GC specific frequency (SN) and Mb (caused by the high value of SN for dwarfs galaxies and giant ellipticals and a minimum SN for intermediate mass galaxies≈ 1010M), we derive a theoretical model for the specific frequency (SNth). The theoretical model for SNth is based on the slope of the power-law embedded cluster mass function (β) and different time scale (Δt) of the forming galaxy. Our results show a good agreement between the observation and the model at a certain β and Δt. The model seems able to reproduce higher value of SNth of β = 1.5 at the midst formation time scale.

Keywords: Galaxies, dwarf, globular cluster, specific frequency, formation time scale.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 754
4218 Fuzzy Clustering of Categorical Attributes and its Use in Analyzing Cultural Data

Authors: George E. Tsekouras, Dimitris Papageorgiou, Sotiris Kotsiantis, Christos Kalloniatis, Panagiotis Pintelas

Abstract:

We develop a three-step fuzzy logic-based algorithm for clustering categorical attributes, and we apply it to analyze cultural data. In the first step the algorithm employs an entropy-based clustering scheme, which initializes the cluster centers. In the second step we apply the fuzzy c-modes algorithm to obtain a fuzzy partition of the data set, and the third step introduces a novel cluster validity index, which decides the final number of clusters.

Keywords: Categorical data, cultural data, fuzzy logic clustering, fuzzy c-modes, cluster validity index.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1661
4217 A Trainable Neural Network Ensemble for ECG Beat Classification

Authors: Atena Sajedin, Shokoufeh Zakernejad, Soheil Faridi, Mehrdad Javadi, Reza Ebrahimpour

Abstract:

This paper illustrates the use of a combined neural network model for classification of electrocardiogram (ECG) beats. We present a trainable neural network ensemble approach to develop customized electrocardiogram beat classifier in an effort to further improve the performance of ECG processing and to offer individualized health care. We process a three stage technique for detection of premature ventricular contraction (PVC) from normal beats and other heart diseases. This method includes a denoising, a feature extraction and a classification. At first we investigate the application of stationary wavelet transform (SWT) for noise reduction of the electrocardiogram (ECG) signals. Then feature extraction module extracts 10 ECG morphological features and one timing interval feature. Then a number of multilayer perceptrons (MLPs) neural networks with different topologies are designed. The performance of the different combination methods as well as the efficiency of the whole system is presented. Among them, Stacked Generalization as a proposed trainable combined neural network model possesses the highest recognition rate of around 95%. Therefore, this network proves to be a suitable candidate in ECG signal diagnosis systems. ECG samples attributing to the different ECG beat types were extracted from the MIT-BIH arrhythmia database for the study.

Keywords: ECG beat Classification; Combining Classifiers;Premature Ventricular Contraction (PVC); Multi Layer Perceptrons;Wavelet Transform

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2164
4216 Performance Assessment of Multi-Level Ensemble for Multi-Class Problems

Authors: Rodolfo Lorbieski, Silvia Modesto Nassar

Abstract:

Many supervised machine learning tasks require decision making across numerous different classes. Multi-class classification has several applications, such as face recognition, text recognition and medical diagnostics. The objective of this article is to analyze an adapted method of Stacking in multi-class problems, which combines ensembles within the ensemble itself. For this purpose, a training similar to Stacking was used, but with three levels, where the final decision-maker (level 2) performs its training by combining outputs from the tree-based pair of meta-classifiers (level 1) from Bayesian families. These are in turn trained by pairs of base classifiers (level 0) of the same family. This strategy seeks to promote diversity among the ensembles forming the meta-classifier level 2. Three performance measures were used: (1) accuracy, (2) area under the ROC curve, and (3) time for three factors: (a) datasets, (b) experiments and (c) levels. To compare the factors, ANOVA three-way test was executed for each performance measure, considering 5 datasets by 25 experiments by 3 levels. A triple interaction between factors was observed only in time. The accuracy and area under the ROC curve presented similar results, showing a double interaction between level and experiment, as well as for the dataset factor. It was concluded that level 2 had an average performance above the other levels and that the proposed method is especially efficient for multi-class problems when compared to binary problems.

Keywords: Stacking, multi-layers, ensemble, multi-class.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1044
4215 Fuzzy Relatives of the CLARANS Algorithm With Application to Text Clustering

Authors: Mohamed A. Mahfouz, M. A. Ismail

Abstract:

This paper introduces new algorithms (Fuzzy relative of the CLARANS algorithm FCLARANS and Fuzzy c Medoids based on randomized search FCMRANS) for fuzzy clustering of relational data. Unlike existing fuzzy c-medoids algorithm (FCMdd) in which the within cluster dissimilarity of each cluster is minimized in each iteration by recomputing new medoids given current memberships, FCLARANS minimizes the same objective function minimized by FCMdd by changing current medoids in such away that that the sum of the within cluster dissimilarities is minimized. Computing new medoids may be effected by noise because outliers may join the computation of medoids while the choice of medoids in FCLARANS is dictated by the location of a predominant fraction of points inside a cluster and, therefore, it is less sensitive to the presence of outliers. In FCMRANS the step of computing new medoids in FCMdd is modified to be based on randomized search. Furthermore, a new initialization procedure is developed that add randomness to the initialization procedure used with FCMdd. Both FCLARANS and FCMRANS are compared with the robust and linearized version of fuzzy c-medoids (RFCMdd). Experimental results with different samples of the Reuter-21578, Newsgroups (20NG) and generated datasets with noise show that FCLARANS is more robust than both RFCMdd and FCMRANS. Finally, both FCMRANS and FCLARANS are more efficient and their outputs are almost the same as that of RFCMdd in terms of classification rate.

Keywords: Data Mining, Fuzzy Clustering, Relational Clustering, Medoid-Based Clustering, Cluster Analysis, Unsupervised Learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2356
4214 Parallelization of Ensemble Kalman Filter (EnKF) for Oil Reservoirs with Time-lapse Seismic Data

Authors: Md Khairullah, Hai-Xiang Lin, Remus G. Hanea, Arnold W. Heemink

Abstract:

In this paper we describe the design and implementation of a parallel algorithm for data assimilation with ensemble Kalman filter (EnKF) for oil reservoir history matching problem. The use of large number of observations from time-lapse seismic leads to a large turnaround time for the analysis step, in addition to the time consuming simulations of the realizations. For efficient parallelization it is important to consider parallel computation at the analysis step. Our experiments show that parallelization of the analysis step in addition to the forecast step has good scalability, exploiting the same set of resources with some additional efforts.

Keywords: EnKF, Data assimilation, Parallel computing, Parallel efficiency.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2222
4213 Proposal to Increase the Efficiency, Reliability and Safety of the Centre of Data Collection Management and Their Evaluation Using Cluster Solutions

Authors: Martin Juhas, Bohuslava Juhasova, Igor Halenar, Andrej Elias

Abstract:

This article deals with the possibility of increasing efficiency, reliability and safety of the system for teledosimetric data collection management and their evaluation as a part of complex study for activity “Research of data collection, their measurement and evaluation with mobile and autonomous units” within project “Research of monitoring and evaluation of non-standard conditions in the area of nuclear power plants”. Possible weaknesses in existing system are identified. A study of available cluster solutions with possibility of their deploying to analysed system is presented

Keywords: Teledosimetric data, efficiency, reliability, safety, cluster solution.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1515
4212 Performance Comparison of Particle Swarm Optimization with Traditional Clustering Algorithms used in Self-Organizing Map

Authors: Anurag Sharma, Christian W. Omlin

Abstract:

Self-organizing map (SOM) is a well known data reduction technique used in data mining. It can reveal structure in data sets through data visualization that is otherwise hard to detect from raw data alone. However, interpretation through visual inspection is prone to errors and can be very tedious. There are several techniques for the automatic detection of clusters of code vectors found by SOM, but they generally do not take into account the distribution of code vectors; this may lead to unsatisfactory clustering and poor definition of cluster boundaries, particularly where the density of data points is low. In this paper, we propose the use of an adaptive heuristic particle swarm optimization (PSO) algorithm for finding cluster boundaries directly from the code vectors obtained from SOM. The application of our method to several standard data sets demonstrates its feasibility. PSO algorithm utilizes a so-called U-matrix of SOM to determine cluster boundaries; the results of this novel automatic method compare very favorably to boundary detection through traditional algorithms namely k-means and hierarchical based approach which are normally used to interpret the output of SOM.

Keywords: cluster boundaries, clustering, code vectors, data mining, particle swarm optimization, self-organizing maps, U-matrix.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1869
4211 Communities of Ammonia-oxidizing Archaea and Bacteria in Enriched Nitrifying Activated Sludge

Authors: Puntipar Sonthiphand, Tawan Limpiyakorn

Abstract:

In this study, communities of ammonia-oxidizing archaea (AOA) and ammonia-oxidizing bacteria (AOB) in nitrifying activated sludge (NAS) prepared by enriching sludge from a municipal wastewater treatment plant in three continuous-flow reactors receiving an inorganic medium containing different ammonium concentrations of 2, 10, and 30 mM NH4 +-N (NAS2, NAS10, and NAS30, respectively) were investigated using molecular analysis. Results suggested that almost all AOA clones from NAS2, NAS10, and NAS30 fell into the same AOA cluster and AOA communities in NAS2 and NAS10 were more diverse than those of NAS30. In contrast to AOA, AOB communities obviously shifted from the seed sludge to enriched NASs and in each enriched NAS, communities of AOB varied particularly. The seed sludge contained members of N. communis cluster and N. oligotropha cluster. After it was enriched under various ammonium loads, members of N. communis cluster disappeared from all enriched NASs. AOB with high affinity to ammonia presented in NAS 2, AOB with low affinity to ammonia presented in NAS 30, and both types of AOB survived in NAS 10. These demonstrated that ammonium load significantly influenced AOB communities, but not AOA communities in enriched NASs.

Keywords: ammonia-oxidizing bacteria, ammonia-oxidizingarchaea, nitrifying activated sludge.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1542
4210 Outlier Pulse Detection and Feature Extraction for Wrist Pulse Analysis

Authors: Bhaskar Thakker, Anoop Lal Vyas

Abstract:

Wrist pulse analysis for identification of health status is found in Ancient Indian as well as Chinese literature. The preprocessing of wrist pulse is necessary to remove outlier pulses and fluctuations prior to the analysis of pulse pressure signal. This paper discusses the identification of irregular pulses present in the pulse series and intricacies associated with the extraction of time domain pulse features. An approach of Dynamic Time Warping (DTW) has been utilized for the identification of outlier pulses in the wrist pulse series. The ambiguity present in the identification of pulse features is resolved with the help of first derivative of Ensemble Average of wrist pulse series. An algorithm for detecting tidal and dicrotic notch in individual wrist pulse segment is proposed.

Keywords: Wrist Pulse Segment, Ensemble Average, Dynamic Time Warping (DTW), Pulse Similarity Vector.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2036
4209 Ensembling Classifiers – An Application toImage Data Classification from Cherenkov Telescope Experiment

Authors: Praveen Boinee, Alessandro De Angelis, Gian Luca Foresti

Abstract:

Ensemble learning algorithms such as AdaBoost and Bagging have been in active research and shown improvements in classification results for several benchmarking data sets with mainly decision trees as their base classifiers. In this paper we experiment to apply these Meta learning techniques with classifiers such as random forests, neural networks and support vector machines. The data sets are from MAGIC, a Cherenkov telescope experiment. The task is to classify gamma signals from overwhelmingly hadron and muon signals representing a rare class classification problem. We compare the individual classifiers with their ensemble counterparts and discuss the results. WEKA a wonderful tool for machine learning has been used for making the experiments.

Keywords: Ensembles, WEKA, Neural networks [NN], SupportVector Machines [SVM], Random Forests [RF].

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1724
4208 Optimizing Approach for Sifting Process to Solve a Common Type of Empirical Mode Decomposition Mode Mixing

Authors: Saad Al-Baddai, Karema Al-Subari, Elmar Lang, Bernd Ludwig

Abstract:

Empirical mode decomposition (EMD), a new data-driven of time-series decomposition, has the advantage of supposing that a time series is non-linear or non-stationary, as is implicitly achieved in Fourier decomposition. However, the EMD suffers of mode mixing problem in some cases. The aim of this paper is to present a solution for a common type of signals causing of EMD mode mixing problem, in case a signal suffers of an intermittency. By an artificial example, the solution shows superior performance in terms of cope EMD mode mixing problem comparing with the conventional EMD and Ensemble Empirical Mode decomposition (EEMD). Furthermore, the over-sifting problem is also completely avoided; and computation load is reduced roughly six times compared with EEMD, an ensemble number of 50.

Keywords: Empirical mode decomposition, mode mixing, sifting process, over-sifting.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 947
4207 The Effects of Yield and Yield Components of Some Quality Increase Applications on Razakı Grape Variety

Authors: Şehri Çınar, Aydın Akın

Abstract:

This study was conducted Razakı grape variety (Vitis vinifera L.) and its vine which was aged 19 was grown on 5 BB rootstock in a vegetation period of 2014 in Afyon province in Turkey. In this research, it was investigated whether the applications of Control (C), 1/3 Cluster Tip Reduction (1/3 CTR), Shoot Tip Reduction (STR), 1/3 CTR + STR, Boric Acid (BA), 1/3 CTR + BA, STR + BA, 1/3 CTR + STR + BA on yield and yield components of Razakı grape variety. The results were obtained as the highest fresh grape yield (7.74 kg/vine) with C application; as the highest cluster weight (244.62 g) with STR application; as the highest 100 berry weight (504.08 g) with C application; as the highest maturity index (36.89) with BA application; as the highest must yield (695.00 ml) with BA and (695.00 ml) with 1/3 CTR + STR + BA applications; as the highest intensity of L* color (46.93) with STR and (46.10) with 1/3 CTR + STR + BA applications; as the highest intensity of a* color (-5.37) with 1/3 CTR + STR and (-5.01) with STR, as the highest intensity of b* color (12.59) with STR application. The shoot tip reduction to increase cluster weight and boric acid application to increase maturity index of Razakı grape variety can be recommended.

Keywords: Razakı, 1/3 cluster tip reduction, shoot tip reduction, boric acid, yield and yield components.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3513
4206 On the Noise Distance in Robust Fuzzy C-Means

Authors: M. G. C. A. Cimino, G. Frosini, B. Lazzerini, F. Marcelloni

Abstract:

In the last decades, a number of robust fuzzy clustering algorithms have been proposed to partition data sets affected by noise and outliers. Robust fuzzy C-means (robust-FCM) is certainly one of the most known among these algorithms. In robust-FCM, noise is modeled as a separate cluster and is characterized by a prototype that has a constant distance δ from all data points. Distance δ determines the boundary of the noise cluster and therefore is a critical parameter of the algorithm. Though some approaches have been proposed to automatically determine the most suitable δ for the specific application, up to today an efficient and fully satisfactory solution does not exist. The aim of this paper is to propose a novel method to compute the optimal δ based on the analysis of the distribution of the percentage of objects assigned to the noise cluster in repeated executions of the robust-FCM with decreasing values of δ . The extremely encouraging results obtained on some data sets found in the literature are shown and discussed.

Keywords: noise prototype, robust fuzzy clustering, robustfuzzy C-means

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1772
4205 A New Algorithm for Cluster Initialization

Authors: Moth'd Belal. Al-Daoud

Abstract:

Clustering is a very well known technique in data mining. One of the most widely used clustering techniques is the k-means algorithm. Solutions obtained from this technique are dependent on the initialization of cluster centers. In this article we propose a new algorithm to initialize the clusters. The proposed algorithm is based on finding a set of medians extracted from a dimension with maximum variance. The algorithm has been applied to different data sets and good results are obtained.

Keywords: clustering, k-means, data mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2057
4204 Hybrid Recommender Systems using Social Network Analysis

Authors: Kyoung-Jae Kim, Hyunchul Ahn

Abstract:

This study proposes novel hybrid social network analysis and collaborative filtering approach to enhance the performance of recommender systems. The proposed model selects subgroups of users in Internet community through social network analysis (SNA), and then performs clustering analysis using the information about subgroups. Finally, it makes recommendations using cluster-indexing CF based on the clustering results. This study tries to use the cores in subgroups as an initial seed for a conventional clustering algorithm. This model chooses five cores which have the highest value of degree centrality from SNA, and then performs clustering analysis by using the cores as initial centroids (cluster centers). Then, the model amplifies the impact of friends in social network in the process of cluster-indexing CF.

Keywords: Social network analysis, Recommender systems, Collaborative filtering, Customer relationship management

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2716
4203 Regression Approach for Optimal Purchase of Hosts Cluster in Fixed Fund for Hadoop Big Data Platform

Authors: Haitao Yang, Jianming Lv, Fei Xu, Xintong Wang, Yilin Huang, Lanting Xia, Xuewu Zhu

Abstract:

Given a fixed fund, purchasing fewer hosts of higher capability or inversely more of lower capability is a must-be-made trade-off in practices for building a Hadoop big data platform. An exploratory study is presented for a Housing Big Data Platform project (HBDP), where typical big data computing is with SQL queries of aggregate, join, and space-time condition selections executed upon massive data from more than 10 million housing units. In HBDP, an empirical formula was introduced to predict the performance of host clusters potential for the intended typical big data computing, and it was shaped via a regression approach. With this empirical formula, it is easy to suggest an optimal cluster configuration. The investigation was based on a typical Hadoop computing ecosystem HDFS+Hive+Spark. A proper metric was raised to measure the performance of Hadoop clusters in HBDP, which was tested and compared with its predicted counterpart, on executing three kinds of typical SQL query tasks. Tests were conducted with respect to factors of CPU benchmark, memory size, virtual host division, and the number of element physical host in cluster. The research has been applied to practical cluster procurement for housing big data computing.

Keywords: Hadoop platform planning, optimal cluster scheme at fixed-fund, performance empirical formula, typical SQL query tasks.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 783
4202 Comparative Evaluation of Accuracy of Selected Machine Learning Classification Techniques for Diagnosis of Cancer: A Data Mining Approach

Authors: Rajvir Kaur, Jeewani Anupama Ginige

Abstract:

With recent trends in Big Data and advancements in Information and Communication Technologies, the healthcare industry is at the stage of its transition from clinician oriented to technology oriented. Many people around the world die of cancer because the diagnosis of disease was not done at an early stage. Nowadays, the computational methods in the form of Machine Learning (ML) are used to develop automated decision support systems that can diagnose cancer with high confidence in a timely manner. This paper aims to carry out the comparative evaluation of a selected set of ML classifiers on two existing datasets: breast cancer and cervical cancer. The ML classifiers compared in this study are Decision Tree (DT), Support Vector Machine (SVM), k-Nearest Neighbor (k-NN), Logistic Regression, Ensemble (Bagged Tree) and Artificial Neural Networks (ANN). The evaluation is carried out based on standard evaluation metrics Precision (P), Recall (R), F1-score and Accuracy. The experimental results based on the evaluation metrics show that ANN showed the highest-level accuracy (99.4%) when tested with breast cancer dataset. On the other hand, when these ML classifiers are tested with the cervical cancer dataset, Ensemble (Bagged Tree) technique gave better accuracy (93.1%) in comparison to other classifiers.

Keywords: Artificial neural networks, breast cancer, cancer dataset, classifiers, cervical cancer, F-score, logistic regression, machine learning, precision, recall, support vector machine.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1489
4201 Multi-Sensor Target Tracking Using Ensemble Learning

Authors: Bhekisipho Twala, Mantepu Masetshaba, Ramapulana Nkoana

Abstract:

Multiple classifier systems combine several individual classifiers to deliver a final classification decision. However, an increasingly controversial question is whether such systems can outperform the single best classifier, and if so, what form of multiple classifiers system yields the most significant benefit. Also, multi-target tracking detection using multiple sensors is an important research field in mobile techniques and military applications. In this paper, several multiple classifiers systems are evaluated in terms of their ability to predict a system’s failure or success for multi-sensor target tracking tasks. The Bristol Eden project dataset is utilised for this task. Experimental and simulation results show that the human activity identification system can fulfil requirements of target tracking due to improved sensors classification performances with multiple classifier systems constructed using boosting achieving higher accuracy rates.

Keywords: Single classifier, machine learning, ensemble learning, multi-sensor target tracking.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 531
4200 A Comparison of Fuzzy Clustering Algorithms to Cluster Web Messages

Authors: Sara El Manar El Bouanani, Ismail Kassou

Abstract:

Our objective in this paper is to propose an approach capable of clustering web messages. The clustering is carried out by assigning, with a certain probability, texts written by the same web user to the same cluster based on Stylometric features and using fuzzy clustering algorithms. Focus in the present work is on comparing the most popular algorithms in fuzzy clustering theory namely, Fuzzy C-means, Possibilistic C-means and Fuzzy Possibilistic C-Means.

Keywords: Authorship detection, fuzzy clustering, profiling, stylometric features.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2005
4199 Soft Computing Based Cluster Head Selection in Wireless Sensor Network Using Bacterial Foraging Optimization Algorithm

Authors: A. Rajagopal, S. Somasundaram, B. Sowmya, T. Suguna

Abstract:

Wireless Sensor Networks (WSNs) enable new applications and need non-conventional paradigms for the protocol because of energy and bandwidth constraints, In WSN, sensor node’s life is a critical parameter. Research on life extension is based on Low-Energy Adaptive Clustering Hierarchy (LEACH) scheme, which rotates Cluster Head (CH) among sensor nodes to distribute energy consumption over all network nodes. CH selection in WSN affects network energy efficiency greatly. This study proposes an improved CH selection for efficient data aggregation in sensor networks. This new algorithm is based on Bacterial Foraging Optimization (BFO) incorporated in LEACH.

Keywords: Bacterial Foraging Optimization (BFO), Cluster Head (CH), Data-aggregation protocols, Low-Energy Adaptive Clustering Hierarchy (LEACH).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3428