Search results for: Cluster Basis
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1411

Search results for: Cluster Basis

1291 A Literature Review on the Effect of Industrial Clusters and the Absorptive Capacity on Innovation

Authors: Enrique Claver Cortés, Bartolomé Marco Lajara, Eduardo Sánchez García, Pedro Seva Larrosa, Encarnación Manresa Marhuenda, Lorena Ruiz Fernández, Esther Poveda Pareja

Abstract:

In recent decades, the analysis of the effects of clustering as an essential factor for the development of innovations and the competitiveness of enterprises has raised great interest in different areas. Nowadays, companies have access to almost all tangible and intangible resources located and/or developed in any country in the world. However, despite the obvious advantages that this situation entails for companies, their geographical location has shown itself, increasingly clearly, to be a fundamental factor that positively influences their innovative performance and competitiveness. Industrial clusters could represent a unique level of analysis, positioned between the individual company and the industry, which makes them an ideal unit of analysis to determine the effects derived from company membership of a cluster. Also, the absorptive capacity (hereinafter 'AC') can mediate the process of innovation development by companies located in a cluster. The transformation and exploitation of knowledge could have a mediating effect between knowledge acquisition and innovative performance. The main objective of this work is to determine the key factors that affect the degree of generation and use of knowledge from the environment by companies and, consequently, their innovative performance and competitiveness. The elements analyzed are the companies' membership of a cluster and the AC. To this end, 30 most relevant papers published on this subject in the "Web of Science" database have been reviewed. Our findings show that, within a cluster, the knowledge coming from the companies' environment can significantly influence their innovative performance and competitiveness, although in this relationship, the degree of access and exploitation of the companies to this knowledge plays a fundamental role, which depends on a series of elements both internal and external to the company.

Keywords: Absorptive capacity, clusters, innovation, knowledge.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 896
1290 Quantity and Quality Aware Artificial Bee Colony Algorithm for Clustering

Authors: U. Idachaba, F. Z. Wang, A. Qi, N. Helian

Abstract:

Artificial Bee Colony (ABC) algorithm is a relatively new swarm intelligence technique for clustering. It produces higher quality clusters compared to other population-based algorithms but with poor energy efficiency, cluster quality consistency and typically slower in convergence speed. Inspired by energy saving foraging behavior of natural honey bees this paper presents a Quality and Quantity Aware Artificial Bee Colony (Q2ABC) algorithm to improve quality of cluster identification, energy efficiency and convergence speed of the original ABC. To evaluate the performance of Q2ABC algorithm, experiments were conducted on a suite of ten benchmark UCI datasets. The results demonstrate Q2ABC outperformed ABC and K-means algorithm in the quality of clusters delivered.

Keywords: Artificial bee colony algorithm, clustering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2120
1289 Ensembling Adaptively Constructed Polynomial Regression Models

Authors: Gints Jekabsons

Abstract:

The approach of subset selection in polynomial regression model building assumes that the chosen fixed full set of predefined basis functions contains a subset that is sufficient to describe the target relation sufficiently well. However, in most cases the necessary set of basis functions is not known and needs to be guessed – a potentially non-trivial (and long) trial and error process. In our research we consider a potentially more efficient approach – Adaptive Basis Function Construction (ABFC). It lets the model building method itself construct the basis functions necessary for creating a model of arbitrary complexity with adequate predictive performance. However, there are two issues that to some extent plague the methods of both the subset selection and the ABFC, especially when working with relatively small data samples: the selection bias and the selection instability. We try to correct these issues by model post-evaluation using Cross-Validation and model ensembling. To evaluate the proposed method, we empirically compare it to ABFC methods without ensembling, to a widely used method of subset selection, as well as to some other well-known regression modeling methods, using publicly available data sets.

Keywords: Basis function construction, heuristic search, modelensembles, polynomial regression.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1673
1288 Customer Segmentation Model in E-commerce Using Clustering Techniques and LRFM Model: The Case of Online Stores in Morocco

Authors: Rachid Ait daoud, Abdellah Amine, Belaid Bouikhalene, Rachid Lbibb

Abstract:

Given the increase in the number of e-commerce sites, the number of competitors has become very important. This means that companies have to take appropriate decisions in order to meet the expectations of their customers and satisfy their needs. In this paper, we present a case study of applying LRFM (length, recency, frequency and monetary) model and clustering techniques in the sector of electronic commerce with a view to evaluating customers’ values of the Moroccan e-commerce websites and then developing effective marketing strategies. To achieve these objectives, we adopt LRFM model by applying a two-stage clustering method. In the first stage, the self-organizing maps method is used to determine the best number of clusters and the initial centroid. In the second stage, kmeans method is applied to segment 730 customers into nine clusters according to their L, R, F and M values. The results show that the cluster 6 is the most important cluster because the average values of L, R, F and M are higher than the overall average value. In addition, this study has considered another variable that describes the mode of payment used by customers to improve and strengthen clusters’ analysis. The clusters’ analysis demonstrates that the payment method is one of the key indicators of a new index which allows to assess the level of customers’ confidence in the company's Website.

Keywords: Customer value, LRFM model, Cluster analysis, Self-Organizing Maps method (SOM), K-means algorithm, loyalty.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6253
1287 Face Recognition using Radial Basis Function Network based on LDA

Authors: Byung-Joo Oh

Abstract:

This paper describes a method to improve the robustness of a face recognition system based on the combination of two compensating classifiers. The face images are preprocessed by the appearance-based statistical approaches such as Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA). LDA features of the face image are taken as the input of the Radial Basis Function Network (RBFN). The proposed approach has been tested on the ORL database. The experimental results show that the LDA+RBFN algorithm has achieved a recognition rate of 93.5%

Keywords: Face recognition, linear discriminant analysis, radial basis function network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2122
1286 Performance Optimization of Data Mining Application Using Radial Basis Function Classifier

Authors: M. Govindarajan, R. M.Chandrasekaran

Abstract:

Text data mining is a process of exploratory data analysis. Classification maps data into predefined groups or classes. It is often referred to as supervised learning because the classes are determined before examining the data. This paper describes proposed radial basis function Classifier that performs comparative crossvalidation for existing radial basis function Classifier. The feasibility and the benefits of the proposed approach are demonstrated by means of data mining problem: direct Marketing. Direct marketing has become an important application field of data mining. Comparative Cross-validation involves estimation of accuracy by either stratified k-fold cross-validation or equivalent repeated random subsampling. While the proposed method may have high bias; its performance (accuracy estimation in our case) may be poor due to high variance. Thus the accuracy with proposed radial basis function Classifier was less than with the existing radial basis function Classifier. However there is smaller the improvement in runtime and larger improvement in precision and recall. In the proposed method Classification accuracy and prediction accuracy are determined where the prediction accuracy is comparatively high.

Keywords: Text Data Mining, Comparative Cross-validation, Radial Basis Function, runtime, accuracy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1554
1285 Multidimensional Data Mining by Means of Randomly Travelling Hyper-Ellipsoids

Authors: Pavel Y. Tabakov, Kevin Duffy

Abstract:

The present study presents a new approach to automatic data clustering and classification problems in large and complex databases and, at the same time, derives specific types of explicit rules describing each cluster. The method works well in both sparse and dense multidimensional data spaces. The members of the data space can be of the same nature or represent different classes. A number of N-dimensional ellipsoids are used for enclosing the data clouds. Due to the geometry of an ellipsoid and its free rotation in space the detection of clusters becomes very efficient. The method is based on genetic algorithms that are used for the optimization of location, orientation and geometric characteristics of the hyper-ellipsoids. The proposed approach can serve as a basis for the development of general knowledge systems for discovering hidden knowledge and unexpected patterns and rules in various large databases.

Keywords: Classification, clustering, data minig, genetic algorithms.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1772
1284 Indexing and Searching of Image Data in Multimedia Databases Using Axial Projection

Authors: Khalid A. Kaabneh

Abstract:

This paper introduces and studies new indexing techniques for content-based queries in images databases. Indexing is the key to providing sophisticated, accurate and fast searches for queries in image data. This research describes a new indexing approach, which depends on linear modeling of signals, using bases for modeling. A basis is a set of chosen images, and modeling an image is a least-squares approximation of the image as a linear combination of the basis images. The coefficients of the basis images are taken together to serve as index for that image. The paper describes the implementation of the indexing scheme, and presents the findings of our extensive evaluation that was conducted to optimize (1) the choice of the basis matrix (B), and (2) the size of the index A (N). Furthermore, we compare the performance of our indexing scheme with other schemes. Our results show that our scheme has significantly higher performance.

Keywords: Axial Projection, images, indexing, multimedia database, searching.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1387
1283 Using Pattern Search Methods for Minimizing Clustering Problems

Authors: Parvaneh Shabanzadeh, Malik Hj Abu Hassan, Leong Wah June, Maryam Mohagheghtabar

Abstract:

Clustering is one of an interesting data mining topics that can be applied in many fields. Recently, the problem of cluster analysis is formulated as a problem of nonsmooth, nonconvex optimization, and an algorithm for solving the cluster analysis problem based on nonsmooth optimization techniques is developed. This optimization problem has a number of characteristics that make it challenging: it has many local minimum, the optimization variables can be either continuous or categorical, and there are no exact analytical derivatives. In this study we show how to apply a particular class of optimization methods known as pattern search methods to address these challenges. These methods do not explicitly use derivatives, an important feature that has not been addressed in previous studies. Results of numerical experiments are presented which demonstrate the effectiveness of the proposed method.

Keywords: Clustering functions, Non-smooth Optimization, Nonconvex Optimization, Pattern Search Method.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1640
1282 Using Spectral Vectors and M-Tree for Graph Clustering and Searching in Graph Databases of Protein Structures

Authors: Do Phuc, Nguyen Thi Kim Phung

Abstract:

In this paper, we represent protein structure by using graph. A protein structure database will become a graph database. Each graph is represented by a spectral vector. We use Jacobi rotation algorithm to calculate the eigenvalues of the normalized Laplacian representation of adjacency matrix of graph. To measure the similarity between two graphs, we calculate the Euclidean distance between two graph spectral vectors. To cluster the graphs, we use M-tree with the Euclidean distance to cluster spectral vectors. Besides, M-tree can be used for graph searching in graph database. Our proposal method was tested with graph database of 100 graphs representing 100 protein structures downloaded from Protein Data Bank (PDB) and we compare the result with the SCOP hierarchical structure.

Keywords: Eigenvalues, m-tree, graph database, protein structure, spectra graph theory.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1656
1281 Predicting Protein Interaction Sites Based on a New Integrated Radial Basis Functional Neural Network

Authors: Xiaoli Shen, Yuehui Chen

Abstract:

Interactions among proteins are the basis of various life events. So, it is important to recognize and research protein interaction sites. A control set that contains 149 protein molecules were used here. Then 10 features were extracted and 4 sample sets that contained 9 sliding windows were made according to features. These 4 sample sets were calculated by Radial Basis Functional neutral networks which were optimized by Particle Swarm Optimization respectively. Then 4 groups of results were obtained. Finally, these 4 groups of results were integrated by decision fusion (DF) and Genetic Algorithm based Selected Ensemble (GASEN). A better accuracy was got by DF and GASEN. So, the integrated methods were proved to be effective.

Keywords: protein interaction sites, features, sliding windows, radial basis functional neutral networks, genetic algorithm basedselected ensemble.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1421
1280 Growing Self Organising Map Based Exploratory Analysis of Text Data

Authors: Sumith Matharage, Damminda Alahakoon

Abstract:

Textual data plays an important role in the modern world. The possibilities of applying data mining techniques to uncover hidden information present in large volumes of text collections is immense. The Growing Self Organizing Map (GSOM) is a highly successful member of the Self Organising Map family and has been used as a clustering and visualisation tool across wide range of disciplines to discover hidden patterns present in the data. A comprehensive analysis of the GSOM’s capabilities as a text clustering and visualisation tool has so far not been published. These functionalities, namely map visualisation capabilities, automatic cluster identification and hierarchical clustering capabilities are presented in this paper and are further demonstrated with experiments on a benchmark text corpus.

Keywords: Text Clustering, Growing Self Organizing Map, Automatic Cluster Identification, Hierarchical Clustering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1996
1279 Energy-Aware Routing in Mobile Wireless Sensor Networks

Authors: R. Geetha, G. Umarani Srikanth, S. Prabhu

Abstract:

Wireless sensor networks are resource constrained networks, where energy is the major resource in such networks. Therefore, energy conservation is major aspect in the deployment of Wireless Sensor Network. This work makes use of an extended Greedy Perimeter Stateless Routing (eGPSR) protocol that mainly focuses on energy efficient data transmission. This data transmission is based on the fact that the message that is sent to a distant node consumes more energy than the message that is sent to a short range transmission. Every cluster contains a head set that consists of many virtual cluster heads. Routing is decided by head set members. The energy level of the received signal is the major constraint to choose head set from its members. The experimental result shows that the use of eGPSR in routing has improved throughput with comparatively less delay.

Keywords: eGPSR, energy efficiency, routing, wireless sensor networks, WSN.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 926
1278 Clustering Multivariate Empiric Characteristic Functions for Multi-Class SVM Classification

Authors: María-Dolores Cubiles-de-la-Vega, Rafael Pino-Mejías, Esther-Lydia Silva-Ramírez

Abstract:

A dissimilarity measure between the empiric characteristic functions of the subsamples associated to the different classes in a multivariate data set is proposed. This measure can be efficiently computed, and it depends on all the cases of each class. It may be used to find groups of similar classes, which could be joined for further analysis, or it could be employed to perform an agglomerative hierarchical cluster analysis of the set of classes. The final tree can serve to build a family of binary classification models, offering an alternative approach to the multi-class SVM problem. We have tested this dendrogram based SVM approach with the oneagainst- one SVM approach over four publicly available data sets, three of them being microarray data. Both performances have been found equivalent, but the first solution requires a smaller number of binary SVM models.

Keywords: Cluster Analysis, Empiric Characteristic Function, Multi-class SVM, R.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1877
1277 Environmental Interference Cancellation of Speech with the Radial Basis Function Networks: An Experimental Comparison

Authors: Nima Hatami

Abstract:

In this paper, we use Radial Basis Function Networks (RBFN) for solving the problem of environmental interference cancellation of speech signal. We show that the Second Order Thin- Plate Spline (SOTPS) kernel cancels the interferences effectively. For make comparison, we test our experiments on two conventional most used RBFN kernels: the Gaussian and First order TPS (FOTPS) basis functions. The speech signals used here were taken from the OGI Multi-Language Telephone Speech Corpus database and were corrupted with six type of environmental noise from NOISEX-92 database. Experimental results show that the SOTPS kernel can considerably outperform the Gaussian and FOTPS functions on speech interference cancellation problem.

Keywords: Environmental interference, interference cancellation of speech, Radial Basis Function networks, Gaussian and TPS kernels.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1563
1276 The Role of Knowledge Management in Innovation: Spanish Evidence

Authors: María Jesús Luengo-Valderrey, Mónica Moso-Díez

Abstract:

In the knowledge-based economy, innovation is considered essential in order to achieve survival and growth in organizations. On the other hand, knowledge management is currently understood as one of the keys to innovation process. Both factors are generally admitted as generators of competitive advantage in organizations. Specifically, activities on R&D&I and those that generate internal knowledge have a positive influence in innovation results. This paper examines this effect and if it is similar or not is what we aimed to quantify in this paper. We focus on the impact that proportion of knowledge workers, the R&D&I investment, the amounts destined for ICTs and training for innovation have on the variation of tangible and intangibles returns for the sector of high and medium technology in Spain. To do this, we have performed an empirical analysis on the results of questionnaires about innovation in enterprises in Spain, collected by the National Statistics Institute. First, using clusters methodology, the behavior of these enterprises regarding knowledge management is identified. Then, using SEM methodology, we performed, for each cluster, the study about cause-effect relationships among constructs defined through variables, setting its type and quantification. The cluster analysis results in four groups in which cluster number 1 and 3 presents the best performance in innovation with differentiating nuances among them, while clusters 2 and 4 obtained divergent results to a similar innovative effort. However, the results of SEM analysis for each cluster show that, in all cases, knowledge workers are those that affect innovation performance most, regardless of the level of investment, and that there is a strong correlation between knowledge workers and investment in knowledge generation. The main findings reached is that Spanish high and medium technology companies improve their innovation performance investing in internal knowledge generation measures, specially, in terms of R&D activities, and underinvest in external ones. This, and the strong correlation between knowledge workers and the set of activities that promote the knowledge generation, should be taken into account by managers of companies, when making decisions about their investments for innovation, since they are key for improving their opportunities in the global market.

Keywords: High and medium technology sector, innovation, knowledge management, Spanish companies.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2196
1275 An Adaptive Fuzzy Clustering Approach for the Network Management

Authors: Amal Elmzabi, Mostafa Bellafkih, Mohammed Ramdani

Abstract:

The Chiu-s method which generates a Takagi-Sugeno Fuzzy Inference System (FIS) is a method of fuzzy rules extraction. The rules output is a linear function of inputs. In addition, these rules are not explicit for the expert. In this paper, we develop a method which generates Mamdani FIS, where the rules output is fuzzy. The method proceeds in two steps: first, it uses the subtractive clustering principle to estimate both the number of clusters and the initial locations of a cluster centers. Each obtained cluster corresponds to a Mamdani fuzzy rule. Then, it optimizes the fuzzy model parameters by applying a genetic algorithm. This method is illustrated on a traffic network management application. We suggest also a Mamdani fuzzy rules generation method, where the expert wants to classify the output variables in some fuzzy predefined classes.

Keywords: Fuzzy entropy, fuzzy inference systems, genetic algorithms, network management, subtractive clustering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1883
1274 Liver Lesion Extraction with Fuzzy Thresholding in Contrast Enhanced Ultrasound Images

Authors: Abder-Rahman Ali, Adélaïde Albouy-Kissi, Manuel Grand-Brochier, Viviane Ladan-Marcus, Christine Hoeffl, Claude Marcus, Antoine Vacavant, Jean-Yves Boire

Abstract:

In this paper, we present a new segmentation approach for focal liver lesions in contrast enhanced ultrasound imaging. This approach, based on a two-cluster Fuzzy C-Means methodology, considers type-II fuzzy sets to handle uncertainty due to the image modality (presence of speckle noise, low contrast, etc.), and to calculate the optimum inter-cluster threshold. Fine boundaries are detected by a local recursive merging of ambiguous pixels. The method has been tested on a representative database. Compared to both Otsu and type-I Fuzzy C-Means techniques, the proposed method significantly reduces the segmentation errors.

Keywords: Defuzzification, fuzzy clustering, image segmentation, type-II fuzzy sets.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2290
1273 An Energy Aware Data Aggregation in Wireless Sensor Network Using Connected Dominant Set

Authors: M. Santhalakshmi, P Suganthi

Abstract:

Wireless Sensor Networks (WSNs) have many advantages. Their deployment is easier and faster than wired sensor networks or other wireless networks, as they do not need fixed infrastructure. Nodes are partitioned into many small groups named clusters to aggregate data through network organization. WSN clustering guarantees performance achievement of sensor nodes. Sensor nodes energy consumption is reduced by eliminating redundant energy use and balancing energy sensor nodes use over a network. The aim of such clustering protocols is to prolong network life. Low Energy Adaptive Clustering Hierarchy (LEACH) is a popular protocol in WSN. LEACH is a clustering protocol in which the random rotations of local cluster heads are utilized in order to distribute energy load among all sensor nodes in the network. This paper proposes Connected Dominant Set (CDS) based cluster formation. CDS aggregates data in a promising approach for reducing routing overhead since messages are transmitted only within virtual backbone by means of CDS and also data aggregating lowers the ratio of responding hosts to the hosts existing in virtual backbones. CDS tries to increase networks lifetime considering such parameters as sensors lifetime, remaining and consumption energies in order to have an almost optimal data aggregation within networks. Experimental results proved CDS outperformed LEACH regarding number of cluster formations, average packet loss rate, average end to end delay, life computation, and remaining energy computation.

Keywords: Wireless sensor network, connected dominant set, clustering, data aggregation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1129
1272 Multi-Agent Systems for Intelligent Clustering

Authors: Jung-Eun Park, Kyung-Whan Oh

Abstract:

Intelligent systems are required in order to quickly and accurately analyze enormous quantities of data in the Internet environment. In intelligent systems, information extracting processes can be divided into supervised learning and unsupervised learning. This paper investigates intelligent clustering by unsupervised learning. Intelligent clustering is the clustering system which determines the clustering model for data analysis and evaluates results by itself. This system can make a clustering model more rapidly, objectively and accurately than an analyzer. The methodology for the automatic clustering intelligent system is a multi-agent system that comprises a clustering agent and a cluster performance evaluation agent. An agent exchanges information about clusters with another agent and the system determines the optimal cluster number through this information. Experiments using data sets in the UCI Machine Repository are performed in order to prove the validity of the system.

Keywords: Intelligent Clustering, Multi-Agent System, PCA, SOM, VC(Variance Criterion)

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1727
1271 A Software of Intrusion Detection Mechanism for Virtual Platforms

Authors: Ying-Chuan Chen, Shuen-Tai Wang

Abstract:

Security is an interesting and significance issue for popular virtual platforms, such as virtualization cluster and cloud platforms. Virtualization is the powerful technology for cloud computing services, there are a lot of benefits by using virtual machine tools which be called hypervisors, such as it can quickly deploy all kinds of virtual Operating Systems in single platform, able to control all virtual system resources effectively, cost down for system platform deployment, ability of customization, high elasticity and high reliability. However, some important security problems need to take care and resolved in virtual platforms that include terrible viruses, evil programs, illegal operations and intrusion behavior. In this paper, we present useful Intrusion Detection Mechanism (IDM) software that not only can auto to analyze all system-s operations with the accounting journal database, but also is able to monitor the system-s state for virtual platforms.

Keywords: security, cluster, cloud, virtualization, virtual machine, virus, intrusion detection

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1546
1270 Simultaneous Clustering and Feature Selection Method for Gene Expression Data

Authors: T. Chandrasekhar, K. Thangavel, E. N. Sathishkumar

Abstract:

Microarrays are made it possible to simultaneously monitor the expression profiles of thousands of genes under various experimental conditions. It is used to identify the co-expressed genes in specific cells or tissues that are actively used to make proteins. This method is used to analysis the gene expression, an important task in bioinformatics research. Cluster analysis of gene expression data has proved to be a useful tool for identifying co-expressed genes, biologically relevant groupings of genes and samples. In this work K-Means algorithms has been applied for clustering of Gene Expression Data. Further, rough set based Quick reduct algorithm has been applied for each cluster in order to select the most similar genes having high correlation. Then the ACV measure is used to evaluate the refined clusters and classification is used to evaluate the proposed method. They could identify compact clusters with feature selection method used to genes are selected.

Keywords: Clustering, Feature selection, Gene expression data, Quick reduct.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1967
1269 Clustering Protein Sequences with Tailored General Regression Model Technique

Authors: G. Lavanya Devi, Allam Appa Rao, A. Damodaram, GR Sridhar, G. Jaya Suma

Abstract:

Cluster analysis divides data into groups that are meaningful, useful, or both. Analysis of biological data is creating a new generation of epidemiologic, prognostic, diagnostic and treatment modalities. Clustering of protein sequences is one of the current research topics in the field of computer science. Linear relation is valuable in rule discovery for a given data, such as if value X goes up 1, value Y will go down 3", etc. The classical linear regression models the linear relation of two sequences perfectly. However, if we need to cluster a large repository of protein sequences into groups where sequences have strong linear relationship with each other, it is prohibitively expensive to compare sequences one by one. In this paper, we propose a new technique named General Regression Model Technique Clustering Algorithm (GRMTCA) to benignly handle the problem of linear sequences clustering. GRMT gives a measure, GR*, to tell the degree of linearity of multiple sequences without having to compare each pair of them.

Keywords: Clustering, General Regression Model, Protein Sequences, Similarity Measure.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1567
1268 Knowledge Representation Based On Interval Type-2 CFCM Clustering

Authors: Myung-Won Lee, Keun-Chang Kwak

Abstract:

This paper is concerned with knowledge representation and extraction of fuzzy if-then rules using Interval Type-2 Context-based Fuzzy C-Means clustering (IT2-CFCM) with the aid of fuzzy granulation. This proposed clustering algorithm is based on information granulation in the form of IT2 based Fuzzy C-Means (IT2-FCM) clustering and estimates the cluster centers by preserving the homogeneity between the clustered patterns from the IT2 contexts produced in the output space. Furthermore, we can obtain the automatic knowledge representation in the design of Radial Basis Function Networks (RBFN), Linguistic Model (LM), and Adaptive Neuro-Fuzzy Networks (ANFN) from the numerical input-output data pairs. We shall focus on a design of ANFN in this paper. The experimental results on an estimation problem of energy performance reveal that the proposed method showed a good knowledge representation and performance in comparison with the previous works.

Keywords: IT2-FCM, IT2-CFCM, context-based fuzzy clustering, adaptive neuro-fuzzy network, knowledge representation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2617
1267 Sparsity-Based Unsupervised Unmixing of Hyperspectral Imaging Data Using Basis Pursuit

Authors: Ahmed Elrewainy

Abstract:

Mixing in the hyperspectral imaging occurs due to the low spatial resolutions of the used cameras. The existing pure materials “endmembers” in the scene share the spectra pixels with different amounts called “abundances”. Unmixing of the data cube is an important task to know the present endmembers in the cube for the analysis of these images. Unsupervised unmixing is done with no information about the given data cube. Sparsity is one of the recent approaches used in the source recovery or unmixing techniques. The l1-norm optimization problem “basis pursuit” could be used as a sparsity-based approach to solve this unmixing problem where the endmembers is assumed to be sparse in an appropriate domain known as dictionary. This optimization problem is solved using proximal method “iterative thresholding”. The l1-norm basis pursuit optimization problem as a sparsity-based unmixing technique was used to unmix real and synthetic hyperspectral data cubes.

Keywords: Basis pursuit, blind source separation, hyperspectral imaging, spectral unmixing, wavelets.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 837
1266 DEA Method for Evaluation of EU Performance

Authors: M. Staníčková

Abstract:

The paper deals with an application of quantitative analysis – the Data Envelopment Analysis (DEA) method to performance evaluation of the European Union Member States, in the reference years 2000 and 2011. The main aim of the paper is to measure efficiency changes over the reference years and to analyze a level of productivity in individual countries based on DEA method and to classify the EU Member States to homogeneous units (clusters) according to efficiency results. The theoretical part is devoted to the fundamental basis of performance theory and the methodology of DEA. The empirical part is aimed at measuring degree of productivity and level of efficiency changes of evaluated countries by basic DEA model – CCR CRS model, and specialized DEA approach – the Malmquist Index measuring the change of technical efficiency and the movement of production possibility frontier. Here, DEA method becomes a suitable tool for setting a competitive/uncompetitive position of each country because there is not only one factor evaluated, but a set of different factors that determine the degree of economic development.

Keywords: CCR CRS model, cluster analysis, DEA method, efficiency, EU, Malmquist index, performance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2619
1265 Fuzzy Clustering of Locations for Degree of Accident Proneness based on Vehicle User Perceptions

Authors: Jayanth Jacob, C. V. Hariharakrishnan, Suganthi L.

Abstract:

The rapid urbanization of cities has a bane in the form road accidents that cause extensive damage to life and limbs. A number of location based factors are enablers of road accidents in the city. The speed of travel of vehicles is non-uniform among locations within a city. In this study, the perception of vehicle users is captured on a 10-point rating scale regarding the degree of variation in speed of travel at chosen locations in the city. The average rating is used to cluster locations using fuzzy c-means clustering and classify them as low, moderate and high speed of travel locations. The high speed of travel locations can be classified proactively to ensure that accidents do not occur due to the speeding of vehicles at such locations. The advantage of fuzzy c-means clustering is that a location may be a part of more than one cluster to a varying degree and this gives a better picture about the location with respect to the characteristic (speed of travel) being studied.

Keywords: C-means clustering, Location Specific, Road Accidents.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1842
1264 Forest Growth Simulation: Tropical Rain Forest Stand Table Projection

Authors: Yasmin Yahya, Roslan Ismail, Samreth Vanna, Khorn Saret

Abstract:

The study on the tree growth for four species groups of commercial timber in Koh Kong province, Cambodia-s tropical rainforest is described. The simulation for these four groups had been successfully developed in the 5-year interval through year-60. Data were obtained from twenty permanent sample plots in the duration of thirteen years. The aim for this study was to develop stand table simulation system of tree growth by the species group. There were five steps involved in the development of the tree growth simulation: aggregate the tree species into meaningful groups by using cluster analysis; allocate the trees in the diameter classes by the species group; observe the diameter movement of the species group. The diameter growth rate, mortality rate and recruitment rate were calculated by using some mathematical formula. Simulation equation had been created by combining those parameters. Result showed the dissimilarity of the diameter growth among species groups.

Keywords: cluster analysis, diameter growth, simulation

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2213
1263 Solar-Inducted Cluster Head Relocation Algorithm

Authors: Goran Djukanovic, Goran Popovic

Abstract:

A special area in the study of Wireless Sensor Networks (WSNs) is how to move sensor nodes, as it expands the scope of application of wireless sensors and provides new opportunities to improve network performance. On the other side, it opens a set of new problems, especially if complete clusters are mobile. Node mobility can prolong the network lifetime. In such WSN, some nodes are possibly moveable or nomadic (relocated periodically), while others are static. This paper presents an idea of mobile, solar-powered CHs that relocate themselves inside clusters in such a way that the total energy consumption in the network reduces, and the lifetime of the network extends. Positioning of CHs is made in each round based on selfish herd hypothesis, where leader retreats to the center of gravity. Based on this idea, an algorithm, together with its modified version, has been presented and tested in this paper. Simulation results show that both algorithms have benefits in network lifetime, and prolongation of network stability period duration.

Keywords: CH-active algorithm, mobile cluster head, sensors, wireless sensor network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1038
1262 Comparison of Polynomial and Radial Basis Kernel Functions based SVR and MLR in Modeling Mass Transfer by Vertical and Inclined Multiple Plunging Jets

Authors: S. Deswal, M. Pal

Abstract:

Presently various computational techniques are used in modeling and analyzing environmental engineering data. In the present study, an intra-comparison of polynomial and radial basis kernel functions based on Support Vector Regression and, in turn, an inter-comparison with Multi Linear Regression has been attempted in modeling mass transfer capacity of vertical (θ = 90O) and inclined (θ multiple plunging jets (varying from 1 to 16 numbers). The data set used in this study consists of four input parameters with a total of eighty eight cases, forty four each for vertical and inclined multiple plunging jets. For testing, tenfold cross validation was used. Correlation coefficient values of 0.971 and 0.981 along with corresponding root mean square error values of 0.0025 and 0.0020 were achieved by using polynomial and radial basis kernel functions based Support Vector Regression respectively. An intra-comparison suggests improved performance by radial basis function in comparison to polynomial kernel based Support Vector Regression. Further, an inter-comparison with Multi Linear Regression (correlation coefficient = 0.973 and root mean square error = 0.0024) reveals that radial basis kernel functions based Support Vector Regression performs better in modeling and estimating mass transfer by multiple plunging jets.

Keywords: Mass transfer, multiple plunging jets, polynomial and radial basis kernel functions, Support Vector Regression.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1433