Search results for: definable subset
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 115

Search results for: definable subset

55 An SVM based Classification Method for Cancer Data using Minimum Microarray Gene Expressions

Authors: R. Mallika, V. Saravanan

Abstract:

This paper gives a novel method for improving classification performance for cancer classification with very few microarray Gene expression data. The method employs classification with individual gene ranking and gene subset ranking. For selection and classification, the proposed method uses the same classifier. The method is applied to three publicly available cancer gene expression datasets from Lymphoma, Liver and Leukaemia datasets. Three different classifiers namely Support vector machines-one against all (SVM-OAA), K nearest neighbour (KNN) and Linear Discriminant analysis (LDA) were tested and the results indicate the improvement in performance of SVM-OAA classifier with satisfactory results on all the three datasets when compared with the other two classifiers.

Keywords: Support vector machines-one against all, cancerclassification, Linear Discriminant analysis, K nearest neighbour, microarray gene expression, gene pair ranking.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2500
54 The Effect of Increment in Simulation Samples on a Combined Selection Procedure

Authors: Mohammad H. Almomani, Rosmanjawati Abdul Rahman

Abstract:

Statistical selection procedures are used to select the best simulated system from a finite set of alternatives. In this paper, we present a procedure that can be used to select the best system when the number of alternatives is large. The proposed procedure consists a combination between Ranking and Selection, and Ordinal Optimization procedures. In order to improve the performance of Ordinal Optimization, Optimal Computing Budget Allocation technique is used to determine the best simulation lengths for all simulation systems and to reduce the total computation time. We also argue the effect of increment in simulation samples for the combined procedure. The results of numerical illustration show clearly the effect of increment in simulation samples on the proposed combination of selection procedure.

Keywords: Indifference-Zone, Optimal Computing Budget Allocation, Ordinal Optimization, Ranking and Selection, Subset Selection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1197
53 A Flexible Flowshop Scheduling Problem with Machine Eligibility Constraint and Two Criteria Objective Function

Authors: Bita Tadayon, Nasser Salmasi

Abstract:

This research deals with a flexible flowshop scheduling problem with arrival and delivery of jobs in groups and processing them individually. Due to the special characteristics of each job, only a subset of machines in each stage is eligible to process that job. The objective function deals with minimization of sum of the completion time of groups on one hand and minimization of sum of the differences between completion time of jobs and delivery time of the group containing that job (waiting period) on the other hand. The problem can be stated as FFc / rj , Mj / irreg which has many applications in production and service industries. A mathematical model is proposed, the problem is proved to be NPcomplete, and an effective heuristic method is presented to schedule the jobs efficiently. This algorithm can then be used within the body of any metaheuristic algorithm for solving the problem.

Keywords: flexible flowshop scheduling, group processing, machine eligibility constraint, mathematical modeling.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1785
52 Wheat Yield Prediction through Agro Meteorological Indices for Ardebil District

Authors: Fariba Esfandiary, Ghafoor Aghaie, Ali Dolati Mehr

Abstract:

Wheat prediction was carried out using different meteorological variables together with agro meteorological indices in Ardebil district for the years 2004-2005 & 2005–2006. On the basis of correlation coefficients, standard error of estimate as well as relative deviation of predicted yield from actual yield using different statistical models, the best subset of agro meteorological indices were selected including daily minimum temperature (Tmin), accumulated difference of maximum & minimum temperatures (TD), growing degree days (GDD), accumulated water vapor pressure deficit (VPD), sunshine hours (SH) & potential evapotranspiration (PET). Yield prediction was done two months in advance before harvesting time which was coincide with commencement of reproductive stage of wheat (5th of June). It revealed that in the final statistical models, 83% of wheat yield variability was accounted for variation in above agro meteorological indices.

Keywords: Wheat yields prediction, agro meteorological indices, statistical models

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2086
51 Genetic Algorithms and Kernel Matrix-based Criteria Combined Approach to Perform Feature and Model Selection for Support Vector Machines

Authors: A. Perolini

Abstract:

Feature and model selection are in the center of attention of many researches because of their impact on classifiers- performance. Both selections are usually performed separately but recent developments suggest using a combined GA-SVM approach to perform them simultaneously. This approach improves the performance of the classifier identifying the best subset of variables and the optimal parameters- values. Although GA-SVM is an effective method it is computationally expensive, thus a rough method can be considered. The paper investigates a joined approach of Genetic Algorithm and kernel matrix criteria to perform simultaneously feature and model selection for SVM classification problem. The purpose of this research is to improve the classification performance of SVM through an efficient approach, the Kernel Matrix Genetic Algorithm method (KMGA).

Keywords: Feature and model selection, Genetic Algorithms, Support Vector Machines, kernel matrix.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1538
50 Heterogeneous Attribute Reduction in Noisy System based on a Generalized Neighborhood Rough Sets Model

Authors: Siyuan Jing, Kun She

Abstract:

Neighborhood Rough Sets (NRS) has been proven to be an efficient tool for heterogeneous attribute reduction. However, most of researches are focused on dealing with complete and noiseless data. Factually, most of the information systems are noisy, namely, filled with incomplete data and inconsistent data. In this paper, we introduce a generalized neighborhood rough sets model, called VPTNRS, to deal with the problem of heterogeneous attribute reduction in noisy system. We generalize classical NRS model with tolerance neighborhood relation and the probabilistic theory. Furthermore, we use the neighborhood dependency to evaluate the significance of a subset of heterogeneous attributes and construct a forward greedy algorithm for attribute reduction based on it. Experimental results show that the model is efficient to deal with noisy data.

Keywords: attribute reduction, incomplete data, inconsistent data, tolerance neighborhood relation, rough sets

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1543
49 Secure Secret Recovery by using Weighted Personal Entropy

Authors: Leau Y. B., Dinna Nina M. N., Habeeb S. A. H., Jetol B.

Abstract:

Authentication plays a vital role in many secure systems. Most of these systems require user to log in with his or her secret password or pass phrase before entering it. This is to ensure all the valuables information is kept confidential guaranteeing also its integrity and availability. However, to achieve this goal, users are required to memorize high entropy passwords or pass phrases. Unfortunately, this sometimes causes difficulty for user to remember meaningless strings of data. This paper presents a new scheme which assigns a weight to each personal question given to the user in revealing the encrypted secrets or password. Concentration of this scheme is to offer fault tolerance to users by allowing them to forget the specific password to a subset of questions and still recover the secret and achieve successful authentication. Comparison on level of security for weight-based and weightless secret recovery scheme is also discussed. The paper concludes with the few areas that requires more investigation in this research.

Keywords: Secret Recovery, Personal Entropy, Cryptography, Secret Sharing and Key Management.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1918
48 The Optimal Equilibrium Capacity of Information Hiding Based on Game Theory

Authors: Ziquan Hu, Kun She, Shahzad Ali, Kai Yan

Abstract:

Game theory could be used to analyze the conflicted issues in the field of information hiding. In this paper, 2-phase game can be used to build the embedder-attacker system to analyze the limits of hiding capacity of embedding algorithms: the embedder minimizes the expected damage and the attacker maximizes it. In the system, the embedder first consumes its resource to build embedded units (EU) and insert the secret information into EU. Then the attacker distributes its resource evenly to the attacked EU. The expected equilibrium damage, which is maximum damage in value from the point of view of the attacker and minimum from the embedder against the attacker, is evaluated by the case when the attacker attacks a subset from all the EU. Furthermore, the optimal equilibrium capacity of hiding information is calculated through the optimal number of EU with the embedded secret information. Finally, illustrative examples of the optimal equilibrium capacity are presented.

Keywords: 2-Phase Game, Expected Equilibrium damage, InformationHiding, Optimal Equilibrium Capacity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1571
47 Coverage and Connectivity Problem in Sensor Networks

Authors: Meenakshi Bansal, Iqbal Singh, Parvinder S. Sandhu

Abstract:

In over deployed sensor networks, one approach to Conserve energy is to keep only a small subset of sensors active at Any instant. For the coverage problems, the monitoring area in a set of points that require sensing, called demand points, and consider that the node coverage area is a circle of range R, where R is the sensing range, If the Distance between a demand point and a sensor node is less than R, the node is able to cover this point. We consider a wireless sensor network consisting of a set of sensors deployed randomly. A point in the monitored area is covered if it is within the sensing range of a sensor. In some applications, when the network is sufficiently dense, area coverage can be approximated by guaranteeing point coverage. In this case, all the points of wireless devices could be used to represent the whole area, and the working sensors are supposed to cover all the sensors. We also introduce Hybrid Algorithm and challenges related to coverage in sensor networks.

Keywords: Wireless sensor networks, network coverage, Energy conservation, Hybrid Algorithms.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1673
46 The Feedback Control for Distributed Systems

Authors: Kamil Aida-zade, C. Ardil

Abstract:

We study the problem of synthesis of lumped sources control for the objects with distributed parameters on the basis of continuous observation of phase state at given points of object. In the proposed approach the phase state space (phase space) is beforehand somehow partitioned at observable points into given subsets (zones). The synthesizing control actions therewith are taken from the class of piecewise constant functions. The current values of control actions are determined by the subset of phase space that contains the aggregate of current states of object at the observable points (in these states control actions take constant values). In the paper such synthesized control actions are called zone control actions. A technique to obtain optimal values of zone control actions with the use of smooth optimization methods is given. With this aim, the formulas of objective functional gradient in the space of zone control actions are obtained.

Keywords: Feedback control, distributed systems, smooth optimization methods, lumped control synthesis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 552
45 Image Retrieval Based on Multi-Feature Fusion for Heterogeneous Image Databases

Authors: N. W. U. D. Chathurani, Shlomo Geva, Vinod Chandran, Proboda Rajapaksha

Abstract:

Selecting an appropriate image representation is the most important factor in implementing an effective Content-Based Image Retrieval (CBIR) system. This paper presents a multi-feature fusion approach for efficient CBIR, based on the distance distribution of features and relative feature weights at the time of query processing. It is a simple yet effective approach, which is free from the effect of features' dimensions, ranges, internal feature normalization and the distance measure. This approach can easily be adopted in any feature combination to improve retrieval quality. The proposed approach is empirically evaluated using two benchmark datasets for image classification (a subset of the Corel dataset and Oliva and Torralba) and compared with existing approaches. The performance of the proposed approach is confirmed with the significantly improved performance in comparison with the independently evaluated baseline of the previously proposed feature fusion approaches.

Keywords: Feature fusion, image retrieval, membership function, normalization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1300
44 FCNN-MR: A Parallel Instance Selection Method Based on Fast Condensed Nearest Neighbor Rule

Authors: Lu Si, Jie Yu, Shasha Li, Jun Ma, Lei Luo, Qingbo Wu, Yongqi Ma, Zhengji Liu

Abstract:

Instance selection (IS) technique is used to reduce the data size to improve the performance of data mining methods. Recently, to process very large data set, several proposed methods divide the training set into some disjoint subsets and apply IS algorithms independently to each subset. In this paper, we analyze the limitation of these methods and give our viewpoint about how to divide and conquer in IS procedure. Then, based on fast condensed nearest neighbor (FCNN) rule, we propose a large data sets instance selection method with MapReduce framework. Besides ensuring the prediction accuracy and reduction rate, it has two desirable properties: First, it reduces the work load in the aggregation node; Second and most important, it produces the same result with the sequential version, which other parallel methods cannot achieve. We evaluate the performance of FCNN-MR on one small data set and two large data sets. The experimental results show that it is effective and practical.

Keywords: Instance selection, data reduction, MapReduce, kNN.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 962
43 Threshold Concepts in TESOL: A Thematic Analysis of Disciplinary Guiding Principles

Authors: Neil Morgan

Abstract:

The notion of Threshold Concepts has offered a fertile new perspective on the transformative effects of mastery of particular concepts on student understanding of subject matter and their developing identities as inductees into disciplinary discourse communities. Only by successfully traversing essential knowledge thresholds can neophytes achieve the more sophisticated understandings of subject matter possessed by mature members of a discipline. This paper uses thematic analysis of disciplinary guiding principles to identify nine candidate Threshold Concepts that appear to underpin effective TESOL practice. The relationship between these candidate TESOL Threshold Concepts, TESOL principles, and TESOL instructional techniques appears to be amenable to a schematic representation based on superordinate categories of TESOL practitioner concern and, as such, offers an alternative to the view of Threshold Concepts as a privileged subset of disciplinary core concepts. The paper concludes by exploring the potential of a Threshold Concepts framework to productively inform TESOL initial teacher education (ITE) and in-service education and training (INSET).

Keywords: TESOL, threshold concepts, TESOL principles, TESOL ITE/INSET, community of practice.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 646
42 Data Mining Determination of Sunlight Average Input for Solar Power Plant

Authors: Fl. Loury, P. Sablonière, C. Lamoureux, G. Magnier, Th. Gutierrez

Abstract:

A method is proposed to extract faithful representative patterns from data set of observations when they are suffering from non-negligible fluctuations. Supposing time interval between measurements to be extremely small compared to observation time, it consists in defining first a subset of intermediate time intervals characterizing coherent behavior. Data projection on these intervals gives a set of curves out of which an ideally “perfect” one is constructed by taking the sup limit of them. Then comparison with average real curve in corresponding interval gives an efficiency parameter expressing the degradation consecutive to fluctuation effect. The method is applied to sunlight data collected in a specific place, where ideal sunlight is the one resulting from direct exposure at location latitude over the year, and efficiency is resulting from action of meteorological parameters, mainly cloudiness, at different periods of the year. The extracted information already gives interesting element of decision, before being used for analysis of plant control.

Keywords: Base Input Reconstruction, Data Mining, Efficiency Factor, Information Pattern Operator.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1473
41 Robot Movement Using the Trust Region Policy Optimization

Authors: Romisaa Ali

Abstract:

The Policy Gradient approach is a subset of the Deep Reinforcement Learning (DRL) combines Deep Neural Networks (DNN) with Reinforcement Learning (RL). This approach finds the optimal policy of robot movement, based on the experience it gains from interaction with its environment. Unlike previous policy gradient algorithms, which were unable to handle the two types of error variance and bias introduced by the DNN model due to over- or underestimation, this algorithm is capable of handling both types of error variance and bias. This article will discuss the state-of-the-art SOTA policy gradient technique, trust region policy optimization (TRPO), by applying this method in various environments compared to another policy gradient method, the Proximal Policy Optimization (PPO), to explain their robust optimization, using this SOTA to gather experience data during various training phases after observing the impact of hyper-parameters on neural network performance.

Keywords: Deep neural networks, deep reinforcement learning, Proximal Policy Optimization, state-of-the-art, trust region policy optimization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 104
40 Robust Camera Calibration using Discrete Optimization

Authors: Stephan Rupp, Matthias Elter, Michael Breitung, Walter Zink, Christian Küblbeck

Abstract:

Camera calibration is an indispensable step for augmented reality or image guided applications where quantitative information should be derived from the images. Usually, a camera calibration is obtained by taking images of a special calibration object and extracting the image coordinates of projected calibration marks enabling the calculation of the projection from the 3d world coordinates to the 2d image coordinates. Thus such a procedure exhibits typical steps, including feature point localization in the acquired images, camera model fitting, correction of distortion introduced by the optics and finally an optimization of the model-s parameters. In this paper we propose to extend this list by further step concerning the identification of the optimal subset of images yielding the smallest overall calibration error. For this, we present a Monte Carlo based algorithm along with a deterministic extension that automatically determines the images yielding an optimal calibration. Finally, we present results proving that the calibration can be significantly improved by automated image selection.

Keywords: Camera Calibration, Discrete Optimization, Monte Carlo Method.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1758
39 Decision Trees for Predicting Risk of Mortality using Routinely Collected Data

Authors: Tessy Badriyah, Jim S. Briggs, Dave R. Prytherch

Abstract:

It is well known that Logistic Regression is the gold standard method for predicting clinical outcome, especially predicting risk of mortality. In this paper, the Decision Tree method has been proposed to solve specific problems that commonly use Logistic Regression as a solution. The Biochemistry and Haematology Outcome Model (BHOM) dataset obtained from Portsmouth NHS Hospital from 1 January to 31 December 2001 was divided into four subsets. One subset of training data was used to generate a model, and the model obtained was then applied to three testing datasets. The performance of each model from both methods was then compared using calibration (the χ2 test or chi-test) and discrimination (area under ROC curve or c-index). The experiment presented that both methods have reasonable results in the case of the c-index. However, in some cases the calibration value (χ2) obtained quite a high result. After conducting experiments and investigating the advantages and disadvantages of each method, we can conclude that Decision Trees can be seen as a worthy alternative to Logistic Regression in the area of Data Mining.

Keywords: Decision Trees, Logistic Regression, clinical outcome, risk of mortality.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2473
38 Phenotypes of B Cells Differ in EBV-positive Burkitt-s lymphoma Derived Cell Lines

Authors: Irina Spaka, Rita Birkenfelde, Svetlana Kozireva, Jevgenija Osmjana, Madara Upmane, ElenaKashuba, Irina Kholodnyuk Holodnuka

Abstract:

Epstein-Barr virus (EBV) is implicated in the pathogenesis of the endemic Burkitt-s lymphoma (BL). The EBVpositive BL-derived cell lines initially maintain the original tumor phenotype of EBV infection (latency I, LatI), but most of them drift toward a lymphoblast phenotype of EBV latency III (LatIII) during in vitro culturing. The aim of the present work was to characterize the B-cell subsets in EBV-positive BL cell lines and to verify whether a particular cell subset correlates with the type of EBV infection. The phenotype analysis of two EBV-negative and eleven EBV-positive (three of LatI and eight of LatIII) BL cell lines was performed by polychromatic flow cytomery, based on expression pattern of CD19, CD10, CD38, CD27, and CD5 markers. Two cell subsets, CD19+CD10+ and CD19+CD10-, were defined in LatIII BL cell lines. In both subsets, the CD27 and CD5 cell surface expression was detected in a proportion of the cells.

Keywords: B-cell subsets, Burkitt's lymphoma cell lines, EBV latency, phenotype profiles.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1899
37 Intelligent Modeling of the Electrical Activity of the Human Heart

Authors: Lambros V. Skarlas, Grigorios N. Beligiannis, Efstratios F. Georgopoulos, Adam V. Adamopoulos

Abstract:

The aim of this contribution is to present a new approach in modeling the electrical activity of the human heart. A recurrent artificial neural network is being used in order to exhibit a subset of the dynamics of the electrical behavior of the human heart. The proposed model can also be used, when integrated, as a diagnostic tool of the human heart system. What makes this approach unique is the fact that every model is being developed from physiological measurements of an individual. This kind of approach is very difficult to apply successfully in many modeling problems, because of the complexity and entropy of the free variables describing the complex system. Differences between the modeled variables and the variables of an individual, measured at specific moments, can be used for diagnostic purposes. The sensor fusion used in order to optimize the utilization of biomedical sensors is another point that this paper focuses on. Sensor fusion has been known for its advantages in applications such as control and diagnostics of mechanical and chemical processes.

Keywords: Artificial Neural Networks, Diagnostic System, Health Condition Modeling Tool, Heart Diagnostics Model, Heart Electricity Model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1778
36 Multiclass Support Vector Machines with Simultaneous Multi-Factors Optimization for Corporate Credit Ratings

Authors: Hyunchul Ahn, William X. S. Wong

Abstract:

Corporate credit rating prediction is one of the most important topics, which has been studied by researchers in the last decade. Over the last decade, researchers are pushing the limit to enhance the exactness of the corporate credit rating prediction model by applying several data-driven tools including statistical and artificial intelligence methods. Among them, multiclass support vector machine (MSVM) has been widely applied due to its good predictability. However, heuristics, for example, parameters of a kernel function, appropriate feature and instance subset, has become the main reason for the critics on MSVM, as they have dictate the MSVM architectural variables. This study presents a hybrid MSVM model that is intended to optimize all the parameter such as feature selection, instance selection, and kernel parameter. Our model adopts genetic algorithm (GA) to simultaneously optimize multiple heterogeneous design factors of MSVM.

Keywords: Corporate credit rating prediction, feature selection, genetic algorithms, instance selection, multiclass support vector machines.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1367
35 Increasing the Efficiency of Rake Receivers for Ultra-Wideband Applications

Authors: Aimilia P. Doukeli, Athanasios S. Lioumpas, George K. Karagiannidis, Panayiotis V. Frangos, P. Takis Mathiopoulos

Abstract:

In diversity rich environments, such as in Ultra- Wideband (UWB) applications, the a priori determination of the number of strong diversity branches is difficult, because of the considerably large number of diversity paths, which are characterized by a variety of power delay profiles (PDPs). Several Rake implementations have been proposed in the past, in order to reduce the number of the estimated and combined paths. To this aim, we introduce two adaptive Rake receivers, which combine a subset of the resolvable paths considering simultaneously the quality of both the total combining output signal-to-noise ratio (SNR) and the individual SNR of each path. These schemes achieve better adaptation to channel conditions compared to other known receivers, without further increasing the complexity. Their performance is evaluated in different practical UWB channels, whose models are based on extensive propagation measurements. The proposed receivers compromise between the power consumption, complexity and performance gain for the additional paths, resulting in important savings in power and computational resources.

Keywords: Adaptive Rake receivers, diversity techniques, fading channels, UWB channel.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1506
34 Identifying Factors Contributing to the Spread of Lyme Disease: A Regression Analysis of Virginia’s Data

Authors: Fatemeh Valizadeh Gamchi, Edward L. Boone

Abstract:

This research focuses on Lyme disease, a widespread infectious condition in the United States caused by the bacterium Borrelia burgdorferi sensu stricto. It is critical to identify environmental and economic elements that are contributing to the spread of the disease. This study examined data from Virginia to identify a subset of explanatory variables significant for Lyme disease case numbers. To identify relevant variables and avoid overfitting, linear poisson, and regularization regression methods such as ridge, lasso, and elastic net penalty were employed. Cross-validation was performed to acquire tuning parameters. The methods proposed can automatically identify relevant disease count covariates. The efficacy of the techniques was assessed using four criteria on three simulated datasets. Finally, using the Virginia Department of Health’s Lyme disease dataset, the study successfully identified key factors, and the results were consistent with previous studies.

Keywords: Lyme disease, Poisson generalized linear model, Ridge regression, Lasso Regression, elastic net regression.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 30
33 A Comparison of SVM-based Criteria in Evolutionary Method for Gene Selection and Classification of Microarray Data

Authors: Rameswar Debnath, Haruhisa Takahashi

Abstract:

An evolutionary method whose selection and recombination operations are based on generalization error-bounds of support vector machine (SVM) can select a subset of potentially informative genes for SVM classifier very efficiently [7]. In this paper, we will use the derivative of error-bound (first-order criteria) to select and recombine gene features in the evolutionary process, and compare the performance of the derivative of error-bound with the error-bound itself (zero-order) in the evolutionary process. We also investigate several error-bounds and their derivatives to compare the performance, and find the best criteria for gene selection and classification. We use 7 cancer-related human gene expression datasets to evaluate the performance of the zero-order and first-order criteria of error-bounds. Though both criteria have the same strategy in theoretically, experimental results demonstrate the best criterion for microarray gene expression data.

Keywords: support vector machine, generalization error-bound, feature selection, evolutionary algorithm, microarray data

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1484
32 A Study about the Distribution of the Spanning Ratios of Yao Graphs

Authors: Maryam Hsaini, Mostafa Nouri-Baygi

Abstract:

A critical problem in wireless sensor networks is limited battery and memory of nodes. Therefore, each node in the network could maintain only a subset of its neighbors to communicate with. This will increase the battery usage in the network because each packet should take more hops to reach its destination. In order to tackle these problems, spanner graphs are defined. Since each node has a small degree in a spanner graph and the distance in the graph is not much greater than its actual geographical distance, spanner graphs are suitable candidates to be used for the topology of a wireless sensor network. In this paper, we study Yao graphs and their behavior for a randomly selected set of points. We generate several random point sets and compare the properties of their Yao graphs with the complete graph. Based on our data sets, we obtain several charts demonstrating how Yao graphs behave for a set of randomly chosen point set. As the results show, the stretch factor of a Yao graph follows a normal distribution. Furthermore, the stretch factor is in average far less than the worst case stretch factor proved for Yao graphs in previous results. Furthermore, we use Yao graph for a realistic point set and study its stretch factor in real world.

Keywords: Wireless sensor network, spanner graph, Yao Graph.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 548
31 HaskellFL: A Tool for Detecting Logical Errors in Haskell

Authors: Vanessa Vasconcelos, Mariza A. S. Bigonha

Abstract:

Understanding and using the functional paradigm is a challenge for many programmers. Looking for logical errors in code may take a lot of a developer’s time when a program grows in size. In order to facilitate both processes, this paper presents HaskellFL, a tool that uses fault localization techniques to locate a logical error in Haskell code. The Haskell subset used in this work is sufficiently expressive for those studying Functional Programming to get immediate help debugging their code and to answer questions about key concepts associated with the functional paradigm. HaskellFL was tested against Functional Programming assignments submitted by students enrolled at the Functional Programming class at the Federal University of Minas Gerais and against exercises from the Exercism Haskell track that are publicly available in GitHub. This work also evaluated the effectiveness of two fault localization techniques, Tarantula and Ochiai, in the Haskell context. Furthermore, the EXAM score was chosen to evaluate the tool’s effectiveness, and results showed that HaskellFL reduced the effort needed to locate an error for all tested scenarios. The results also showed that the Ochiai method was more effective than Tarantula.

Keywords: Debug, fault localization, functional programming, Haskell.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 649
30 MIMO Antenna Selections using CSI from Reciprocal Channel

Authors: P. Uthansakul, K. Attakitmongkol, N. Promsuvana, M. Uthansakul

Abstract:

It is well known that the channel capacity of Multiple- Input-Multiple-Output (MIMO) system increases as the number of antenna pairs between transmitter and receiver increases but it suffers from multiple expensive RF chains. To reduce the cost of RF chains, Antenna Selection (AS) method can offer a good tradeoff between expense and performance. In a transmitting AS system, Channel State Information (CSI) feedback is necessarily required to choose the best subset of antennas in which the effects of delays and errors occurred in feedback channels are the most dominant factors degrading the performance of the AS method. This paper presents the concept of AS method using CSI from channel reciprocity instead of feedback method. Reciprocity technique can easily archive CSI by utilizing a reverse channel where the forward and reverse channels are symmetrically considered in time, frequency and location. In this work, the capacity performance of MIMO system when using AS method at transmitter with reciprocity channels is investigated by own developing Testbed. The obtained results show that reciprocity technique offers capacity close to a system with a perfect CSI and gains a higher capacity than a system without AS method from 0.9 to 2.2 bps/Hz at SNR 10 dB.

Keywords: Antenna Selection, Capacity, Channel, Measurement, MIMO, Reciprocity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1921
29 Synthetic Aperture Radar Remote Sensing Classification Using the Bag of Visual Words Model to Land Cover Studies

Authors: Reza Mohammadi, Mahmod R. Sahebi, Mehrnoosh Omati, Milad Vahidi

Abstract:

Classification of high resolution polarimetric Synthetic Aperture Radar (PolSAR) images plays an important role in land cover and land use management. Recently, classification algorithms based on Bag of Visual Words (BOVW) model have attracted significant interest among scholars and researchers in and out of the field of remote sensing. In this paper, BOVW model with pixel based low-level features has been implemented to classify a subset of San Francisco bay PolSAR image, acquired by RADARSAR 2 in C-band. We have used segment-based decision-making strategy and compared the result with the result of traditional Support Vector Machine (SVM) classifier. 90.95% overall accuracy of the classification with the proposed algorithm has shown that the proposed algorithm is comparable with the state-of-the-art methods. In addition to increase in the classification accuracy, the proposed method has decreased undesirable speckle effect of SAR images.

Keywords: Bag of Visual Words, classification, feature extraction, land cover management, Polarimetric Synthetic Aperture Radar.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 715
28 Local Curvelet Based Classification Using Linear Discriminant Analysis for Face Recognition

Authors: Mohammed Rziza, Mohamed El Aroussi, Mohammed El Hassouni, Sanaa Ghouzali, Driss Aboutajdine

Abstract:

In this paper, an efficient local appearance feature extraction method based the multi-resolution Curvelet transform is proposed in order to further enhance the performance of the well known Linear Discriminant Analysis(LDA) method when applied to face recognition. Each face is described by a subset of band filtered images containing block-based Curvelet coefficients. These coefficients characterize the face texture and a set of simple statistical measures allows us to form compact and meaningful feature vectors. The proposed method is compared with some related feature extraction methods such as Principal component analysis (PCA), as well as Linear Discriminant Analysis LDA, and independent component Analysis (ICA). Two different muti-resolution transforms, Wavelet (DWT) and Contourlet, were also compared against the Block Based Curvelet-LDA algorithm. Experimental results on ORL, YALE and FERET face databases convince us that the proposed method provides a better representation of the class information and obtains much higher recognition accuracies.

Keywords: Curvelet, Linear Discriminant Analysis (LDA) , Contourlet, Discreet Wavelet Transform, DWT, Block-based analysis, face recognition (FR).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1759
27 A Monte Carlo Method to Data Stream Analysis

Authors: Kittisak Kerdprasop, Nittaya Kerdprasop, Pairote Sattayatham

Abstract:

Data stream analysis is the process of computing various summaries and derived values from large amounts of data which are continuously generated at a rapid rate. The nature of a stream does not allow a revisit on each data element. Furthermore, data processing must be fast to produce timely analysis results. These requirements impose constraints on the design of the algorithms to balance correctness against timely responses. Several techniques have been proposed over the past few years to address these challenges. These techniques can be categorized as either dataoriented or task-oriented. The data-oriented approach analyzes a subset of data or a smaller transformed representation, whereas taskoriented scheme solves the problem directly via approximation techniques. We propose a hybrid approach to tackle the data stream analysis problem. The data stream has been both statistically transformed to a smaller size and computationally approximated its characteristics. We adopt a Monte Carlo method in the approximation step. The data reduction has been performed horizontally and vertically through our EMR sampling method. The proposed method is analyzed by a series of experiments. We apply our algorithm on clustering and classification tasks to evaluate the utility of our approach.

Keywords: Data Stream, Monte Carlo, Sampling, DensityEstimation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1380
26 The Determination of Aflatoxins in Paddy and Milled Fractions of Rice in Guyana: Preliminary Results

Authors: Donna M. Morrison, Lambert Chester, Coretta A. N. Samuels, David R. Ledoux

Abstract:

A survey was conducted in the five rice-growing regions in Guyana to determine the presence of aflatoxins in multiple fractions of rice in June/October 2015 growing season. The fractions were paddy, steamed paddy, cargo rice, white rice and parboiled rice. Samples were analyzed by High Performance Liquid Chromatography. A subset of the samples was further analyzed by enzyme-linked immunosorbent assay (ELISA) for concurrence. All analyses were conducted at the University of Missouri, USA. Of the 186 samples tested, 16 had aflatoxin concentrations greater than 20 ppb the recommended limit for aflatoxins in food according to the United States Food and Drug Administration. An additional three samples had aflatoxin B1 concentrations greater than the European Union Commission maximum levels for aflatoxin B1 in rice at 5 µg/kg and total aflatoxins (B1, B2, G1 and G2) at 10 µg/kg. The survey indicates that there is no widespread aflatoxin problem in rice in Guyana. The incidence of aflatoxins appears to be localized.

Keywords: Aflatoxins, enzyme-linked immunosorbent assay, high-performance liquid chromatography, rice fractions.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1517