Search results for: Classification and regression tree (CART)
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2107

Search results for: Classification and regression tree (CART)

1597 An Improved Preprocessing for Biosonar Target Classification

Authors: Turgay Temel, John Hallam

Abstract:

An improved processing description to be employed in biosonar signal processing in a cochlea model is proposed and examined. It is compared to conventional models using a modified discrimination analysis and both are tested. Their performances are evaluated with echo data captured from natural targets (trees).Results indicate that the phase characteristics of low-pass filters employed in the echo processing have a significant effect on class separability for this data.

Keywords: Cochlea model, discriminant analysis, neurospikecoding, classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1466
1596 A CTL Specification of Serializability for Transactions Accessing Uniform Data

Authors: Rafat Alshorman, Walter Hussak

Abstract:

Existing work in temporal logic on representing the execution of infinitely many transactions, uses linear-time temporal logic (LTL) and only models two-step transactions. In this paper, we use the comparatively efficient branching-time computational tree logic CTL and extend the transaction model to a class of multistep transactions, by introducing distinguished propositional variables to represent the read and write steps of n multi-step transactions accessing m data items infinitely many times. We prove that the well known correspondence between acyclicity of conflict graphs and serializability for finite schedules, extends to infinite schedules. Furthermore, in the case of transactions accessing the same set of data items in (possibly) different orders, serializability corresponds to the absence of cycles of length two. This result is used to give an efficient encoding of the serializability condition into CTL.

Keywords: computational tree logic, serializability, multi-step transactions.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1156
1595 The Strengths and Limitations of the Statistical Modeling of Complex Social Phenomenon: Focusing on SEM, Path Analysis, or Multiple Regression Models

Authors: Jihye Jeon

Abstract:

This paper analyzes the conceptual framework of three statistical methods, multiple regression, path analysis, and structural equation models. When establishing research model of the statistical modeling of complex social phenomenon, it is important to know the strengths and limitations of three statistical models. This study explored the character, strength, and limitation of each modeling and suggested some strategies for accurate explaining or predicting the causal relationships among variables. Especially, on the studying of depression or mental health, the common mistakes of research modeling were discussed.

Keywords: Multiple regression, path analysis, structural equation models, statistical modeling, social and psychological phenomenon.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 9183
1594 Gene Expression Data Classification Using Discriminatively Regularized Sparse Subspace Learning

Authors: Chunming Xu

Abstract:

Sparse representation which can represent high dimensional data effectively has been successfully used in computer vision and pattern recognition problems. However, it doesn-t consider the label information of data samples. To overcome this limitation, we develop a novel dimensionality reduction algorithm namely dscriminatively regularized sparse subspace learning(DR-SSL) in this paper. The proposed DR-SSL algorithm can not only make use of the sparse representation to model the data, but also can effective employ the label information to guide the procedure of dimensionality reduction. In addition,the presented algorithm can effectively deal with the out-of-sample problem.The experiments on gene-expression data sets show that the proposed algorithm is an effective tool for dimensionality reduction and gene-expression data classification.

Keywords: sparse representation, dimensionality reduction, labelinformation, sparse subspace learning, gene-expression data classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1426
1593 Ensembling Adaptively Constructed Polynomial Regression Models

Authors: Gints Jekabsons

Abstract:

The approach of subset selection in polynomial regression model building assumes that the chosen fixed full set of predefined basis functions contains a subset that is sufficient to describe the target relation sufficiently well. However, in most cases the necessary set of basis functions is not known and needs to be guessed – a potentially non-trivial (and long) trial and error process. In our research we consider a potentially more efficient approach – Adaptive Basis Function Construction (ABFC). It lets the model building method itself construct the basis functions necessary for creating a model of arbitrary complexity with adequate predictive performance. However, there are two issues that to some extent plague the methods of both the subset selection and the ABFC, especially when working with relatively small data samples: the selection bias and the selection instability. We try to correct these issues by model post-evaluation using Cross-Validation and model ensembling. To evaluate the proposed method, we empirically compare it to ABFC methods without ensembling, to a widely used method of subset selection, as well as to some other well-known regression modeling methods, using publicly available data sets.

Keywords: Basis function construction, heuristic search, modelensembles, polynomial regression.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1648
1592 Gene Selection Guided by Feature Interdependence

Authors: Hung-Ming Lai, Andreas Albrecht, Kathleen Steinhöfel

Abstract:

Cancers could normally be marked by a number of differentially expressed genes which show enormous potential as biomarkers for a certain disease. Recent years, cancer classification based on the investigation of gene expression profiles derived by high-throughput microarrays has widely been used. The selection of discriminative genes is, therefore, an essential preprocess step in carcinogenesis studies. In this paper, we have proposed a novel gene selector using information-theoretic measures for biological discovery. This multivariate filter is a four-stage framework through the analyses of feature relevance, feature interdependence, feature redundancy-dependence and subset rankings, and having been examined on the colon cancer data set. Our experimental result show that the proposed method outperformed other information theorem based filters in all aspect of classification errors and classification performance.

Keywords: Colon cancer, feature interdependence, feature subset selection, gene selection, microarray data analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2114
1591 Dynamic Load Balancing Strategy for Grid Computing

Authors: Belabbas Yagoubi, Yahya Slimani

Abstract:

Workload and resource management are two essential functions provided at the service level of the grid software infrastructure. To improve the global throughput of these software environments, workloads have to be evenly scheduled among the available resources. To realize this goal several load balancing strategies and algorithms have been proposed. Most strategies were developed in mind, assuming homogeneous set of sites linked with homogeneous and fast networks. However for computational grids we must address main new issues, namely: heterogeneity, scalability and adaptability. In this paper, we propose a layered algorithm which achieve dynamic load balancing in grid computing. Based on a tree model, our algorithm presents the following main features: (i) it is layered; (ii) it supports heterogeneity and scalability; and, (iii) it is totally independent from any physical architecture of a grid.

Keywords: Grid computing, load balancing, workload, tree based model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3098
1590 A New Weighted LDA Method in Comparison to Some Versions of LDA

Authors: Delaram Jarchi, Reza Boostani

Abstract:

Linear Discrimination Analysis (LDA) is a linear solution for classification of two classes. In this paper, we propose a variant LDA method for multi-class problem which redefines the between class and within class scatter matrices by incorporating a weight function into each of them. The aim is to separate classes as much as possible in a situation that one class is well separated from other classes, incidentally, that class must have a little influence on classification. It has been suggested to alleviate influence of classes that are well separated by adding a weight into between class scatter matrix and within class scatter matrix. To obtain a simple and effective weight function, ordinary LDA between every two classes has been used in order to find Fisher discrimination value and passed it as an input into two weight functions and redefined between class and within class scatter matrices. Experimental results showed that our new LDA method improved classification rate, on glass, iris and wine datasets, in comparison to different versions of LDA.

Keywords: Discriminant vectors, weighted LDA, uncorrelation, principle components, Fisher-face method, Bootstarp method.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1499
1589 Harmonics Elimination in Multilevel Inverter Using Linear Fuzzy Regression

Authors: A. K. Al-Othman, H. A. Al-Mekhaizim

Abstract:

Multilevel inverters supplied from equal and constant dc sources almost don-t exist in practical applications. The variation of the dc sources affects the values of the switching angles required for each specific harmonic profile, as well as increases the difficulty of the harmonic elimination-s equations. This paper presents an extremely fast optimal solution of harmonic elimination of multilevel inverters with non-equal dc sources using Tanaka's fuzzy linear regression formulation. A set of mathematical equations describing the general output waveform of the multilevel inverter with nonequal dc sources is formulated. Fuzzy linear regression is then employed to compute the optimal solution set of switching angles.

Keywords: Multilevel converters, harmonics, pulse widthmodulation (PWM), optimal control.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1780
1588 Classification of the Latin Alphabet as Pattern on ARToolkit Markers for Augmented Reality Applications

Authors: Mohamed Badeche, Mohamed Benmohammed

Abstract:

augmented reality is a technique used to insert virtual objects in real scenes. One of the most used libraries in the area is the ARToolkit library. It is based on the recognition of the markers that are in the form of squares with a pattern inside. This pattern which is mostly textual is source of confusing. In this paper, we present the results of a classification of Latin characters as a pattern on the ARToolkit markers to know the most distinguishable among them.

Keywords: ARToolkit library, augmented reality, K-means, patterns

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1806
1587 Genetic Algorithm for Feature Subset Selection with Exploitation of Feature Correlations from Continuous Wavelet Transform: a real-case Application

Authors: G. Van Dijck, M. M. Van Hulle, M. Wevers

Abstract:

A genetic algorithm (GA) based feature subset selection algorithm is proposed in which the correlation structure of the features is exploited. The subset of features is validated according to the classification performance. Features derived from the continuous wavelet transform are potentially strongly correlated. GA-s that do not take the correlation structure of features into account are inefficient. The proposed algorithm forms clusters of correlated features and searches for a good candidate set of clusters. Secondly a search within the clusters is performed. Different simulations of the algorithm on a real-case data set with strong correlations between features show the increased classification performance. Comparison is performed with a standard GA without use of the correlation structure.

Keywords: Classification, genetic algorithm, hierarchicalagglomerative clustering, wavelet transform.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1197
1586 Mining of Interesting Prediction Rules with Uniform Two-Level Genetic Algorithm

Authors: Bilal Alatas, Ahmet Arslan

Abstract:

The main goal of data mining is to extract accurate, comprehensible and interesting knowledge from databases that may be considered as large search spaces. In this paper, a new, efficient type of Genetic Algorithm (GA) called uniform two-level GA is proposed as a search strategy to discover truly interesting, high-level prediction rules, a difficult problem and relatively little researched, rather than discovering classification knowledge as usual in the literatures. The proposed method uses the advantage of uniform population method and addresses the task of generalized rule induction that can be regarded as a generalization of the task of classification. Although the task of generalized rule induction requires a lot of computations, which is usually not satisfied with the normal algorithms, it was demonstrated that this method increased the performance of GAs and rapidly found interesting rules.

Keywords: Classification rule mining, data mining, genetic algorithms.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1573
1585 Government (Big) Data Ecosystem: Definition, Classification of Actors, and Their Roles

Authors: Syed Iftikhar Hussain Shah, Vasilis Peristeras, Ioannis Magnisalis

Abstract:

Organizations, including governments, generate (big) data that are high in volume, velocity, veracity, and come from a variety of sources. Public Administrations are using (big) data, implementing base registries, and enforcing data sharing within the entire government to deliver (big) data related integrated services, provision of insights to users, and for good governance. Government (Big) data ecosystem actors represent distinct entities that provide data, consume data, manipulate data to offer paid services, and extend data services like data storage, hosting services to other actors. In this research work, we perform a systematic literature review. The key objectives of this paper are to propose a robust definition of government (big) data ecosystem and a classification of government (big) data ecosystem actors and their roles. We showcase a graphical view of actors, roles, and their relationship in the government (big) data ecosystem. We also discuss our research findings. We did not find too much published research articles about the government (big) data ecosystem, including its definition and classification of actors and their roles. Therefore, we lent ideas for the government (big) data ecosystem from numerous areas that include scientific research data, humanitarian data, open government data, industry data, in the literature.

Keywords: Big data, big data ecosystem, classification of big data actors, big data actors roles, definition of government (big) data ecosystem, data-driven government, eGovernment, gaps in data ecosystems, government (big) data, public administration, systematic literature review.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2062
1584 Burning Rate Response of Solid Fuels in Laminar Boundary Layer

Authors: A. M. Tahsini

Abstract:

Solid fuel transient burning behavior under oxidizer gas flow is numerically investigated. It is done using analysis of the regression rate responses to the imposed sudden and oscillatory variation at inflow properties. The conjugate problem is considered by simultaneous solution of flow and solid phase governing equations to compute the fuel regression rate. The advection upstream splitting method is used as flow computational scheme in finite volume method. The ignition phase is completely simulated to obtain the exact initial condition for response analysis. The results show that the transient burning effects which lead to the combustion instabilities and intermittent extinctions could be observed in solid fuels as the solid propellants.

Keywords: Extinction, Oscillation, Regression rate, Response, Transient burning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2345
1583 A Model for Test Case Selection in the Software-Development Life Cycle

Authors: Adtha Lawanna

Abstract:

Software maintenance is one of the essential processes of Software-Development Life Cycle. The main philosophies of retaining software concern the improvement of errors, the revision of codes, the inhibition of future errors, and the development in piece and capacity. While the adjustment has been employing, the software structure has to be retested to an upsurge a level of assurance that it will be prepared due to the requirements. According to this state, the test cases must be considered for challenging the revised modules and the whole software. A concept of resolving this problem is ongoing by regression test selection such as the retest-all selections, random/ad-hoc selection and the safe regression test selection. Particularly, the traditional techniques concern a mapping between the test cases in a test suite and the lines of code it executes. However, there are not only the lines of code as one of the requirements that can affect the size of test suite but including the number of functions and faulty versions. Therefore, a model for test case selection is developed to cover those three requirements by the integral technique which can produce the smaller size of the test cases when compared with the traditional regression selection techniques.

Keywords: Software maintenance, regression test selection, test case.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1677
1582 A Model for Test Case Selection in the Software-Development Life Cycle

Authors: Adtha Lawanna

Abstract:

Software maintenance is one of the essential processes of Software-Development Life Cycle. The main philosophies of retaining software concern the improvement of errors, the revision of codes, the inhibition of future errors, and the development in piece and capacity. While the adjustment has been employing, the software structure has to be retested to an upsurge a level of assurance that it will be prepared due to the requirements. According to this state, the test cases must be considered for challenging the revised modules and the whole software. A concept of resolving this problem is ongoing by regression test selection such as the retest-all selections, random/ad-hoc selection and the safe regression test selection. Particularly, the traditional techniques concern a mapping between the test cases in a test suite and the lines of code it executes. However, there are not only the lines of code as one of the requirements that can affect the size of test suite but including the number of functions and faulty versions. Therefore, a model for test case selection is developed to cover those three requirements by the integral technique which can produce the smaller size of the test cases when compared with the traditional regression selection techniques.

Keywords: Software maintenance, regression test selection, test case.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1577
1581 Effects of Hidden Unit Sizes and Autoregressive Features in Mental Task Classification

Authors: Ramaswamy Palaniappan, Nai-Jen Huan

Abstract:

Classification of electroencephalogram (EEG) signals extracted during mental tasks is a technique that is actively pursued for Brain Computer Interfaces (BCI) designs. In this paper, we compared the classification performances of univariateautoregressive (AR) and multivariate autoregressive (MAR) models for representing EEG signals that were extracted during different mental tasks. Multilayer Perceptron (MLP) neural network (NN) trained by the backpropagation (BP) algorithm was used to classify these features into the different categories representing the mental tasks. Classification performances were also compared across different mental task combinations and 2 sets of hidden units (HU): 2 to 10 HU in steps of 2 and 20 to 100 HU in steps of 20. Five different mental tasks from 4 subjects were used in the experimental study and combinations of 2 different mental tasks were studied for each subject. Three different feature extraction methods with 6th order were used to extract features from these EEG signals: AR coefficients computed with Burg-s algorithm (ARBG), AR coefficients computed with stepwise least square algorithm (ARLS) and MAR coefficients computed with stepwise least square algorithm. The best results were obtained with 20 to 100 HU using ARBG. It is concluded that i) it is important to choose the suitable mental tasks for different individuals for a successful BCI design, ii) higher HU are more suitable and iii) ARBG is the most suitable feature extraction method.

Keywords: Autoregressive, Brain-Computer Interface, Electroencephalogram, Neural Network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1784
1580 Mathematical Modeling to Predict Surface Roughness in CNC Milling

Authors: Ab. Rashid M.F.F., Gan S.Y., Muhammad N.Y.

Abstract:

Surface roughness (Ra) is one of the most important requirements in machining process. In order to obtain better surface roughness, the proper setting of cutting parameters is crucial before the process take place. This research presents the development of mathematical model for surface roughness prediction before milling process in order to evaluate the fitness of machining parameters; spindle speed, feed rate and depth of cut. 84 samples were run in this study by using FANUC CNC Milling α-Τ14ιE. Those samples were randomly divided into two data sets- the training sets (m=60) and testing sets(m=24). ANOVA analysis showed that at least one of the population regression coefficients was not zero. Multiple Regression Method was used to determine the correlation between a criterion variable and a combination of predictor variables. It was established that the surface roughness is most influenced by the feed rate. By using Multiple Regression Method equation, the average percentage deviation of the testing set was 9.8% and 9.7% for training data set. This showed that the statistical model could predict the surface roughness with about 90.2% accuracy of the testing data set and 90.3% accuracy of the training data set.

Keywords: Surface roughness, regression analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2105
1579 Multi-View Neural Network Based Gait Recognition

Authors: Saeid Fazli, Hadis Askarifar, Maryam Sheikh Shoaie

Abstract:

Human identification at a distance has recently gained growing interest from computer vision researchers. Gait recognition aims essentially to address this problem by identifying people based on the way they walk [1]. Gait recognition has 3 steps. The first step is preprocessing, the second step is feature extraction and the third one is classification. This paper focuses on the classification step that is essential to increase the CCR (Correct Classification Rate). Multilayer Perceptron (MLP) is used in this work. Neural Networks imitate the human brain to perform intelligent tasks [3].They can represent complicated relationships between input and output and acquire knowledge about these relationships directly from the data [2]. In this paper we apply MLP NN for 11 views in our database and compare the CCR values for these views. Experiments are performed with the NLPR databases, and the effectiveness of the proposed method for gait recognition is demonstrated.

Keywords: Human motion analysis, biometrics, gait recognition, principal component analysis, MLP neural network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2082
1578 Stature Prediction Model Based On Hand Anthropometry

Authors: Arunesh Chandra, Pankaj Chandna, Surinder Deswal, Rajesh Kumar Mishra, Rajender Kumar

Abstract:

The arm length, hand length, hand breadth and middle finger length of 1540 right-handed industrial workers of Haryana state was used to assess the relationship between the upper limb dimensions and stature. Initially, the data were analyzed using basic univariate analysis and independent t-tests; then simple and multiple linear regression models were used to estimate stature using SPSS (version 17). There was a positive correlation between upper limb measurements (hand length, hand breadth, arm length and middle finger length) and stature (p < 0.01), which was highest for hand length. The accuracy of stature prediction ranged from ± 54.897 mm to ± 58.307 mm. The use of multiple regression equations gave better results than simple regression equations. This study provides new forensic standards for stature estimation from the upper limb measurements of male industrial workers of Haryana (India). The results of this research indicate that stature can be determined using hand dimensions with accuracy, when only upper limb is available due to any reasons likewise explosions, train/plane crashes, mutilated bodies, etc. The regression formula derived in this study will be useful for anatomists, archaeologists, anthropologists, design engineers and forensic scientists for fairly prediction of stature using regression equations.

Keywords: Anthropometric dimensions, Forensic identification, Industrial workers, Stature prediction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2924
1577 A Spatial Hypergraph Based Semi-Supervised Band Selection Method for Hyperspectral Imagery Semantic Interpretation

Authors: Akrem Sellami, Imed Riadh Farah

Abstract:

Hyperspectral imagery (HSI) typically provides a wealth of information captured in a wide range of the electromagnetic spectrum for each pixel in the image. Hence, a pixel in HSI is a high-dimensional vector of intensities with a large spectral range and a high spectral resolution. Therefore, the semantic interpretation is a challenging task of HSI analysis. We focused in this paper on object classification as HSI semantic interpretation. However, HSI classification still faces some issues, among which are the following: The spatial variability of spectral signatures, the high number of spectral bands, and the high cost of true sample labeling. Therefore, the high number of spectral bands and the low number of training samples pose the problem of the curse of dimensionality. In order to resolve this problem, we propose to introduce the process of dimensionality reduction trying to improve the classification of HSI. The presented approach is a semi-supervised band selection method based on spatial hypergraph embedding model to represent higher order relationships with different weights of the spatial neighbors corresponding to the centroid of pixel. This semi-supervised band selection has been developed to select useful bands for object classification. The presented approach is evaluated on AVIRIS and ROSIS HSIs and compared to other dimensionality reduction methods. The experimental results demonstrate the efficacy of our approach compared to many existing dimensionality reduction methods for HSI classification.

Keywords: Hyperspectral image, spatial hypergraph, dimensionality reduction, semantic interpretation, band selection, feature extraction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1197
1576 Use of Regression Analysis in Determining the Length of Plastic Hinge in Reinforced Concrete Columns

Authors: Mehmet Alpaslan Köroğlu, Musa Hakan Arslan, Muslu Kazım Körez

Abstract:

Basic objective of this study is to create a regression analysis method that can estimate the length of a plastic hinge which is an important design parameter, by making use of the outcomes of (lateral load-lateral displacement hysteretic curves) the experimental studies conducted for the reinforced square concrete columns. For this aim, 170 different square reinforced concrete column tests results have been collected from the existing literature. The parameters which are thought affecting the plastic hinge length such as crosssection properties, features of material used, axial loading level, confinement of the column, longitudinal reinforcement bars in the columns etc. have been obtained from these 170 different square reinforced concrete column tests. In the study, when determining the length of plastic hinge, using the experimental test results, a regression analysis have been separately tested and compared with each other. In addition, the outcome of mentioned methods on determination of plastic hinge length of the reinforced concrete columns has been compared to other methods available in the literature.

Keywords: Columns, plastic hinge length, regression analysis, reinforced concrete.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4220
1575 Use of Gaussian-Euclidean Hybrid Function Based Artificial Immune System for Breast Cancer Diagnosis

Authors: Cuneyt Yucelbas, Seral Ozsen, Sule Yucelbas, Gulay Tezel

Abstract:

Due to the fact that there exist only a small number of complex systems in artificial immune system (AIS) that work out nonlinear problems, nonlinear AIS approaches, among the well-known solution techniques, need to be developed. Gaussian function is usually used as similarity estimation in classification problems and pattern recognition. In this study, diagnosis of breast cancer, the second type of the most widespread cancer in women, was performed with different distance calculation functions that euclidean, gaussian and gaussian-euclidean hybrid function in the clonal selection model of classical AIS on Wisconsin Breast Cancer Dataset (WBCD), which was taken from the University of California, Irvine Machine-Learning Repository. We used 3-fold cross validation method to train and test the dataset. According to the results, the maximum test classification accuracy was reported as 97.35% by using of gaussian-euclidean hybrid function for fold-3. Also, mean of test classification accuracies for all of functions were obtained as 94.78%, 94.45% and 95.31% with use of euclidean, gaussian and gaussian-euclidean, respectively. With these results, gaussian-euclidean hybrid function seems to be a potential distance calculation method, and it may be considered as an alternative distance calculation method for hard nonlinear classification problems.

Keywords: Artificial Immune System, Breast Cancer Diagnosis, Euclidean Function, Gaussian Function.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2095
1574 Modeling Aeration of Sharp Crested Weirs by Using Support Vector Machines

Authors: Arun Goel

Abstract:

The present paper attempts to investigate the prediction of air entrainment rate and aeration efficiency of a free overfall jets issuing from a triangular sharp crested weir by using regression based modelling. The empirical equations, Support vector machine (polynomial and radial basis function) models and the linear regression techniques were applied on the triangular sharp crested weirs relating the air entrainment rate and the aeration efficiency to the input parameters namely drop height, discharge, and vertex angle. It was observed that there exists a good agreement between the measured values and the values obtained using empirical equations, Support vector machine (Polynomial and rbf) models and the linear regression techniques. The test results demonstrated that the SVM based (Poly & rbf) model also provided acceptable prediction of the measured values with reasonable accuracy along with empirical equations and linear regression techniques in modelling the air entrainment rate and the aeration efficiency of a free overfall jets issuing from triangular sharp crested weir. Further sensitivity analysis has also been performed to study the impact of input parameter on the output in terms of air entrainment rate and aeration efficiency.

Keywords: Air entrainment rate, dissolved oxygen, regression, SVM, weir.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1934
1573 Clustering in WSN Based on Minimum Spanning Tree Using Divide and Conquer Approach

Authors: Uttam Vijay, Nitin Gupta

Abstract:

Due to heavy energy constraints in WSNs clustering is an efficient way to manage the energy in sensors. There are many methods already proposed in the area of clustering and research is still going on to make clustering more energy efficient. In our paper we are proposing a minimum spanning tree based clustering using divide and conquer approach. The MST based clustering was first proposed in 1970’s for large databases. Here we are taking divide and conquer approach and implementing it for wireless sensor networks with the constraints attached to the sensor networks. This Divide and conquer approach is implemented in a way that we don’t have to construct the whole MST before clustering but we just find the edge which will be the part of the MST to a corresponding graph and divide the graph in clusters there itself if that edge from the graph can be removed judging on certain constraints and hence saving lot of computation.

Keywords: Algorithm, Clustering, Edge-Weighted Graph, Weighted-LEACH.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2442
1572 A Research on Inference from Multiple Distance Variables in Hedonic Regression – Focus on Three Variables

Authors: Yan Wang, Yasushi Asami, Yukio Sadahiro

Abstract:

In urban context, urban nodes such as amenity or hazard will certainly affect house price, while classic hedonic analysis will employ distance variables measured from each urban nodes. However, effects from distances to facilities on house prices generally do not represent the true price of the property. Distance variables measured on the same surface are suffering a problem called multicollinearity, which is usually presented as magnitude variance and mean value in regression, errors caused by instability. In this paper, we provided a theoretical framework to identify and gather the data with less bias, and also provided specific sampling method on locating the sample region to avoid the spatial multicollinerity problem in three distance variable’s case.

Keywords: Hedonic regression, urban node, distance variables, multicollinerity, collinearity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1967
1571 Computational Identification of MicroRNAs and their Targets in two Species of Evergreen Spruce Tree (Picea)

Authors: Muhammad Y.K. Barozai, Ifthikhar A. Baloch, M. Din

Abstract:

MicroRNAs (miRNAs) are small, non-coding and regulatory RNAs about 20 to 24 nucleotides long. Their conserved nature among the various organisms makes them a good source of new miRNAs discovery by comparative genomics approach. The study resulted in 21 miRNAs of 20 pre-miRNAs belonging to 16 families (miR156, 157, 158, 164, 165, 168, 169, 172, 319, 390, 393, 394, 395, 400, 472 and 861) in evergreen spruce tree (Picea). The miRNA families; miR 157, 158, 164, 165, 168, 169, 319, 390, 393, 394, 400, 472 and 861 are reported for the first time in the Picea. All 20 miRNA precursors form stable minimum free energy stem-loop structure as their orthologues form in Arabidopsis and the mature miRNA reside in the stem portion of the stem loop structure. Sixteen (16) miRNAs are from Picea glauca and five (5) belong to Picea sitchensis. Their targets consist of transcription factors, growth related, stressed related and hypothetical proteins.

Keywords: BLAST, Comparative Genomics, Micro-RNAs, Spruce

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2023
1570 Adaptive Hierarchical Key Structure Generation for Key Management in Wireless Sensor Networks using A*

Authors: Jin Myoung Kim, Tae Ho Cho

Abstract:

Wireless Sensor networks have a wide spectrum of civil and military applications that call for secure communication such as the terrorist tracking, target surveillance in hostile environments. For the secure communication in these application areas, we propose a method for generating a hierarchical key structure for the efficient group key management. In this paper, we apply A* algorithm in generating a hierarchical key structure by considering the history data of the ratio of addition and eviction of sensor nodes in a location where sensor nodes are deployed. Thus generated key tree structure provides an efficient way of managing the group key in terms of energy consumption when addition and eviction event occurs. A* algorithm tries to minimize the number of messages needed for group key management by the history data. The experimentation with the tree shows efficiency of the proposed method.

Keywords: Heuristic search, key management, security, sensor network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1665
1569 A Comprehensive Method of Fault Detection and Isolation Based On Testability Modeling Data

Authors: Junyou Shi, Weiwei Cui

Abstract:

Testability modeling is a commonly used method in testability design and analysis of system. A dependency matrix will be obtained from testability modeling, and we will give a quantitative evaluation about fault detection and isolation. Based on the dependency matrix, we can obtain the diagnosis tree. The tree provides the procedures of the fault detection and isolation. But the dependency matrix usually includes built-in test (BIT) and manual test in fact. BIT runs the test automatically and is not limited by the procedures. The method above cannot give a more efficient diagnosis and use the advantages of the BIT. A Comprehensive method of fault detection and isolation is proposed. This method combines the advantages of the BIT and Manual test by splitting the matrix. The result of the case study shows that the method is effective.

Keywords: BIT, fault detection, fault isolation, testability modeling.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1648
1568 Graph-Based Text Similarity Measurement by Exploiting Wikipedia as Background Knowledge

Authors: Lu Zhang, Chunping Li, Jun Liu, Hui Wang

Abstract:

Text similarity measurement is a fundamental issue in many textual applications such as document clustering, classification, summarization and question answering. However, prevailing approaches based on Vector Space Model (VSM) more or less suffer from the limitation of Bag of Words (BOW), which ignores the semantic relationship among words. Enriching document representation with background knowledge from Wikipedia is proven to be an effective way to solve this problem, but most existing methods still cannot avoid similar flaws of BOW in a new vector space. In this paper, we propose a novel text similarity measurement which goes beyond VSM and can find semantic affinity between documents. Specifically, it is a unified graph model that exploits Wikipedia as background knowledge and synthesizes both document representation and similarity computation. The experimental results on two different datasets show that our approach significantly improves VSM-based methods in both text clustering and classification.

Keywords: Text classification, Text clustering, Text similarity, Wikipedia

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2086