Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 137

Search results for: attribute

137 Relation between Significance of Attribute Set and Single Attribute

Authors: Xiuqin Ma, Norrozila Binti Sulaiman, Hongwu Qin

Abstract:

In the research field of Rough Set, few papers concern the significance of attribute set. However, there is important relation between the significance of single attribute and that of attribute set, which should not be ignored. In this paper, we draw conclusions by case analysis that (1) the attribute set including single attributes with high significance is certainly significant, while, (2)the attribute set which consists of single attributes with low significance possibly has high significance. We validate the conclusions on discernibility matrix and the results demonstrate the contribution of our conclusions.

Keywords: relation, attribute set, single attribute, rough set, significance

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1493
136 Dynamic Attribute Dependencies in Relational Attribute Grammars

Authors: K. Barbar, M. Dehayni, A. Awada, M. Smaili

Abstract:

Considering the theory of attribute grammars, we use logical formulas instead of traditional functional semantic rules. Following the decoration of a derivation tree, a suitable algorithm should maintain the consistency of the formulas together with the evaluation of the attributes. This may be a Prolog-like resolution, but this paper examines a somewhat different strategy, based on production specialization, local consistency and propagation: given a derivation tree, it is interactively decorated, i.e. incrementally checked and evaluated. The non-directed dependencies are dynamically directed during attribute evaluation.

Keywords: Input/Output attribute grammars, local consistency, logical programming, propagation, relational attribute grammars.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1239
135 A Design-Based Cohesion Metric for Object-Oriented Classes

Authors: Jehad Al Dallal

Abstract:

Class cohesion is an important object-oriented software quality attribute. It indicates how much the members in a class are related. Assessing the class cohesion and improving the class quality accordingly during the object-oriented design phase allows for cheaper management of the later phases. In this paper, the notion of distance between pairs of methods and pairs of attribute types in a class is introduced and used as a basis for introducing a novel class cohesion metric. The metric considers the methodmethod, attribute-attribute, and attribute-method direct interactions. It is shown that the metric gives more sensitive values than other well-known design-based class cohesion metrics.

Keywords: Object-oriented software quality, object-orienteddesign, class cohesion.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2030
134 Invariant Characters of Tolerance Class and Reduction under Homomorphism in IIS

Authors: Chen Wu, Lijuan Wang

Abstract:

Some invariant properties of incomplete information systems homomorphism are studied in this paper. Demand conditions of tolerance class, attribute reduction, indispensable attribute and dispensable attribute being invariant under homomorphism in incomplete information system are revealed and discussed. The existing condition of endohomomorphism on an incomplete information system is also explored. It establishes some theoretical foundations for further investigations on incomplete information systems in rough set theory, like in information systems.

Keywords: Attribute reduction, homomorphism, incomplete information system, rough set, tolerance relation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 497
133 Fuzzy Decision Making via Multiple Attribute

Authors: Behnaz Zohouri, Mahdi Zowghiand, Mohsen haghighi

Abstract:

In this paper, a method for decision making in fuzzy environment is presented.A new subjective and objective integrated approach is introduced that used to assign weight attributes in fuzzy multiple attribute decision making (FMADM) problems and alternatives and fmally ranked by proposed method.

Keywords: Multiple Attribute Decision Making, Triangular fuzzy numbers, ranking index, Fuzzy Entropy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1184
132 Studies on Properties of Knowledge Dependency and Reduction Algorithm in Tolerance Rough Set Model

Authors: Chen Wu, Lijuan Wang

Abstract:

Relation between tolerance class and indispensable attribute and knowledge dependency in rough set model with tolerance relation is explored. After giving definitions and concepts of knowledge dependency and knowledge dependency degree for incomplete information system in tolerance rough set model by distinguishing decision attribute containing missing attribute value or not, the result of maintaining reflectivity, transitivity, augmentation, decomposition law and merge law for complete knowledge dependency is proved. Knowledge dependency degrees (not complete knowledge dependency degrees) only satisfy some laws after transitivity, augmentation and decomposition operations. An algorithm to solve attribute reduction in an incomplete decision table is designed. The correctness is checked by an example.

Keywords: Incomplete information system, rough set, tolerance relation, knowledge dependence, attribute reduction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 493
131 An Attribute Based Access Control Model with POL Module for Dynamically Granting and Revoking Authorizations

Authors: Gang Liu, Huimin Song, Can Wang, Runnan Zhang, Lu Fang

Abstract:

Currently, resource sharing and system security are critical issues. This paper proposes a POL module composed of PRIV ILEGE attribute (PA), obligation and log which improves attribute based access control (ABAC) model in dynamically granting authorizations and revoking authorizations. The following describes the new model termed PABAC in terms of the POL module structure, attribute definitions, policy formulation and authorization architecture, which demonstrate the advantages of it. The POL module addresses the problems which are not predicted before and not described by access control policy. It can be one of the subject attributes or resource attributes according to the practical application, which enhances the flexibility of the model compared with ABAC. A scenario that illustrates how this model is applied to the real world is provided.

Keywords: Access control, attribute based access control, granting authorizations, privilege, revoking authorizations, system security.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 706
130 Heterogeneous Attribute Reduction in Noisy System based on a Generalized Neighborhood Rough Sets Model

Authors: Siyuan Jing, Kun She

Abstract:

Neighborhood Rough Sets (NRS) has been proven to be an efficient tool for heterogeneous attribute reduction. However, most of researches are focused on dealing with complete and noiseless data. Factually, most of the information systems are noisy, namely, filled with incomplete data and inconsistent data. In this paper, we introduce a generalized neighborhood rough sets model, called VPTNRS, to deal with the problem of heterogeneous attribute reduction in noisy system. We generalize classical NRS model with tolerance neighborhood relation and the probabilistic theory. Furthermore, we use the neighborhood dependency to evaluate the significance of a subset of heterogeneous attributes and construct a forward greedy algorithm for attribute reduction based on it. Experimental results show that the model is efficient to deal with noisy data.

Keywords: attribute reduction, incomplete data, inconsistent data, tolerance neighborhood relation, rough sets

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1349
129 Attribute Selection Methods Comparison for Classification of Diffuse Large B-Cell Lymphoma

Authors: Helyane Bronoski Borges, Júlio Cesar Nievola

Abstract:

The most important subtype of non-Hodgkin-s lymphoma is the Diffuse Large B-Cell Lymphoma. Approximately 40% of the patients suffering from it respond well to therapy, whereas the remainder needs a more aggressive treatment, in order to better their chances of survival. Data Mining techniques have helped to identify the class of the lymphoma in an efficient manner. Despite that, thousands of genes should be processed to obtain the results. This paper presents a comparison of the use of various attribute selection methods aiming to reduce the number of genes to be searched, looking for a more effective procedure as a whole.

Keywords: Attribute selection, data mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1166
128 A Family of Entropies on Interval-valued Intuitionistic Fuzzy Sets and Their Applications in Multiple Attribute Decision Making

Authors: Min Sun, Jing Liu

Abstract:

The entropy of intuitionistic fuzzy sets is used to indicate the degree of fuzziness of an interval-valued intuitionistic fuzzy set(IvIFS). In this paper, we deal with the entropies of IvIFS. Firstly, we propose a family of entropies on IvIFS with a parameter λ ∈ [0, 1], which generalize two entropy measures defined independently by Zhang and Wei, for IvIFS, and then we prove that the new entropy is an increasing function with respect to the parameter λ. Furthermore, a new multiple attribute decision making (MADM) method using entropy-based attribute weights is proposed to deal with the decision making situations where the alternatives on attributes are expressed by IvIFS and the attribute weights information is unknown. Finally, a numerical example is given to illustrate the applications of the proposed method.

Keywords: Interval-valued intuitionistic fuzzy sets, intervalvalued intuitionistic fuzzy entropy, multiple attribute decision making

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1424
127 Attribute Weighted Class Complexity: A New Metric for Measuring Cognitive Complexity of OO Systems

Authors: Dr. L. Arockiam, A. Aloysius

Abstract:

In general, class complexity is measured based on any one of these factors such as Line of Codes (LOC), Functional points (FP), Number of Methods (NOM), Number of Attributes (NOA) and so on. There are several new techniques, methods and metrics with the different factors that are to be developed by the researchers for calculating the complexity of the class in Object Oriented (OO) software. Earlier, Arockiam et.al has proposed a new complexity measure namely Extended Weighted Class Complexity (EWCC) which is an extension of Weighted Class Complexity which is proposed by Mishra et.al. EWCC is the sum of cognitive weights of attributes and methods of the class and that of the classes derived. In EWCC, a cognitive weight of each attribute is considered to be 1. The main problem in EWCC metric is that, every attribute holds the same value but in general, cognitive load in understanding the different types of attributes cannot be the same. So here, we are proposing a new metric namely Attribute Weighted Class Complexity (AWCC). In AWCC, the cognitive weights have to be assigned for the attributes which are derived from the effort needed to understand their data types. The proposed metric has been proved to be a better measure of complexity of class with attributes through the case studies and experiments

Keywords: Software Complexity, Attribute Weighted Class Complexity, Weighted Class Complexity, Data Type

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1644
126 Some Investigations on Higher Mathematics Scores for Chinese University Student

Authors: Xun Ge, Jingju Qian

Abstract:

To investigate some relations between higher mathe¬matics scores in Chinese graduate student entrance examination and calculus (resp. linear algebra, probability statistics) scores in subject's completion examination of Chinese university, we select 20 students as a sample, take higher mathematics score as a decision attribute and take calculus score, linear algebra score, probability statistics score as condition attributes. In this paper, we are based on rough-set theory (Rough-set theory is a logic-mathematical method proposed by Z. Pawlak. In recent years, this theory has been widely implemented in the many fields of natural science and societal science.) to investigate importance of condition attributes with respective to decision attribute and strength of condition attributes supporting decision attribute. Results of this investigation will be helpful for university students to raise higher mathematics scores in Chinese graduate student entrance examination.

Keywords: Rough set, higher mathematics scores, decision attribute, condition attribute.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 866
125 Fuzzy Population-Based Meta-Heuristic Approaches for Attribute Reduction in Rough Set Theory

Authors: Mafarja Majdi, Salwani Abdullah, Najmeh S. Jaddi

Abstract:

One of the global combinatorial optimization problems in machine learning is feature selection. It concerned with removing the irrelevant, noisy, and redundant data, along with keeping the original meaning of the original data. Attribute reduction in rough set theory is an important feature selection method. Since attribute reduction is an NP-hard problem, it is necessary to investigate fast and effective approximate algorithms. In this paper, we proposed two feature selection mechanisms based on memetic algorithms (MAs) which combine the genetic algorithm with a fuzzy record to record travel algorithm and a fuzzy controlled great deluge algorithm, to identify a good balance between local search and genetic search. In order to verify the proposed approaches, numerical experiments are carried out on thirteen datasets. The results show that the MAs approaches are efficient in solving attribute reduction problems when compared with other meta-heuristic approaches.

Keywords: Rough Set Theory, Attribute Reduction, Fuzzy Logic, Memetic Algorithms, Record to Record Algorithm, Great Deluge Algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1713
124 A Combined Cipher Text Policy Attribute-Based Encryption and Timed-Release Encryption Method for Securing Medical Data in Cloud

Authors: G. Shruthi, Purohit Shrinivasacharya

Abstract:

The biggest problem in cloud is securing an outsourcing data. A cloud environment cannot be considered to be trusted. It becomes more challenging when outsourced data sources are managed by multiple outsourcers with different access rights. Several methods have been proposed to protect data confidentiality against the cloud service provider to support fine-grained data access control. We propose a method with combined Cipher Text Policy Attribute-based Encryption (CP-ABE) and Timed-release encryption (TRE) secure method to control medical data storage in public cloud.

Keywords: Attribute, encryption, security, trapdoor.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 404
123 Examining the Value of Attribute Scores for Author-Supplied Keyphrases in Automatic Keyphrase Extraction

Authors: Vicky Min-How Lim, Siew Fan Wong, Tong Ming Lim

Abstract:

Automatic keyphrase extraction is useful in efficiently locating specific documents in online databases. While several techniques have been introduced over the years, improvement on accuracy rate is minimal. This research examines attribute scores for author-supplied keyphrases to better understand how the scores affect the accuracy rate of automatic keyphrase extraction. Five attributes are chosen for examination: Term Frequency, First Occurrence, Last Occurrence, Phrase Position in Sentences, and Term Cohesion Degree. The results show that First Occurrence is the most reliable attribute. Term Frequency, Last Occurrence and Term Cohesion Degree display a wide range of variation but are still usable with suggested tweaks. Only Phrase Position in Sentences shows a totally unpredictable pattern. The results imply that the commonly used ranking approach which directly extracts top ranked potential phrases from candidate keyphrase list as the keyphrases may not be reliable.

Keywords: Accuracy, Attribute Score, Author-supplied keyphrases, Automatic keyphrase extraction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1117
122 Further Investigations on Higher Mathematics Scores for Chinese University Students

Authors: Xun Ge

Abstract:

Recently, X. Ge and J. Qian investigated some relations between higher mathematics scores and calculus scores (resp. linear algebra scores, probability statistics scores) for Chinese university students. Based on rough-set theory, they established an information system S = (U,CuD,V, f). In this information system, higher mathematics score was taken as a decision attribute and calculus score, linear algebra score, probability statistics score were taken as condition attributes. They investigated importance of each condition attribute with respective to decision attribute and strength of each condition attribute supporting decision attribute. In this paper, we give further investigations for this issue. Based on the above information system S = (U, CU D, V, f), we analyze the decision rules between condition and decision granules. For each x E U, we obtain support (resp. strength, certainty factor, coverage factor) of the decision rule C —>x D, where C —>x D is the decision rule induced by x in S = (U, CU D, V, f). Results of this paper gives new analysis of on higher mathematics scores for Chinese university students, which can further lead Chinese university students to raise higher mathematics scores in Chinese graduate student entrance examination.

Keywords: Rough set, support, strength, certainty factor, coverage factor.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1118
121 Intelligent Recognition of Diabetes Disease via FCM Based Attribute Weighting

Authors: Kemal Polat

Abstract:

In this paper, an attribute weighting method called fuzzy C-means clustering based attribute weighting (FCMAW) for classification of Diabetes disease dataset has been used. The aims of this study are to reduce the variance within attributes of diabetes dataset and to improve the classification accuracy of classifier algorithm transforming from non-linear separable datasets to linearly separable datasets. Pima Indians Diabetes dataset has two classes including normal subjects (500 instances) and diabetes subjects (268 instances). Fuzzy C-means clustering is an improved version of K-means clustering method and is one of most used clustering methods in data mining and machine learning applications. In this study, as the first stage, fuzzy C-means clustering process has been used for finding the centers of attributes in Pima Indians diabetes dataset and then weighted the dataset according to the ratios of the means of attributes to centers of theirs. Secondly, after weighting process, the classifier algorithms including support vector machine (SVM) and k-NN (k- nearest neighbor) classifiers have been used for classifying weighted Pima Indians diabetes dataset. Experimental results show that the proposed attribute weighting method (FCMAW) has obtained very promising results in the classification of Pima Indians diabetes dataset.

Keywords: Fuzzy C-means clustering, Fuzzy C-means clustering based attribute weighting, Pima Indians diabetes dataset, SVM.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1449
120 A Multi-Attribute Utility Model for Performance Evaluation of Sustainable Banking

Authors: Sonia Rebai, Mohamed Naceur Azaiez, Dhafer Saidane

Abstract:

In this study, we develop a performance evaluation model based on a multi-attribute utility approach aiming at reaching the sustainable banking (SB) status. This model is built accounting for various banks’ stakeholders in a win-win paradigm. In addition, it offers the opportunity for adopting a global measure of performance as an indication of a bank’s sustainability degree. This measure is referred to as banking sustainability performance index (BSPI). This index may constitute a basis for ranking banks. Moreover, it may constitute a bridge between the assessment types of financial and extra-financial rating agencies. A real application is performed on three French banks.

Keywords: Multi-attribute utility theory, Performance, Sustainable banking.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1954
119 Sensory Evaluation of the Selected Coffee Products Using Fuzzy Approach

Authors: M.A. Lazim, M. Suriani

Abstract:

Knowing consumers' preferences and perceptions of the sensory evaluation of drink products are very significant to manufacturers and retailers alike. With no appropriate sensory analysis, there is a high risk of market disappointment. This paper aims to rank the selected coffee products and also to determine the best of quality attribute through sensory evaluation using fuzzy decision making model. Three products of coffee drinks were used for sensory evaluation. Data were collected from thirty judges at a hypermarket in Kuala Terengganu, Malaysia. The judges were asked to specify their sensory evaluation in linguistic terms of the quality attributes of colour, smell, taste and mouth feel for each product and also the weight of each quality attribute. Five fuzzy linguistic terms represent the quality attributes were introduced prior analysing. The judgment membership function and the weights were compared to rank the products and also to determine the best quality attribute. The product of Indoc was judged as the first in ranking and 'taste' as the best quality attribute. These implicate the importance of sensory evaluation in identifying consumers- preferences and also the competency of fuzzy approach in decision making.

Keywords: fuzzy decision making, fuzzy linguistic, membership function, sensory evaluation,

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2520
118 Evaluation of Attribute II Bt Sweet Corn Resistance and Reduced-Risk Insecticide Applications for Control of Corn Earworm

Authors: R. Weinzierl, R. Estes, N. Tinsley, M. Keshlaf

Abstract:

The corn earworm, Helicoverpa zea Boddie, is a serious pest of corn. Larval feeding in ear tips destroys kernels and allows growth of fungi and production of mycotoxins. Infested sweet corn is not marketable. Development of improved transgenic hybrids expressing insecticidal toxins from Bacillus thuringiensis (Bt) may limit or prevent crop losses. The effectiveness of Attribute® II Bt resistance and applications of Voliam Xpress insecticide were evaluated for effectiveness in controlling corn earworm in plots near Urbana, IL, USA, in 2013. Where no insecticides were applied, ear infestations and kernel damage in Attribute® II ‘Protector’ plots were consistently lower (near zero) than in plots of the non-Bt isoline ‘Garrison.’ Multiple applications of Voliam Xpress significantly reduced the number of corn earworm larvae and kernel damage in the Garrison plots, but infestations and damage in these plots were greater than in Protectorplots that did not receive insecticide applications. Our results indicate that Attribute® II Bt resistance is more effective than multiple applications of an insecticide for preventing losses caused by corn earworm in sweet corn.

Keywords: Bacillus thuringiensis, Helicoverpa zea, insect pest management, transgenic sweet corn.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1945
117 Attribute Based Comparison and Selection of Modular Self-Reconfigurable Robot Using Multiple Attribute Decision Making Approach

Authors: Manpreet Singh, V. P. Agrawal, Gurmanjot Singh Bhatti

Abstract:

From the last decades, there is a significant technological advancement in the field of robotics, and a number of modular self-reconfigurable robots were introduced that can help in space exploration, bucket to stuff, search, and rescue operation during earthquake, etc. As there are numbers of self-reconfigurable robots, choosing the optimum one is always a concern for robot user since there is an increase in available features, facilities, complexity, etc. The objective of this research work is to present a multiple attribute decision making based methodology for coding, evaluation, comparison ranking and selection of modular self-reconfigurable robots using a technique for order preferences by similarity to ideal solution approach. However, 86 attributes that affect the structure and performance are identified. A database for modular self-reconfigurable robot on the basis of different pertinent attribute is generated. This database is very useful for the user, for selecting a robot that suits their operational needs. Two visual methods namely linear graph and spider chart are proposed for ranking of modular self-reconfigurable robots. Using five robots (Atron, Smores, Polybot, M-Tran 3, Superbot), an example is illustrated, and raking of the robots is successfully done, which shows that Smores is the best robot for the operational need illustrated, and this methodology is found to be very effective and simple to use.

Keywords: Self-reconfigurable robots, MADM, TOPSIS, morphogenesis, scalability.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 546
116 Research on Urban Point of Interest Generalization Method Based on Mapping Presentation

Authors: Chengming Li, Yong Yin, Peipei Guo, Xiaoli Liu

Abstract:

Without taking account of the attribute richness of POI (point of interest) data and spatial distribution limited by roads, a POI generalization method considering both attribute information and spatial distribution has been proposed against the existing point generalization algorithm merely focusing on overall information of point groups. Hierarchical characteristic of urban POI information expression has been firstly analyzed to point out the measurement feature of the corresponding hierarchy. On this basis, an urban POI generalizing strategy has been put forward: POIs urban road network have been divided into three distribution pattern; corresponding generalization methods have been proposed according to the characteristic of POI data in different distribution patterns. Experimental results showed that the method taking into account both attribute information and spatial distribution characteristics of POI can better implement urban POI generalization in the mapping presentation.

Keywords: POI, Road network, spatial information expression, selection method, distribution pattern.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 683
115 An Attribute-Centre Based Decision Tree Classification Algorithm

Authors: Gökhan Silahtaroğlu

Abstract:

Decision tree algorithms have very important place at classification model of data mining. In literature, algorithms use entropy concept or gini index to form the tree. The shape of the classes and their closeness to each other some of the factors that affect the performance of the algorithm. In this paper we introduce a new decision tree algorithm which employs data (attribute) folding method and variation of the class variables over the branches to be created. A comparative performance analysis has been held between the proposed algorithm and C4.5.

Keywords: Classification, decision tree, split, pruning, entropy, gini.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1154
114 Computing Entropy for Ortholog Detection

Authors: Hsing-Kuo Pao, John Case

Abstract:

Biological sequences from different species are called or-thologs if they evolved from a sequence of a common ancestor species and they have the same biological function. Approximations of Kolmogorov complexity or entropy of biological sequences are already well known to be useful in extracting similarity information between such sequences -in the interest, for example, of ortholog detection. As is well known, the exact Kolmogorov complexity is not algorithmically computable. In prac-tice one can approximate it by computable compression methods. How-ever, such compression methods do not provide a good approximation to Kolmogorov complexity for short sequences. Herein is suggested a new ap-proach to overcome the problem that compression approximations may notwork well on short sequences. This approach is inspired by new, conditional computations of Kolmogorov entropy. A main contribution of the empir-ical work described shows the new set of entropy-based machine learning attributes provides good separation between positive (ortholog) and nega-tive (non-ortholog) data - better than with good, previously known alter-natives (which do not employ some means to handle short sequences well).Also empirically compared are the new entropy based attribute set and a number of other, more standard similarity attributes sets commonly used in genomic analysis. The various similarity attributes are evaluated by cross validation, through boosted decision tree induction C5.0, and by Receiver Operating Characteristic (ROC) analysis. The results point to the conclu-sion: the new, entropy based attribute set by itself is not the one giving the best prediction; however, it is the best attribute set for use in improving the other, standard attribute sets when conjoined with them.

Keywords: compression, decision tree, entropy, ortholog, ROC.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1602
113 Adaptive Network Intrusion Detection Learning: Attribute Selection and Classification

Authors: Dewan Md. Farid, Jerome Darmont, Nouria Harbi, Nguyen Huu Hoa, Mohammad Zahidur Rahman

Abstract:

In this paper, a new learning approach for network intrusion detection using naïve Bayesian classifier and ID3 algorithm is presented, which identifies effective attributes from the training dataset, calculates the conditional probabilities for the best attribute values, and then correctly classifies all the examples of training and testing dataset. Most of the current intrusion detection datasets are dynamic, complex and contain large number of attributes. Some of the attributes may be redundant or contribute little for detection making. It has been successfully tested that significant attribute selection is important to design a real world intrusion detection systems (IDS). The purpose of this study is to identify effective attributes from the training dataset to build a classifier for network intrusion detection using data mining algorithms. The experimental results on KDD99 benchmark intrusion detection dataset demonstrate that this new approach achieves high classification rates and reduce false positives using limited computational resources.

Keywords: Attributes selection, Conditional probabilities, information gain, network intrusion detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2460
112 Research of Data Cleaning Methods Based on Dependency Rules

Authors: Yang Bao, Shi Wei Deng, Wang Qun Lin

Abstract:

This paper introduces the concept and principle of data cleaning, analyzes the types and causes of dirty data, and proposes several key steps of typical cleaning process, puts forward a well scalability and versatility data cleaning framework, in view of data with attribute dependency relation, designs several of violation data discovery algorithms by formal formula, which can obtain inconsistent data to all target columns with condition attribute dependent no matter data is structured (SQL) or unstructured (NoSql), and gives 6 data cleaning methods based on these algorithms.

Keywords: Data cleaning, dependency rules, violation data discovery, data repair.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2071
111 Rethinking the Analysis of Means-End Chain Data in Marketing Research

Authors: P. Puustinen, A. Kanto

Abstract:

This paper proposes a new procedure for analyzing means-end chain data in marketing research. Most commonly the collected data is summarized in the Hierarchical Value Map (HVM) illustrating the main attribute-consequence-value linkages. This paper argues that traditionally constructed HVM may give an erroneous impression of the results of a means-end study. To justify the arguments, an alternative procedure to (1) determine the dominant attribute-consequence-value linkages and (2) construct HVM in a precise manner is presented. The current approach makes a contribution to means-end analysis, allowing marketers to address a set of marketing problems, such as advertising strategy.

Keywords: Means-end chain analysis, Laddering, Hierarchical Value Map.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2517
110 Improved K-Modes for Categorical Clustering Using Weighted Dissimilarity Measure

Authors: S.Aranganayagi, K.Thangavel

Abstract:

K-Modes is an extension of K-Means clustering algorithm, developed to cluster the categorical data, where the mean is replaced by the mode. The similarity measure proposed by Huang is the simple matching or mismatching measure. Weight of attribute values contribute much in clustering; thus in this paper we propose a new weighted dissimilarity measure for K-Modes, based on the ratio of frequency of attribute values in the cluster and in the data set. The new weighted measure is experimented with the data sets obtained from the UCI data repository. The results are compared with K-Modes and K-representative, which show that the new measure generates clusters with high purity.

Keywords: Clustering, categorical data, K-Modes, weighted dissimilarity measure

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3438
109 An Interval-Based Multi-Attribute Decision Making Approach for Electric Utility Resource Planning

Authors: M. Sedighizadeh, A. Rezazadeh

Abstract:

This paper presents an interval-based multi-attribute decision making (MADM) approach in support of the decision process with imprecise information. The proposed decision methodology is based on the model of linear additive utility function but extends the problem formulation with the measure of composite utility variance. A sample study concerning with the evaluation of electric generation expansion strategies is provided showing how the imprecise data may affect the choice toward the best solution and how a set of alternatives, acceptable to the decision maker (DM), may be identified with certain confidence.

Keywords: Decision Making, Power Generation, ElectricUtilities, Resource Planning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1363
108 Application of Data Mining Techniques for Tourism Knowledge Discovery

Authors: Teklu Urgessa, Wookjae Maeng, Joong Seek Lee

Abstract:

Application of five implementations of three data mining classification techniques was experimented for extracting important insights from tourism data. The aim was to find out the best performing algorithm among the compared ones for tourism knowledge discovery. Knowledge discovery process from data was used as a process model. 10-fold cross validation method is used for testing purpose. Various data preprocessing activities were performed to get the final dataset for model building. Classification models of the selected algorithms were built with different scenarios on the preprocessed dataset. The outperformed algorithm tourism dataset was Random Forest (76%) before applying information gain based attribute selection and J48 (C4.5) (75%) after selection of top relevant attributes to the class (target) attribute. In terms of time for model building, attribute selection improves the efficiency of all algorithms. Artificial Neural Network (multilayer perceptron) showed the highest improvement (90%). The rules extracted from the decision tree model are presented, which showed intricate, non-trivial knowledge/insight that would otherwise not be discovered by simple statistical analysis with mediocre accuracy of the machine using classification algorithms.

Keywords: Classification algorithms; data mining; tourism; knowledge discovery.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2119