Search results for: attribute ranking.

295 Some Investigations on Higher Mathematics Scores for Chinese University Student

Abstract:

To investigate some relations between higher mathe¬matics scores in Chinese graduate student entrance examination and calculus (resp. linear algebra, probability statistics) scores in subject's completion examination of Chinese university, we select 20 students as a sample, take higher mathematics score as a decision attribute and take calculus score, linear algebra score, probability statistics score as condition attributes. In this paper, we are based on rough-set theory (Rough-set theory is a logic-mathematical method proposed by Z. Pawlak. In recent years, this theory has been widely implemented in the many fields of natural science and societal science.) to investigate importance of condition attributes with respective to decision attribute and strength of condition attributes supporting decision attribute. Results of this investigation will be helpful for university students to raise higher mathematics scores in Chinese graduate student entrance examination.

Keywords: Rough set, higher mathematics scores, decision attribute, condition attribute.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1045

294 Compromise Ratio Method for Decision Making under Fuzzy Environment using Fuzzy Distance Measure

Authors: Debashree Guha, Debjani Chakraborty

Abstract:

The aim of this paper is to adopt a compromise ratio (CR) methodology for fuzzy multi-attribute single-expert decision making proble. In this paper, the rating of each alternative has been described by linguistic terms, which can be expressed as triangular fuzzy numbers. The compromise ratio method for fuzzy multi-attribute single expert decision making has been considered here by taking the ranking index based on the concept that the chosen alternative should be as close as possible to the ideal solution and as far away as possible from the negative-ideal solution simultaneously. From logical point of view, the distance between two triangular fuzzy numbers also is a fuzzy number, not a crisp value. Therefore a fuzzy distance measure, which is itself a fuzzy number, has been used here to calculate the difference between two triangular fuzzy numbers. Now in this paper, with the help of this fuzzy distance measure, it has been shown that the compromise ratio is a fuzzy number and this eases the problem of the decision maker to take the decision. The computation principle and the procedure of the compromise ratio method have been described in detail in this paper. A comparative analysis of the compromise ratio method previously proposed [1] and the newly adopted method have been illustrated with two numerical examples.

Keywords: Compromise ratio method, Fuzzy multi-attributesingle-expert decision making, Fuzzy number, Linguistic variable

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1363

293 Fuzzy Population-Based Meta-Heuristic Approaches for Attribute Reduction in Rough Set Theory

Authors: Mafarja Majdi, Salwani Abdullah, Najmeh S. Jaddi

Abstract:

One of the global combinatorial optimization problems in machine learning is feature selection. It concerned with removing the irrelevant, noisy, and redundant data, along with keeping the original meaning of the original data. Attribute reduction in rough set theory is an important feature selection method. Since attribute reduction is an NP-hard problem, it is necessary to investigate fast and effective approximate algorithms. In this paper, we proposed two feature selection mechanisms based on memetic algorithms (MAs) which combine the genetic algorithm with a fuzzy record to record travel algorithm and a fuzzy controlled great deluge algorithm, to identify a good balance between local search and genetic search. In order to verify the proposed approaches, numerical experiments are carried out on thirteen datasets. The results show that the MAs approaches are efficient in solving attribute reduction problems when compared with other meta-heuristic approaches.

Keywords: Rough Set Theory, Attribute Reduction, Fuzzy Logic, Memetic Algorithms, Record to Record Algorithm, Great Deluge Algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1893

292 A Centroid Ranking Approach Based Fuzzy MCDM Model

Authors: T. C. Chu, S.H. Wu

Abstract:

This paper suggests ranking alternatives under fuzzy MCDM (multiple criteria decision making) via an centroid based ranking approach, where criteria are classified to benefit qualitative, benefit quantitative and cost quantitative ones. The ratings of alternatives versus qualitative criteria and the importance weights of all criteria are assessed in linguistic values represented by fuzzy numbers. The membership function for the final fuzzy evaluation value of each alternative can be developed through α-cuts and interval arithmetic of fuzzy numbers. The distance between the original point and the relative centroid is applied to defuzzify the final fuzzy evaluation values in order to rank alternatives. Finally a numerical example demonstrates the computation procedure of the proposed model.

Keywords: Fuzzy MCDM, Criteria, Fuzzy number, Ranking, Relative centroid.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1633

291 Multidimensional Compromise Optimization for Development Ranking of the Gulf Cooperation Council Countries and Turkey

Authors: C. Ardil

Abstract:

In this research, a multidimensional compromise optimization method is proposed for multidimensional decision making analysis in the development ranking of the Gulf Cooperation Council Countries and Turkey. The proposed approach presents ranking solutions resulting from different multicriteria decision analyses, which yield different ranking orders for the same ranking problem, consisting of a set of alternatives in terms of numerous competing criteria when they are applied with the same numerical data. The multiobjective optimization decision making problem is considered in three sequential steps. In the first step, five different criteria related to the development ranking are gathered from the research field. In the second step, identified evaluation criteria are, objectively, weighted using standard deviation procedure. In the third step, a country selection problem is illustrated with a numerical example as an application of the proposed multidimensional compromise optimization model. Finally, multidimensional compromise optimization approach is applied to rank the Gulf Cooperation Council Countries and Turkey.

Keywords: Standard deviation, performance evaluation, multicriteria decision making, multidimensional compromise optimization, vector normalization, multicriteria decision making, multicriteria analysis, multidimensional decision analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 758

290 Ranking - Convex Risk Minimization

Authors: Wojciech Rejchel

Abstract:

The problem of ranking (rank regression) has become popular in the machine learning community. This theory relates to problems, in which one has to predict (guess) the order between objects on the basis of vectors describing their observed features. In many ranking algorithms a convex loss function is used instead of the 0-1 loss. It makes these procedures computationally efficient. Hence, convex risk minimizers and their statistical properties are investigated in this paper. Fast rates of convergence are obtained under conditions, that look similarly to the ones from the classification theory. Methods used in this paper come from the theory of U-processes as well as empirical processes.

Keywords: Convex loss function, empirical risk minimization, empirical process, U-process, boosting, euclidean family.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1372

289 A Combined Cipher Text Policy Attribute-Based Encryption and Timed-Release Encryption Method for Securing Medical Data in Cloud

Authors: G. Shruthi, Purohit Shrinivasacharya

Abstract:

The biggest problem in cloud is securing an outsourcing data. A cloud environment cannot be considered to be trusted. It becomes more challenging when outsourced data sources are managed by multiple outsourcers with different access rights. Several methods have been proposed to protect data confidentiality against the cloud service provider to support fine-grained data access control. We propose a method with combined Cipher Text Policy Attribute-based Encryption (CP-ABE) and Timed-release encryption (TRE) secure method to control medical data storage in public cloud.

Keywords: Attribute, encryption, security, trapdoor.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 691

288 A Hybrid Ontology Based Approach for Ranking Documents

Authors: Sarah Motiee, Azadeh Nematzadeh, Mehrnoush Shamsfard

Abstract:

Increasing growth of information volume in the internet causes an increasing need to develop new (semi)automatic methods for retrieval of documents and ranking them according to their relevance to the user query. In this paper, after a brief review on ranking models, a new ontology based approach for ranking HTML documents is proposed and evaluated in various circumstances. Our approach is a combination of conceptual, statistical and linguistic methods. This combination reserves the precision of ranking without loosing the speed. Our approach exploits natural language processing techniques to extract phrases from documents and the query and doing stemming on words. Then an ontology based conceptual method will be used to annotate documents and expand the query. To expand a query the spread activation algorithm is improved so that the expansion can be done flexible and in various aspects. The annotated documents and the expanded query will be processed to compute the relevance degree exploiting statistical methods. The outstanding features of our approach are (1) combining conceptual, statistical and linguistic features of documents, (2) expanding the query with its related concepts before comparing to documents, (3) extracting and using both words and phrases to compute relevance degree, (4) improving the spread activation algorithm to do the expansion based on weighted combination of different conceptual relationships and (5) allowing variable document vector dimensions. A ranking system called ORank is developed to implement and test the proposed model. The test results will be included at the end of the paper.

Keywords: Document ranking, Ontology, Spread activation algorithm, Annotation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1572

287 Using a Semantic Self-Organising Web Page-Ranking Mechanism for Public Administration and Education

Authors: Marios Poulos, Sozon Papavlasopoulos, V. S. Belesiotis

Abstract:

In the proposed method for Web page-ranking, a novel theoretic model is introduced and tested by examples of order relationships among IP addresses. Ranking is induced using a convexity feature, which is learned according to these examples using a self-organizing procedure. We consider the problem of selforganizing learning from IP data to be represented by a semi-random convex polygon procedure, in which the vertices correspond to IP addresses. Based on recent developments in our regularization theory for convex polygons and corresponding Euclidean distance based methods for classification, we develop an algorithmic framework for learning ranking functions based on a Computational Geometric Theory. We show that our algorithm is generic, and present experimental results explaining the potential of our approach. In addition, we explain the generality of our approach by showing its possible use as a visualization tool for data obtained from diverse domains, such as Public Administration and Education.

Keywords: Computational Geometry, Education, e-Governance, Semantic Web.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1716

286 ORank: An Ontology Based System for Ranking Documents

Authors: Mehrnoush Shamsfard, Azadeh Nematzadeh, Sarah Motiee

Abstract:

Increasing growth of information volume in the internet causes an increasing need to develop new (semi)automatic methods for retrieval of documents and ranking them according to their relevance to the user query. In this paper, after a brief review on ranking models, a new ontology based approach for ranking HTML documents is proposed and evaluated in various circumstances. Our approach is a combination of conceptual, statistical and linguistic methods. This combination reserves the precision of ranking without loosing the speed. Our approach exploits natural language processing techniques for extracting phrases and stemming words. Then an ontology based conceptual method will be used to annotate documents and expand the query. To expand a query the spread activation algorithm is improved so that the expansion can be done in various aspects. The annotated documents and the expanded query will be processed to compute the relevance degree exploiting statistical methods. The outstanding features of our approach are (1) combining conceptual, statistical and linguistic features of documents, (2) expanding the query with its related concepts before comparing to documents, (3) extracting and using both words and phrases to compute relevance degree, (4) improving the spread activation algorithm to do the expansion based on weighted combination of different conceptual relationships and (5) allowing variable document vector dimensions. A ranking system called ORank is developed to implement and test the proposed model. The test results will be included at the end of the paper.

Keywords: Document ranking, Ontology, Spread activation algorithm, Annotation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1835

285 Algorithm for Information Retrieval Optimization

Authors: Kehinde K. Agbele, Kehinde Daniel Aruleba, Eniafe F. Ayetiran

Abstract:

When using Information Retrieval Systems (IRS), users often present search queries made of ad-hoc keywords. It is then up to the IRS to obtain a precise representation of the user’s information need and the context of the information. This paper investigates optimization of IRS to individual information needs in order of relevance. The study addressed development of algorithms that optimize the ranking of documents retrieved from IRS. This study discusses and describes a Document Ranking Optimization (DROPT) algorithm for information retrieval (IR) in an Internet-based or designated databases environment. Conversely, as the volume of information available online and in designated databases is growing continuously, ranking algorithms can play a major role in the context of search results. In this paper, a DROPT technique for documents retrieved from a corpus is developed with respect to document index keywords and the query vectors. This is based on calculating the weight (

Keywords: Internet ranking,

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1431

284 Further Investigations on Higher Mathematics Scores for Chinese University Students

Authors: Xun Ge

Abstract:

Recently, X. Ge and J. Qian investigated some relations between higher mathematics scores and calculus scores (resp. linear algebra scores, probability statistics scores) for Chinese university students. Based on rough-set theory, they established an information system S = (U,CuD,V, f). In this information system, higher mathematics score was taken as a decision attribute and calculus score, linear algebra score, probability statistics score were taken as condition attributes. They investigated importance of each condition attribute with respective to decision attribute and strength of each condition attribute supporting decision attribute. In this paper, we give further investigations for this issue. Based on the above information system S = (U, CU D, V, f), we analyze the decision rules between condition and decision granules. For each x E U, we obtain support (resp. strength, certainty factor, coverage factor) of the decision rule C —>x D, where C —>x D is the decision rule induced by x in S = (U, CU D, V, f). Results of this paper gives new analysis of on higher mathematics scores for Chinese university students, which can further lead Chinese university students to raise higher mathematics scores in Chinese graduate student entrance examination.

Keywords: Rough set, support, strength, certainty factor, coverage factor.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1318

283 Intelligent Recognition of Diabetes Disease via FCM Based Attribute Weighting

Authors: Kemal Polat

Abstract:

In this paper, an attribute weighting method called fuzzy C-means clustering based attribute weighting (FCMAW) for classification of Diabetes disease dataset has been used. The aims of this study are to reduce the variance within attributes of diabetes dataset and to improve the classification accuracy of classifier algorithm transforming from non-linear separable datasets to linearly separable datasets. Pima Indians Diabetes dataset has two classes including normal subjects (500 instances) and diabetes subjects (268 instances). Fuzzy C-means clustering is an improved version of K-means clustering method and is one of most used clustering methods in data mining and machine learning applications. In this study, as the first stage, fuzzy C-means clustering process has been used for finding the centers of attributes in Pima Indians diabetes dataset and then weighted the dataset according to the ratios of the means of attributes to centers of theirs. Secondly, after weighting process, the classifier algorithms including support vector machine (SVM) and k-NN (k- nearest neighbor) classifiers have been used for classifying weighted Pima Indians diabetes dataset. Experimental results show that the proposed attribute weighting method (FCMAW) has obtained very promising results in the classification of Pima Indians diabetes dataset.

Keywords: Fuzzy C-means clustering, Fuzzy C-means clustering based attribute weighting, Pima Indians diabetes dataset, SVM.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1714

282 Application of Neural Network for Contingency Ranking Based on Combination of Severity Indices

Authors: S. Jadid, S. Jalilzadeh

Abstract:

In this paper, an improved technique for contingency ranking using artificial neural network (ANN) is presented. The proposed approach is based on multi-layer perceptrons trained by backpropagation to contingency analysis. Severity indices in dynamic stability assessment are presented. These indices are based on the concept of coherency and three dot products of the system variables. It is well known that some indices work better than others for a particular power system. This paper along with test results using several different systems, demonstrates that combination of indices with ANN provides better ranking than a single index. The presented results are obtained through the use of power system simulation (PSS/E) and MATLAB 6.5 software.

Keywords: composite indices, transient stability, neural network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2188

281 A Study of the Effectiveness of the Routing Decision Support Algorithm

Authors: Wayne Goodridge, Alexander Nikov, Ashok Sahai

Abstract:

Multi criteria decision making (MCDM) methods like analytic hierarchy process, ELECTRE and multi-attribute utility theory are critically studied. They have irregularities in terms of the reliability of ranking of the best alternatives. The Routing Decision Support (RDS) algorithm is trying to improve some of their deficiencies. This paper gives a mathematical verification that the RDS algorithm conforms to the test criteria for an effective MCDM method when a linear preference function is considered.

Keywords: Decision support systems, linear preference function, multi-criteria decision-making algorithm, analytic hierarchy process.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1544

280 Retrieval of User Specific Images Using Semantic Signatures

Authors: K. Venkateswari, U. K. Balaji Saravanan, K. Thangaraj, K. V. Deepana

Abstract:

Image search engines rely on the surrounding textual keywords for the retrieval of images. It is a tedious work for the search engines like Google and Bing to interpret the user’s search intention and to provide the desired results. The recent researches also state that the Google image search engines do not work well on all the images. Consequently, this leads to the emergence of efficient image retrieval technique, which interprets the user’s search intention and shows the desired results. In order to accomplish this task, an efficient image re-ranking framework is required. Sequentially, to provide best image retrieval, the new image re-ranking framework is experimented in this paper. The implemented new image re-ranking framework provides best image retrieval from the image dataset by making use of re-ranking of retrieved images that is based on the user’s desired images. This is experimented in two sections. One is offline section and other is online section. In offline section, the reranking framework studies differently (reference classes or Semantic Spaces) for diverse user query keywords. The semantic signatures get generated by combining the textual and visual features of the images. In the online section, images are re-ranked by comparing the semantic signatures that are obtained from the reference classes with the user specified image query keywords. This re-ranking methodology will increases the retrieval image efficiency and the result will be effective to the user.

Keywords: CBIR, Image Re-ranking, Image Retrieval, Semantic Signature, Semantic Space.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1897

279 An Efficiency Measurement of E-Government Performance for United Nation Ranking Index

Authors: Yassine Jadi, Lin Jie

Abstract:

In order to serve the society in an electronic manner, many developing countries have launched tremendous e-government projects. The strategies of development and implementation e-government system have reached different levels, and to ensure consistency of development, the governments need to evaluate e-government performance. The United nation has design e-government development ranking index (EGDI) that rely on three indexes, Online service index (OSI), Telecommunication Infrastructure index (TII), and human capital index( HCI) which are not reflecting the interaction between a government and their citizens. Based on data envelopment analyses (DEA) technique, we are using E-participating index (EPI) as an output of government effort to evaluate the performance of e-government system. Therefore, the ranking index can be achieved in efficiency manner.

Keywords: E-government, DEA, efficiency measurement, EGDI.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1164

278 Ranking of Inventory Policies Using Distance Based Approach Method

Authors: Gupta Amit, Kumar Ramesh, Tewari P. C.

Abstract:

Globalization is putting enormous pressure on the business organizations specially manufacturing one to rethink the supply chain in innovative manners. Inventory consumes major portion of total sale revenue. Effective and efficient inventory management plays a vital role for the successful functioning of any organization. Selection of inventory policy is one of the important purchasing activities. This paper focuses on selection and ranking of alternative inventory policies. A deterministic quantitative model based on Distance Based Approach (DBA) method has been developed for evaluation and ranking of inventory policies. We have employed this concept first time for this type of the selection problem. Four inventory policies economic order quantity (EOQ), just in time (JIT), vendor managed inventory (VMI) and monthly policy are considered. Improper selection could affect a company’s competitiveness in terms of the productivity of its facilities and quality of its products. The ranking of inventory policies is a multi-criteria problem. There is a need to first identify the selection criteria and then processes the information with reference to relative importance of attributes for comparison. Criteria values for each inventory policy can be obtained either analytically or by using a simulation technique or they are linguistic subjective judgments defined by fuzzy sets, like, for example, the values of criteria. A methodology is developed and applied to rank the inventory policies.

Keywords: Inventory Policy, Ranking, DBA, Selection criteria.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1779

277 Evaluation of Attribute II Bt Sweet Corn Resistance and Reduced-Risk Insecticide Applications for Control of Corn Earworm

Authors: R. Weinzierl, R. Estes, N. Tinsley, M. Keshlaf

Abstract:

The corn earworm, Helicoverpa zea Boddie, is a serious pest of corn. Larval feeding in ear tips destroys kernels and allows growth of fungi and production of mycotoxins. Infested sweet corn is not marketable. Development of improved transgenic hybrids expressing insecticidal toxins from Bacillus thuringiensis (Bt) may limit or prevent crop losses. The effectiveness of Attribute^® II Bt resistance and applications of Voliam Xpress insecticide were evaluated for effectiveness in controlling corn earworm in plots near Urbana, IL, USA, in 2013. Where no insecticides were applied, ear infestations and kernel damage in Attribute^® II ‘Protector’ plots were consistently lower (near zero) than in plots of the non-Bt isoline ‘Garrison.’ Multiple applications of Voliam Xpress significantly reduced the number of corn earworm larvae and kernel damage in the Garrison plots, but infestations and damage in these plots were greater than in Protectorplots that did not receive insecticide applications. Our results indicate that Attribute^® II Bt resistance is more effective than multiple applications of an insecticide for preventing losses caused by corn earworm in sweet corn.

Keywords: Bacillus thuringiensis, Helicoverpa zea, insect pest management, transgenic sweet corn.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2173

276 Choice of Efficient Information System with Service-Oriented Architecture using Multiple Criteria Threshold Algorithms (With Practical Example)

Authors: Irina Pyrlina

Abstract:

Author presents the results of a study conducted to identify criteria of efficient information system (IS) with serviceoriented architecture (SOA) realization and proposes a ranking method to evaluate SOA information systems using a set of architecture quality criteria before the systems are implemented. The method is used to compare 7 SOA projects and ranking result for SOA efficiency of the projects is provided. The choice of SOA realization project depends on following criteria categories: IS internal work and organization, SOA policies, guidelines and change management, processes and business services readiness, risk management and mitigation. The last criteria category was analyzed on the basis of projects statistics.

Keywords: multiple criteria threshold algorithm, serviceoriented architecture, SOA operational risks, efficiency criteria for IS architecture, projects ranking.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1350

275 Ranking of Performance Measures of GSCM towards Sustainability: Using Analytic Hierarchy Process

Authors: Dixit Garg, S. Luthra, A. Haleem

Abstract:

During recent years, the natural environment has become a challenging topic that business organizations must consider due to the economic and ecological impacts and increasing awareness of environment protection among society. Organizations are trying to achieve the goals of improvement in environment, low cost, high quality, flexibility and more customer satisfaction. Performance measurement frameworks are very useful to monitor the performance of any organization. The basic goal of this paper is to identify performance measures and ranking of these performance measures of GSCM performance measurement towards sustainability framework. Five perspectives (Environment, Economic, Social, Operational and Cost performances) and nineteen performance measures of GSCM performance towards sustainability have been have been identified from extensive literature review. Analytical Hierarchy Process (AHP) technique has been utilized for ranking of these performance perspectives and measures. All pair comparisons in AHP have been made on the basis on the experts’ opinions (selected from academia and industry). Ranking of these performance perspectives and measures will help to understand the importance of environmental, economic, social, operational performances and cost performances in the supply chain.

Keywords: Analytical Hierarchy Process (AHP), Green Supply Chain Management, Performance Measures (PM), Sustainability.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3189

274 A Framework for Ranking Quality of Information on Weblog

Authors: Mohammad Javad Kargar, Fatemeh Azimzadeh

Abstract:

The vast amount of information on the World Wide Web is created and published by many different types of providers. Unlike books and journals, most of this information is not subject to editing or peer review by experts. This lack of quality control and the explosion of web sites make the task of finding quality information on the web especially critical. Meanwhile new facilities for producing web pages such as Blogs make this issue more significant because Blogs have simple content management tools enabling nonexperts to build easily updatable web diaries or online journals. On the other hand despite a decade of active research in information quality (IQ) there is no framework for measuring information quality on the Blogs yet. This paper presents a novel experimental framework for ranking quality of information on the Weblog. The results of data analysis revealed seven IQ dimensions for the Weblog. For each dimension, variables and related coefficients were calculated so that presented framework is able to assess IQ of Weblogs automatically.

Keywords: Information Quality, Weblog, Web Ranking, Web- Quality.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1805

273 Research on Urban Point of Interest Generalization Method Based on Mapping Presentation

Authors: Chengming Li, Yong Yin, Peipei Guo, Xiaoli Liu

Abstract:

Without taking account of the attribute richness of POI (point of interest) data and spatial distribution limited by roads, a POI generalization method considering both attribute information and spatial distribution has been proposed against the existing point generalization algorithm merely focusing on overall information of point groups. Hierarchical characteristic of urban POI information expression has been firstly analyzed to point out the measurement feature of the corresponding hierarchy. On this basis, an urban POI generalizing strategy has been put forward: POIs urban road network have been divided into three distribution pattern; corresponding generalization methods have been proposed according to the characteristic of POI data in different distribution patterns. Experimental results showed that the method taking into account both attribute information and spatial distribution characteristics of POI can better implement urban POI generalization in the mapping presentation.

Keywords: POI, Road network, spatial information expression, selection method, distribution pattern.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 992

272 Finding Authoritative Researchers on Academic Web Sites

Authors: Dalibor Fiala, Karel Jezek, Francois Rousselot

Abstract:

In this paper, we present a methodology for finding authoritative researchers by analyzing academic Web sites. We show a case study in which we concentrate on a set of Czech computer science departments- Web sites. We analyze the relations between them via hyperlinks and find the most important ones using several common ranking algorithms. We then examine the contents of the research papers present on these sites and determine the most authoritative Czech authors.

Keywords: Authorities, citation analysis, prestige, ranking algorithms, Web mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1205

271 An Attribute-Centre Based Decision Tree Classification Algorithm

Authors: Gökhan Silahtaroğlu

Abstract:

Decision tree algorithms have very important place at classification model of data mining. In literature, algorithms use entropy concept or gini index to form the tree. The shape of the classes and their closeness to each other some of the factors that affect the performance of the algorithm. In this paper we introduce a new decision tree algorithm which employs data (attribute) folding method and variation of the class variables over the branches to be created. A comparative performance analysis has been held between the proposed algorithm and C4.5.

Keywords: Classification, decision tree, split, pruning, entropy, gini.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1330

270 Assessing and Visualizing the Stability of Feature Selectors: A Case Study with Spectral Data

Authors: R.Guzman-Martinez, Oscar Garcia-Olalla, R.Alaiz-Rodriguez

Abstract:

Feature selection plays an important role in applications with high dimensional data. The assessment of the stability of feature selection/ranking algorithms becomes an important issue when the dataset is small and the aim is to gain insight into the underlying process by analyzing the most relevant features. In this work, we propose a graphical approach that enables to analyze the similarity between feature ranking techniques as well as their individual stability. Moreover, it works with whatever stability metric (Canberra distance, Spearman's rank correlation coefficient, Kuncheva's stability index,...). We illustrate this visualization technique evaluating the stability of several feature selection techniques on a spectral binary dataset. Experimental results with a neural-based classifier show that stability and ranking quality may not be linked together and both issues have to be studied jointly in order to offer answers to the domain experts.

Keywords: Feature Selection Stability, Spectral data, Data visualization

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1480

269 An SVM based Classification Method for Cancer Data using Minimum Microarray Gene Expressions

Authors: R. Mallika, V. Saravanan

Abstract:

This paper gives a novel method for improving classification performance for cancer classification with very few microarray Gene expression data. The method employs classification with individual gene ranking and gene subset ranking. For selection and classification, the proposed method uses the same classifier. The method is applied to three publicly available cancer gene expression datasets from Lymphoma, Liver and Leukaemia datasets. Three different classifiers namely Support vector machines-one against all (SVM-OAA), K nearest neighbour (KNN) and Linear Discriminant analysis (LDA) were tested and the results indicate the improvement in performance of SVM-OAA classifier with satisfactory results on all the three datasets when compared with the other two classifiers.

Keywords: Support vector machines-one against all, cancerclassification, Linear Discriminant analysis, K nearest neighbour, microarray gene expression, gene pair ranking.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2505

268 Computing Entropy for Ortholog Detection

Authors: Hsing-Kuo Pao, John Case

Abstract:

Biological sequences from different species are called or-thologs if they evolved from a sequence of a common ancestor species and they have the same biological function. Approximations of Kolmogorov complexity or entropy of biological sequences are already well known to be useful in extracting similarity information between such sequences -in the interest, for example, of ortholog detection. As is well known, the exact Kolmogorov complexity is not algorithmically computable. In prac-tice one can approximate it by computable compression methods. How-ever, such compression methods do not provide a good approximation to Kolmogorov complexity for short sequences. Herein is suggested a new ap-proach to overcome the problem that compression approximations may notwork well on short sequences. This approach is inspired by new, conditional computations of Kolmogorov entropy. A main contribution of the empir-ical work described shows the new set of entropy-based machine learning attributes provides good separation between positive (ortholog) and nega-tive (non-ortholog) data - better than with good, previously known alter-natives (which do not employ some means to handle short sequences well).Also empirically compared are the new entropy based attribute set and a number of other, more standard similarity attributes sets commonly used in genomic analysis. The various similarity attributes are evaluated by cross validation, through boosted decision tree induction C5.0, and by Receiver Operating Characteristic (ROC) analysis. The results point to the conclu-sion: the new, entropy based attribute set by itself is not the one giving the best prediction; however, it is the best attribute set for use in improving the other, standard attribute sets when conjoined with them.

Keywords: compression, decision tree, entropy, ortholog, ROC.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1788

267 Combining the Deep Neural Network with the K-Means for Traffic Accident Prediction

Authors: Celso L. Fernando, Toshio Yoshii, Takahiro Tsubota

Abstract:

Understanding the causes of a road accident and predicting their occurrence is key to prevent deaths and serious injuries from road accident events. Traditional statistical methods such as the Poisson and the Logistics regressions have been used to find the association of the traffic environmental factors with the accident occurred; recently, an artificial neural network, ANN, a computational technique that learns from historical data to make a more accurate prediction, has emerged. Although the ability to make accurate predictions, the ANN has difficulty dealing with highly unbalanced attribute patterns distribution in the training dataset; in such circumstances, the ANN treats the minority group as noise. However, in the real world data, the minority group is often the group of interest; e.g., in the road traffic accident data, the events of the accident are the group of interest. This study proposes a combination of the k-means with the ANN to improve the predictive ability of the neural network model by alleviating the effect of the unbalanced distribution of the attribute patterns in the training dataset. The results show that the proposed method improves the ability of the neural network to make a prediction on a highly unbalanced distributed attribute patterns dataset; however, on an even distributed attribute patterns dataset, the proposed method performs almost like a standard neural network.

Keywords: Accident risks estimation, artificial neural network, deep learning, K-mean, road safety.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 861

266 Adaptive Network Intrusion Detection Learning: Attribute Selection and Classification

Authors: Dewan Md. Farid, Jerome Darmont, Nouria Harbi, Nguyen Huu Hoa, Mohammad Zahidur Rahman

Abstract:

In this paper, a new learning approach for network intrusion detection using naïve Bayesian classifier and ID3 algorithm is presented, which identifies effective attributes from the training dataset, calculates the conditional probabilities for the best attribute values, and then correctly classifies all the examples of training and testing dataset. Most of the current intrusion detection datasets are dynamic, complex and contain large number of attributes. Some of the attributes may be redundant or contribute little for detection making. It has been successfully tested that significant attribute selection is important to design a real world intrusion detection systems (IDS). The purpose of this study is to identify effective attributes from the training dataset to build a classifier for network intrusion detection using data mining algorithms. The experimental results on KDD99 benchmark intrusion detection dataset demonstrate that this new approach achieves high classification rates and reduce false positives using limited computational resources.

Keywords: Attributes selection, Conditional probabilities, information gain, network intrusion detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2649