Search results for: similarity ranking

401 Destination Port Detection for Vessels: An Analytic Tool for Optimizing Port Authorities Resources

Authors: Lubna Eljabu, Mohammad Etemad, Stan Matwin

Abstract:

Port authorities have many challenges in congested ports to allocate their resources to provide a safe and secure loading/unloading procedure for cargo vessels. Selecting a destination port is the decision of a vessel master based on many factors such as weather, wavelength and changes of priorities. Having access to a tool which leverages Automatic Identification System (AIS) messages to monitor vessel’s movements and accurately predict their next destination port promotes an effective resource allocation process for port authorities. In this research, we propose a method, namely, Reference Route of Trajectory (RRoT) to assist port authorities in predicting inflow and outflow traffic in their local environment by monitoring AIS messages. Our RRo method creates a reference route based on historical AIS messages. It utilizes some of the best trajectory similarity measures to identify the destination of a vessel using their recent movement. We evaluated five different similarity measures such as Discrete Frechet Distance (DFD), Dynamic Time ´ Warping (DTW), Partial Curve Mapping (PCM), Area between two curves (Area) and Curve length (CL). Our experiments show that our method identifies the destination port with an accuracy of 98.97% and an f-measure of 99.08% using Dynamic Time Warping (DTW) similarity measure.

Keywords: Spatial temporal data mining, trajectory mining, trajectory similarity, resource optimization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 698

400 Applications of Rough Set Decompositions in Information Retrieval

Authors: Chen Wu, Xiaohua Hu

Abstract:

This paper proposes rough set models with three different level knowledge granules in incomplete information system under tolerance relation by similarity between objects according to their attribute values. Through introducing dominance relation on the discourse to decompose similarity classes into three subclasses: little better subclass, little worse subclass and vague subclass, it dismantles lower and upper approximations into three components. By using these components, retrieving information to find naturally hierarchical expansions to queries and constructing answers to elaborative queries can be effective. It illustrates the approach in applying rough set models in the design of information retrieval system to access different granular expanded documents. The proposed method enhances rough set model application in the flexibility of expansions and elaborative queries in information retrieval.

Keywords: Incomplete information system, Rough set model, tolerance relation, dominance relation, approximation, decomposition, elaborative query.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1612

399 Performance Analysis of Software Reliability Models using Matrix Method

Authors: RajPal Garg, Kapil Sharma, Rajive Kumar, R. K. Garg

Abstract:

This paper presents a computational methodology based on matrix operations for a computer based solution to the problem of performance analysis of software reliability models (SRMs). A set of seven comparison criteria have been formulated to rank various non-homogenous Poisson process software reliability models proposed during the past 30 years to estimate software reliability measures such as the number of remaining faults, software failure rate, and software reliability. Selection of optimal SRM for use in a particular case has been an area of interest for researchers in the field of software reliability. Tools and techniques for software reliability model selection found in the literature cannot be used with high level of confidence as they use a limited number of model selection criteria. A real data set of middle size software project from published papers has been used for demonstration of matrix method. The result of this study will be a ranking of SRMs based on the Permanent value of the criteria matrix formed for each model based on the comparison criteria. The software reliability model with highest value of the Permanent is ranked at number – 1 and so on.

Keywords: Matrix method, Model ranking, Model selection, Model selection criteria, Software reliability models.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2319

398 Online Brands: A Comparative Study of World Top Ranked Universities with Science and Technology Programs

Authors: Zullina H. Shaari, Amzairi Amar, Abdul Mutalib Embong, Hezlina Hashim

Abstract:

University websites are considered as one of the brand primary touch points for multiple stakeholders, but most of them did not have great designs to create favorable impressions. Some of the elements that web designers should carefully consider are the appearance, the content, the functionality, usability and search engine optimization. However, priority should be placed on website simplicity and negative space. In terms of content, previous research suggests that universities should include reputation, learning environment, graduate career prospects, image destination, cultural integration, and virtual tour on their websites. The study examines how top 200 world ranking science and technology-based universities present their brands online and whether the websites capture the content dimensions. Content analysis of the websites revealed that the top ranking universities captured these dimensions at varying degree. Besides, the UK-based university had better priority on website simplicity and negative space compared to the Malaysian-based university.

Keywords: Science and technology programs, top-ranked universities, online brands, university websites.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2292

397 Hybrid Reliability-Similarity-Based Approach for Supervised Machine Learning

Authors: Walid Cherif

Abstract:

Data mining has, over recent years, seen big advances because of the spread of internet, which generates everyday a tremendous volume of data, and also the immense advances in technologies which facilitate the analysis of these data. In particular, classification techniques are a subdomain of Data Mining which determines in which group each data instance is related within a given dataset. It is used to classify data into different classes according to desired criteria. Generally, a classification technique is either statistical or machine learning. Each type of these techniques has its own limits. Nowadays, current data are becoming increasingly heterogeneous; consequently, current classification techniques are encountering many difficulties. This paper defines new measure functions to quantify the resemblance between instances and then combines them in a new approach which is different from actual algorithms by its reliability computations. Results of the proposed approach exceeded most common classification techniques with an f-measure exceeding 97% on the IRIS Dataset.

Keywords: Data mining, knowledge discovery, machine learning, similarity measurement, supervised classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1529

396 Military Fighter Aircraft Selection Using Multiplicative Multiple Criteria Decision Making Analysis Method

Authors: C. Ardil

Abstract:

Multiplicative multiple criteria decision making analysis (MCDMA) method is a systematic decision support system to aid decision makers reach appropriate decisions. The application of multiplicative MCDMA in the military aircraft selection problem is significant for proper decision making process, which is the decisive factor in minimizing expenditures and increasing defense capability and capacity. Nine military fighter aircraft alternatives were evaluated by ten decision criteria to solve the decision making problem. In this study, multiplicative MCDMA model aims to evaluate and select an appropriate military fighter aircraft for the Air Force fleet planning. The ranking results of multiplicative MCDMA model were compared with the ranking results of additive MCDMA, logarithmic MCDMA, and regrettive MCDMA models under the L₂ norm data normalization technique to substantiate the robustness of the proposed method. The final ranking results indicate the military fighter aircraft Su-57 as the best available solution.

Keywords: Aircraft Selection, Military Fighter Aircraft Selection, Air Force Fleet Planning, Multiplicative MCDMA, Additive MCDMA, Logarithmic MCDMA, Regrettive MCDMA, Mean Weight, Multiple Criteria Decision Making Analysis, Sensitivity Analysis

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 773

395 A Comparative Analysis Approach Based on Fuzzy AHP, TOPSIS and PROMETHEE for the Selection Problem of GSCM Solutions

Authors: Omar Boutkhoum, Mohamed Hanine, Abdessadek Bendarag

Abstract:

Sustainable economic growth is nowadays driving firms to extend toward the adoption of many green supply chain management (GSCM) solutions. However, the evaluation and selection of these solutions is a matter of concern that needs very serious decisions, involving complexity owing to the presence of various associated factors. To resolve this problem, a comparative analysis approach based on multi-criteria decision-making methods is proposed for adequate evaluation of sustainable supply chain management solutions. In the present paper, we propose an integrated decision-making model based on FAHP (Fuzzy Analytic Hierarchy Process), TOPSIS (Technique for Order of Preference by Similarity to Ideal Solution) and PROMETHEE (Preference Ranking Organisation METHod for Enrichment Evaluations) to contribute to a better understanding and development of new sustainable strategies for industrial organizations. Due to the varied importance of the selected criteria, FAHP is used to identify the evaluation criteria and assign the importance weights for each criterion, while TOPSIS and PROMETHEE methods employ these weighted criteria as inputs to evaluate and rank the alternatives. The main objective is to provide a comparative analysis based on TOPSIS and PROMETHEE processes to help make sound and reasoned decisions related to the selection problem of GSCM solution.

Keywords: GSCM solutions, multi-criteria analysis, FAHP, TOPSIS, PROMETHEE, decision support system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 939

394 Image Retrieval Using Fused Features

Authors: K. Sakthivel, R. Nallusamy, C. Kavitha

Abstract:

The system is designed to show images which are related to the query image. Extracting color, texture, and shape features from an image plays a vital role in content-based image retrieval (CBIR). Initially RGB image is converted into HSV color space due to its perceptual uniformity. From the HSV image, Color features are extracted using block color histogram, texture features using Haar transform and shape feature using Fuzzy C-means Algorithm. Then, the characteristics of the global and local color histogram, texture features through co-occurrence matrix and Haar wavelet transform and shape are compared and analyzed for CBIR. Finally, the best method of each feature is fused during similarity measure to improve image retrieval effectiveness and accuracy.

Keywords: Color Histogram, Haar Wavelet Transform, Fuzzy C-means, Co-occurrence matrix; Similarity measure.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2128

393 A Numerical Solution Based On Operational Matrix of Differentiation of Shifted Second Kind Chebyshev Wavelets for a Stefan Problem

Authors: Rajeev, N. K. Raigar

Abstract:

In this study, one dimensional phase change problem (a Stefan problem) is considered and a numerical solution of this problem is discussed. First, we use similarity transformation to convert the governing equations into ordinary differential equations with its boundary conditions. The solutions of ordinary differential equation with the associated boundary conditions and interface condition (Stefan condition) are obtained by using a numerical approach based on operational matrix of differentiation of shifted second kind Chebyshev wavelets. The obtained results are compared with existing exact solution which is sufficiently accurate.

Keywords: Operational matrix of differentiation, Similarity transformation, Shifted second kind Chebyshev wavelets, Stefan problem.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2001

392 Generation of Photo-Mosaic Images through Block Matching and Color Adjustment

Authors: Hae-Yeoun Lee

Abstract:

Mosaic refers to a technique that makes image by gathering lots of small materials in various colors. This paper presents an automatic algorithm that makes the photo-mosaic image using photos. The algorithm is composed of 4 steps: partition and feature extraction, block matching, redundancy removal and color adjustment. The input image is partitioned in the small block to extract feature. Each block is matched to find similar photo in database by comparing similarity with Euclidean difference between blocks. The intensity of the block is adjusted to enhance the similarity of image by replacing the value of light and darkness with that of relevant block. Further, the quality of image is improved by minimizing the redundancy of tiles in the adjacent blocks. Experimental results support that the proposed algorithm is excellent in quantitative analysis and qualitative analysis.

Keywords: Photo-mosaic, Euclidean distance, Block matching, Intensity adjustment.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3571

391 Environmental Decision Making Model for Assessing On-Site Performances of Building Subcontractors

Authors: Buket Metin

Abstract:

Buildings cause a variety of loads on the environment due to activities performed at each stage of the building life cycle. Construction is the first stage that affects both the natural and built environments at different steps of the process, which can be defined as transportation of materials within the construction site, formation and preparation of materials on-site and the application of materials to realize the building subsystems. All of these steps require the use of technology, which varies based on the facilities that contractors and subcontractors have. Hence, environmental consequences of the construction process should be tackled by focusing on construction technology options used in every step of the process. This paper presents an environmental decision-making model for assessing on-site performances of subcontractors based on the construction technology options which they can supply. First, construction technologies, which constitute information, tools and methods, are classified. Then, environmental performance criteria are set forth related to resource consumption, ecosystem quality, and human health issues. Finally, the model is developed based on the relationships between the construction technology components and the environmental performance criteria. The Fuzzy Analytical Hierarchy Process (FAHP) method is used for weighting the environmental performance criteria according to environmental priorities of decision-maker(s), while the Technique for Order Preference by Similarity to Ideal Solution (TOPSIS) method is used for ranking on-site environmental performances of subcontractors using quantitative data related to the construction technology components. Thus, the model aims to provide an insight to decision-maker(s) about the environmental consequences of the construction process and to provide an opportunity to improve the overall environmental performance of construction sites.

Keywords: Construction process, construction technology, decision making, environmental performance, subcontractors.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1172

390 A Study on Finding Similar Document with Multiple Categories

Authors: R. Saraçoğlu, N. Allahverdi

Abstract:

Searching similar documents and document management subjects have important place in text mining. One of the most important parts of similar document research studies is the process of classifying or clustering the documents. In this study, a similar document search approach that includes discussion of out the case of belonging to multiple categories (multiple categories problem) has been carried. The proposed method that based on Fuzzy Similarity Classification (FSC) has been compared with Rocchio algorithm and naive Bayes method which are widely used in text mining. Empirical results show that the proposed method is quite successful and can be applied effectively. For the second stage, multiple categories vector method based on information of categories regarding to frequency of being seen together has been used. Empirical results show that achievement is increased almost two times, when proposed method is compared with classical approach.

Keywords: Document similarity, Fuzzy classification, Multiple categories, Text mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1707

389 3D CAD Models and its Feature Similarity

Authors: Elmi Abu Bakar, Tetsuo Miyake, Zhong Zhang, Takashi Imamura

Abstract:

Knowing the geometrical object pose of products in manufacturing line before robot manipulation is required and less time consuming for overall shape measurement. In order to perform it, the information of shape representation and matching of objects is become required. Objects are compared with its descriptor that conceptually subtracted from each other to form scalar metric. When the metric value is smaller, the object is considered closed to each other. Rotating the object from static pose in some direction introduce the change of value in scalar metric value of boundary information after feature extraction of related object. In this paper, a proposal method for indexing technique for retrieval of 3D geometrical models based on similarity between boundaries shapes in order to measure 3D CAD object pose using object shape feature matching for Computer Aided Testing (CAT) system in production line is proposed. In experimental results shows the effectiveness of proposed method.

Keywords: CAD, rendering, feature extraction, feature classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1979

388 Real Time Speed Estimation of Vehicles

Authors: Azhar Hussain, Kashif Shahzad, Chunming Tang

Abstract:

this paper gives a novel approach towards real-time speed estimation of multiple traffic vehicles using fuzzy logic and image processing techniques with proper arrangement of camera parameters. The described algorithm consists of several important steps. First, the background is estimated by computing median over time window of specific frames. Second, the foreground is extracted using fuzzy similarity approach (FSA) between estimated background pixels and the current frame pixels containing foreground and background. Third, the traffic lanes are divided into two parts for both direction vehicles for parallel processing. Finally, the speeds of vehicles are estimated by Maximum a Posterior Probability (MAP) estimator. True ground speed is determined by utilizing infrared sensors for three different vehicles and the results are compared to the proposed algorithm with an accuracy of ± 0.74 kmph.

Keywords: Defuzzification, Fuzzy similarity approach, lane cropping, Maximum a Posterior Probability (MAP) estimator, Speed estimation

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2806

387 Multi-objective Optimization with Fuzzy Based Ranking for TCSC Supplementary Controller to Improve Rotor Angle and Voltage Stability

Authors: S. Panda, S. C. Swain, A. K. Baliarsingh, A. K. Mohanty, C. Ardil

Abstract:

Many real-world optimization problems involve multiple conflicting objectives and the use of evolutionary algorithms to solve the problems has attracted much attention recently. This paper investigates the application of multi-objective optimization technique for the design of a Thyristor Controlled Series Compensator (TCSC)-based controller to enhance the performance of a power system. The design objective is to improve both rotor angle stability and system voltage profile. A Genetic Algorithm (GA) based solution technique is applied to generate a Pareto set of global optimal solutions to the given multi-objective optimisation problem. Further, a fuzzy-based membership value assignment method is employed to choose the best compromise solution from the obtained Pareto solution set. Simulation results are presented to show the effectiveness and robustness of the proposed approach.

Keywords: Multi-objective optimisation, thyristor controlled series compensator, power system stability, genetic algorithm, pareto solution set, fuzzy ranking.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1938

386 Effect of Neighborhood Size on Negative Weights in Punctual Kriging Based Image Restoration

Authors: Asmatullah Chaudhry, Anwar M. Mirza

Abstract:

We present a general comparison of punctual kriging based image restoration for different neighbourhood sizes. The formulation of the technique under consideration is based on punctual kriging and fuzzy concepts for image restoration in spatial domain. Three different neighbourhood windows are considered to estimate the semivariance at different lags for studying its effect in reduction of negative weights resulted in punctual kriging, consequently restoration of degraded images. Our results show that effect of neighbourhood size higher than 5x5 on reduction in negative weights is insignificant. In addition, image quality measures, such as structure similarity indices, peak signal to noise ratios and the new variogram based quality measures; show that 3x3 window size gives better performance as compared with larger window sizes.

Keywords: Image restoration, punctual kriging, semi-variance, structure similarity index, negative weights in punctual kriging.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2358

385 Parallezation Protein Sequence Similarity Algorithms using Remote Method Interface

Authors: Mubarak Saif Mohsen, Zurinahni Zainol, Rosalina Abdul Salam, Wahidah Husain

Abstract:

One of the major problems in genomic field is to perform sequence comparison on DNA and protein sequences. Executing sequence comparison on the DNA and protein data is a computationally intensive task. Sequence comparison is the basic step for all algorithms in protein sequences similarity. Parallel computing is an attractive solution to provide the computational power needed to speedup the lengthy process of the sequence comparison. Our main research is to enhance the protein sequence algorithm using dynamic programming method. In our approach, we parallelize the dynamic programming algorithm using multithreaded program to perform the sequence comparison and also developed a distributed protein database among many PCs using Remote Method Interface (RMI). As a result, we showed how different sizes of protein sequences data and computation of scoring matrix of these protein sequence on different number of processors affected the processing time and speed, as oppose to sequential processing.

Keywords: Protein sequence algorithm, dynamic programming algorithm, multithread

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1903

384 OCIRS: An Ontology-based Chinese Idioms Retrieval System

Authors: Hu Haibo, Tu Chunmei, Fu Chunlei, Fu Li, Mao Fan, Ma Yuan

Abstract:

Chinese Idioms are a type of traditional Chinese idiomatic expressions with specific meanings and stereotypes structure which are widely used in classical Chinese and are still common in vernacular written and spoken Chinese today. Currently, Chinese Idioms are retrieved in glossary with key character or key word in morphology or pronunciation index that can not meet the need of searching semantically. OCIRS is proposed to search the desired idiom in the case of users only knowing its meaning without any key character or key word. The user-s request in a sentence or phrase will be grammatically analyzed in advance by word segmentation, key word extraction and semantic similarity computation, thus can be mapped to the idiom domain ontology which is constructed to provide ample semantic relations and to facilitate description logics-based reasoning for idiom retrieval. The experimental evaluation shows that OCIRS realizes the function of searching idioms via semantics, obtaining preliminary achievement as requested by the users.

Keywords: Chinese idiom, idiom retrieval, semantic searching, ontology, semantics similarity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1719

383 Multicriteria Decision Analysis for Development Ranking of Balkan Countries

Authors: C. Ardil

Abstract:

In this research, the Balkan peninsula countries' developmental integration into European Union represents the strategic economic development objectives of the countries in the region. In order to objectively analyze the level of economic development competition of Balkan Peninsula countries, the mathematical compromise programming technique of multicriteria evaluation is used in this ranking problem. The primary aim of this research is to explain the role and significance of the multicriteria method evaluation using a real example of compromise solutions. Using the mathematical compromise programming technique, twelve countries of the Balkan Peninsula are economically evaluated and mutually compared. The economic development evaluation of the countries is performed according to five evaluation criteria forming the basis for economic development evaluation. The multiattribute model is solved using the mathematical compromise programming technique for producing different Pareto solutions. The results obtained by the multicriteria evaluation gives the possibility of identification and evaluation of the most eminent economic development indicators for each country separately. Finally, in this way, the proposed method has proved to be a successful model for the evaluation of the Balkan peninsula countries' economic development competition.

Keywords: Balkan peninsula countries, standard deviation, multicriteria decision making, mathematical compromise programming, multicriteria decision making, multicriteria analysis, multicriteria decision analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 794

382 Detecting Remote Protein Evolutionary Relationships via String Scoring Method

Authors: Nazar Zaki, Safaai Deris

Abstract:

The amount of the information being churned out by the field of biology has jumped manifold and now requires the extensive use of computer techniques for the management of this information. The predominance of biological information such as protein sequence similarity in the biological information sea is key information for detecting protein evolutionary relationship. Protein sequence similarity typically implies homology, which in turn may imply structural and functional similarities. In this work, we propose, a learning method for detecting remote protein homology. The proposed method uses a transformation that converts protein sequence into fixed-dimensional representative feature vectors. Each feature vector records the sensitivity of a protein sequence to a set of amino acids substrings generated from the protein sequences of interest. These features are then used in conjunction with support vector machines for the detection of the protein remote homology. The proposed method is tested and evaluated on two different benchmark protein datasets and it-s able to deliver improvements over most of the existing homology detection methods.

Keywords: Protein homology detection; support vectormachine; string kernel.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1392

381 Computational Analysis of Potential Inhibitors Selected Based On Structural Similarity for the Src SH2 Domain

Authors: W. P. Hu, J. V. Kumar, Jeffrey J. P. Tsai

Abstract:

The inhibition of SH2 domain regulated protein-protein interactions is an attractive target for developing an effective chemotherapeutic approach in the treatment of disease. Molecular simulation is a useful tool for developing new drugs and for studying molecular recognition. In this study, we searched potential drug compounds for the inhibition of SH2 domain by performing structural similarity search in PubChem Compound Database. A total of 37 compounds were screened from the database, and then we used the LibDock docking program to evaluate the inhibition effect. The best three compounds (AP22408, CID 71463546 and CID 9917321) were chosen for MD simulations after the LibDock docking. Our results show that the compound CID 9917321 can produce a more stable protein-ligand complex compared to other two currently known inhibitors of Src SH2 domain. The compound CID 9917321 may be useful for the inhibition of SH2 domain based on these computational results. Subsequently experiments are needed to verify the effect of compound CID 9917321 on the SH2 domain in the future studies.

Keywords: Nonpeptide inhibitor, Src SH2 domain, LibDock, molecular dynamics simulation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2078

380 Computational Method for Annotation of Protein Sequence According to Gene Ontology Terms

Authors: Razib M. Othman, Safaai Deris, Rosli M. Illias

Abstract:

Annotation of a protein sequence is pivotal for the understanding of its function. Accuracy of manual annotation provided by curators is still questionable by having lesser evidence strength and yet a hard task and time consuming. A number of computational methods including tools have been developed to tackle this challenging task. However, they require high-cost hardware, are difficult to be setup by the bioscientists, or depend on time intensive and blind sequence similarity search like Basic Local Alignment Search Tool. This paper introduces a new method of assigning highly correlated Gene Ontology terms of annotated protein sequences to partially annotated or newly discovered protein sequences. This method is fully based on Gene Ontology data and annotations. Two problems had been identified to achieve this method. The first problem relates to splitting the single monolithic Gene Ontology RDF/XML file into a set of smaller files that can be easy to assess and process. Thus, these files can be enriched with protein sequences and Inferred from Electronic Annotation evidence associations. The second problem involves searching for a set of semantically similar Gene Ontology terms to a given query. The details of macro and micro problems involved and their solutions including objective of this study are described. This paper also describes the protein sequence annotation and the Gene Ontology. The methodology of this study and Gene Ontology based protein sequence annotation tool namely extended UTMGO is presented. Furthermore, its basic version which is a Gene Ontology browser that is based on semantic similarity search is also introduced.

Keywords: automatic clustering, bioinformatics tool, gene ontology, protein sequence annotation, semantic similarity search

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3128

379 Fuzzy Uncertainty Theory for Stealth Fighter Aircraft Selection in Entropic Fuzzy TOPSIS Decision Analysis Process

Authors: C. Ardil

Abstract:

The purpose of this paper is to present fuzzy TOPSIS in an entropic fuzzy environment. Due to the ambiguous concepts often represented in decision data, exact values are insufficient to model real-life situations. In this paper, the rating of each alternative is defined in fuzzy linguistic terms, which can be expressed with triangular fuzzy numbers. The weight of each criterion is then derived from the decision matrix using the entropy weighting method. Next, a vertex method is proposed to calculate the distance between two triangular fuzzy numbers. According to the TOPSIS concept, a closeness coefficient is defined to determine the ranking order of all alternatives by simultaneously calculating the distances to both the fuzzy positive-ideal solution (FPIS) and the fuzzy negative-ideal solution (FNIS). Finally, an illustrative example of selecting stealth fighter aircraft is shown at the end of this article to highlight the procedure of the proposed method. Correlation analysis and validation analysis using TOPSIS, WSM, and WPM methods were performed to compare the ranking order of the alternatives.

Keywords: stealth fighter aircraft selection, fuzzy uncertainty theory (FUT), fuzzy entropic decision (FED), fuzzy linguistic variables, triangular fuzzy numbers, multiple criteria decision making analysis, MCDMA, TOPSIS, WSM, WPM

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 603

378 Feature Selection with Kohonen Self Organizing Classification Algorithm

Authors: Francesco Maiorana

Abstract:

In this paper a one-dimension Self Organizing Map algorithm (SOM) to perform feature selection is presented. The algorithm is based on a first classification of the input dataset on a similarity space. From this classification for each class a set of positive and negative features is computed. This set of features is selected as result of the procedure. The procedure is evaluated on an in-house dataset from a Knowledge Discovery from Text (KDT) application and on a set of publicly available datasets used in international feature selection competitions. These datasets come from KDT applications, drug discovery as well as other applications. The knowledge of the correct classification available for the training and validation datasets is used to optimize the parameters for positive and negative feature extractions. The process becomes feasible for large and sparse datasets, as the ones obtained in KDT applications, by using both compression techniques to store the similarity matrix and speed up techniques of the Kohonen algorithm that take advantage of the sparsity of the input matrix. These improvements make it feasible, by using the grid, the application of the methodology to massive datasets.

Keywords: Clustering algorithm, Data mining, Feature selection, Grid, Kohonen Self Organizing Map.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3052

377 Exploiting Global Self Similarity for Head-Shoulder Detection

Authors: Lae-Jeong Park, Jung-Ho Moon

Abstract:

People detection from images has a variety of applications such as video surveillance and driver assistance system, but is still a challenging task and more difficult in crowded environments such as shopping malls in which occlusion of lower parts of human body often occurs. Lack of the full-body information requires more effective features than common features such as HOG. In this paper, new features are introduced that exploits global self-symmetry (GSS) characteristic in head-shoulder patterns. The features encode the similarity or difference of color histograms and oriented gradient histograms between two vertically symmetric blocks. The domain-specific features are rapid to compute from the integral images in Viola-Jones cascade-of-rejecters framework. The proposed features are evaluated with our own head-shoulder dataset that, in part, consists of a well-known INRIA pedestrian dataset. Experimental results show that the GSS features are effective in reduction of false alarmsmarginally and the gradient GSS features are preferred more often than the color GSS ones in the feature selection.

Keywords: Pedestrian detection, cascade of rejecters, feature extraction, self-symmetry, HOG.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2400

376 A Distance Function for Data with Missing Values and Its Application

Authors: Loai AbdAllah, Ilan Shimshoni

Abstract:

Missing values in data are common in real world applications. Since the performance of many data mining algorithms depend critically on it being given a good metric over the input space, we decided in this paper to define a distance function for unlabeled datasets with missing values. We use the Bhattacharyya distance, which measures the similarity of two probability distributions, to define our new distance function. According to this distance, the distance between two points without missing attributes values is simply the Mahalanobis distance. When on the other hand there is a missing value of one of the coordinates, the distance is computed according to the distribution of the missing coordinate. Our distance is general and can be used as part of any algorithm that computes the distance between data points. Because its performance depends strongly on the chosen distance measure, we opted for the k nearest neighbor classifier to evaluate its ability to accurately reflect object similarity. We experimented on standard numerical datasets from the UCI repository from different fields. On these datasets we simulated missing values and compared the performance of the kNN classifier using our distance to other three basic methods. Our experiments show that kNN using our distance function outperforms the kNN using other methods. Moreover, the runtime performance of our method is only slightly higher than the other methods.

Keywords: Missing values, Distance metric, Bhattacharyya distance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2752

375 Modeling Uncertainty in Multiple Criteria Decision Making Using the Technique for Order Preference by Similarity to Ideal Solution for the Selection of Stealth Combat Aircraft

Authors: C. Ardil

Abstract:

Uncertainty set theory is a generalization of fuzzy set theory and intuitionistic fuzzy set theory. It serves as an effective tool for dealing with inconsistent, imprecise, and vague information. The technique for order preference by similarity to ideal solution (TOPSIS) method is a multiple-attribute method used to identify solutions from a finite set of alternatives. It simultaneously minimizes the distance from an ideal point and maximizes the distance from a nadir point. In this paper, an extension of the TOPSIS method for multiple attribute group decision-making (MAGDM) based on uncertainty sets is presented. In uncertainty decision analysis, decision-makers express information about attribute values and weights using uncertainty numbers to select the best stealth combat aircraft.

Keywords: Uncertainty set, stealth combat aircraft selection multiple criteria decision-making analysis, MCDM, uncertainty decision analysis, TOPSIS

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 146

374 Application of Scientific Metrics to Evaluate Academic Reputation in Different Research Areas

Authors: Cristiano R. Cervi, Renata Galante, José Palazzo M. de Oliveira

Abstract:

In this paper, we address the problem of identifying academic reputation of researchers using scientific metrics in different research areas. Due to the characteristics of each area, researchers can present different behaviors. In previous work, we define Rep-Index that makes use of a profile template to individually identify the reputation of researchers. The Rep-Index is comprehensive and adaptive because involves hole trajectory of the researcher built throughout his career and can be used in different areas and in different contexts. Now, we compare our metric (Rep-Index) with the h-index and the g-index through experiments with researchers in the fields of Economics, Dentistry and Computer Science. We analyze the trajectory of 830 Brazilian researchers from the National Council of Technological and Scientific Development (CNPq), which receive grants research productivity. The grants are aimed at productivity researchers that stand out among their peers, enhancing their scientific normative criteria established by CNPq. Of the 830 researchers, 210 are in the area of Economics, 216 of Dentistry e 404 of Computer Science. The experiments show that our metric is strongly correlated with h-index, g-index and CNPq ranking. We also show good results for our hypothesis that our metric can be used to evaluate research in several areas. We apply our metric (Rep-Index) to compare the behavior of researchers in relation to their h-index and g-index through extensive experiments. The experiments showed that our metric is strongly correlated with h-index, g-index and CNPq ranking.

Keywords: Researcher reputation, profile model, scientific metrics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2001

373 Minimal Spanning Tree based Fuzzy Clustering

Authors: Ágnes Vathy-Fogarassy, Balázs Feil, János Abonyi

Abstract:

Most of fuzzy clustering algorithms have some discrepancies, e.g. they are not able to detect clusters with convex shapes, the number of the clusters should be a priori known, they suffer from numerical problems, like sensitiveness to the initialization, etc. This paper studies the synergistic combination of the hierarchical and graph theoretic minimal spanning tree based clustering algorithm with the partitional Gath-Geva fuzzy clustering algorithm. The aim of this hybridization is to increase the robustness and consistency of the clustering results and to decrease the number of the heuristically defined parameters of these algorithms to decrease the influence of the user on the clustering results. For the analysis of the resulted fuzzy clusters a new fuzzy similarity measure based tool has been presented. The calculated similarities of the clusters can be used for the hierarchical clustering of the resulted fuzzy clusters, which information is useful for cluster merging and for the visualization of the clustering results. As the examples used for the illustration of the operation of the new algorithm will show, the proposed algorithm can detect clusters from data with arbitrary shape and does not suffer from the numerical problems of the classical Gath-Geva fuzzy clustering algorithm.

Keywords: Clustering, fuzzy clustering, minimal spanning tree, cluster validity, fuzzy similarity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2408

372 Thermosolutal MHD Mixed Marangoni Convective Boundary Layers in the Presence of Suction or Injection

Authors: Noraini Ahmad, Seripah Awang Kechil, Norma Mohd Basir

Abstract:

The steady coupled dissipative layers, called Marangoni mixed convection boundary layers, in the presence of a magnetic field and solute concentration that are formed along the surface of two immiscible fluids with uniform suction or injection effects is examined. The similarity boundary layer equations are solved numerically using the Runge-Kutta Fehlberg with shooting technique. The Marangoni, buoyancy and external pressure gradient effects that are generated in mixed convection boundary layer flow are assessed. The velocity, temperature and concentration boundary layers thickness decrease with the increase of the magnetic field strength and the injection to suction. For buoyancy-opposed flow, the Marangoni mixed convection parameter enhances the velocity boundary layer but decreases the temperature and concentration boundary layers. However, for the buoyancy-assisted flow, the Marangoni mixed convection parameter decelerates the velocity but increases the temperature and concentration boundary layers.

Keywords: Magnetic field, mixed Marangoni convection, similarity boundary layers, solute concentration.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1882