Search results for: Approximate similarity
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 605

Search results for: Approximate similarity

455 Detecting Remote Protein Evolutionary Relationships via String Scoring Method

Authors: Nazar Zaki, Safaai Deris

Abstract:

The amount of the information being churned out by the field of biology has jumped manifold and now requires the extensive use of computer techniques for the management of this information. The predominance of biological information such as protein sequence similarity in the biological information sea is key information for detecting protein evolutionary relationship. Protein sequence similarity typically implies homology, which in turn may imply structural and functional similarities. In this work, we propose, a learning method for detecting remote protein homology. The proposed method uses a transformation that converts protein sequence into fixed-dimensional representative feature vectors. Each feature vector records the sensitivity of a protein sequence to a set of amino acids substrings generated from the protein sequences of interest. These features are then used in conjunction with support vector machines for the detection of the protein remote homology. The proposed method is tested and evaluated on two different benchmark protein datasets and it-s able to deliver improvements over most of the existing homology detection methods.

Keywords: Protein homology detection; support vectormachine; string kernel.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1346
454 Computational Analysis of Potential Inhibitors Selected Based On Structural Similarity for the Src SH2 Domain

Authors: W. P. Hu, J. V. Kumar, Jeffrey J. P. Tsai

Abstract:

The inhibition of SH2 domain regulated protein-protein interactions is an attractive target for developing an effective chemotherapeutic approach in the treatment of disease. Molecular simulation is a useful tool for developing new drugs and for studying molecular recognition. In this study, we searched potential drug compounds for the inhibition of SH2 domain by performing structural similarity search in PubChem Compound Database. A total of 37 compounds were screened from the database, and then we used the LibDock docking program to evaluate the inhibition effect. The best three compounds (AP22408, CID 71463546 and CID 9917321) were chosen for MD simulations after the LibDock docking. Our results show that the compound CID 9917321 can produce a more stable protein-ligand complex compared to other two currently known inhibitors of Src SH2 domain. The compound CID 9917321 may be useful for the inhibition of SH2 domain based on these computational results. Subsequently experiments are needed to verify the effect of compound CID 9917321 on the SH2 domain in the future studies.

Keywords: Nonpeptide inhibitor, Src SH2 domain, LibDock, molecular dynamics simulation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2033
453 Computational Method for Annotation of Protein Sequence According to Gene Ontology Terms

Authors: Razib M. Othman, Safaai Deris, Rosli M. Illias

Abstract:

Annotation of a protein sequence is pivotal for the understanding of its function. Accuracy of manual annotation provided by curators is still questionable by having lesser evidence strength and yet a hard task and time consuming. A number of computational methods including tools have been developed to tackle this challenging task. However, they require high-cost hardware, are difficult to be setup by the bioscientists, or depend on time intensive and blind sequence similarity search like Basic Local Alignment Search Tool. This paper introduces a new method of assigning highly correlated Gene Ontology terms of annotated protein sequences to partially annotated or newly discovered protein sequences. This method is fully based on Gene Ontology data and annotations. Two problems had been identified to achieve this method. The first problem relates to splitting the single monolithic Gene Ontology RDF/XML file into a set of smaller files that can be easy to assess and process. Thus, these files can be enriched with protein sequences and Inferred from Electronic Annotation evidence associations. The second problem involves searching for a set of semantically similar Gene Ontology terms to a given query. The details of macro and micro problems involved and their solutions including objective of this study are described. This paper also describes the protein sequence annotation and the Gene Ontology. The methodology of this study and Gene Ontology based protein sequence annotation tool namely extended UTMGO is presented. Furthermore, its basic version which is a Gene Ontology browser that is based on semantic similarity search is also introduced.

Keywords: automatic clustering, bioinformatics tool, gene ontology, protein sequence annotation, semantic similarity search

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3093
452 Military Combat Aircraft Selection Using Trapezoidal Fuzzy Numbers with the Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS)

Authors: C. Ardil

Abstract:

This article presents a new approach to uncertainty, vagueness, and imprecision analysis for ranking alternatives with fuzzy data for decision making using the Technique for Order Preference by Similarity to Ideal Solution (TOPSIS). In the proposed approach, fuzzy decision information related to the aircraft selection problem is taken into account in ranking the alternatives and selecting the best one. The basic procedural step is to transform the fuzzy decision matrices into matrices of alternatives evaluated according to all decision criteria. A numerical example illustrates the proposed approach for the military combat aircraft selection problem.

Keywords: trapezoidal fuzzy numbers, multiple criteria decision making analysis, decision making, aircraft selection, MCDMA, fuzzy TOPSIS

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 409
451 Feature Selection with Kohonen Self Organizing Classification Algorithm

Authors: Francesco Maiorana

Abstract:

In this paper a one-dimension Self Organizing Map algorithm (SOM) to perform feature selection is presented. The algorithm is based on a first classification of the input dataset on a similarity space. From this classification for each class a set of positive and negative features is computed. This set of features is selected as result of the procedure. The procedure is evaluated on an in-house dataset from a Knowledge Discovery from Text (KDT) application and on a set of publicly available datasets used in international feature selection competitions. These datasets come from KDT applications, drug discovery as well as other applications. The knowledge of the correct classification available for the training and validation datasets is used to optimize the parameters for positive and negative feature extractions. The process becomes feasible for large and sparse datasets, as the ones obtained in KDT applications, by using both compression techniques to store the similarity matrix and speed up techniques of the Kohonen algorithm that take advantage of the sparsity of the input matrix. These improvements make it feasible, by using the grid, the application of the methodology to massive datasets.

Keywords: Clustering algorithm, Data mining, Feature selection, Grid, Kohonen Self Organizing Map.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3006
450 Exploiting Global Self Similarity for Head-Shoulder Detection

Authors: Lae-Jeong Park, Jung-Ho Moon

Abstract:

People detection from images has a variety of applications such as video surveillance and driver assistance system, but is still a challenging task and more difficult in crowded environments such as shopping malls in which occlusion of lower parts of human body often occurs. Lack of the full-body information requires more effective features than common features such as HOG. In this paper, new features are introduced that exploits global self-symmetry (GSS) characteristic in head-shoulder patterns. The features encode the similarity or difference of color histograms and oriented gradient histograms between two vertically symmetric blocks. The domain-specific features are rapid to compute from the integral images in Viola-Jones cascade-of-rejecters framework. The proposed features are evaluated with our own head-shoulder dataset that, in part, consists of a well-known INRIA pedestrian dataset. Experimental results show that the GSS features are effective in reduction of false alarmsmarginally and the gradient GSS features are preferred more often than the color GSS ones in the feature selection.

Keywords: Pedestrian detection, cascade of rejecters, feature extraction, self-symmetry, HOG.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2357
449 A Distance Function for Data with Missing Values and Its Application

Authors: Loai AbdAllah, Ilan Shimshoni

Abstract:

Missing values in data are common in real world applications. Since the performance of many data mining algorithms depend critically on it being given a good metric over the input space, we decided in this paper to define a distance function for unlabeled datasets with missing values. We use the Bhattacharyya distance, which measures the similarity of two probability distributions, to define our new distance function. According to this distance, the distance between two points without missing attributes values is simply the Mahalanobis distance. When on the other hand there is a missing value of one of the coordinates, the distance is computed according to the distribution of the missing coordinate. Our distance is general and can be used as part of any algorithm that computes the distance between data points. Because its performance depends strongly on the chosen distance measure, we opted for the k nearest neighbor classifier to evaluate its ability to accurately reflect object similarity. We experimented on standard numerical datasets from the UCI repository from different fields. On these datasets we simulated missing values and compared the performance of the kNN classifier using our distance to other three basic methods. Our  experiments show that kNN using our distance function outperforms the kNN using other methods. Moreover, the runtime performance of our method is only slightly higher than the other methods.

Keywords: Missing values, Distance metric, Bhattacharyya distance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2700
448 The Wavelet-Based DFT: A New Interpretation, Extensions and Applications

Authors: Abdulnasir Hossen, Ulrich Heute

Abstract:

In 1990 [1] the subband-DFT (SB-DFT) technique was proposed. This technique used the Hadamard filters in the decomposition step to split the input sequence into low- and highpass sequences. In the next step, either two DFTs are needed on both bands to compute the full-band DFT or one DFT on one of the two bands to compute an approximate DFT. A combination network with correction factors was to be applied after the DFTs. Another approach was proposed in 1997 [2] for using a special discrete wavelet transform (DWT) to compute the discrete Fourier transform (DFT). In the first step of the algorithm, the input sequence is decomposed in a similar manner to the SB-DFT into two sequences using wavelet decomposition with Haar filters. The second step is to perform DFTs on both bands to obtain the full-band DFT or to obtain a fast approximate DFT by implementing pruning at both input and output sides. In this paper, the wavelet-based DFT (W-DFT) with Haar filters is interpreted as SB-DFT with Hadamard filters. The only difference is in a constant factor in the combination network. This result is very important to complete the analysis of the W-DFT, since all the results concerning the accuracy and approximation errors in the SB-DFT are applicable. An application example in spectral analysis is given for both SB-DFT and W-DFT (with different filters). The adaptive capability of the SB-DFT is included in the W-DFT algorithm to select the band of most energy as the band to be computed. Finally, the W-DFT is extended to the two-dimensional case. An application in image transformation is given using two different types of wavelet filters.

Keywords: Image Transform, Spectral Analysis, Sub-Band DFT, Wavelet DFT.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1627
447 Development of an Automatic Calibration Framework for Hydrologic Modelling Using Approximate Bayesian Computation

Authors: A. Chowdhury, P. Egodawatta, J. M. McGree, A. Goonetilleke

Abstract:

Hydrologic models are increasingly used as tools to predict stormwater quantity and quality from urban catchments. However, due to a range of practical issues, most models produce gross errors in simulating complex hydraulic and hydrologic systems. Difficulty in finding a robust approach for model calibration is one of the main issues. Though automatic calibration techniques are available, they are rarely used in common commercial hydraulic and hydrologic modelling software e.g. MIKE URBAN. This is partly due to the need for a large number of parameters and large datasets in the calibration process. To overcome this practical issue, a framework for automatic calibration of a hydrologic model was developed in R platform and presented in this paper. The model was developed based on the time-area conceptualization. Four calibration parameters, including initial loss, reduction factor, time of concentration and time-lag were considered as the primary set of parameters. Using these parameters, automatic calibration was performed using Approximate Bayesian Computation (ABC). ABC is a simulation-based technique for performing Bayesian inference when the likelihood is intractable or computationally expensive to compute. To test the performance and usefulness, the technique was used to simulate three small catchments in Gold Coast. For comparison, simulation outcomes from the same three catchments using commercial modelling software, MIKE URBAN were used. The graphical comparison shows strong agreement of MIKE URBAN result within the upper and lower 95% credible intervals of posterior predictions as obtained via ABC. Statistical validation for posterior predictions of runoff result using coefficient of determination (CD), root mean square error (RMSE) and maximum error (ME) was found reasonable for three study catchments. The main benefit of using ABC over MIKE URBAN is that ABC provides a posterior distribution for runoff flow prediction, and therefore associated uncertainty in predictions can be obtained. In contrast, MIKE URBAN just provides a point estimate. Based on the results of the analysis, it appears as though ABC the developed framework performs well for automatic calibration.

Keywords: Automatic calibration framework, approximate Bayesian computation, hydrologic and hydraulic modelling, MIKE URBAN software, R platform.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1674
446 Surrogate based Evolutionary Algorithm for Design Optimization

Authors: Maumita Bhattacharya

Abstract:

Optimization is often a critical issue for most system design problems. Evolutionary Algorithms are population-based, stochastic search techniques, widely used as efficient global optimizers. However, finding optimal solution to complex high dimensional, multimodal problems often require highly computationally expensive function evaluations and hence are practically prohibitive. The Dynamic Approximate Fitness based Hybrid EA (DAFHEA) model presented in our earlier work [14] reduced computation time by controlled use of meta-models to partially replace the actual function evaluation by approximate function evaluation. However, the underlying assumption in DAFHEA is that the training samples for the meta-model are generated from a single uniform model. Situations like model formation involving variable input dimensions and noisy data certainly can not be covered by this assumption. In this paper we present an enhanced version of DAFHEA that incorporates a multiple-model based learning approach for the SVM approximator. DAFHEA-II (the enhanced version of the DAFHEA framework) also overcomes the high computational expense involved with additional clustering requirements of the original DAFHEA framework. The proposed framework has been tested on several benchmark functions and the empirical results illustrate the advantages of the proposed technique.

Keywords: Evolutionary algorithm, Fitness function, Optimization, Meta-model, Stochastic method.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1533
445 Surface Flattening Assisted with 3D Mannequin Based On Minimum Energy

Authors: Shih-Wen Hsiao, Rong-Qi Chen, Chien-Yu Lin

Abstract:

The topic of surface flattening plays a vital role in the field of computer aided design and manufacture. Surface flattening enables the production of 2D patterns and it can be used in design and manufacturing for developing a 3D surface to a 2D platform, especially in fashion design. This study describes surface flattening based on minimum energy methods according to the property of different fabrics. Firstly, through the geometric feature of a 3D surface, the less transformed area can be flattened on a 2D platform by geodesic. Then, strain energy that has accumulated in mesh can be stably released by an approximate implicit method and revised error function. In some cases, cutting mesh to further release the energy is a common way to fix the situation and enhance the accuracy of the surface flattening, and this makes the obtained 2D pattern naturally generate significant cracks. When this methodology is applied to a 3D mannequin constructed with feature lines, it enhances the level of computer-aided fashion design. Besides, when different fabrics are applied to fashion design, it is necessary to revise the shape of a 2D pattern according to the properties of the fabric. With this model, the outline of 2D patterns can be revised by distributing the strain energy with different results according to different fabric properties. Finally, this research uses some common design cases to illustrate and verify the feasibility of this methodology.

Keywords: Surface flattening, Strain energy, Minimum energy, approximate implicit method, Fashion design.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2542
444 An Overview of Some High Order and Multi-Level Finite Difference Schemes in Computational Aeroacoustics

Authors: Appanah Rao Appadu, Muhammad Zaid Dauhoo

Abstract:

In this paper, we have combined some spatial derivatives with the optimised time derivative proposed by Tam and Webb in order to approximate the linear advection equation which is given by = 0. Ôêé Ôêé + Ôêé Ôêé x f t u These spatial derivatives are as follows: a standard 7-point 6 th -order central difference scheme (ST7), a standard 9-point 8 th -order central difference scheme (ST9) and optimised schemes designed by Tam and Webb, Lockard et al., Zingg et al., Zhuang and Chen, Bogey and Bailly. Thus, these seven different spatial derivatives have been coupled with the optimised time derivative to obtain seven different finite-difference schemes to approximate the linear advection equation. We have analysed the variation of the modified wavenumber and group velocity, both with respect to the exact wavenumber for each spatial derivative. The problems considered are the 1-D propagation of a Boxcar function, propagation of an initial disturbance consisting of a sine and Gaussian function and the propagation of a Gaussian profile. It is known that the choice of the cfl number affects the quality of results in terms of dissipation and dispersion characteristics. Based on the numerical experiments solved and numerical methods used to approximate the linear advection equation, it is observed in this work, that the quality of results is dependent on the choice of the cfl number, even for optimised numerical methods. The errors from the numerical results have been quantified into dispersion and dissipation using a technique devised by Takacs. Also, the quantity, Exponential Error for Low Dispersion and Low Dissipation, eeldld has been computed from the numerical results. Moreover, based on this work, it has been found that when the quantity, eeldld can be used as a measure of the total error. In particular, the total error is a minimum when the eeldld is a minimum.

Keywords: Optimised time derivative, dissipation, dispersion, cfl number, Nomenclature: k : time step, h : spatial step, β :advection velocity, r: cfl/Courant number, hkrβ= , w =θ, h : exact wave number, n :time level, RPE : Relative phase error per unit time step, AFM :modulus of amplification factor

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1589
443 Modeling Uncertainty in Multiple Criteria Decision Making Using the Technique for Order Preference by Similarity to Ideal Solution for the Selection of Stealth Combat Aircraft

Authors: C. Ardil

Abstract:

Uncertainty set theory is a generalization of fuzzy set theory and intuitionistic fuzzy set theory. It serves as an effective tool for dealing with inconsistent, imprecise, and vague information. The technique for order preference by similarity to ideal solution (TOPSIS) method is a multiple-attribute method used to identify solutions from a finite set of alternatives. It simultaneously minimizes the distance from an ideal point and maximizes the distance from a nadir point. In this paper, an extension of the TOPSIS method for multiple attribute group decision-making (MAGDM) based on uncertainty sets is presented. In uncertainty decision analysis, decision-makers express information about attribute values and weights using uncertainty numbers to select the best stealth combat aircraft.

Keywords: Uncertainty set, stealth combat aircraft selection multiple criteria decision-making analysis, MCDM, uncertainty decision analysis, TOPSIS

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 49
442 An Approximate Lateral-Torsional Buckling Mode Function for Cantilever I-Beams

Authors: H. Ozbasaran

Abstract:

Lateral torsional buckling is a global buckling mode which should be considered in design of slender structural members under flexure about their strong axis. It is possible to compute the load which causes lateral torsional buckling of a beam by finite element analysis, however, closed form equations are needed in engineering practice for calculation ease which can be obtained by using energy method. In lateral torsional buckling applications of energy method, a proper function for the critical lateral torsional buckling mode should be chosen which can be thought as the variation of twisting angle along the buckled beam. Accuracy of the results depends on how close is the chosen function to the exact mode. Since critical lateral torsional buckling mode of the cantilever I-beams varies due to material properties, section properties and loading case, the hardest step is to determine a proper mode function in application of energy method. This paper presents an approximate function for critical lateral torsional buckling mode of doubly symmetric cantilever I-beams. Coefficient matrices are calculated for concentrated load at free end, uniformly distributed load and constant moment along the beam cases. Critical lateral torsional buckling modes obtained by presented function and exact solutions are compared. It is found that the modes obtained by presented function coincide with differential equation solutions for considered loading cases.

Keywords: Buckling mode, cantilever, lateral-torsional buckling, I-beam.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2516
441 Minimal Spanning Tree based Fuzzy Clustering

Authors: Ágnes Vathy-Fogarassy, Balázs Feil, János Abonyi

Abstract:

Most of fuzzy clustering algorithms have some discrepancies, e.g. they are not able to detect clusters with convex shapes, the number of the clusters should be a priori known, they suffer from numerical problems, like sensitiveness to the initialization, etc. This paper studies the synergistic combination of the hierarchical and graph theoretic minimal spanning tree based clustering algorithm with the partitional Gath-Geva fuzzy clustering algorithm. The aim of this hybridization is to increase the robustness and consistency of the clustering results and to decrease the number of the heuristically defined parameters of these algorithms to decrease the influence of the user on the clustering results. For the analysis of the resulted fuzzy clusters a new fuzzy similarity measure based tool has been presented. The calculated similarities of the clusters can be used for the hierarchical clustering of the resulted fuzzy clusters, which information is useful for cluster merging and for the visualization of the clustering results. As the examples used for the illustration of the operation of the new algorithm will show, the proposed algorithm can detect clusters from data with arbitrary shape and does not suffer from the numerical problems of the classical Gath-Geva fuzzy clustering algorithm.

Keywords: Clustering, fuzzy clustering, minimal spanning tree, cluster validity, fuzzy similarity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2354
440 Thermosolutal MHD Mixed Marangoni Convective Boundary Layers in the Presence of Suction or Injection

Authors: Noraini Ahmad, Seripah Awang Kechil, Norma Mohd Basir

Abstract:

The steady coupled dissipative layers, called Marangoni mixed convection boundary layers, in the presence of a magnetic field and solute concentration that are formed along the surface of two immiscible fluids with uniform suction or injection effects is examined. The similarity boundary layer equations are solved numerically using the Runge-Kutta Fehlberg with shooting technique. The Marangoni, buoyancy and external pressure gradient effects that are generated in mixed convection boundary layer flow are assessed. The velocity, temperature and concentration boundary layers thickness decrease with the increase of the magnetic field strength and the injection to suction. For buoyancy-opposed flow, the Marangoni mixed convection parameter enhances the velocity boundary layer but decreases the temperature and concentration boundary layers. However, for the buoyancy-assisted flow, the Marangoni mixed convection parameter decelerates the velocity but increases the temperature and concentration boundary layers.

Keywords: Magnetic field, mixed Marangoni convection, similarity boundary layers, solute concentration.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1840
439 Multi-Rate Exact Discretization based on Diagonalization of a Linear System - A Multiple-Real-Eigenvalue Case

Authors: T. Sakamoto, N. Hori

Abstract:

A multi-rate discrete-time model, whose response agrees exactly with that of a continuous-time original at all sampling instants for any sampling periods, is developed for a linear system, which is assumed to have multiple real eigenvalues. The sampling rates can be chosen arbitrarily and individually, so that their ratios can even be irrational. The state space model is obtained as a combination of a linear diagonal state equation and a nonlinear output equation. Unlike the usual lifted model, the order of the proposed model is the same as the number of sampling rates, which is less than or equal to the order of the original continuous-time system. The method is based on a nonlinear variable transformation, which can be considered as a generalization of linear similarity transformation, which cannot be applied to systems with multiple eigenvalues in general. An example and its simulation result show that the proposed multi-rate model gives exact responses at all sampling instants.

Keywords: Multi-rate discretization, linear systems, triangularization, similarity transformation, diagonalization, exponential transformation, multiple eigenvalues

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1325
438 Fast Approximate Bayesian Contextual Cold Start Learning (FAB-COST)

Authors: Jack R. McKenzie, Peter A. Appleby, Thomas House, Neil Walton

Abstract:

Cold-start is a notoriously difficult problem which can occur in recommendation systems, and arises when there is insufficient information to draw inferences for users or items. To address this challenge, a contextual bandit algorithm – the Fast Approximate Bayesian Contextual Cold Start Learning algorithm (FAB-COST) – is proposed, which is designed to provide improved accuracy compared to the traditionally used Laplace approximation in the logistic contextual bandit, while controlling both algorithmic complexity and computational cost. To this end, FAB-COST uses a combination of two moment projection variational methods: Expectation Propagation (EP), which performs well at the cold start, but becomes slow as the amount of data increases; and Assumed Density Filtering (ADF), which has slower growth of computational cost with data size but requires more data to obtain an acceptable level of accuracy. By switching from EP to ADF when the dataset becomes large, it is able to exploit their complementary strengths. The empirical justification for FAB-COST is presented, and systematically compared to other approaches on simulated data. In a benchmark against the Laplace approximation on real data consisting of over 670, 000 impressions from autotrader.co.uk, FAB-COST demonstrates at one point increase of over 16% in user clicks. On the basis of these results, it is argued that FAB-COST is likely to be an attractive approach to cold-start recommendation systems in a variety of contexts.

Keywords: Cold-start, expectation propagation, multi-armed bandits, Thompson sampling, variational inference.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 492
437 Grouping and Indexing Color Features for Efficient Image Retrieval

Authors: M. V. Sudhamani, C. R. Venugopal

Abstract:

Content-based Image Retrieval (CBIR) aims at searching image databases for specific images that are similar to a given query image based on matching of features derived from the image content. This paper focuses on a low-dimensional color based indexing technique for achieving efficient and effective retrieval performance. In our approach, the color features are extracted using the mean shift algorithm, a robust clustering technique. Then the cluster (region) mode is used as representative of the image in 3-D color space. The feature descriptor consists of the representative color of a region and is indexed using a spatial indexing method that uses *R -tree thus avoiding the high-dimensional indexing problems associated with the traditional color histogram. Alternatively, the images in the database are clustered based on region feature similarity using Euclidian distance. Only representative (centroids) features of these clusters are indexed using *R -tree thus improving the efficiency. For similarity retrieval, each representative color in the query image or region is used independently to find regions containing that color. The results of these methods are compared. A JAVA based query engine supporting query-by- example is built to retrieve images by color.

Keywords: Content-based, indexing, cluster, region.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1774
436 Stability Analysis of Three-Dimensional Flow and Heat Transfer over a Permeable Shrinking Surface in a Cu-Water Nanofluid

Authors: Roslinda Nazar, Amin Noor, Khamisah Jafar, Ioan Pop

Abstract:

In this paper, the steady laminar three-dimensional boundary layer flow and heat transfer of a copper (Cu)-water nanofluid in the vicinity of a permeable shrinking flat surface in an otherwise quiescent fluid is studied. The nanofluid mathematical model in which the effect of the nanoparticle volume fraction is taken into account is considered. The governing nonlinear partial differential equations are transformed into a system of nonlinear ordinary differential equations using a similarity transformation which is then solved numerically using the function bvp4c from Matlab. Dual solutions (upper and lower branch solutions) are found for the similarity boundary layer equations for a certain range of the suction parameter. A stability analysis has been performed to show which branch solutions are stable and physically realizable. The numerical results for the skin friction coefficient and the local Nusselt number as well as the velocity and temperature profiles are obtained, presented and discussed in detail for a range of various governing parameters.

Keywords: Heat Transfer, Nanofluid, Shrinking Surface, Stability Analysis, Three-Dimensional Flow.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2153
435 A Review and Comparative Analysis on Cluster Ensemble Methods

Authors: S. Sarumathi, P. Ranjetha, C. Saraswathy, M. Vaishnavi, S. Geetha

Abstract:

Clustering is an unsupervised learning technique for aggregating data objects into meaningful classes so that intra cluster similarity is maximized and inter cluster similarity is minimized in data mining. However, no single clustering algorithm proves to be the most effective in producing the best result. As a result, a new challenging technique known as the cluster ensemble approach has blossomed in order to determine the solution to this problem. For the cluster analysis issue, this new technique is a successful approach. The cluster ensemble's main goal is to combine similar clustering solutions in a way that achieves the precision while also improving the quality of individual data clustering. Because of the massive and rapid creation of new approaches in the field of data mining, the ongoing interest in inventing novel algorithms necessitates a thorough examination of current techniques and future innovation. This paper presents a comparative analysis of various cluster ensemble approaches, including their methodologies, formal working process, and standard accuracy and error rates. As a result, the society of clustering practitioners will benefit from this exploratory and clear research, which will aid in determining the most appropriate solution to the problem at hand.

Keywords: Clustering, cluster ensemble methods, consensus function, data mining, unsupervised learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 744
434 Classifying Biomedical Text Abstracts based on Hierarchical 'Concept' Structure

Authors: Rozilawati Binti Dollah, Masaki Aono

Abstract:

Classifying biomedical literature is a difficult and challenging task, especially when a large number of biomedical articles should be organized into a hierarchical structure. In this paper, we present an approach for classifying a collection of biomedical text abstracts downloaded from Medline database with the help of ontology alignment. To accomplish our goal, we construct two types of hierarchies, the OHSUMED disease hierarchy and the Medline abstract disease hierarchies from the OHSUMED dataset and the Medline abstracts, respectively. Then, we enrich the OHSUMED disease hierarchy before adapting it to ontology alignment process for finding probable concepts or categories. Subsequently, we compute the cosine similarity between the vector in probable concepts (in the “enriched" OHSUMED disease hierarchy) and the vector in Medline abstract disease hierarchies. Finally, we assign category to the new Medline abstracts based on the similarity score. The results obtained from the experiments show the performance of our proposed approach for hierarchical classification is slightly better than the performance of the multi-class flat classification.

Keywords: Biomedical literature, hierarchical text classification, ontology alignment, text mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1979
433 Fighter Aircraft Evaluation and Selection Process Based on Triangular Fuzzy Numbers in Multiple Criteria Decision Making Analysis Using the Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS)

Authors: C. Ardil

Abstract:

This article presents a multiple criteria evaluation approach to uncertainty, vagueness, and imprecision analysis for ranking alternatives with fuzzy data for decision making using the Technique for Order Preference by Similarity to Ideal Solution (TOPSIS). The fighter aircraft evaluation and selection decision making problem is modeled in a fuzzy environment with triangular fuzzy numbers. The fuzzy decision information related to the fighter aircraft selection problem is taken into account in ordering the alternatives and selecting the best candidate. The basic fuzzy TOPSIS procedure steps transform fuzzy decision matrices into matrices of alternatives evaluated according to all decision criteria. A practical numerical example illustrates the proposed approach to the fighter aircraft selection problem.

Keywords: triangular fuzzy number (TFN), multiple criteria decision making analysis, decision making, aircraft selection, MCDMA, fuzzy TOPSIS

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 409
432 Bird Diversity along Boat Touring Routes in Tha Ka Sub-District, Amphawa District, Samut Songkram Province, Thailand

Authors: N. Charoenpokaraj, P. Chitman

Abstract:

This research aims to study species, abundance, status of birds, the similarities and activity characteristics of birds which reap benefits from the research area in boat touring routes in Tha Ka sub-district, Amphawa District, Samut Songkram Province, Thailand. from October 2012 – September 2013. The data was analyzed to find the abundance, and similarity index of the birds. The results from the survey of birds on all three routes found that there are 33 families and 63 species. Route 3 (traditional coconut sugar making kiln – resort) had the most species; 56 species. There were 18 species of commonly found birds with an abundance level of 5, which calculates to 28.57% of all bird species. In August, 46 species are found, being the greatest number of bird species benefiting from this route. As for the status of the birds, there are 51 resident birds, 7 resident and migratory birds, and 5 migratory birds. On Route 2 and Route 3, the similarity index value is equal to 0.881. The birds are classified by their activity characteristics i.e. insectivore, piscivore, granivore, nectrivore and aquatic invertebrate feeder birds. Some birds also use the area for nesting.

Keywords: Bird diversity, boat touring routes, Samut Songkram.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1672
431 Optimal Model Order Selection for Transient Error Autoregressive Moving Average (TERA) MRI Reconstruction Method

Authors: Abiodun M. Aibinu, Athaur Rahman Najeeb, Momoh J. E. Salami, Amir A. Shafie

Abstract:

An alternative approach to the use of Discrete Fourier Transform (DFT) for Magnetic Resonance Imaging (MRI) reconstruction is the use of parametric modeling technique. This method is suitable for problems in which the image can be modeled by explicit known source functions with a few adjustable parameters. Despite the success reported in the use of modeling technique as an alternative MRI reconstruction technique, two important problems constitutes challenges to the applicability of this method, these are estimation of Model order and model coefficient determination. In this paper, five of the suggested method of evaluating the model order have been evaluated, these are: The Final Prediction Error (FPE), Akaike Information Criterion (AIC), Residual Variance (RV), Minimum Description Length (MDL) and Hannan and Quinn (HNQ) criterion. These criteria were evaluated on MRI data sets based on the method of Transient Error Reconstruction Algorithm (TERA). The result for each criterion is compared to result obtained by the use of a fixed order technique and three measures of similarity were evaluated. Result obtained shows that the use of MDL gives the highest measure of similarity to that use by a fixed order technique.

Keywords: Autoregressive Moving Average (ARMA), MagneticResonance Imaging (MRI), Parametric modeling, Transient Error.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1577
430 Graph Codes-2D Projections of Multimedia Feature Graphs for Fast and Effective Retrieval

Authors: Stefan Wagenpfeil, Felix Engel, Paul McKevitt, Matthias Hemmje

Abstract:

Multimedia Indexing and Retrieval is generally de-signed and implemented by employing feature graphs. These graphs typically contain a significant number of nodes and edges to reflect the level of detail in feature detection. A higher level of detail increases the effectiveness of the results but also leads to more complex graph structures. However, graph-traversal-based algorithms for similarity are quite inefficient and computation intensive, espe-cially for large data structures. To deliver fast and effective retrieval, an efficient similarity algorithm, particularly for large graphs, is mandatory. Hence, in this paper, we define a graph-projection into a 2D space (Graph Code) as well as the corresponding algorithms for indexing and retrieval. We show that calculations in this space can be performed more efficiently than graph-traversals due to a simpler processing model and a high level of parallelisation. In consequence, we prove that the effectiveness of retrieval also increases substantially, as Graph Codes facilitate more levels of detail in feature fusion. Thus, Graph Codes provide a significant increase in efficiency and effectiveness (especially for Multimedia indexing and retrieval) and can be applied to images, videos, audio, and text information.

Keywords: indexing, retrieval, multimedia, graph code, graph algorithm

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 375
429 On the Maximum Theorem: A Constructive Analysis

Authors: Yasuhito Tanaka

Abstract:

We examine the maximum theorem by Berge from the point of view of Bishop style constructive mathematics. We will show an approximate version of the maximum theorem and the maximum theorem for functions with sequentially locally at most one maximum.

Keywords: Maximum theorem, Constructive mathematics, Sequentially locally at most one maximum.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1930
428 Latent Semantic Inference for Agriculture FAQ Retrieval

Authors: Dawei Wang, Rujing Wang, Ying Li, Baozi Wei

Abstract:

FAQ system can make user find answer to the problem that puzzles them. But now the research on Chinese FAQ system is still on the theoretical stage. This paper presents an approach to semantic inference for FAQ mining. To enhance the efficiency, a small pool of the candidate question-answering pairs retrieved from the system for the follow-up work according to the concept of the agriculture domain extracted from user input .Input queries or questions are converted into four parts, the question word segment (QWS), the verb segment (VS), the concept of agricultural areas segment (CS), the auxiliary segment (AS). A semantic matching method is presented to estimate the similarity between the semantic segments of the query and the questions in the pool of the candidate. A thesaurus constructed from the HowNet, a Chinese knowledge base, is adopted for word similarity measure in the matcher. The questions are classified into eleven intension categories using predefined question stemming keywords. For FAQ mining, given a query, the question part and answer part in an FAQ question-answer pair is matched with the input query, respectively. Finally, the probabilities estimated from these two parts are integrated and used to choose the most likely answer for the input query. These approaches are experimented on an agriculture FAQ system. Experimental results indicate that the proposed approach outperformed the FAQ-Finder system in agriculture FAQ retrieval.

Keywords: FAQ, Semantic Inference, Ontology.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1340
427 Relevance Feedback within CBIR Systems

Authors: Mawloud Mosbah, Bachir Boucheham

Abstract:

We present here the results for a comparative study of some techniques, available in the literature, related to the relevance feedback mechanism in the case of a short-term learning. Only one method among those considered here is belonging to the data mining field which is the K-nearest neighbors algorithm (KNN) while the rest of the methods is related purely to the information retrieval field and they fall under the purview of the following three major axes: Shifting query, Feature Weighting and the optimization of the parameters of similarity metric. As a contribution, and in addition to the comparative purpose, we propose a new version of the KNN algorithm referred to as an incremental KNN which is distinct from the original version in the sense that besides the influence of the seeds, the rate of the actual target image is influenced also by the images already rated. The results presented here have been obtained after experiments conducted on the Wang database for one iteration and utilizing color moments on the RGB space. This compact descriptor, Color Moments, is adequate for the efficiency purposes needed in the case of interactive systems. The results obtained allow us to claim that the proposed algorithm proves good results; it even outperforms a wide range of techniques available in the literature.

Keywords: CBIR, Category Search, Relevance Feedback (RFB), Query Point Movement, Standard Rocchio’s Formula, Adaptive Shifting Query, Feature Weighting, Optimization of the Parameters of Similarity Metric, Original KNN, Incremental KNN.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2255
426 Authenticity of Lipid and Soluble Sugar Profiles of Various Oat Cultivars (Avena sativa)

Authors: Marijana M. Ačanski, Kristian A. Pastor, Djura N. Vujić

Abstract:

The identification of lipid and soluble sugar components in flour samples of different cultivars belonging to common oat species (Avena sativa L.) was performed: spring oat, winter oat and hulless oat. Fatty acids were extracted from flour samples with n-hexane, and derivatized into volatile methyl esters, using TMSH (trimethylsulfonium hydroxide in methanol). Soluble sugars were then extracted from defatted and dried samples of oat flour with 96% ethanol, and further derivatized into corresponding TMS-oximes, using hydroxylamine hydrochloride solution and BSTFA (N,O-bis-(trimethylsilyl)-trifluoroacetamide). The hexane and ethanol extracts of each oat cultivar were analyzed using GC-MS system. Lipid and simple sugar compositions are very similar in all samples of investigated cultivars. Chemometric tool was applied to numeric values of automatically integrated surface areas of detected lipid and simple sugar components in their corresponding derivatized forms. Hierarchical cluster analysis shows a very high similarity between the investigated flour samples of oat cultivars, according to the fatty acid content (0.9955). Moderate similarity was observed according to the content of soluble sugars (0.50). These preliminary results support the idea of establishing methods for oat flour authentication, and provide the means for distinguishing oat flour samples, regardless of the variety, from flour samples made of other cereal species, just by lipid and simple sugar profile analysis.

Keywords: Authentication, chemometrics, GC-MS, lipid and soluble sugar composition, oat cultivars.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1325