Search results for: Co-occurrence matrix; Similarity measure.
2316 Improving Similarity Search Using Clustered Data
Authors: Deokho Kim, Wonwoo Lee, Jaewoong Lee, Teresa Ng, Gun-Ill Lee, Jiwon Jeong
Abstract:
This paper presents a method for improving object search accuracy using a deep learning model. A major limitation to provide accurate similarity with deep learning is the requirement of huge amount of data for training pairwise similarity scores (metrics), which is impractical to collect. Thus, similarity scores are usually trained with a relatively small dataset, which comes from a different domain, causing limited accuracy on measuring similarity. For this reason, this paper proposes a deep learning model that can be trained with a significantly small amount of data, a clustered data which of each cluster contains a set of visually similar images. In order to measure similarity distance with the proposed method, visual features of two images are extracted from intermediate layers of a convolutional neural network with various pooling methods, and the network is trained with pairwise similarity scores which is defined zero for images in identical cluster. The proposed method outperforms the state-of-the-art object similarity scoring techniques on evaluation for finding exact items. The proposed method achieves 86.5% of accuracy compared to the accuracy of the state-of-the-art technique, which is 59.9%. That is, an exact item can be found among four retrieved images with an accuracy of 86.5%, and the rest can possibly be similar products more than the accuracy. Therefore, the proposed method can greatly reduce the amount of training data with an order of magnitude as well as providing a reliable similarity metric.
Keywords: Visual search, deep learning, convolutional neural network, machine learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 8252315 Algebraic Riccati Matrix Equation for Eigen- Decomposition of Special Structured Matrices; Applications in Structural Mechanics
Authors: Mahdi Nouri
Abstract:
In this paper Algebraic Riccati matrix equation is used for Eigen-decomposition of special structured matrices. This is achieved by similarity transformation and then using algebraic riccati matrix equation to triangulation of matrices. The process is decomposition of matrices into small and specially structured submatrices with low dimensions for fast and easy finding of Eigenpairs. Numerical and structural examples included showing the efficiency of present method.
Keywords: Riccati, matrix equation, eigenvalue problem, symmetric, bisymmetric, persymmetric, decomposition, canonical forms, Graphs theory, adjacency and Laplacian matrices.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18062314 On Generalized New Class of Matrix Polynomial Set
Authors: Ghazi S. Kahmmash
Abstract:
New generalization of the new class matrix polynomial set have been obtained. An explicit representation and an expansion of the matrix exponential in a series of these matrix are given for these matrix polynomials.
Keywords: Generating functions, Recurrences relation and Generalization of the new class matrix polynomial set.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 12532313 Alphanumeric Hand-Prints Classification: Similarity Analysis between Local Decisions
Authors: G. Dimauro, S. Impedovo, M.G. Lucchese, R. Modugno, G. Pirlo
Abstract:
This paper presents the analysis of similarity between local decisions, in the process of alphanumeric hand-prints classification. From the analysis of local characteristics of handprinted numerals and characters, extracted by a zoning method, the set of classification decisions is obtained and the similarity among them is investigated. For this purpose the Similarity Index is used, which is an estimator of similarity between classifiers, based on the analysis of agreements between their decisions. The experimental tests, carried out using numerals and characters from the CEDAR and ETL database, respectively, show to what extent different parts of the patterns provide similar classification decisions.
Keywords: Handwriting Recognition, Optical Character Recognition, Similarity Index, Zoning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13082312 Featured based Segmentation of Color Textured Images using GLCM and Markov Random Field Model
Authors: Dipti Patra, Mridula J
Abstract:
In this paper, we propose a new image segmentation approach for colour textured images. The proposed method for image segmentation consists of two stages. In the first stage, textural features using gray level co-occurrence matrix(GLCM) are computed for regions of interest (ROI) considered for each class. ROI acts as ground truth for the classes. Ohta model (I1, I2, I3) is the colour model used for segmentation. Statistical mean feature at certain inter pixel distance (IPD) of I2 component was considered to be the optimized textural feature for further segmentation. In the second stage, the feature matrix obtained is assumed to be the degraded version of the image labels and modeled as Markov Random Field (MRF) model to model the unknown image labels. The labels are estimated through maximum a posteriori (MAP) estimation criterion using ICM algorithm. The performance of the proposed approach is compared with that of the existing schemes, JSEG and another scheme which uses GLCM and MRF in RGB colour space. The proposed method is found to be outperforming the existing ones in terms of segmentation accuracy with acceptable rate of convergence. The results are validated with synthetic and real textured images.
Keywords: Texture Image Segmentation, Gray Level Cooccurrence Matrix, Markov Random Field Model, Ohta colour space, ICM algorithm.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21732311 Approximately Similarity Measurement of Web Sites Using Genetic Algorithms and Binary Trees
Authors: Doru Anastasiu Popescu, Dan Rădulescu
Abstract:
In this paper, we determine the similarity of two HTML web applications. We are going to use a genetic algorithm in order to determine the most significant web pages of each application (we are not going to use every web page of a site). Using these significant web pages, we will find the similarity value between the two applications. The algorithm is going to be efficient because we are going to use a reduced number of web pages for comparisons but it will return an approximate value of the similarity. The binary trees are used to keep the tags from the significant pages. The algorithm was implemented in Java language.
Keywords: Tag, HTML, web page, genetic algorithm, similarity value, binary tree.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13092310 The Partial Non-combinatorially Symmetric N10 -Matrix Completion Problem
Authors: Gu-Fang Mou, Ting-Zhu Huang
Abstract:
An n×n matrix is called an N1 0 -matrix if all principal minors are non-positive and each entry is non-positive. In this paper, we study the partial non-combinatorially symmetric N1 0 -matrix completion problems if the graph of its specified entries is a transitive tournament or a double cycle. In general, these digraphs do not have N1 0 -completion. Therefore, we have given sufficient conditions that guarantee the existence of the N1 0 -completion for these digraphs.
Keywords: Matrix completion, matrix completion, N10 -matrix, non-combinatorially symmetric, cycle, digraph.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 10862309 Fuzzy Adjacency Matrix in Graphs
Authors: Mahdi Taheri, Mehrana Niroumand
Abstract:
In this paper a new definition of adjacency matrix in the simple graphs is presented that is called fuzzy adjacency matrix, so that elements of it are in the form of 0 and n N n 1 , ∈ that are in the interval [0, 1], and then some charactristics of this matrix are presented with the related examples . This form matrix has complete of information of a graph.Keywords: Graph, adjacency matrix, fuzzy numbers
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 23732308 Hybrid Reliability-Similarity-Based Approach for Supervised Machine Learning
Authors: Walid Cherif
Abstract:
Data mining has, over recent years, seen big advances because of the spread of internet, which generates everyday a tremendous volume of data, and also the immense advances in technologies which facilitate the analysis of these data. In particular, classification techniques are a subdomain of Data Mining which determines in which group each data instance is related within a given dataset. It is used to classify data into different classes according to desired criteria. Generally, a classification technique is either statistical or machine learning. Each type of these techniques has its own limits. Nowadays, current data are becoming increasingly heterogeneous; consequently, current classification techniques are encountering many difficulties. This paper defines new measure functions to quantify the resemblance between instances and then combines them in a new approach which is different from actual algorithms by its reliability computations. Results of the proposed approach exceeded most common classification techniques with an f-measure exceeding 97% on the IRIS Dataset.
Keywords: Data mining, knowledge discovery, machine learning, similarity measurement, supervised classification.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15272307 A Similarity Function for Global Quality Assessment of Retinal Vessel Segmentations
Authors: Arturo Aquino, Manuel Emilio Gegundez, Jose Manuel Bravo, Diego Marin
Abstract:
Retinal vascularity assessment plays an important role in diagnosis of ophthalmic pathologies. The employment of digital images for this purpose makes possible a computerized approach and has motivated development of many methods for automated vascular tree segmentation. Metrics based on contingency tables for binary classification have been widely used for evaluating performance of these algorithms and, concretely, the accuracy has been mostly used as measure of global performance in this topic. However, this metric shows very poor matching with human perception as well as other notable deficiencies. Here, a new similarity function for measuring quality of retinal vessel segmentations is proposed. This similarity function is based on characterizing the vascular tree as a connected structure with a measurable area and length. Tests made indicate that this new approach shows better behaviour than the current one does. Generalizing, this concept of measuring descriptive properties may be used for designing functions for measuring more successfully segmentation quality of other complex structures.
Keywords: Retinal vessel segmentation, quality assessment, performanceevaluation, similarity function.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15002306 Feature Selection with Kohonen Self Organizing Classification Algorithm
Authors: Francesco Maiorana
Abstract:
In this paper a one-dimension Self Organizing Map algorithm (SOM) to perform feature selection is presented. The algorithm is based on a first classification of the input dataset on a similarity space. From this classification for each class a set of positive and negative features is computed. This set of features is selected as result of the procedure. The procedure is evaluated on an in-house dataset from a Knowledge Discovery from Text (KDT) application and on a set of publicly available datasets used in international feature selection competitions. These datasets come from KDT applications, drug discovery as well as other applications. The knowledge of the correct classification available for the training and validation datasets is used to optimize the parameters for positive and negative feature extractions. The process becomes feasible for large and sparse datasets, as the ones obtained in KDT applications, by using both compression techniques to store the similarity matrix and speed up techniques of the Kohonen algorithm that take advantage of the sparsity of the input matrix. These improvements make it feasible, by using the grid, the application of the methodology to massive datasets.Keywords: Clustering algorithm, Data mining, Feature selection, Grid, Kohonen Self Organizing Map.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 30522305 Sequence Relationships Similarity of Swine Influenza a (H1N1) Virus
Authors: Patsaraporn Somboonsak, Mud-Armeen Munlin
Abstract:
In April 2009, a new variant of Influenza A virus subtype H1N1 emerged in Mexico and spread all over the world. The influenza has three subtypes in human (H1N1, H1N2 and H3N2) Types B and C influenza tend to be associated with local or regional epidemics. Preliminary genetic characterization of the influenza viruses has identified them as swine influenza A (H1N1) viruses. Nucleotide sequence analysis of the Haemagglutinin (HA) and Neuraminidase (NA) are similar to each other and the majority of their genes of swine influenza viruses, two genes coding for the neuraminidase (NA) and matrix (M) proteins are similar to corresponding genes of swine influenza. Sequence similarity between the 2009 A (H1N1) virus and its nearest relatives indicates that its gene segments have been circulating undetected for an extended period. Nucleic acid sequence Maximum Likelihood (MCL) and DNA Empirical base frequencies, Phylogenetic relationship amongst the HA genes of H1N1 virus isolated in Genbank having high nucleotide sequence homology. In this paper we used 16 HA nucleotide sequences from NCBI for computing sequence relationships similarity of swine influenza A virus using the following method MCL the result is 28%, 36.64% for Optimal tree with the sum of branch length, 35.62% for Interior branch phylogeny Neighber – Join Tree, 1.85% for the overall transition/transversion, and 8.28% for Overall mean distance.Keywords: Sequence DNA, Relationship of swine, Swineinfluenza, Sequence Similarity
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21242304 Interactive, Topic-Oriented Search Support by a Centroid-Based Text Categorisation
Authors: Mario Kubek, Herwig Unger
Abstract:
Centroid terms are single words that semantically and topically characterise text documents and so may serve as their very compact representation in automatic text processing. In the present paper, centroids are used to measure the relevance of text documents with respect to a given search query. Thus, a new graphbased paradigm for searching texts in large corpora is proposed and evaluated against keyword-based methods. The first, promising experimental results demonstrate the usefulness of the centroid-based search procedure. It is shown that especially the routing of search queries in interactive and decentralised search systems can be greatly improved by applying this approach. A detailed discussion on further fields of its application completes this contribution.Keywords: Search algorithm, centroid, query, keyword, cooccurrence, categorisation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6232303 Product Configuration Strategy Based On Product Family Similarity
Authors: Heejung Lee
Abstract:
To offer a large variety of products while maintaining low costs, high speed, and high quality in a mass customization product development environment, platform based product development has much benefit and usefulness in many industry fields. This paper proposes a product configuration strategy by similarity measure, incorporating the knowledge engineering principles such as product information model, ontology engineering, and formal concept analysis.
Keywords: Platform, product family, ontology, formal concept analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17882302 Inverse Matrix in the Theory of Dynamic Systems
Authors: R. Masarova, M. Juhas, B. Juhasova, Z. Sutova
Abstract:
In dynamic system theory a mathematical model is often used to describe their properties. In order to find a transfer matrix of a dynamic system we need to calculate an inverse matrix. The paper contains the fusion of the classical theory and the procedures used in the theory of automated control for calculating the inverse matrix. The final part of the paper models the given problem by the Matlab.Keywords: Dynamic system, transfer matrix, inverse matrix, modeling.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 24122301 Ranking Genes from DNA Microarray Data of Cervical Cancer by a local Tree Comparison
Authors: Frank Emmert-Streib, Matthias Dehmer, Jing Liu, Max Muhlhauser
Abstract:
The major objective of this paper is to introduce a new method to select genes from DNA microarray data. As criterion to select genes we suggest to measure the local changes in the correlation graph of each gene and to select those genes whose local changes are largest. More precisely, we calculate the correlation networks from DNA microarray data of cervical cancer whereas each network represents a tissue of a certain tumor stage and each node in the network represents a gene. From these networks we extract one tree for each gene by a local decomposition of the correlation network. The interpretation of a tree is that it represents the n-nearest neighbor genes on the n-th level of a tree, measured by the Dijkstra distance, and, hence, gives the local embedding of a gene within the correlation network. For the obtained trees we measure the pairwise similarity between trees rooted by the same gene from normal to cancerous tissues. This evaluates the modification of the tree topology due to tumor progression. Finally, we rank the obtained similarity values from all tissue comparisons and select the top ranked genes. For these genes the local neighborhood in the correlation networks changes most between normal and cancerous tissues. As a result we find that the top ranked genes are candidates suspected to be involved in tumor growth. This indicates that our method captures essential information from the underlying DNA microarray data of cervical cancer.
Keywords: Graph similarity, generalized trees, graph alignment, DNA microarray data, cervical cancer.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17532300 Numerical Treatment of Matrix Differential Models Using Matrix Splines
Authors: Kholod M. Abualnaja
Abstract:
This paper consider the solution of the matrix differential models using quadratic, cubic, quartic, and quintic splines. Also using the Taylor’s and Picard’s matrix methods, one illustrative example is included.
Keywords: Matrix Splines, Cubic Splines, Quartic Splines.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17022299 Similarity Based Membership of Elements to Uncertain Concept in Information System
Authors: M. Kamel El-Sayed
Abstract:
The process of determining the degree of membership for an element to an uncertain concept has been found in many ways, using equivalence and symmetry relations in information systems. In the case of similarity, these methods did not take into account the degree of symmetry between elements. In this paper, we use a new definition for finding the membership based on the degree of symmetry. We provide an example to clarify the suggested methods and compare it with previous methods. This method opens the door to more accurate decisions in information systems.
Keywords: Information system, uncertain concept, membership function, similarity relation, degree of similarity.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 8102298 SAF: A Substitution and Alignment Free Similarity Measure for Protein Sequences
Authors: Abdellali Kelil, Shengrui Wang, Ryszard Brzezinski
Abstract:
The literature reports a large number of approaches for measuring the similarity between protein sequences. Most of these approaches estimate this similarity using alignment-based techniques that do not necessarily yield biologically plausible results, for two reasons. First, for the case of non-alignable (i.e., not yet definitively aligned and biologically approved) sequences such as multi-domain, circular permutation and tandem repeat protein sequences, alignment-based approaches do not succeed in producing biologically plausible results. This is due to the nature of the alignment, which is based on the matching of subsequences in equivalent positions, while non-alignable proteins often have similar and conserved domains in non-equivalent positions. Second, the alignment-based approaches lead to similarity measures that depend heavily on the parameters set by the user for the alignment (e.g., gap penalties and substitution matrices). For easily alignable protein sequences, it's possible to supply a suitable combination of input parameters that allows such an approach to yield biologically plausible results. However, for difficult-to-align protein sequences, supplying different combinations of input parameters yields different results. Such variable results create ambiguities and complicate the similarity measurement task. To overcome these drawbacks, this paper describes a novel and effective approach for measuring the similarity between protein sequences, called SAF for Substitution and Alignment Free. Without resorting either to the alignment of protein sequences or to substitution relations between amino acids, SAF is able to efficiently detect the significant subsequences that best represent the intrinsic properties of protein sequences, those underlying the chronological dependencies of structural features and biochemical activities of protein sequences. Moreover, by using a new efficient subsequence matching scheme, SAF more efficiently handles protein sequences that contain similar structural features with significant meaning in chronologically non-equivalent positions. To show the effectiveness of SAF, extensive experiments were performed on protein datasets from different databases, and the results were compared with those obtained by several mainstream algorithms.Keywords: Protein, Similarity, Substitution, Alignment.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14102297 The Relationship of Eigenvalues between Backward MPSD and Jacobi Iterative Matrices
Authors: Zhuan-de Wang, Hou-biao Li, Zhong-xi Gao
Abstract:
In this paper, the backward MPSD (Modified Preconditioned Simultaneous Displacement) iterative matrix is firstly proposed. The relationship of eigenvalues between the backward MPSD iterative matrix and backward Jacobi iterative matrix for block p-cyclic case is obtained, which improves and refines the results in the corresponding references.
Keywords: Backward MPSD iterative matrix, Jacobi iterative matrix, eigenvalue, p-cyclic matrix.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17772296 Non-Overlapping Hierarchical Index Structure for Similarity Search
Authors: Mounira Taileb, Sid Lamrous, Sami Touati
Abstract:
In order to accelerate the similarity search in highdimensional database, we propose a new hierarchical indexing method. It is composed of offline and online phases. Our contribution concerns both phases. In the offline phase, after gathering the whole of the data in clusters and constructing a hierarchical index, the main originality of our contribution consists to develop a method to construct bounding forms of clusters to avoid overlapping. For the online phase, our idea improves considerably performances of similarity search. However, for this second phase, we have also developed an adapted search algorithm. Our method baptized NOHIS (Non-Overlapping Hierarchical Index Structure) use the Principal Direction Divisive Partitioning (PDDP) as algorithm of clustering. The principle of the PDDP is to divide data recursively into two sub-clusters; division is done by using the hyper-plane orthogonal to the principal direction derived from the covariance matrix and passing through the centroid of the cluster to divide. Data of each two sub-clusters obtained are including by a minimum bounding rectangle (MBR). The two MBRs are directed according to the principal direction. Consequently, the nonoverlapping between the two forms is assured. Experiments use databases containing image descriptors. Results show that the proposed method outperforms sequential scan and SRtree in processing k-nearest neighbors.
Keywords: K-nearest neighbour search, multi-dimensional indexing, multimedia databases, similarity search.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15622295 On Positive Definite Solutions of Quaternionic Matrix Equations
Authors: Minghui Wang
Abstract:
The real representation of the quaternionic matrix is definited and studied. The relations between the positive (semi)define quaternionic matrix and its real representation matrix are presented. By means of the real representation, the relation between the positive (semi)definite solutions of quaternionic matrix equations and those of corresponding real matrix equations is established.Keywords: Matrix equation, Quaternionic matrix, Real representation, positive (semi)definite solutions.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14192294 Topographic Arrangement of 3D Design Components on 2D Maps by Unsupervised Feature Extraction
Authors: Stefan Menzel
Abstract:
As a result of the daily workflow in the design development departments of companies, databases containing huge numbers of 3D geometric models are generated. According to the given problem engineers create CAD drawings based on their design ideas and evaluate the performance of the resulting design, e.g. by computational simulations. Usually, new geometries are built either by utilizing and modifying sets of existing components or by adding single newly designed parts to a more complex design. The present paper addresses the two facets of acquiring components from large design databases automatically and providing a reasonable overview of the parts to the engineer. A unified framework based on the topographic non-negative matrix factorization (TNMF) is proposed which solves both aspects simultaneously. First, on a given database meaningful components are extracted into a parts-based representation in an unsupervised manner. Second, the extracted components are organized and visualized on square-lattice 2D maps. It is shown on the example of turbine-like geometries that these maps efficiently provide a wellstructured overview on the database content and, at the same time, define a measure for spatial similarity allowing an easy access and reuse of components in the process of design development.Keywords: Design decomposition, topographic non-negative matrix factorization, parts-based representation, self-organization, unsupervised feature extraction.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13792293 Connectivity Estimation from the Inverse Coherence Matrix in a Complex Chaotic Oscillator Network
Authors: Won Sup Kim, Xue-Mei Cui, Seung Kee Han
Abstract:
We present on the method of inverse coherence matrix for the estimation of network connectivity from multivariate time series of a complex system. In a model system of coupled chaotic oscillators, it is shown that the inverse coherence matrix defined as the inverse of cross coherence matrix is proportional to the network connectivity. Therefore the inverse coherence matrix could be used for the distinction between the directly connected links from indirectly connected links in a complex network. We compare the result of network estimation using the method of the inverse coherence matrix with the results obtained from the coherence matrix and the partial coherence matrix.
Keywords: Chaotic oscillator, complex network, inverse coherence matrix, network estimation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20032292 Incremental Algorithm to Cluster the Categorical Data with Frequency Based Similarity Measure
Authors: S.Aranganayagi, K.Thangavel
Abstract:
Clustering categorical data is more complicated than the numerical clustering because of its special properties. Scalability and memory constraint is the challenging problem in clustering large data set. This paper presents an incremental algorithm to cluster the categorical data. Frequencies of attribute values contribute much in clustering similar categorical objects. In this paper we propose new similarity measures based on the frequencies of attribute values and its cardinalities. The proposed measures and the algorithm are experimented with the data sets from UCI data repository. Results prove that the proposed method generates better clusters than the existing one.Keywords: Clustering, Categorical, Incremental, Frequency, Domain
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18202291 A Combination of Similarity Ranking and Time for Social Research Paper Searching
Authors: P. Jomsri
Abstract:
Nowadays social media are important tools for web resource discovery. The performance and capabilities of web searches are vital, especially search results from social research paper bookmarking. This paper proposes a new algorithm for ranking method that is a combination of similarity ranking with paper posted time or CSTRank. The paper posted time is static ranking for improving search results. For this particular study, the paper posted time is combined with similarity ranking to produce a better ranking than other methods such as similarity ranking or SimRank. The retrieval performance of combination rankings is evaluated using mean values of NDCG. The evaluation in the experiments implies that the chosen CSTRank ranking by using weight score at ratio 90:10 can improve the efficiency of research paper searching on social bookmarking websites.Keywords: combination ranking, information retrieval, time, similarity ranking, static ranking, weight score
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16662290 Using Spectral Vectors and M-Tree for Graph Clustering and Searching in Graph Databases of Protein Structures
Authors: Do Phuc, Nguyen Thi Kim Phung
Abstract:
In this paper, we represent protein structure by using graph. A protein structure database will become a graph database. Each graph is represented by a spectral vector. We use Jacobi rotation algorithm to calculate the eigenvalues of the normalized Laplacian representation of adjacency matrix of graph. To measure the similarity between two graphs, we calculate the Euclidean distance between two graph spectral vectors. To cluster the graphs, we use M-tree with the Euclidean distance to cluster spectral vectors. Besides, M-tree can be used for graph searching in graph database. Our proposal method was tested with graph database of 100 graphs representing 100 protein structures downloaded from Protein Data Bank (PDB) and we compare the result with the SCOP hierarchical structure.Keywords: Eigenvalues, m-tree, graph database, protein structure, spectra graph theory.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16562289 Solving Linear Matrix Equations by Matrix Decompositions
Authors: Yongxin Yuan, Kezheng Zuo
Abstract:
In this paper, a system of linear matrix equations is considered. A new necessary and sufficient condition for the consistency of the equations is derived by means of the generalized singular-value decomposition, and the explicit representation of the general solution is provided.
Keywords: Matrix equation, Generalized inverse, Generalized singular-value decomposition.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20582288 Impact of Similarity Ratings on Human Judgement
Authors: Ian A. McCulloh, Madelaine Zinser, Jesse Patsolic, Michael Ramos
Abstract:
Recommender systems are a common artificial intelligence (AI) application. For any given input, a search system will return a rank-ordered list of similar items. As users review returned items, they must decide when to halt the search and either revise search terms or conclude their requirement is novel with no similar items in the database. We present a statistically designed experiment that investigates the impact of similarity ratings on human judgement to conclude a search item is novel and halt the search. In the study, 450 participants were recruited from Amazon Mechanical Turk to render judgement across 12 decision tasks. We find the inclusion of ratings increases the human perception that items are novel. Percent similarity increases novelty discernment when compared with star-rated similarity or the absence of a rating. Ratings reduce the time to decide and improve decision confidence. This suggests that the inclusion of similarity ratings can aid human decision-makers in knowledge search tasks.
Keywords: Ratings, rankings, crowdsourcing, empirical studies, user studies, similarity measures, human-centered computing, novelty in information retrieval.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4172287 Destination Port Detection for Vessels: An Analytic Tool for Optimizing Port Authorities Resources
Authors: Lubna Eljabu, Mohammad Etemad, Stan Matwin
Abstract:
Port authorities have many challenges in congested ports to allocate their resources to provide a safe and secure loading/unloading procedure for cargo vessels. Selecting a destination port is the decision of a vessel master based on many factors such as weather, wavelength and changes of priorities. Having access to a tool which leverages Automatic Identification System (AIS) messages to monitor vessel’s movements and accurately predict their next destination port promotes an effective resource allocation process for port authorities. In this research, we propose a method, namely, Reference Route of Trajectory (RRoT) to assist port authorities in predicting inflow and outflow traffic in their local environment by monitoring AIS messages. Our RRo method creates a reference route based on historical AIS messages. It utilizes some of the best trajectory similarity measures to identify the destination of a vessel using their recent movement. We evaluated five different similarity measures such as Discrete Frechet Distance (DFD), Dynamic Time ´ Warping (DTW), Partial Curve Mapping (PCM), Area between two curves (Area) and Curve length (CL). Our experiments show that our method identifies the destination port with an accuracy of 98.97% and an f-measure of 99.08% using Dynamic Time Warping (DTW) similarity measure.
Keywords: Spatial temporal data mining, trajectory mining, trajectory similarity, resource optimization.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 696