Search results for: informative theoretic similarity metrics.
631 Analysis of Physicochemical Properties on Prediction of R5, X4 and R5X4 HIV-1 Coreceptor Usage
Authors: Kai-Ti Hsu, Hui-Ling Huang, Chun-Wei Tung, Yi-Hsiung Chen, Shinn-Ying Ho
Abstract:
Bioinformatics methods for predicting the T cell coreceptor usage from the array of membrane protein of HIV-1 are investigated. In this study, we aim to propose an effective prediction method for dealing with the three-class classification problem of CXCR4 (X4), CCR5 (R5) and CCR5/CXCR4 (R5X4). We made efforts in investigating the coreceptor prediction problem as follows: 1) proposing a feature set of informative physicochemical properties which is cooperated with SVM to achieve high prediction test accuracy of 81.48%, compared with the existing method with accuracy of 70.00%; 2) establishing a large up-to-date data set by increasing the size from 159 to 1225 sequences to verify the proposed prediction method where the mean test accuracy is 88.59%, and 3) analyzing the set of 14 informative physicochemical properties to further understand the characteristics of HIV-1coreceptors.Keywords: Coreceptor, genetic algorithm, HIV-1, SVM, physicochemical properties, prediction.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2384630 A Multi-Objective Methodology for Selecting Lean Initiatives in Modular Construction Companies
Authors: Saba Shams Bidhendi, Steven Goh, Andrew Wandel
Abstract:
The implementation of lean manufacturing initiatives has produced significant impacts in improving operational performance and reducing manufacturing wastes in the production process. However, selecting an appropriate set of lean strategies is critical to avoid misapplication of the lean manufacturing techniques and consequential increase in non-value-adding activities. To the author’s best knowledge, there is currently no methodology to select lean strategies that considers their impacts on manufacturing wastes and performance metrics simultaneously. In this research, a multi-objective methodology is proposed that suggests an appropriate set of lean initiatives based on their impacts on performance metrics and manufacturing wastes and within manufacturers’ resource limitation. The proposed methodology in this research suggests the best set of lean initiatives for implementation that have highest impacts on identified critical performance metrics and manufacturing wastes. Therefore, manufacturers can assure that implementing suggested lean tools improves their production performance and reduces manufacturing wastes at the same time. A case study was conducted to show the effectiveness and validate the proposed model and methodologies.
Keywords: Lean manufacturing, Lean strategies, manufacturing wastes, manufacturing performance metrics, decision making, optimisation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 794629 Empirical Analysis of the Reusability of Object-Oriented Program Code in Open-Source Software
Authors: Fathi Taibi
Abstract:
Measuring the reusability of Object-Oriented (OO) program code is important to ensure a successful and timely adaptation and integration of the reused code in new software projects. It has become even more relevant with the availability of huge amounts of open-source projects. Reuse saves cost, increases the speed of development and improves software reliability. Measuring this reusability is not s straight forward process due to the variety of metrics and qualities linked to software reuse and the lack of comprehensive empirical studies to support the proposed metrics or models. In this paper, a conceptual model is proposed to measure the reusability of OO program code. A comprehensive set of metrics is used to compute the most significant factors of reusability and an empirical investigation is conducted to measure the reusability of the classes of randomly selected open-source Java projects. Additionally, the impact of using inner and anonymous classes on the reusability of their enclosing classes is assessed. The results obtained are thoroughly analyzed to identify the factors behind lack of reusability in open-source OO program code and the impact of nesting on it.
Keywords: Code reuse, Low Complexity, Empirical Analysis, Modularity, Software Metrics, Understandability.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2181628 Evolutionary Decision Trees and Software Metrics for Module Defects Identification
Authors: Monica Chiş
Abstract:
Software metric is a measure of some property of a piece of software or its specification. The aim of this paper is to present an application of evolutionary decision trees in software engineering in order to classify the software modules that have or have not one or more reported defects. For this some metrics are used for detecting the class of modules with defects or without defects.Keywords: Evolutionary decision trees, decision trees, softwaremetrics.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1751627 Cost Sensitive Feature Selection in Decision-Theoretic Rough Set Models for Customer Churn Prediction: The Case of Telecommunication Sector Customers
Authors: Emel Kızılkaya Aydogan, Mihrimah Ozmen, Yılmaz Delice
Abstract:
In recent days, there is a change and the ongoing development of the telecommunications sector in the global market. In this sector, churn analysis techniques are commonly used for analysing why some customers terminate their service subscriptions prematurely. In addition, customer churn is utmost significant in this sector since it causes to important business loss. Many companies make various researches in order to prevent losses while increasing customer loyalty. Although a large quantity of accumulated data is available in this sector, their usefulness is limited by data quality and relevance. In this paper, a cost-sensitive feature selection framework is developed aiming to obtain the feature reducts to predict customer churn. The framework is a cost based optional pre-processing stage to remove redundant features for churn management. In addition, this cost-based feature selection algorithm is applied in a telecommunication company in Turkey and the results obtained with this algorithm.
Keywords: Churn prediction, data mining, decision-theoretic rough set, feature selection.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1763626 Flagging Critical Components to Prevent Transient Faults in Real-Time Systems
Authors: Muhammad Sheikh Sadi, D. G. Myers, Cesar Ortega Sanchez
Abstract:
This paper proposes the use of metrics in design space exploration that highlight where in the structure of the model and at what point in the behaviour, prevention is needed against transient faults. Previous approaches to tackle transient faults focused on recovery after detection. Almost no research has been directed towards preventive measures. But in real-time systems, hard deadlines are performance requirements that absolutely must be met and a missed deadline constitutes an erroneous action and a possible system failure. This paper proposes the use of metrics to assess the system design to flag where transient faults may have significant impact. These tools then allow the design to be changed to minimize that impact, and they also flag where particular design techniques – such as coding of communications or memories – need to be applied in later stages of design.
Keywords: Criticality, Metrics, Real-Time Systems, Transient Faults.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1340625 SAF: A Substitution and Alignment Free Similarity Measure for Protein Sequences
Authors: Abdellali Kelil, Shengrui Wang, Ryszard Brzezinski
Abstract:
The literature reports a large number of approaches for measuring the similarity between protein sequences. Most of these approaches estimate this similarity using alignment-based techniques that do not necessarily yield biologically plausible results, for two reasons. First, for the case of non-alignable (i.e., not yet definitively aligned and biologically approved) sequences such as multi-domain, circular permutation and tandem repeat protein sequences, alignment-based approaches do not succeed in producing biologically plausible results. This is due to the nature of the alignment, which is based on the matching of subsequences in equivalent positions, while non-alignable proteins often have similar and conserved domains in non-equivalent positions. Second, the alignment-based approaches lead to similarity measures that depend heavily on the parameters set by the user for the alignment (e.g., gap penalties and substitution matrices). For easily alignable protein sequences, it's possible to supply a suitable combination of input parameters that allows such an approach to yield biologically plausible results. However, for difficult-to-align protein sequences, supplying different combinations of input parameters yields different results. Such variable results create ambiguities and complicate the similarity measurement task. To overcome these drawbacks, this paper describes a novel and effective approach for measuring the similarity between protein sequences, called SAF for Substitution and Alignment Free. Without resorting either to the alignment of protein sequences or to substitution relations between amino acids, SAF is able to efficiently detect the significant subsequences that best represent the intrinsic properties of protein sequences, those underlying the chronological dependencies of structural features and biochemical activities of protein sequences. Moreover, by using a new efficient subsequence matching scheme, SAF more efficiently handles protein sequences that contain similar structural features with significant meaning in chronologically non-equivalent positions. To show the effectiveness of SAF, extensive experiments were performed on protein datasets from different databases, and the results were compared with those obtained by several mainstream algorithms.Keywords: Protein, Similarity, Substitution, Alignment.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1409624 Bee Optimized Fuzzy Geographical Routing Protocol for VANET
Authors: P. Saravanan, T. Arunkumar
Abstract:
Vehicular Adhoc Network (VANET) is a new technology which aims to ensure intelligent inter-vehicle communications, seamless internet connectivity leading to improved road safety, essential alerts, and access to comfort and entertainment. VANET operations are hindered by mobile node’s (vehicles) uncertain mobility. Routing algorithms use metrics to evaluate which path is best for packets to travel. Metrics like path length (hop count), delay, reliability, bandwidth, and load determine optimal route. The proposed scheme exploits link quality, traffic density, and intersections as routing metrics to determine next hop. This study enhances Geographical Routing Protocol (GRP) using fuzzy controllers while rules are optimized with Bee Swarm Optimization (BSO). Simulations results are compared to conventional GRP.
Keywords: Bee Swarm Optimization (BSO), Geographical Routing Protocol (GRP), Vehicular Adhoc Network (VANET).
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2458623 Video Quality Control Using a ROI and Two- Component Weighted Metrics
Authors: Petra Heribanová, Jaroslav Polec, Michal Martinovič
Abstract:
In this paper we propose a new content-weighted method for full reference (FR) video quality control using a region of interest (ROI) and wherein two-component weighted metrics for Deaf People Video Communication. In our approach, an image is partitioned into region of interest and into region "dry-as-dust", then region of interest is partitioned into two parts: edges and background (smooth regions), while the another methods (metrics) combined and weighted three or more parts as edges, edges errors, texture, smooth regions, blur, block distance etc. as we proposed. Using another idea that different image regions from deaf people video communication have different perceptual significance relative to quality. Intensity edges certainly contain considerable image information and are perceptually significant.
Keywords: Video quality assessment, weighted MSE.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1981622 Towards the Use of Software Product Metrics as an Indicator for Measuring Mobile Applications Power Consumption
Authors: Ching Kin Keong, Koh Tieng Wei, Abdul Azim Abd. Ghani, Khaironi Yatim Sharif
Abstract:
Maintaining factory default battery endurance rate over time in supporting huge amount of running applications on energy-restricted mobile devices has created a new challenge for mobile applications developer. While delivering customers’ unlimited expectations, developers are barely aware of efficient use of energy from the application itself. Thus, developers need a set of valid energy consumption indicators in assisting them to develop energy saving applications. In this paper, we present a few software product metrics that can be used as an indicator to measure energy consumption of Android-based mobile applications in the early of design stage. In particular, Trepn Profiler (Power profiling tool for Qualcomm processor) has used to collect the data of mobile application power consumption, and then analyzed for the 23 software metrics in this preliminary study. The results show that McCabe cyclomatic complexity, number of parameters, nested block depth, number of methods, weighted methods per class, number of classes, total lines of code and method lines have direct relationship with power consumption of mobile application.Keywords: Battery endurance, software metrics, mobile application, power consumption.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1943621 Handover for Dense Small Cells Heterogeneous Networks: A Power-Efficient Game Theoretical Approach
Authors: Mohanad Alhabo, Li Zhang, Naveed Nawaz
Abstract:
In this paper, a non-cooperative game method is formulated where all players compete to transmit at higher power. Every base station represents a player in the game. The game is solved by obtaining the Nash equilibrium (NE) where the game converges to optimality. The proposed method, named Power Efficient Handover Game Theoretic (PEHO-GT) approach, aims to control the handover in dense small cell networks. Players optimize their payoff by adjusting the transmission power to improve the performance in terms of throughput, handover, power consumption and load balancing. To select the desired transmission power for a player, the payoff function considers the gain of increasing the transmission power. Then, the cell selection takes place by deploying Technique for Order Preference by Similarity to an Ideal Solution (TOPSIS). A game theoretical method is implemented for heterogeneous networks to validate the improvement obtained. Results reveal that the proposed method gives a throughput improvement while reducing the power consumption and minimizing the frequent handover.Keywords: Energy efficiency, game theory, handover, HetNets, small cells.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 468620 Sequence Relationships Similarity of Swine Influenza a (H1N1) Virus
Authors: Patsaraporn Somboonsak, Mud-Armeen Munlin
Abstract:
In April 2009, a new variant of Influenza A virus subtype H1N1 emerged in Mexico and spread all over the world. The influenza has three subtypes in human (H1N1, H1N2 and H3N2) Types B and C influenza tend to be associated with local or regional epidemics. Preliminary genetic characterization of the influenza viruses has identified them as swine influenza A (H1N1) viruses. Nucleotide sequence analysis of the Haemagglutinin (HA) and Neuraminidase (NA) are similar to each other and the majority of their genes of swine influenza viruses, two genes coding for the neuraminidase (NA) and matrix (M) proteins are similar to corresponding genes of swine influenza. Sequence similarity between the 2009 A (H1N1) virus and its nearest relatives indicates that its gene segments have been circulating undetected for an extended period. Nucleic acid sequence Maximum Likelihood (MCL) and DNA Empirical base frequencies, Phylogenetic relationship amongst the HA genes of H1N1 virus isolated in Genbank having high nucleotide sequence homology. In this paper we used 16 HA nucleotide sequences from NCBI for computing sequence relationships similarity of swine influenza A virus using the following method MCL the result is 28%, 36.64% for Optimal tree with the sum of branch length, 35.62% for Interior branch phylogeny Neighber – Join Tree, 1.85% for the overall transition/transversion, and 8.28% for Overall mean distance.Keywords: Sequence DNA, Relationship of swine, Swineinfluenza, Sequence Similarity
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2123619 Robust Face Recognition using AAM and Gabor Features
Authors: Sanghoon Kim, Sun-Tae Chung, Souhwan Jung, Seoungseon Jeon, Jaemin Kim, Seongwon Cho
Abstract:
In this paper, we propose a face recognition algorithm using AAM and Gabor features. Gabor feature vectors which are well known to be robust with respect to small variations of shape, scaling, rotation, distortion, illumination and poses in images are popularly employed for feature vectors for many object detection and recognition algorithms. EBGM, which is prominent among face recognition algorithms employing Gabor feature vectors, requires localization of facial feature points where Gabor feature vectors are extracted. However, localization method employed in EBGM is based on Gabor jet similarity and is sensitive to initial values. Wrong localization of facial feature points affects face recognition rate. AAM is known to be successfully applied to localization of facial feature points. In this paper, we devise a facial feature point localization method which first roughly estimate facial feature points using AAM and refine facial feature points using Gabor jet similarity-based facial feature localization method with initial points set by the rough facial feature points obtained from AAM, and propose a face recognition algorithm using the devised localization method for facial feature localization and Gabor feature vectors. It is observed through experiments that such a cascaded localization method based on both AAM and Gabor jet similarity is more robust than the localization method based on only Gabor jet similarity. Also, it is shown that the proposed face recognition algorithm using this devised localization method and Gabor feature vectors performs better than the conventional face recognition algorithm using Gabor jet similarity-based localization method and Gabor feature vectors like EBGM.Keywords: Face Recognition, AAM, Gabor features, EBGM.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2205618 Video Quality Assessment Methods: A Bird’s-Eye View
Authors: P. M. Arun Kumar, S. Chandramathi
Abstract:
The proliferation of multimedia technology and services in today’s world provide ample research scope in the frontiers of visual signal processing. Wide spread usage of video based applications in heterogeneous environment needs viable methods of Video Quality Assessment (VQA). The evaluation of video quality not only depends on high QoS requirements but also emphasis the need of novel term ‘QoE’ (Quality of Experience) that perceive video quality as user centric. This paper discusses two vital video quality assessment methods namely, subjective and objective assessment methods. The evolution of various video quality metrics, their classification models and applications are reviewed in this work. The Mean Opinion Score (MOS) based subjective measurements and algorithm based objective metrics are discussed and their challenges are outlined. Further, this paper explores the recent progress of VQA in emerging technologies such as mobile video and 3D video.
Keywords: 3D-Video, no reference metric, quality of experience, video quality assessment, video quality metrics.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4053617 Using Multi-Arm Bandits to Optimize Game Play Metrics and Effective Game Design
Authors: Kenny Raharjo, Ramon Lawrence
Abstract:
Game designers have the challenging task of building games that engage players to spend their time and money on the game. There are an infinite number of game variations and design choices, and it is hard to systematically determine game design choices that will have positive experiences for players. In this work, we demonstrate how multi-arm bandits can be used to automatically explore game design variations to achieve improved player metrics. The advantage of multi-arm bandits is that they allow for continuous experimentation and variation, intrinsically converge to the best solution, and require no special infrastructure to use beyond allowing minor game variations to be deployed to users for evaluation. A user study confirms that applying multi-arm bandits was successful in determining the preferred game variation with highest play time metrics and can be a useful technique in a game designer's toolkit.Keywords: Game design, multi-arm bandit, design exploration and data mining, player metric optimization and analytics.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1535616 Behavioral Signature Generation using Shadow Honeypot
Authors: Maros Barabas, Michal Drozd, Petr Hanacek
Abstract:
A novel behavioral detection framework is proposed to detect zero day buffer overflow vulnerabilities (based on network behavioral signatures) using zero-day exploits, instead of the signature-based or anomaly-based detection solutions currently available for IDPS techniques. At first we present the detection model that uses shadow honeypot. Our system is used for the online processing of network attacks and generating a behavior detection profile. The detection profile represents the dataset of 112 types of metrics describing the exact behavior of malware in the network. In this paper we present the examples of generating behavioral signatures for two attacks – a buffer overflow exploit on FTP server and well known Conficker worm. We demonstrated the visualization of important aspects by showing the differences between valid behavior and the attacks. Based on these metrics we can detect attacks with a very high probability of success, the process of detection is however very expensive.Keywords: behavioral signatures, metrics, network, security design
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2053615 Classifier Combination Approach in Motion Imagery Signals Processing for Brain Computer Interface
Authors: Homayoon Zarshenas, Mahdi Bamdad, Hadi Grailu, Akbar A. Shakoori
Abstract:
In this study we focus on improvement performance of a cue based Motor Imagery Brain Computer Interface (BCI). For this purpose, data fusion approach is used on results of different classifiers to make the best decision. At first step Distinction Sensitive Learning Vector Quantization method is used as a feature selection method to determine most informative frequencies in recorded signals and its performance is evaluated by frequency search method. Then informative features are extracted by packet wavelet transform. In next step 5 different types of classification methods are applied. The methodologies are tested on BCI Competition II dataset III, the best obtained accuracy is 85% and the best kappa value is 0.8. At final step ordered weighted averaging (OWA) method is used to provide a proper aggregation classifiers outputs. Using OWA enhanced system accuracy to 95% and kappa value to 0.9. Applying OWA just uses 50 milliseconds for performing calculation.Keywords: BCI, EEG, Classifier, Fuzzy operator, OWA.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1876614 Abrupt Scene Change Detection
Authors: Priyadarshinee Adhikari, Neeta Gargote, Jyothi Digge, B.G. Hogade
Abstract:
A number of automated shot-change detection methods for indexing a video sequence to facilitate browsing and retrieval have been proposed in recent years. This paper emphasizes on the simulation of video shot boundary detection using one of the methods of the color histogram wherein scaling of the histogram metrics is an added feature. The difference between the histograms of two consecutive frames is evaluated resulting in the metrics. Further scaling of the metrics is performed to avoid ambiguity and to enable the choice of apt threshold for any type of videos which involves minor error due to flashlight, camera motion, etc. Two sample videos are used here with resolution of 352 X 240 pixels using color histogram approach in the uncompressed media. An attempt is made for the retrieval of color video. The simulation is performed for the abrupt change in video which yields 90% recall and precision value.Keywords: Abrupt change, color histogram, ground-truthing, precision, recall, scaling, threshold.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2101613 Recursive Similarity Hashing of Fractal Geometry
Authors: Timothee G. Leleu
Abstract:
A new technique of topological multi-scale analysis is introduced. By performing a clustering recursively to build a hierarchy, and analyzing the co-scale and intra-scale similarities, an Iterated Function System can be extracted from any data set. The study of fractals shows that this method is efficient to extract self-similarities, and can find elegant solutions the inverse problem of building fractals. The theoretical aspects and practical implementations are discussed, together with examples of analyses of simple fractals.Keywords: hierarchical clustering, multi-scale analysis, Similarity hashing.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1862612 Hutchinson-Barnsley Operator in Fuzzy Metric Spaces
Authors: R. Uthayakumar, D. Easwaramoorthy
Abstract:
The purpose of this paper is to present the fuzzy contraction properties of the Hutchinson-Barnsley operator on the fuzzy hyperspace with respect to the Hausdorff fuzzy metrics. Also we discuss about the relationships between the Hausdorff fuzzy metrics on the fuzzy hyperspaces. Our theorems generalize and extend some recent results related with Hutchinson-Barnsley operator in the metric spaces.Keywords: Fractals, Iterated Function System, Hutchinson- Barnsley Operator, Fuzzy Metric Space, Hausdorff Fuzzy Metric.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1802611 Towards Clustering of Web-based Document Structures
Authors: Matthias Dehmer, Frank Emmert Streib, Jürgen Kilian, Andreas Zulauf
Abstract:
Methods for organizing web data into groups in order to analyze web-based hypertext data and facilitate data availability are very important in terms of the number of documents available online. Thereby, the task of clustering web-based document structures has many applications, e.g., improving information retrieval on the web, better understanding of user navigation behavior, improving web users requests servicing, and increasing web information accessibility. In this paper we investigate a new approach for clustering web-based hypertexts on the basis of their graph structures. The hypertexts will be represented as so called generalized trees which are more general than usual directed rooted trees, e.g., DOM-Trees. As a important preprocessing step we measure the structural similarity between the generalized trees on the basis of a similarity measure d. Then, we apply agglomerative clustering to the obtained similarity matrix in order to create clusters of hypertext graph patterns representing navigation structures. In the present paper we will run our approach on a data set of hypertext structures and obtain good results in Web Structure Mining. Furthermore we outline the application of our approach in Web Usage Mining as future work.Keywords: Clustering methods, graph-based patterns, graph similarity, hypertext structures, web structure mining
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1505610 Incorporating Semantic Similarity Measure in Genetic Algorithm : An Approach for Searching the Gene Ontology Terms
Authors: Razib M. Othman, Safaai Deris, Rosli M. Illias, Hany T. Alashwal, Rohayanti Hassan, FarhanMohamed
Abstract:
The most important property of the Gene Ontology is the terms. These control vocabularies are defined to provide consistent descriptions of gene products that are shareable and computationally accessible by humans, software agent, or other machine-readable meta-data. Each term is associated with information such as definition, synonyms, database references, amino acid sequences, and relationships to other terms. This information has made the Gene Ontology broadly applied in microarray and proteomic analysis. However, the process of searching the terms is still carried out using traditional approach which is based on keyword matching. The weaknesses of this approach are: ignoring semantic relationships between terms, and highly depending on a specialist to find similar terms. Therefore, this study combines semantic similarity measure and genetic algorithm to perform a better retrieval process for searching semantically similar terms. The semantic similarity measure is used to compute similitude strength between two terms. Then, the genetic algorithm is employed to perform batch retrievals and to handle the situation of the large search space of the Gene Ontology graph. The computational results are presented to show the effectiveness of the proposed algorithm.Keywords: Gene Ontology, Semantic similarity measure, Genetic algorithm, Ontology search
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1489609 Effective Keyword and Similarity Thresholds for the Discovery of Themes from the User Web Access Patterns
Authors: Haider A Ramadhan, Khalil Shihab
Abstract:
Clustering techniques have been used by many intelligent software agents to group similar access patterns of the Web users into high level themes which express users intentions and interests. However, such techniques have been mostly focusing on one salient feature of the Web document visited by the user, namely the extracted keywords. The major aim of these techniques is to come up with an optimal threshold for the number of keywords needed to produce more focused themes. In this paper we focus on both keyword and similarity thresholds to generate themes with concentrated themes, and hence build a more sound model of the user behavior. The purpose of this paper is two fold: use distance based clustering methods to recognize overall themes from the Proxy log file, and suggest an efficient cut off levels for the keyword and similarity thresholds which tend to produce more optimal clusters with better focus and efficient size.
Keywords: Data mining, knowledge discovery, clustering, dataanalysis, Web log analysis, theme based searching.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1453608 A Hybrid Gene Selection Technique Using Improved Mutual Information and Fisher Score for Cancer Classification Using Microarrays
Authors: M. Anidha, K. Premalatha
Abstract:
Feature Selection is significant in order to perform constructive classification in the area of cancer diagnosis. However, a large number of features compared to the number of samples makes the task of classification computationally very hard and prone to errors in microarray gene expression datasets. In this paper, we present an innovative method for selecting highly informative gene subsets of gene expression data that effectively classifies the cancer data into tumorous and non-tumorous. The hybrid gene selection technique comprises of combined Mutual Information and Fisher score to select informative genes. The gene selection is validated by classification using Support Vector Machine (SVM) which is a supervised learning algorithm capable of solving complex classification problems. The results obtained from improved Mutual Information and F-Score with SVM as a classifier has produced efficient results.
Keywords: Gene selection, mutual information, Fisher score, classification, SVM.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1152607 Using Genetic Algorithm to Improve Information Retrieval Systems
Authors: Ahmed A. A. Radwan, Bahgat A. Abdel Latef, Abdel Mgeid A. Ali, Osman A. Sadek
Abstract:
This study investigates the use of genetic algorithms in information retrieval. The method is shown to be applicable to three well-known documents collections, where more relevant documents are presented to users in the genetic modification. In this paper we present a new fitness function for approximate information retrieval which is very fast and very flexible, than cosine similarity fitness function.Keywords: Cosine similarity, Fitness function, Genetic Algorithm, Information Retrieval, Query learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2754606 Non-Overlapping Hierarchical Index Structure for Similarity Search
Authors: Mounira Taileb, Sid Lamrous, Sami Touati
Abstract:
In order to accelerate the similarity search in highdimensional database, we propose a new hierarchical indexing method. It is composed of offline and online phases. Our contribution concerns both phases. In the offline phase, after gathering the whole of the data in clusters and constructing a hierarchical index, the main originality of our contribution consists to develop a method to construct bounding forms of clusters to avoid overlapping. For the online phase, our idea improves considerably performances of similarity search. However, for this second phase, we have also developed an adapted search algorithm. Our method baptized NOHIS (Non-Overlapping Hierarchical Index Structure) use the Principal Direction Divisive Partitioning (PDDP) as algorithm of clustering. The principle of the PDDP is to divide data recursively into two sub-clusters; division is done by using the hyper-plane orthogonal to the principal direction derived from the covariance matrix and passing through the centroid of the cluster to divide. Data of each two sub-clusters obtained are including by a minimum bounding rectangle (MBR). The two MBRs are directed according to the principal direction. Consequently, the nonoverlapping between the two forms is assured. Experiments use databases containing image descriptors. Results show that the proposed method outperforms sequential scan and SRtree in processing k-nearest neighbors.
Keywords: K-nearest neighbour search, multi-dimensional indexing, multimedia databases, similarity search.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1561605 3D Objects Indexing Using Spherical Harmonic for Optimum Measurement Similarity
Authors: S. Hellam, Y. Oulahrir, F. El Mounchid, A. Sadiq, S. Mbarki
Abstract:
In this paper, we propose a method for three-dimensional (3-D)-model indexing based on defining a new descriptor, which we call new descriptor using spherical harmonics. The purpose of the method is to minimize, the processing time on the database of objects models and the searching time of similar objects to request object. Firstly we start by defining the new descriptor using a new division of 3-D object in a sphere. Then we define a new distance which will be used in the search for similar objects in the database.
Keywords: 3D indexation, spherical harmonic, similarity of 3D objects.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2231604 A Game-Theoretic Approach to Hedonic Housing Prices
Authors: Cielito F. Habito, Michael O. Santos, Andres G. Victorio
Abstract:
A property-s selling price is described as the result of sequential bargaining between a buyer and a seller in an environment of asymmetric information. Hedonic housing prices are estimated based upon 17,333 records of New Zealand residential properties sold during the years 2006 and 2007.Keywords: Housing demand, hedonics and valuation, residentialmarkets.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1385603 Developing and Implementing Successful Key Performance Indicators
Authors: Marie Mikušová, Viktorie Janečková
Abstract:
Measurement and the following evaluation of performance represent important part of management. The paper focuses on indicators as the basic elements of performance measurement system. It emphasizes a necessity of searching requirements for quality indicators so that they can become part of the useful system. It introduces standpoints for a systematic dividing of indicators so that they have as high as possible informative value of background sources for searching, analysis, designing and using of indicators. It draws attention to requirements for indicators' quality and at the same it deals with some dangers decreasing indicator's informative value. It submits a draft of questions that should be answered at the construction of indicator. It is obvious that particular indicators need to be defined exactly to stimulate the desired behavior in order to attain expected results. In the enclosure a concrete example of the defined indicator in the concrete conditions of a small firm is given. The authors of the paper pay attention to the fact that a quality indicator makes it possible to get to the basic causes of the problem and include the established facts into the company information system. At the same time they emphasize that developing of a quality indicator is a prerequisite for the utilization of the system of measurement in management.Keywords: performance, measurement, firm, indicator
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1555602 Improved Weighted Matching for Speaker Recognition
Authors: Ozan Mut, Mehmet Göktürk
Abstract:
Matching algorithms have significant importance in speaker recognition. Feature vectors of the unknown utterance are compared to feature vectors of the modeled speakers as a last step in speaker recognition. A similarity score is found for every model in the speaker database. Depending on the type of speaker recognition, these scores are used to determine the author of unknown speech samples. For speaker verification, similarity score is tested against a predefined threshold and either acceptance or rejection result is obtained. In the case of speaker identification, the result depends on whether the identification is open set or closed set. In closed set identification, the model that yields the best similarity score is accepted. In open set identification, the best score is tested against a threshold, so there is one more possible output satisfying the condition that the speaker is not one of the registered speakers in existing database. This paper focuses on closed set speaker identification using a modified version of a well known matching algorithm. The results of new matching algorithm indicated better performance on YOHO international speaker recognition database.Keywords: Automatic Speaker Recognition, Voice Recognition, Pattern Recognition, Digital Audio Signal Processing.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1731