Search results for: Sequence Similarity
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 847

Search results for: Sequence Similarity

637 On the Effectivity of Different Pseudo-Noise and Orthogonal Sequences for Speech Encryption from Correlation Properties

Authors: V. Anil Kumar, Abhijit Mitra, S. R. Mahadeva Prasanna

Abstract:

We analyze the effectivity of different pseudo noise (PN) and orthogonal sequences for encrypting speech signals in terms of perceptual intelligence. Speech signal can be viewed as sequence of correlated samples and each sample as sequence of bits. The residual intelligibility of the speech signal can be reduced by removing the correlation among the speech samples. PN sequences have random like properties that help in reducing the correlation among speech samples. The mean square aperiodic auto-correlation (MSAAC) and the mean square aperiodic cross-correlation (MSACC) measures are used to test the randomness of the PN sequences. Results of the investigation show the effectivity of large Kasami sequences for this purpose among many PN sequences.

Keywords: Speech encryption, pseudo-noise codes, maximallength, Gold, Barker, Kasami, Walsh-Hadamard, autocorrelation, crosscorrelation, figure of merit.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2002
636 A Low Complexity Frequency Offset Estimation for MB-OFDM based UWB Systems

Authors: Wang Xue, Liu Dan, Liu Ying, Wang Molin, Qian Zhihong

Abstract:

A low-complexity, high-accurate frequency offset estimation for multi-band orthogonal frequency division multiplexing (MB-OFDM) based ultra-wide band systems is presented regarding different carrier frequency offsets, different channel frequency responses, different preamble patterns in different bands. Utilizing a half-cycle Constant Amplitude Zero Auto Correlation (CAZAC) sequence as the preamble sequence, the estimator with a semi-cross contrast scheme between two successive OFDM symbols is proposed. The CRLB and complexity of the proposed algorithm are derived. Compared to the reference estimators, the proposed method achieves significantly less complexity (about 50%) for all preamble patterns of the MB-OFDM systems. The CRLBs turn out to be of well performance.

Keywords: CAZAC, Frequency Offset, Semi-cross Contrast, MB-OFDM, UWB

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1615
635 On Pseudo-Random and Orthogonal Binary Spreading Sequences

Authors: Abhijit Mitra

Abstract:

Different pseudo-random or pseudo-noise (PN) as well as orthogonal sequences that can be used as spreading codes for code division multiple access (CDMA) cellular networks or can be used for encrypting speech signals to reduce the residual intelligence are investigated. We briefly review the theoretical background for direct sequence CDMA systems and describe the main characteristics of the maximal length, Gold, Barker, and Kasami sequences. We also discuss about variable- and fixed-length orthogonal codes like Walsh- Hadamard codes. The equivalence of PN and orthogonal codes are also derived. Finally, a new PN sequence is proposed which is shown to have certain better properties than the existing codes.

Keywords: Code division multiple access, pseudo-noise codes, maximal length, Gold, Barker, Kasami, Walsh-Hadamard, autocorrelation, crosscorrelation, figure of merit.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5990
634 Molecular Dynamics and Circular Dichroism Studies on Aurein 1.2 and Retro Analog

Authors: Safyeh Soufian, Hoosein Naderi-Manesh, Abdoali Alizadeh, Mohammad Nabi Sarbolouki

Abstract:

Aurein 1.2 is a 13-residue amphipathic peptide with antibacterial and anticancer activity. Aurein1.2 and its retro analog were synthesized to study the activity of the peptides in relation to their structure. The antibacterial test result showed the retro-analog is inactive. The secondary structural analysis by CD spectra indicated that both of the peptides at TFE/Water adopt alpha-helical conformation. MD simulation was performed on aurein 1.2 and retro-analog in water and TFE in order to analyse the factors that are involved in the activity difference between retro and the native peptide. The simulation results are discussed and validated in the light of experimental data from the CD experiment. Both of the peptides showed a relatively similar pattern for their hydrophobicity, hydrophilicity, solvent accessible surfaces, and solvent accessible hydrophobic surfaces. However, they showed different in directions of dipole moment of peptides. Also, Our results further indicate that the reversion of the amino acid sequence affects flexibility .The data also showed that factors causing structural rigidity may decrease the activity. Consequently, our finding suggests that in the case of sequence-reversed peptide strategy, one has to pay attention to the role of amino acid sequence order in making flexibility and role of dipole moment direction in peptide activity. KeywordsAntimicrobial peptides, retro, molecular dynamic, circular dichroism.

Keywords: Antimicrobial peptides, retro, molecular dynamic, circular dichroism.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1813
633 Modeling Uncertainty in Multiple Criteria Decision Making Using the Technique for Order Preference by Similarity to Ideal Solution for the Selection of Stealth Combat Aircraft

Authors: C. Ardil

Abstract:

Uncertainty set theory is a generalization of fuzzy set theory and intuitionistic fuzzy set theory. It serves as an effective tool for dealing with inconsistent, imprecise, and vague information. The technique for order preference by similarity to ideal solution (TOPSIS) method is a multiple-attribute method used to identify solutions from a finite set of alternatives. It simultaneously minimizes the distance from an ideal point and maximizes the distance from a nadir point. In this paper, an extension of the TOPSIS method for multiple attribute group decision-making (MAGDM) based on uncertainty sets is presented. In uncertainty decision analysis, decision-makers express information about attribute values and weights using uncertainty numbers to select the best stealth combat aircraft.

Keywords: Uncertainty set, stealth combat aircraft selection multiple criteria decision-making analysis, MCDM, uncertainty decision analysis, TOPSIS

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 46
632 Performance Evaluation of One and Two Dimensional Prime Codes for Optical Code Division Multiple Access Systems

Authors: Gurjit Kaur, Neena Gupta

Abstract:

In this paper, we have analyzed and compared the performance of various coding schemes. The basic ID prime sequence codes are unique in only dimension, i.e. time slots, whereas 2D coding techniques are not unique by their time slots but with their wavelengths also. In this research, we have evaluated and compared the performance of 1D and 2D coding techniques constructed using prime sequence coding pattern for Optical Code Division Multiple Access (OCDMA) system on a single platform. Analysis shows that 2D prime code supports lesser number of active users than 1D codes, but they are having large code family and are the most secure codes compared to other codes. The performance of all these codes is analyzed on basis of number of active users supported at a Bit Error Rate (BER) of 10-9.

Keywords: CDMA, OCDMA, BER, OOC, PC, EPC, MPC, 2-D PC/PC, λc, λa.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1081
631 Inconsistency Discovery in Multiple State Diagrams

Authors: Mohammad N. Alanazi, David A. Gustafson

Abstract:

In this article, we introduce a new approach for analyzing UML designs to detect the inconsistencies between multiple state diagrams and sequence diagrams. The Super State Analysis (SSA) identifies the inconsistencies in super states, single step transitions, and sequences. Because SSA considers multiple UML state diagrams, it discovers inconsistencies that cannot be discovered when considering only a single UML state diagram. We have introduced a transition set that captures relationship information that is not specifiable in UML diagrams. The SSA model uses the transition set to link transitions of multiple state diagrams together. The analysis generates three different sets automatically. These sets are compared to the provided sets to detect the inconsistencies. SSA identifies five types of inconsistencies: impossible super states, unreachable super states, illegal transitions, missing transitions, and illegal sequences.

Keywords: Modeling Languages, Object-Oriented Analysis, Sequence Diagrams, Software Models, State Diagrams, UML.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1606
630 Formal Specification and Description Language and Message Sequence Chart to Model and Validate Session Initiation Protocol Services

Authors: Sa’ed Abed, Mohammad H. Al Shayeji, Ovais Ahmed, Sahel Alouneh

Abstract:

Session Initiation Protocol (SIP) is a signaling layer protocol for building, adjusting and ending sessions among participants including Internet conferences, telephone calls and multimedia distribution. SIP facilitates user movement by proxying and forwarding requests to the present location of the user. In this paper, we provide a formal Specification and Description Language (SDL) and Message Sequence Chart (MSC) to model and define the Internet Engineering Task Force (IETF) SIP protocol and its sample services resulted from informal SIP specification. We create an “Abstract User Interface” using case analysis so that can be applied to identify SIP services more explicitly. The issued sample SIP features are then used as case scenarios; they are revised in MSCs format and validated to their corresponding SDL models.

Keywords: Modeling, MSC, SDL, SIP, validating.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1222
629 Deterministic Random Number Generators for Online Applications

Authors: Natarajan Vijayarangan, Prasanna S. Bidare

Abstract:

Cryptography, Image watermarking and E-banking are filled with apparent oxymora and paradoxes. Random sequences are used as keys to encrypt information to be used as watermark during embedding the watermark and also to extract the watermark during detection. Also, the keys are very much utilized for 24x7x365 banking operations. Therefore a deterministic random sequence is very much useful for online applications. In order to obtain the same random sequence, we need to supply the same seed to the generator. Many researchers have used Deterministic Random Number Generators (DRNGs) for cryptographic applications and Pseudo Noise Random sequences (PNs) for watermarking. Even though, there are some weaknesses in PN due to attacks, the research community used it mostly in digital watermarking. On the other hand, DRNGs have not been widely used in online watermarking due to its computational complexity and non-robustness. Therefore, we have invented a new design of generating DRNG using Pi-series to make it useful for online Cryptographic, Digital watermarking and Banking applications.

Keywords: E-tokens, LFSR, non-linear, Pi series, pseudo random number.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1967
628 Identification of PIP Aquaporin Genes from Wheat

Authors: Sh. A. Yousif, M. Bhave

Abstract:

There is strong evidence that water channel proteins 'aquaporins (AQPs)' are central components in plant-water relations as well as a number of other physiological parameters. We had previously reported the isolation of 24 plasma membrane intrinsic protein (PIP) type AQPs. However, the gene numbers in rice and the polyploid nature of bread wheat indicated a high probability of further genes in the latter. The present work focused on identification of further AQP isoforms in bread wheat. With the use of altered primer design, we identified five genes homologous, designated PIP1;5b, PIP2;9b, TaPIP2;2, TaPIP2;2a, TaPIP2;2b. Sequence alignments indicate PIP1;5b, PIP2;9b are likely to be homeologues of two previously reported genes while the other three are new genes and could be homeologs of each other. The results indicate further AQP diversity in wheat and the sequence data will enable physical mapping of these genes to identify their genomes as well as genetic to determine their association with any quantitative trait loci (QTLs) associated with plant-water relation such as salinity or drought tolerance.

Keywords: Aquaporins, homeologues, PIP, wheat

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1988
627 Structural Basis of Resistance of Helicobacterpylori DnaK to Antimicrobial Peptide Pyrrhocoricin

Authors: Musammat F. Nahar, Anna Roujeinikova

Abstract:

Bacterial molecular chaperone DnaK plays an essential role in protein folding, stress response and transmembrane targeting of proteins. DnaKs from many bacterial species, including Escherichia coli, Salmonella typhimurium and Haemophilus infleunzae are the molecular targets for the insect-derived antimicrobial peptide pyrrhocoricin. Pyrrhocoricin-like peptides bind in the substrate recognition tunnel. Despite the high degree of crossspecies sequence conservation in the substrate-binding tunnel, some bacteria are not sensitive to pyrrhocoricin. This work addresses the molecular mechanism of resistance of Helicobacter pylori DnaK to pyrrhocoricin. Homology modelling, structural and sequence analysis identify a single aminoacid substitution at the interface between the lid and the β-sandwich subdomains of the DnaK substrate-binding domain as the major determinant for its resistance.

Keywords: Helicobacter pylori, molecular chaperone DnaK, pyrrhocoricin, structural biology.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1707
626 Flow Duration Curves and Recession Curves Connection through a Mathematical Link

Authors: Elena Carcano, Mirzi Betasolo

Abstract:

This study helps Public Water Bureaus in giving reliable answers to water concession requests. Rapidly increasing water requests can be supported provided that further uses of a river course are not totally compromised, and environmental features are protected as well. Strictly speaking, a water concession can be considered a continuous drawing from the source and causes a mean annual streamflow reduction. Therefore, deciding if a water concession is appropriate or inappropriate seems to be easily solved by comparing the generic demand to the mean annual streamflow value at disposal. Still, the immediate shortcoming for such a comparison is that streamflow data are information available only for few catchments and, most often, limited to specific sites. Subsequently, comparing the generic water demand to mean daily discharge is indeed far from being completely satisfactory since the mean daily streamflow is greater than the water withdrawal for a long period of a year. Consequently, such a comparison appears to be of little significance in order to preserve the quality and the quantity of the river. In order to overcome such a limit, this study aims to complete the information provided by flow duration curves introducing a link between Flow Duration Curves (FDCs) and recession curves and aims to show the chronological sequence of flows with a particular focus on low flow data. The analysis is carried out on 25 catchments located in North-Eastern Italy for which daily data are provided. The results identify groups of catchments as hydrologically homogeneous, having the lower part of the FDCs (corresponding streamflow interval is streamflow Q between 300 and 335, namely: Q(300), Q(335)) smoothly reproduced by a common recession curve. In conclusion, the results are useful to provide more reliable answers to water request, especially for those catchments which show similar hydrological response and can be used for a focused regionalization approach on low flow data. A mathematical link between streamflow duration curves and recession curves is herein provided, thus furnishing streamflow duration curves information upon a temporal sequence of data. In such a way, by introducing assumptions on recession curves, the chronological sequence upon low flow data can also be attributed to FDCs, which are known to lack this information by nature.

Keywords: Chronological sequence of discharges, recession curves, streamflow duration curves, water concession.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 527
625 Minimal Spanning Tree based Fuzzy Clustering

Authors: Ágnes Vathy-Fogarassy, Balázs Feil, János Abonyi

Abstract:

Most of fuzzy clustering algorithms have some discrepancies, e.g. they are not able to detect clusters with convex shapes, the number of the clusters should be a priori known, they suffer from numerical problems, like sensitiveness to the initialization, etc. This paper studies the synergistic combination of the hierarchical and graph theoretic minimal spanning tree based clustering algorithm with the partitional Gath-Geva fuzzy clustering algorithm. The aim of this hybridization is to increase the robustness and consistency of the clustering results and to decrease the number of the heuristically defined parameters of these algorithms to decrease the influence of the user on the clustering results. For the analysis of the resulted fuzzy clusters a new fuzzy similarity measure based tool has been presented. The calculated similarities of the clusters can be used for the hierarchical clustering of the resulted fuzzy clusters, which information is useful for cluster merging and for the visualization of the clustering results. As the examples used for the illustration of the operation of the new algorithm will show, the proposed algorithm can detect clusters from data with arbitrary shape and does not suffer from the numerical problems of the classical Gath-Geva fuzzy clustering algorithm.

Keywords: Clustering, fuzzy clustering, minimal spanning tree, cluster validity, fuzzy similarity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2354
624 Thermosolutal MHD Mixed Marangoni Convective Boundary Layers in the Presence of Suction or Injection

Authors: Noraini Ahmad, Seripah Awang Kechil, Norma Mohd Basir

Abstract:

The steady coupled dissipative layers, called Marangoni mixed convection boundary layers, in the presence of a magnetic field and solute concentration that are formed along the surface of two immiscible fluids with uniform suction or injection effects is examined. The similarity boundary layer equations are solved numerically using the Runge-Kutta Fehlberg with shooting technique. The Marangoni, buoyancy and external pressure gradient effects that are generated in mixed convection boundary layer flow are assessed. The velocity, temperature and concentration boundary layers thickness decrease with the increase of the magnetic field strength and the injection to suction. For buoyancy-opposed flow, the Marangoni mixed convection parameter enhances the velocity boundary layer but decreases the temperature and concentration boundary layers. However, for the buoyancy-assisted flow, the Marangoni mixed convection parameter decelerates the velocity but increases the temperature and concentration boundary layers.

Keywords: Magnetic field, mixed Marangoni convection, similarity boundary layers, solute concentration.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1840
623 Automata Theory Approach for Solving Frequent Pattern Discovery Problems

Authors: Renáta Iváncsy, István Vajk

Abstract:

The various types of frequent pattern discovery problem, namely, the frequent itemset, sequence and graph mining problems are solved in different ways which are, however, in certain aspects similar. The main approach of discovering such patterns can be classified into two main classes, namely, in the class of the levelwise methods and in that of the database projection-based methods. The level-wise algorithms use in general clever indexing structures for discovering the patterns. In this paper a new approach is proposed for discovering frequent sequences and tree-like patterns efficiently that is based on the level-wise issue. Because the level-wise algorithms spend a lot of time for the subpattern testing problem, the new approach introduces the idea of using automaton theory to solve this problem.

Keywords: Frequent pattern discovery, graph mining, pushdownautomaton, sequence mining, state machine, tree mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1583
622 An Optimized Virtual Scheme for Reducing Collisions in MAC Layer

Authors: M. Sivakumar, S. Saravanan

Abstract:

The main function of Medium Access Control (MAC) is to share the channel efficiently between all nodes. In the real-time scenario, there will be certain amount of wastage in bandwidth due to back-off periods. More bandwidth will be wasted in idle state if the back-off period is very high and collision may occur if the back-off period is small. So, an optimization is needed for this problem. The main objective of the work is to reduce delay due to back-off period thereby reducing collision and increasing throughput. Here a method, called the virtual back-off algorithm (VBA) is used to optimize the back-off period and thereby it increases throughput and reduces collisions. The main idea is to optimize the number of transmission for every node. A counter is introduced at each node to implement this idea. Here counter value represents the sequence number. VBA is classified into two types VBA with counter sharing (VBA-CS) and VBA with no counter sharing (VBA-NCS). These two classifications of VBA are compared for various parameters. Simulation is done in NS-2 environment. The results obtained are found to be promising. 

Keywords: VBA, sequence number, counter, back-off period.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1351
621 Multi-Rate Exact Discretization based on Diagonalization of a Linear System - A Multiple-Real-Eigenvalue Case

Authors: T. Sakamoto, N. Hori

Abstract:

A multi-rate discrete-time model, whose response agrees exactly with that of a continuous-time original at all sampling instants for any sampling periods, is developed for a linear system, which is assumed to have multiple real eigenvalues. The sampling rates can be chosen arbitrarily and individually, so that their ratios can even be irrational. The state space model is obtained as a combination of a linear diagonal state equation and a nonlinear output equation. Unlike the usual lifted model, the order of the proposed model is the same as the number of sampling rates, which is less than or equal to the order of the original continuous-time system. The method is based on a nonlinear variable transformation, which can be considered as a generalization of linear similarity transformation, which cannot be applied to systems with multiple eigenvalues in general. An example and its simulation result show that the proposed multi-rate model gives exact responses at all sampling instants.

Keywords: Multi-rate discretization, linear systems, triangularization, similarity transformation, diagonalization, exponential transformation, multiple eigenvalues

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1323
620 Grouping and Indexing Color Features for Efficient Image Retrieval

Authors: M. V. Sudhamani, C. R. Venugopal

Abstract:

Content-based Image Retrieval (CBIR) aims at searching image databases for specific images that are similar to a given query image based on matching of features derived from the image content. This paper focuses on a low-dimensional color based indexing technique for achieving efficient and effective retrieval performance. In our approach, the color features are extracted using the mean shift algorithm, a robust clustering technique. Then the cluster (region) mode is used as representative of the image in 3-D color space. The feature descriptor consists of the representative color of a region and is indexed using a spatial indexing method that uses *R -tree thus avoiding the high-dimensional indexing problems associated with the traditional color histogram. Alternatively, the images in the database are clustered based on region feature similarity using Euclidian distance. Only representative (centroids) features of these clusters are indexed using *R -tree thus improving the efficiency. For similarity retrieval, each representative color in the query image or region is used independently to find regions containing that color. The results of these methods are compared. A JAVA based query engine supporting query-by- example is built to retrieve images by color.

Keywords: Content-based, indexing, cluster, region.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1773
619 Hand Gesture Recognition: Sign to Voice System (S2V)

Authors: Oi Mean Foong, Tan Jung Low, Satrio Wibowo

Abstract:

Hand gesture is one of the typical methods used in sign language for non-verbal communication. It is most commonly used by people who have hearing or speech problems to communicate among themselves or with normal people. Various sign language systems have been developed by manufacturers around the globe but they are neither flexible nor cost-effective for the end users. This paper presents a system prototype that is able to automatically recognize sign language to help normal people to communicate more effectively with the hearing or speech impaired people. The Sign to Voice system prototype, S2V, was developed using Feed Forward Neural Network for two-sequence signs detection. Different sets of universal hand gestures were captured from video camera and utilized to train the neural network for classification purpose. The experimental results have shown that neural network has achieved satisfactory result for sign-to-voice translation.

Keywords: Hand gesture detection, neural network, signlanguage, sequence detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1801
618 Stability Analysis of Three-Dimensional Flow and Heat Transfer over a Permeable Shrinking Surface in a Cu-Water Nanofluid

Authors: Roslinda Nazar, Amin Noor, Khamisah Jafar, Ioan Pop

Abstract:

In this paper, the steady laminar three-dimensional boundary layer flow and heat transfer of a copper (Cu)-water nanofluid in the vicinity of a permeable shrinking flat surface in an otherwise quiescent fluid is studied. The nanofluid mathematical model in which the effect of the nanoparticle volume fraction is taken into account is considered. The governing nonlinear partial differential equations are transformed into a system of nonlinear ordinary differential equations using a similarity transformation which is then solved numerically using the function bvp4c from Matlab. Dual solutions (upper and lower branch solutions) are found for the similarity boundary layer equations for a certain range of the suction parameter. A stability analysis has been performed to show which branch solutions are stable and physically realizable. The numerical results for the skin friction coefficient and the local Nusselt number as well as the velocity and temperature profiles are obtained, presented and discussed in detail for a range of various governing parameters.

Keywords: Heat Transfer, Nanofluid, Shrinking Surface, Stability Analysis, Three-Dimensional Flow.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2153
617 A Review and Comparative Analysis on Cluster Ensemble Methods

Authors: S. Sarumathi, P. Ranjetha, C. Saraswathy, M. Vaishnavi, S. Geetha

Abstract:

Clustering is an unsupervised learning technique for aggregating data objects into meaningful classes so that intra cluster similarity is maximized and inter cluster similarity is minimized in data mining. However, no single clustering algorithm proves to be the most effective in producing the best result. As a result, a new challenging technique known as the cluster ensemble approach has blossomed in order to determine the solution to this problem. For the cluster analysis issue, this new technique is a successful approach. The cluster ensemble's main goal is to combine similar clustering solutions in a way that achieves the precision while also improving the quality of individual data clustering. Because of the massive and rapid creation of new approaches in the field of data mining, the ongoing interest in inventing novel algorithms necessitates a thorough examination of current techniques and future innovation. This paper presents a comparative analysis of various cluster ensemble approaches, including their methodologies, formal working process, and standard accuracy and error rates. As a result, the society of clustering practitioners will benefit from this exploratory and clear research, which will aid in determining the most appropriate solution to the problem at hand.

Keywords: Clustering, cluster ensemble methods, consensus function, data mining, unsupervised learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 742
616 Classifying Biomedical Text Abstracts based on Hierarchical 'Concept' Structure

Authors: Rozilawati Binti Dollah, Masaki Aono

Abstract:

Classifying biomedical literature is a difficult and challenging task, especially when a large number of biomedical articles should be organized into a hierarchical structure. In this paper, we present an approach for classifying a collection of biomedical text abstracts downloaded from Medline database with the help of ontology alignment. To accomplish our goal, we construct two types of hierarchies, the OHSUMED disease hierarchy and the Medline abstract disease hierarchies from the OHSUMED dataset and the Medline abstracts, respectively. Then, we enrich the OHSUMED disease hierarchy before adapting it to ontology alignment process for finding probable concepts or categories. Subsequently, we compute the cosine similarity between the vector in probable concepts (in the “enriched" OHSUMED disease hierarchy) and the vector in Medline abstract disease hierarchies. Finally, we assign category to the new Medline abstracts based on the similarity score. The results obtained from the experiments show the performance of our proposed approach for hierarchical classification is slightly better than the performance of the multi-class flat classification.

Keywords: Biomedical literature, hierarchical text classification, ontology alignment, text mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1977
615 Solving Single Machine Total Weighted Tardiness Problem Using Gaussian Process Regression

Authors: Wanatchapong Kongkaew

Abstract:

This paper proposes an application of probabilistic technique, namely Gaussian process regression, for estimating an optimal sequence of the single machine with total weighted tardiness (SMTWT) scheduling problem. In this work, the Gaussian process regression (GPR) model is utilized to predict an optimal sequence of the SMTWT problem, and its solution is improved by using an iterated local search based on simulated annealing scheme, called GPRISA algorithm. The results show that the proposed GPRISA method achieves a very good performance and a reasonable trade-off between solution quality and time consumption. Moreover, in the comparison of deviation from the best-known solution, the proposed mechanism noticeably outperforms the recently existing approaches.

 

Keywords: Gaussian process regression, iterated local search, simulated annealing, single machine total weighted tardiness.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2196
614 Comparison of Domain and Hydrophobicity Features for the Prediction of Protein-Protein Interactions using Support Vector Machines

Authors: Hany Alashwal, Safaai Deris, Razib M. Othman

Abstract:

The protein domain structure has been widely used as the most informative sequence feature to computationally predict protein-protein interactions. However, in a recent study, a research group has reported a very high accuracy of 94% using hydrophobicity feature. Therefore, in this study we compare and verify the usefulness of protein domain structure and hydrophobicity properties as the sequence features. Using the Support Vector Machines (SVM) as the learning system, our results indicate that both features achieved accuracy of nearly 80%. Furthermore, domains structure had receiver operating characteristic (ROC) score of 0.8480 with running time of 34 seconds, while hydrophobicity had ROC score of 0.8159 with running time of 20,571 seconds (5.7 hours). These results indicate that protein-protein interaction can be predicted from domain structure with reliable accuracy and acceptable running time.

Keywords: Bioinformatics, protein-protein interactions, support vector machines, protein features.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1878
613 Constraint Based Frequent Pattern Mining Technique for Solving GCS Problem

Authors: First G.M. Karthik, Second Ramachandra.V.Pujeri, Dr.

Abstract:

Generalized Center String (GCS) problem are generalized from Common Approximate Substring problem and Common substring problems. GCS are known to be NP-hard allowing the problems lies in the explosion of potential candidates. Finding longest center string without concerning the sequence that may not contain any motifs is not known in advance in any particular biological gene process. GCS solved by frequent pattern-mining techniques and known to be fixed parameter tractable based on the fixed input sequence length and symbol set size. Efficient method known as Bpriori algorithms can solve GCS with reasonable time/space complexities. Bpriori 2 and Bpriori 3-2 algorithm are been proposed of any length and any positions of all their instances in input sequences. In this paper, we reduced the time/space complexity of Bpriori algorithm by Constrained Based Frequent Pattern mining (CBFP) technique which integrates the idea of Constraint Based Mining and FP-tree mining. CBFP mining technique solves the GCS problem works for all center string of any length, but also for the positions of all their mutated copies of input sequence. CBFP mining technique construct TRIE like with FP tree to represent the mutated copies of center string of any length, along with constraints to restraint growth of the consensus tree. The complexity analysis for Constrained Based FP mining technique and Bpriori algorithm is done based on the worst case and average case approach. Algorithm's correctness compared with the Bpriori algorithm using artificial data is shown.

Keywords: Constraint Based Mining, FP tree, Data mining, GCS problem, CBFP mining technique.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1649
612 Fighter Aircraft Evaluation and Selection Process Based on Triangular Fuzzy Numbers in Multiple Criteria Decision Making Analysis Using the Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS)

Authors: C. Ardil

Abstract:

This article presents a multiple criteria evaluation approach to uncertainty, vagueness, and imprecision analysis for ranking alternatives with fuzzy data for decision making using the Technique for Order Preference by Similarity to Ideal Solution (TOPSIS). The fighter aircraft evaluation and selection decision making problem is modeled in a fuzzy environment with triangular fuzzy numbers. The fuzzy decision information related to the fighter aircraft selection problem is taken into account in ordering the alternatives and selecting the best candidate. The basic fuzzy TOPSIS procedure steps transform fuzzy decision matrices into matrices of alternatives evaluated according to all decision criteria. A practical numerical example illustrates the proposed approach to the fighter aircraft selection problem.

Keywords: triangular fuzzy number (TFN), multiple criteria decision making analysis, decision making, aircraft selection, MCDMA, fuzzy TOPSIS

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 408
611 Cellular Automata Based Robust Watermarking Architecture towards the VLSI Realization

Authors: V. H. Mankar, T. S. Das, S. K. Sarkar

Abstract:

In this paper, we have proposed a novel blind watermarking architecture towards its hardware implementation in VLSI. In order to facilitate this hardware realization, cellular automata (CA) concept is introduced. The CA has been already accepted as an attractive structure for VLSI implementation because of its modularity, parallelism, high performance and reliability. The hardware realizable multiresolution spread spectrum watermarking techniques are very few in numbers in spite of their best ever resiliency against signal impairments. This is because of the computational cost and complexity associated with their different filter banks and lifting techniques. The concept of cellular automata theory in order to form a new transform domain technique i.e. Cellular Automata Transform (CAT) have been incorporated. Since CA provides spreading sequences having very low cross-correlation properties, the CA based pseudorandom sequence generator is considered in the present work. Considering the watermarking technique as a digital communication process, an error control coding (ECC) must be incorporated in the data hiding schemes. Besides the hardware implementation of entire CA based data hiding technique, the individual blocks of the algorithm using CA provide the best result than that of some other methods irrespective of the hardware and software technique. The Cellular Automata Transform, CA based PN sequence generator, and CA ECC are the requisite blocks that are developed not only to meet the reliable hardware requirements but also for the basic spread spectrum watermarking features. The proposed algorithm shows statistical invisibility and resiliency against various common signal-processing operations. This algorithmic design utilizes the existing allocated bandwidth in the data transmission channel in a more efficient manner.

Keywords: Cellular automata, watermarking, error control coding, PN sequence, VLSI.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2031
610 Cloning of a β-Glucosidase Gene (BGL1) from Traditional Starter Yeast Saccharomycopsis fibuligera BMQ 908 and Expression in Pichia pastoris

Authors: Le Thuy Mai, Vu Nguyen Thanh

Abstract:

β-Glucosidase is an important enzyme for production of ethanol from lignocellulose. With hydrolytic activity on cellooligosaccharides, especially cellobiose, β-glucosidase removes product inhibitory effect on cellulases and forms fermentable sugars. In this study, β-glucosidase encoding gene (BGL1) from traditional starter yeast Saccharomycosis fibuligera BMQ908 was cloned and expressed in Pichia pastoris. BGL1 of S. fibuligera BMQ 908 shared 98% nucleotide homology with the closest GenBank sequence (M22475) but identity in amino-acid sequences of catalytic domains. Recombinant plasmid pPICZαA/BGL1 containing the sequence encoding BGL1 mature protein and α-factor secretion signal was constructed and transformed into methylotrophic yeast P. pastoris by electroporation. The recombinant strain produced single extracellular protein with molecular weight of 120 kDa and cellobiase activity of 60 IU/ml. The optimum pH of the recombinant β-glucosidase was 5.0 and the optimum temperature was 50°C.

Keywords: β-Glucosidase, Pichia pastoris, Saccharomycopsisfibuligera, recombinant enzyme.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4926
609 Non-Population Search Algorithms for Capacitated Material Requirement Planning in Multi-Stage Assembly Flow Shop with Alternative Machines

Authors: Watcharapan Sukkerd, Teeradej Wuttipornpun

Abstract:

This paper aims to present non-population search algorithms called tabu search (TS), simulated annealing (SA) and variable neighborhood search (VNS) to minimize the total cost of capacitated MRP problem in multi-stage assembly flow shop with two alternative machines. There are three main steps for the algorithm. Firstly, an initial sequence of orders is constructed by a simple due date-based dispatching rule. Secondly, the sequence of orders is repeatedly improved to reduce the total cost by applying TS, SA and VNS separately. Finally, the total cost is further reduced by optimizing the start time of each operation using the linear programming (LP) model. Parameters of the algorithm are tuned by using real data from automotive companies. The result shows that VNS significantly outperforms TS, SA and the existing algorithm.

Keywords: Capacitated MRP, non-population search algorithms, linear programming, assembly flow shop.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 909
608 A Simplified Single Correlator Rake Receiver for CDMA Communications

Authors: K. Murali Krishna, Abhijit Mitra, C. Ardil

Abstract:

This paper presents a single correlator RAKE receiver for direct sequence code division multiple access (DS-CDMA) systems. In conventional RAKE receivers, multiple correlators are used to despread the multipath signals and then to align and combine those signals in a later stage before making a bit decision. The simplified receiver structure presented here uses a single correlator and single code sequence generator to recover the multipaths. Modified Walsh- Hadamard codes are used here for data spreading that provides better uncorrelation properties for the multipath signals. The main advantage of this receiver structure is that it requires only a single correlator and a code generator in contrary to the conventional RAKE receiver concept with multiple correlators. It is shown in results that the proposed receiver achieves better bit error rates in comparison with the conventional one for more than one multipaths.

Keywords: RAKE receiver, Code division multiple access, ModifiedWalsh-Hadamard codes, Single correlator.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3599