Search results for: Jaccard’s similarity measure.
1307 Minimal Spanning Tree based Fuzzy Clustering
Authors: Ágnes Vathy-Fogarassy, Balázs Feil, János Abonyi
Abstract:
Most of fuzzy clustering algorithms have some discrepancies, e.g. they are not able to detect clusters with convex shapes, the number of the clusters should be a priori known, they suffer from numerical problems, like sensitiveness to the initialization, etc. This paper studies the synergistic combination of the hierarchical and graph theoretic minimal spanning tree based clustering algorithm with the partitional Gath-Geva fuzzy clustering algorithm. The aim of this hybridization is to increase the robustness and consistency of the clustering results and to decrease the number of the heuristically defined parameters of these algorithms to decrease the influence of the user on the clustering results. For the analysis of the resulted fuzzy clusters a new fuzzy similarity measure based tool has been presented. The calculated similarities of the clusters can be used for the hierarchical clustering of the resulted fuzzy clusters, which information is useful for cluster merging and for the visualization of the clustering results. As the examples used for the illustration of the operation of the new algorithm will show, the proposed algorithm can detect clusters from data with arbitrary shape and does not suffer from the numerical problems of the classical Gath-Geva fuzzy clustering algorithm.Keywords: Clustering, fuzzy clustering, minimal spanning tree, cluster validity, fuzzy similarity.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 24061306 Clustering Protein Sequences with Tailored General Regression Model Technique
Authors: G. Lavanya Devi, Allam Appa Rao, A. Damodaram, GR Sridhar, G. Jaya Suma
Abstract:
Cluster analysis divides data into groups that are meaningful, useful, or both. Analysis of biological data is creating a new generation of epidemiologic, prognostic, diagnostic and treatment modalities. Clustering of protein sequences is one of the current research topics in the field of computer science. Linear relation is valuable in rule discovery for a given data, such as if value X goes up 1, value Y will go down 3", etc. The classical linear regression models the linear relation of two sequences perfectly. However, if we need to cluster a large repository of protein sequences into groups where sequences have strong linear relationship with each other, it is prohibitively expensive to compare sequences one by one. In this paper, we propose a new technique named General Regression Model Technique Clustering Algorithm (GRMTCA) to benignly handle the problem of linear sequences clustering. GRMT gives a measure, GR*, to tell the degree of linearity of multiple sequences without having to compare each pair of them.Keywords: Clustering, General Regression Model, Protein Sequences, Similarity Measure.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15671305 Sequence Relationships Similarity of Swine Influenza a (H1N1) Virus
Authors: Patsaraporn Somboonsak, Mud-Armeen Munlin
Abstract:
In April 2009, a new variant of Influenza A virus subtype H1N1 emerged in Mexico and spread all over the world. The influenza has three subtypes in human (H1N1, H1N2 and H3N2) Types B and C influenza tend to be associated with local or regional epidemics. Preliminary genetic characterization of the influenza viruses has identified them as swine influenza A (H1N1) viruses. Nucleotide sequence analysis of the Haemagglutinin (HA) and Neuraminidase (NA) are similar to each other and the majority of their genes of swine influenza viruses, two genes coding for the neuraminidase (NA) and matrix (M) proteins are similar to corresponding genes of swine influenza. Sequence similarity between the 2009 A (H1N1) virus and its nearest relatives indicates that its gene segments have been circulating undetected for an extended period. Nucleic acid sequence Maximum Likelihood (MCL) and DNA Empirical base frequencies, Phylogenetic relationship amongst the HA genes of H1N1 virus isolated in Genbank having high nucleotide sequence homology. In this paper we used 16 HA nucleotide sequences from NCBI for computing sequence relationships similarity of swine influenza A virus using the following method MCL the result is 28%, 36.64% for Optimal tree with the sum of branch length, 35.62% for Interior branch phylogeny Neighber – Join Tree, 1.85% for the overall transition/transversion, and 8.28% for Overall mean distance.Keywords: Sequence DNA, Relationship of swine, Swineinfluenza, Sequence Similarity
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21241304 Robust Face Recognition using AAM and Gabor Features
Authors: Sanghoon Kim, Sun-Tae Chung, Souhwan Jung, Seoungseon Jeon, Jaemin Kim, Seongwon Cho
Abstract:
In this paper, we propose a face recognition algorithm using AAM and Gabor features. Gabor feature vectors which are well known to be robust with respect to small variations of shape, scaling, rotation, distortion, illumination and poses in images are popularly employed for feature vectors for many object detection and recognition algorithms. EBGM, which is prominent among face recognition algorithms employing Gabor feature vectors, requires localization of facial feature points where Gabor feature vectors are extracted. However, localization method employed in EBGM is based on Gabor jet similarity and is sensitive to initial values. Wrong localization of facial feature points affects face recognition rate. AAM is known to be successfully applied to localization of facial feature points. In this paper, we devise a facial feature point localization method which first roughly estimate facial feature points using AAM and refine facial feature points using Gabor jet similarity-based facial feature localization method with initial points set by the rough facial feature points obtained from AAM, and propose a face recognition algorithm using the devised localization method for facial feature localization and Gabor feature vectors. It is observed through experiments that such a cascaded localization method based on both AAM and Gabor jet similarity is more robust than the localization method based on only Gabor jet similarity. Also, it is shown that the proposed face recognition algorithm using this devised localization method and Gabor feature vectors performs better than the conventional face recognition algorithm using Gabor jet similarity-based localization method and Gabor feature vectors like EBGM.Keywords: Face Recognition, AAM, Gabor features, EBGM.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 22061303 Application of l1-Norm Minimization Technique to Image Retrieval
Authors: C. S. Sastry, Saurabh Jain, Ashish Mishra
Abstract:
Image retrieval is a topic where scientific interest is currently high. The important steps associated with image retrieval system are the extraction of discriminative features and a feasible similarity metric for retrieving the database images that are similar in content with the search image. Gabor filtering is a widely adopted technique for feature extraction from the texture images. The recently proposed sparsity promoting l1-norm minimization technique finds the sparsest solution of an under-determined system of linear equations. In the present paper, the l1-norm minimization technique as a similarity metric is used in image retrieval. It is demonstrated through simulation results that the l1-norm minimization technique provides a promising alternative to existing similarity metrics. In particular, the cases where the l1-norm minimization technique works better than the Euclidean distance metric are singled out.
Keywords: l1-norm minimization, content based retrieval, modified Gabor function.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 34321302 Time Series Regression with Meta-Clusters
Authors: Monika Chuchro
Abstract:
This paper presents a preliminary attempt to apply classification of time series using meta-clusters in order to improve the quality of regression models. In this case, clustering was performed as a method to obtain subgroups of time series data with normal distribution from the inflow into wastewater treatment plant data, composed of several groups differing by mean value. Two simple algorithms, K-mean and EM, were chosen as a clustering method. The Rand index was used to measure the similarity. After simple meta-clustering, a regression model was performed for each subgroups. The final model was a sum of the subgroups models. The quality of the obtained model was compared with the regression model made using the same explanatory variables, but with no clustering of data. Results were compared using determination coefficient (R2), measure of prediction accuracy- mean absolute percentage error (MAPE) and comparison on a linear chart. Preliminary results allow us to foresee the potential of the presented technique.
Keywords: Clustering, Data analysis, Data mining, Predictive models.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19511301 Further the Effectiveness of Software Testability Measure
Authors: Liang Zhao, Feng Wang, Bo Deng, Bo Yang
Abstract:
Software testability is proposed to address the problem of increasing cost of test and the quality of software. Testability measure provides a quantified way to denote the testability of software. Since 1990s, many testability measure models are proposed to address the problem. By discussing the contradiction between domain testability and domain range ratio (DRR), a new testability measure, semantic fault distance, is proposed. Its validity is discussed.
Keywords: Software testability, DRR, Domain testability.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20451300 Optimal Model Order Selection for Transient Error Autoregressive Moving Average (TERA) MRI Reconstruction Method
Authors: Abiodun M. Aibinu, Athaur Rahman Najeeb, Momoh J. E. Salami, Amir A. Shafie
Abstract:
An alternative approach to the use of Discrete Fourier Transform (DFT) for Magnetic Resonance Imaging (MRI) reconstruction is the use of parametric modeling technique. This method is suitable for problems in which the image can be modeled by explicit known source functions with a few adjustable parameters. Despite the success reported in the use of modeling technique as an alternative MRI reconstruction technique, two important problems constitutes challenges to the applicability of this method, these are estimation of Model order and model coefficient determination. In this paper, five of the suggested method of evaluating the model order have been evaluated, these are: The Final Prediction Error (FPE), Akaike Information Criterion (AIC), Residual Variance (RV), Minimum Description Length (MDL) and Hannan and Quinn (HNQ) criterion. These criteria were evaluated on MRI data sets based on the method of Transient Error Reconstruction Algorithm (TERA). The result for each criterion is compared to result obtained by the use of a fixed order technique and three measures of similarity were evaluated. Result obtained shows that the use of MDL gives the highest measure of similarity to that use by a fixed order technique.Keywords: Autoregressive Moving Average (ARMA), MagneticResonance Imaging (MRI), Parametric modeling, Transient Error.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16151299 Recursive Similarity Hashing of Fractal Geometry
Authors: Timothee G. Leleu
Abstract:
A new technique of topological multi-scale analysis is introduced. By performing a clustering recursively to build a hierarchy, and analyzing the co-scale and intra-scale similarities, an Iterated Function System can be extracted from any data set. The study of fractals shows that this method is efficient to extract self-similarities, and can find elegant solutions the inverse problem of building fractals. The theoretical aspects and practical implementations are discussed, together with examples of analyses of simple fractals.Keywords: hierarchical clustering, multi-scale analysis, Similarity hashing.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18631298 Effective Keyword and Similarity Thresholds for the Discovery of Themes from the User Web Access Patterns
Authors: Haider A Ramadhan, Khalil Shihab
Abstract:
Clustering techniques have been used by many intelligent software agents to group similar access patterns of the Web users into high level themes which express users intentions and interests. However, such techniques have been mostly focusing on one salient feature of the Web document visited by the user, namely the extracted keywords. The major aim of these techniques is to come up with an optimal threshold for the number of keywords needed to produce more focused themes. In this paper we focus on both keyword and similarity thresholds to generate themes with concentrated themes, and hence build a more sound model of the user behavior. The purpose of this paper is two fold: use distance based clustering methods to recognize overall themes from the Proxy log file, and suggest an efficient cut off levels for the keyword and similarity thresholds which tend to produce more optimal clusters with better focus and efficient size.
Keywords: Data mining, knowledge discovery, clustering, dataanalysis, Web log analysis, theme based searching.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14551297 Latent Semantic Inference for Agriculture FAQ Retrieval
Authors: Dawei Wang, Rujing Wang, Ying Li, Baozi Wei
Abstract:
FAQ system can make user find answer to the problem that puzzles them. But now the research on Chinese FAQ system is still on the theoretical stage. This paper presents an approach to semantic inference for FAQ mining. To enhance the efficiency, a small pool of the candidate question-answering pairs retrieved from the system for the follow-up work according to the concept of the agriculture domain extracted from user input .Input queries or questions are converted into four parts, the question word segment (QWS), the verb segment (VS), the concept of agricultural areas segment (CS), the auxiliary segment (AS). A semantic matching method is presented to estimate the similarity between the semantic segments of the query and the questions in the pool of the candidate. A thesaurus constructed from the HowNet, a Chinese knowledge base, is adopted for word similarity measure in the matcher. The questions are classified into eleven intension categories using predefined question stemming keywords. For FAQ mining, given a query, the question part and answer part in an FAQ question-answer pair is matched with the input query, respectively. Finally, the probabilities estimated from these two parts are integrated and used to choose the most likely answer for the input query. These approaches are experimented on an agriculture FAQ system. Experimental results indicate that the proposed approach outperformed the FAQ-Finder system in agriculture FAQ retrieval.
Keywords: FAQ, Semantic Inference, Ontology.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13791296 Using Genetic Algorithm to Improve Information Retrieval Systems
Authors: Ahmed A. A. Radwan, Bahgat A. Abdel Latef, Abdel Mgeid A. Ali, Osman A. Sadek
Abstract:
This study investigates the use of genetic algorithms in information retrieval. The method is shown to be applicable to three well-known documents collections, where more relevant documents are presented to users in the genetic modification. In this paper we present a new fitness function for approximate information retrieval which is very fast and very flexible, than cosine similarity fitness function.Keywords: Cosine similarity, Fitness function, Genetic Algorithm, Information Retrieval, Query learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 27561295 Non-Overlapping Hierarchical Index Structure for Similarity Search
Authors: Mounira Taileb, Sid Lamrous, Sami Touati
Abstract:
In order to accelerate the similarity search in highdimensional database, we propose a new hierarchical indexing method. It is composed of offline and online phases. Our contribution concerns both phases. In the offline phase, after gathering the whole of the data in clusters and constructing a hierarchical index, the main originality of our contribution consists to develop a method to construct bounding forms of clusters to avoid overlapping. For the online phase, our idea improves considerably performances of similarity search. However, for this second phase, we have also developed an adapted search algorithm. Our method baptized NOHIS (Non-Overlapping Hierarchical Index Structure) use the Principal Direction Divisive Partitioning (PDDP) as algorithm of clustering. The principle of the PDDP is to divide data recursively into two sub-clusters; division is done by using the hyper-plane orthogonal to the principal direction derived from the covariance matrix and passing through the centroid of the cluster to divide. Data of each two sub-clusters obtained are including by a minimum bounding rectangle (MBR). The two MBRs are directed according to the principal direction. Consequently, the nonoverlapping between the two forms is assured. Experiments use databases containing image descriptors. Results show that the proposed method outperforms sequential scan and SRtree in processing k-nearest neighbors.
Keywords: K-nearest neighbour search, multi-dimensional indexing, multimedia databases, similarity search.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15621294 3D Objects Indexing Using Spherical Harmonic for Optimum Measurement Similarity
Authors: S. Hellam, Y. Oulahrir, F. El Mounchid, A. Sadiq, S. Mbarki
Abstract:
In this paper, we propose a method for three-dimensional (3-D)-model indexing based on defining a new descriptor, which we call new descriptor using spherical harmonics. The purpose of the method is to minimize, the processing time on the database of objects models and the searching time of similar objects to request object. Firstly we start by defining the new descriptor using a new division of 3-D object in a sphere. Then we define a new distance which will be used in the search for similar objects in the database.
Keywords: 3D indexation, spherical harmonic, similarity of 3D objects.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 22311293 Improved Weighted Matching for Speaker Recognition
Authors: Ozan Mut, Mehmet Göktürk
Abstract:
Matching algorithms have significant importance in speaker recognition. Feature vectors of the unknown utterance are compared to feature vectors of the modeled speakers as a last step in speaker recognition. A similarity score is found for every model in the speaker database. Depending on the type of speaker recognition, these scores are used to determine the author of unknown speech samples. For speaker verification, similarity score is tested against a predefined threshold and either acceptance or rejection result is obtained. In the case of speaker identification, the result depends on whether the identification is open set or closed set. In closed set identification, the model that yields the best similarity score is accepted. In open set identification, the best score is tested against a threshold, so there is one more possible output satisfying the condition that the speaker is not one of the registered speakers in existing database. This paper focuses on closed set speaker identification using a modified version of a well known matching algorithm. The results of new matching algorithm indicated better performance on YOHO international speaker recognition database.Keywords: Automatic Speaker Recognition, Voice Recognition, Pattern Recognition, Digital Audio Signal Processing.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17321292 Game Skill Measure for Mixed Games
Authors: Roman V. Yampolskiy
Abstract:
Games can be classified as games of skill, games of chance or otherwise be classified as mixed. This paper deals with the topic of scientifically classifying mixed games as more reliant on elements of chance or elements of skill and ways to scientifically measure the amount of skill involved. This is predominantly useful for classification of games as legal or illegal in deferent jurisdictions based on the local gaming laws. We propose a novel measure of skill to chance ratio called the Game Skill Measure (GSM) and utilize it to calculate the skill component of a popular variant of Poker.Keywords: Chance, Game, Skill, Luck.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14631291 Fuzzy Processing of Uncertain Data
Authors: Petr Morávek, Miloš Šeda
Abstract:
In practice, we often come across situations where it is necessary to make decisions based on incomplete or uncertain data. In control systems it may be due to the unknown exact mathematical model, or its excessive complexity (e.g. nonlinearity) when it is necessary to simplify it, respectively, to solve it using a rule base. In the case of databases, searching data we compare a similarity measure with of the requirements of the selection with stored data, where both the select query and the data itself may contain vague terms, for example in the form of linguistic qualifiers. In this paper, we focus on the processing of uncertain data in databases and demonstrate it on the example multi-criteria decision making in the selection of variants, specified by higher number of technical parameters.Keywords: fuzzy logic, linguistic variable, multicriteria decision
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14181290 Flow and Heat Transfer of a Nanofluid over a Shrinking Sheet
Authors: N. Bachok, N. L. Aleng, N. M. Arifin, A. Ishak, N. Senu
Abstract:
The problem of laminar fluid flow which results from the shrinking of a permeable surface in a nanofluid has been investigated numerically. The model used for the nanofluid incorporates the effects of Brownian motion and thermophoresis. A similarity solution is presented which depends on the mass suction parameter S, Prandtl number Pr, Lewis number Le, Brownian motion number Nb and thermophoresis number Nt. It was found that the reduced Nusselt number is decreasing function of each dimensionless number.
Keywords: Boundary layer, Nanofluid, Shrinking sheet, Brownian motion, Thermophoresis, Similarity solution.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 28061289 Learning and Evaluating Possibilistic Decision Trees using Information Affinity
Authors: Ilyes Jenhani, Salem Benferhat, Zied Elouedi
Abstract:
This paper investigates the issue of building decision trees from data with imprecise class values where imprecision is encoded in the form of possibility distributions. The Information Affinity similarity measure is introduced into the well-known gain ratio criterion in order to assess the homogeneity of a set of possibility distributions representing instances-s classes belonging to a given training partition. For the experimental study, we proposed an information affinity based performance criterion which we have used in order to show the performance of the approach on well-known benchmarks.Keywords: Data mining from uncertain data, Decision Trees, Possibility Theory.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15151288 Mixed Convection Boundary Layer Flows Induced by a Permeable Continuous Surface Stretched with Prescribed Skin Friction
Authors: Mohamed Ali
Abstract:
The boundary layer flow and heat transfer on a stretched surface moving with prescribed skin friction is studied for permeable surface. The surface temperature is assumed to vary inversely with the vertical direction x for n = -1. The skin friction at the surface scales as (x-1/2) at m = 0. The constants m and n are the indices of the power law velocity and temperature exponent respectively. Similarity solutions are obtained for the boundary layer equations subject to power law temperature and velocity variation. The effect of various governing parameters, such as the buoyancy parameter λ and the suction/injection parameter fw for air (Pr = 0.72) are studied. The choice of n and m ensures that the used similarity solutions are x independent. The results show that, assisting flow (λ > 0) enhancing the heat transfer coefficient along the surface for any constant value of fw. Furthermore, injection increases the heat transfer coefficient but suction reduces it at constant λ.Keywords: Stretching surface, Boundary layers, Prescribed skin friction, Suction or injection, similarity solutions, buoyancy effects.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18561287 A Simplified and Effective Algorithm Used to Mine Similar Processes: An Illustrated Example
Authors: Min-Hsun Kuo, Yun-Shiow Chen
Abstract:
The running logs of a process hold valuable information about its executed activity behavior and generated activity logic structure. Theses informative logs can be extracted, analyzed and utilized to improve the efficiencies of the process's execution and conduction. One of the techniques used to accomplish the process improvement is called as process mining. To mine similar processes is such an improvement mission in process mining. Rather than directly mining similar processes using a single comparing coefficient or a complicate fitness function, this paper presents a simplified heuristic process mining algorithm with two similarity comparisons that are able to relatively conform the activity logic sequences (traces) of mining processes with those of a normalized (regularized) one. The relative process conformance is to find which of the mining processes match the required activity sequences and relationships, further for necessary and sufficient applications of the mined processes to process improvements. One similarity presented is defined by the relationships in terms of the number of similar activity sequences existing in different processes; another similarity expresses the degree of the similar (identical) activity sequences among the conforming processes. Since these two similarities are with respect to certain typical behavior (activity sequences) occurred in an entire process, the common problems, such as the inappropriateness of an absolute comparison and the incapability of an intrinsic information elicitation, which are often appeared in other process conforming techniques, can be solved by the relative process comparison presented in this paper. To demonstrate the potentiality of the proposed algorithm, a numerical example is illustrated.Keywords: process mining, process similarity, artificial intelligence, process conformance.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14431286 Entropy Measures on Neutrosophic Soft Sets and Its Application in Multi Attribute Decision Making
Authors: I. Arockiarani
Abstract:
The focus of the paper is to furnish the entropy measure for a neutrosophic set and neutrosophic soft set which is a measure of uncertainty and it permeates discourse and system. Various characterization of entropy measures are derived. Further we exemplify this concept by applying entropy in various real time decision making problems.Keywords: Entropy measure, Hausdorff distance, neutrosophic set, soft set.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 9331285 Combining Skin Color and Optical Flow for Computer Vision Systems
Authors: Muhammad Raza Ali, Tim Morris
Abstract:
Skin color is an important visual cue for computer vision systems involving human users. In this paper we combine skin color and optical flow for detection and tracking of skin regions. We apply these techniques to gesture recognition with encouraging results. We propose a novel skin similarity measure. For grouping detected skin regions we propose a novel skin region grouping mechanism. The proposed techniques work with any number of skin regions making them suitable for a multiuser scenario.Keywords: Bayesian tracking, chromaticity space, optical flowgesture recognition
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19281284 Computing the Similarity and the Diversity in the Species Based on Cronobacter Genome
Authors: E. Al Daoud
Abstract:
The purpose of computing the similarity and the diversity in the species is to trace the process of evolution and to find the relationship between the species and discover the unique, the special, the common and the universal proteins. The proteins of the whole genome of 40 species are compared with the cronobacter genome which is used as reference genome. More than 3 billion pairwise alignments are performed using blastp. Several findings are introduced in this study, for example, we found 172 proteins in cronobacter genome which have insignificant hits in other species, 116 significant proteins in the all tested species with very high score value and 129 common proteins in the plants but have insignificant hits in mammals, birds, fishes, and insects.
Keywords: Genome, species, blastp, conserved genes, cronobacter.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 10081283 A Specification-Based Approach for Retrieval of Reusable Business Component for Software Reuse
Authors: Meng Fanchao, Zhan Dechen, Xu Xiaofei
Abstract:
Software reuse can be considered as the most realistic and promising way to improve software engineering productivity and quality. Automated assistance for software reuse involves the representation, classification, retrieval and adaptation of components. The representation and retrieval of components are important to software reuse in Component-Based on Software Development (CBSD). However, current industrial component models mainly focus on the implement techniques and ignore the semantic information about component, so it is difficult to retrieve the components that satisfy user-s requirements. This paper presents a method of business component retrieval based on specification matching to solve the software reuse of enterprise information system. First, a business component model oriented reuse is proposed. In our model, the business data type is represented as sign data type based on XML, which can express the variable business data type that can describe the variety of business operations. Based on this model, we propose specification match relationships in two levels: business operation level and business component level. In business operation level, we use input business data types, output business data types and the taxonomy of business operations evaluate the similarity between business operations. In the business component level, we propose five specification matches between business components. To retrieval reusable business components, we propose the measure of similarity degrees to calculate the similarities between business components. Finally, a business component retrieval command like SQL is proposed to help user to retrieve approximate business components from component repository.Keywords: Business component, business operation, business data type, specification matching.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14091282 Group Similarity Transformation of a Time Dependent Chemical Convective Process
Authors: M. M. Kassem, A. S. Rashed
Abstract:
The time dependent progress of a chemical reaction over a flat horizontal plate is here considered. The problem is solved through the group similarity transformation method which reduces the number of independent by one and leads to a set of nonlinear ordinary differential equation. The problem shows a singularity at the chemical reaction order n=1 and is analytically solved through the perturbation method. The behavior of the process is then numerically investigated for n≠1 and different Schmidt numbers. Graphical results for the velocity and concentration of chemicals based on the analytical and numerical solutions are presented and discussed.
Keywords: Time dependent, chemical convection, grouptransformation method, perturbation method.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16271281 Upgraded Rough Clustering and Outlier Detection Method on Yeast Dataset by Entropy Rough K-Means Method
Authors: P. Ashok, G. M. Kadhar Nawaz
Abstract:
Rough set theory is used to handle uncertainty and incomplete information by applying two accurate sets, Lower approximation and Upper approximation. In this paper, the rough clustering algorithms are improved by adopting the Similarity, Dissimilarity–Similarity and Entropy based initial centroids selection method on three different clustering algorithms namely Entropy based Rough K-Means (ERKM), Similarity based Rough K-Means (SRKM) and Dissimilarity-Similarity based Rough K-Means (DSRKM) were developed and executed by yeast dataset. The rough clustering algorithms are validated by cluster validity indexes namely Rand and Adjusted Rand indexes. An experimental result shows that the ERKM clustering algorithm perform effectively and delivers better results than other clustering methods. Outlier detection is an important task in data mining and very much different from the rest of the objects in the clusters. Entropy based Rough Outlier Factor (EROF) method is seemly to detect outlier effectively for yeast dataset. In rough K-Means method, by tuning the epsilon (ᶓ) value from 0.8 to 1.08 can detect outliers on boundary region and the RKM algorithm delivers better results, when choosing the value of epsilon (ᶓ) in the specified range. An experimental result shows that the EROF method on clustering algorithm performed very well and suitable for detecting outlier effectively for all datasets. Further, experimental readings show that the ERKM clustering method outperformed the other methods.
Keywords: Clustering, Entropy, Outlier, Rough K-Means, validity index.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14121280 Human Action Recognition System Based on Silhouette
Authors: S. Maheswari, P. Arockia Jansi Rani
Abstract:
Human action is recognized directly from the video sequences. The objective of this work is to recognize various human actions like run, jump, walk etc. Human action recognition requires some prior knowledge about actions namely, the motion estimation, foreground and background estimation. Region of interest (ROI) is extracted to identify the human in the frame. Then, optical flow technique is used to extract the motion vectors. Using the extracted features similarity measure based classification is done to recognize the action. From experimentations upon the Weizmann database, it is found that the proposed method offers a high accuracy.Keywords: Background subtraction, human silhouette, optical flow, classification.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 9991279 An Efficient Method of Shot Cut Detection
Authors: Lenka Krulikovská, Jaroslav Polec
Abstract:
In this paper we present a method of abrupt cut detection with a novel logic of frames- comparison. Actual frame is compared with its motion estimated prediction instead of comparison with successive frame. Four different similarity metrics were employed to estimate the resemblance of compared frames. Obtained results were evaluated by standard used measures of test accuracy and compared with existing approach. Based on the results, we claim the proposed method is more effective and Pearson correlation coefficient obtained the best results among chosen similarity metrics.
Keywords: Abrupt cut, mutual information, shot cut detection, Pearson correlation coefficient.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19341278 Genetic Diversity Based Population Study of Freshwater Mud Eel (Monopterus cuchia) in Bangladesh
Authors: M. F. Miah, K. M. A. Zinnah, M. J. Raihan, H. Ali, M. N. Naser
Abstract:
As genetic diversity is most important for existing, breeding and production of any fish; this study was undertaken for investigating genetic diversity of freshwater mud eel, Monopterus cuchia at population level where three ecological populations such as flooded area of Sylhet (P1), open water of Moulvibazar (P2) and open water of Sunamganj (P3) districts of Bangladesh were considered. Four arbitrary RAPD primers (OPB-12, C0-4, B-03 and OPB-08) were screened and RAPD banding patterns were analyzed among the populations considering 15 individuals of each population. In total 174, 138 and 149 bands were detected in the populations of P1, P2 and P3 respectively; however, each primer revealed less number of bands in each population. 100% polymorphic loci were recorded in P2 and P3 whereas only one monomorphic locus was observed in P1, recorded 97.5% polymorphism. Different genetic parameters such as inter-individual pairwise similarity, genetic distance, Nei genetic similarity, linkage distances, cluster analysis and allelic information, etc. were considered for measuring genetic diversity. The average inter-individual pairwise similarity was recorded 2.98, 1.47 and 1.35 in P1, P2 and P3 respectively. Considering genetic distance analysis, the highest distance 1 was recorded in P2 and P3 and the lowest genetic distance 0.444 was found in P2. The average Nei genetic similarity was observed 0.19, 0.16 and 0.13 in P1, P2 and P3, respectively; however, the average linkage distance was recorded 24.92, 17.14 and 15.28 in P1, P3 and P2 respectively. Based on linkage distance, genetic clusters were generated in three populations where 6 clades and 7 clusters were found in P1, 3 clades and 5 clusters were observed in P2 and 4 clades and 7 clusters were detected in P3. In addition, allelic information was observed where the frequency of p and q alleles were observed 0.093 and 0.907 in P1, 0.076 and 0.924 in P2, 0.074 and 0.926 in P3 respectively. The average gene diversity was observed highest in P2 (0.132) followed by P3 (0.131) and P1 (0.121) respectively.
Keywords: Genetic diversity, Monopterus cuchia, population, RAPD, Bangladesh.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1831