Search results for: Semantic textual similarity
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 692

Search results for: Semantic textual similarity

452 Parallezation Protein Sequence Similarity Algorithms using Remote Method Interface

Authors: Mubarak Saif Mohsen, Zurinahni Zainol, Rosalina Abdul Salam, Wahidah Husain

Abstract:

One of the major problems in genomic field is to perform sequence comparison on DNA and protein sequences. Executing sequence comparison on the DNA and protein data is a computationally intensive task. Sequence comparison is the basic step for all algorithms in protein sequences similarity. Parallel computing is an attractive solution to provide the computational power needed to speedup the lengthy process of the sequence comparison. Our main research is to enhance the protein sequence algorithm using dynamic programming method. In our approach, we parallelize the dynamic programming algorithm using multithreaded program to perform the sequence comparison and also developed a distributed protein database among many PCs using Remote Method Interface (RMI). As a result, we showed how different sizes of protein sequences data and computation of scoring matrix of these protein sequence on different number of processors affected the processing time and speed, as oppose to sequential processing.

Keywords: Protein sequence algorithm, dynamic programming algorithm, multithread

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1892
451 Detecting Remote Protein Evolutionary Relationships via String Scoring Method

Authors: Nazar Zaki, Safaai Deris

Abstract:

The amount of the information being churned out by the field of biology has jumped manifold and now requires the extensive use of computer techniques for the management of this information. The predominance of biological information such as protein sequence similarity in the biological information sea is key information for detecting protein evolutionary relationship. Protein sequence similarity typically implies homology, which in turn may imply structural and functional similarities. In this work, we propose, a learning method for detecting remote protein homology. The proposed method uses a transformation that converts protein sequence into fixed-dimensional representative feature vectors. Each feature vector records the sensitivity of a protein sequence to a set of amino acids substrings generated from the protein sequences of interest. These features are then used in conjunction with support vector machines for the detection of the protein remote homology. The proposed method is tested and evaluated on two different benchmark protein datasets and it-s able to deliver improvements over most of the existing homology detection methods.

Keywords: Protein homology detection; support vectormachine; string kernel.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1379
450 Computational Analysis of Potential Inhibitors Selected Based On Structural Similarity for the Src SH2 Domain

Authors: W. P. Hu, J. V. Kumar, Jeffrey J. P. Tsai

Abstract:

The inhibition of SH2 domain regulated protein-protein interactions is an attractive target for developing an effective chemotherapeutic approach in the treatment of disease. Molecular simulation is a useful tool for developing new drugs and for studying molecular recognition. In this study, we searched potential drug compounds for the inhibition of SH2 domain by performing structural similarity search in PubChem Compound Database. A total of 37 compounds were screened from the database, and then we used the LibDock docking program to evaluate the inhibition effect. The best three compounds (AP22408, CID 71463546 and CID 9917321) were chosen for MD simulations after the LibDock docking. Our results show that the compound CID 9917321 can produce a more stable protein-ligand complex compared to other two currently known inhibitors of Src SH2 domain. The compound CID 9917321 may be useful for the inhibition of SH2 domain based on these computational results. Subsequently experiments are needed to verify the effect of compound CID 9917321 on the SH2 domain in the future studies.

Keywords: Nonpeptide inhibitor, Src SH2 domain, LibDock, molecular dynamics simulation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2064
449 Military Combat Aircraft Selection Using Trapezoidal Fuzzy Numbers with the Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS)

Authors: C. Ardil

Abstract:

This article presents a new approach to uncertainty, vagueness, and imprecision analysis for ranking alternatives with fuzzy data for decision making using the Technique for Order Preference by Similarity to Ideal Solution (TOPSIS). In the proposed approach, fuzzy decision information related to the aircraft selection problem is taken into account in ranking the alternatives and selecting the best one. The basic procedural step is to transform the fuzzy decision matrices into matrices of alternatives evaluated according to all decision criteria. A numerical example illustrates the proposed approach for the military combat aircraft selection problem.

Keywords: trapezoidal fuzzy numbers, multiple criteria decision making analysis, decision making, aircraft selection, MCDMA, fuzzy TOPSIS

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 449
448 Emergentist Metaphorical Creativity: Towards a Model of Analysing Metaphorical Creativity in Interactive Talk

Authors: Afef Badri

Abstract:

Metaphorical creativity does not constitute a static property of discourse. It is an interactive dynamic process created online. There has been a lack of research concerning online produced metaphorical creativity. This paper intends to account for metaphorical creativity in online talk-in-interaction as a dynamic process that emerges as discourse unfolds. It brings together insights from the emergentist approach to the study of metaphor in verbal interactions and insights from conceptual blending approach as a model for analysing online metaphorical constructions to propose a model for studying metaphorical creativity in interactive talk. The model is based on three focal points. First, metaphorical creativity is a dynamic emergent and open-to-change process that evolves in real time as interlocutors constantly blend and re-blend previous metaphorical contributions. Second, it is not a product of isolated individual minds but a joint achievement that is co-constructed and co-elaborated by interlocutors. The third and most important point is that the emergent process of metaphorical creativity is tightly shaped by contextual variables surrounding talk-in-interaction. It is grounded in the framework of interpretation of interlocutors. It is constrained by preceding contributions in a way that creates textual cohesion of the verbal exchange and it is also a goal-oriented process predefined by the communicative intention of each participant in a way that reveals the ideological coherence/incoherence of the entire conversation.

Keywords: Communicative intention, conceptual blending, contextual variables, the emergentist approach, ideological coherence, metaphorical creativity, textual cohesion

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1033
447 Feature Selection with Kohonen Self Organizing Classification Algorithm

Authors: Francesco Maiorana

Abstract:

In this paper a one-dimension Self Organizing Map algorithm (SOM) to perform feature selection is presented. The algorithm is based on a first classification of the input dataset on a similarity space. From this classification for each class a set of positive and negative features is computed. This set of features is selected as result of the procedure. The procedure is evaluated on an in-house dataset from a Knowledge Discovery from Text (KDT) application and on a set of publicly available datasets used in international feature selection competitions. These datasets come from KDT applications, drug discovery as well as other applications. The knowledge of the correct classification available for the training and validation datasets is used to optimize the parameters for positive and negative feature extractions. The process becomes feasible for large and sparse datasets, as the ones obtained in KDT applications, by using both compression techniques to store the similarity matrix and speed up techniques of the Kohonen algorithm that take advantage of the sparsity of the input matrix. These improvements make it feasible, by using the grid, the application of the methodology to massive datasets.

Keywords: Clustering algorithm, Data mining, Feature selection, Grid, Kohonen Self Organizing Map.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3041
446 Exploiting Global Self Similarity for Head-Shoulder Detection

Authors: Lae-Jeong Park, Jung-Ho Moon

Abstract:

People detection from images has a variety of applications such as video surveillance and driver assistance system, but is still a challenging task and more difficult in crowded environments such as shopping malls in which occlusion of lower parts of human body often occurs. Lack of the full-body information requires more effective features than common features such as HOG. In this paper, new features are introduced that exploits global self-symmetry (GSS) characteristic in head-shoulder patterns. The features encode the similarity or difference of color histograms and oriented gradient histograms between two vertically symmetric blocks. The domain-specific features are rapid to compute from the integral images in Viola-Jones cascade-of-rejecters framework. The proposed features are evaluated with our own head-shoulder dataset that, in part, consists of a well-known INRIA pedestrian dataset. Experimental results show that the GSS features are effective in reduction of false alarmsmarginally and the gradient GSS features are preferred more often than the color GSS ones in the feature selection.

Keywords: Pedestrian detection, cascade of rejecters, feature extraction, self-symmetry, HOG.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2389
445 A Distance Function for Data with Missing Values and Its Application

Authors: Loai AbdAllah, Ilan Shimshoni

Abstract:

Missing values in data are common in real world applications. Since the performance of many data mining algorithms depend critically on it being given a good metric over the input space, we decided in this paper to define a distance function for unlabeled datasets with missing values. We use the Bhattacharyya distance, which measures the similarity of two probability distributions, to define our new distance function. According to this distance, the distance between two points without missing attributes values is simply the Mahalanobis distance. When on the other hand there is a missing value of one of the coordinates, the distance is computed according to the distribution of the missing coordinate. Our distance is general and can be used as part of any algorithm that computes the distance between data points. Because its performance depends strongly on the chosen distance measure, we opted for the k nearest neighbor classifier to evaluate its ability to accurately reflect object similarity. We experimented on standard numerical datasets from the UCI repository from different fields. On these datasets we simulated missing values and compared the performance of the kNN classifier using our distance to other three basic methods. Our  experiments show that kNN using our distance function outperforms the kNN using other methods. Moreover, the runtime performance of our method is only slightly higher than the other methods.

Keywords: Missing values, Distance metric, Bhattacharyya distance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2736
444 Domain Knowledge Representation through Multiple Sub Ontologies: An Application Interoperability

Authors: Sunitha Abburu, Golla Suresh Babu

Abstract:

The issues that limit application interoperability is lack of common vocabulary, common structure, application domain knowledge ontology based semantic technology provides solutions that resolves application interoperability issues. Ontology is broadly used in diverse applications such as artificial intelligence, bioinformatics, biomedical, information integration, etc. Ontology can be used to interpret the knowledge of various domains. To reuse, enrich the available ontologies and reduce the duplication of ontologies of the same domain, there is a strong need to integrate the ontologies of the particular domain. The integrated ontology gives complete knowledge about the domain by sharing this comprehensive domain ontology among the groups. As per the literature survey there is no well-defined methodology to represent knowledge of a whole domain. The current research addresses a systematic methodology for knowledge representation using multiple sub-ontologies at different levels that addresses application interoperability and enables semantic information retrieval. The current method represents complete knowledge of a domain by importing concepts from multiple sub ontologies of same and relative domains that reduces ontology duplication, rework, implementation cost through ontology reusability.

Keywords: Knowledge acquisition, knowledge representation, knowledge transfer, ontologies, semantics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 951
443 Elemental Graph Data Model: A Semantic and Topological Representation of Building Elements

Authors: Yasmeen A. S. Essawy, Khaled Nassar

Abstract:

With the rapid increase of complexity in the building industry, professionals in the A/E/C industry were forced to adopt Building Information Modeling (BIM) in order to enhance the communication between the different project stakeholders throughout the project life cycle and create a semantic object-oriented building model that can support geometric-topological analysis of building elements during design and construction. This paper presents a model that extracts topological relationships and geometrical properties of building elements from an existing fully designed BIM, and maps this information into a directed acyclic Elemental Graph Data Model (EGDM). The model incorporates BIM-based search algorithms for automatic deduction of geometrical data and topological relationships for each building element type. Using graph search algorithms, such as Depth First Search (DFS) and topological sortings, all possible construction sequences can be generated and compared against production and construction rules to generate an optimized construction sequence and its associated schedule. The model is implemented in a C# platform.

Keywords: Building information modeling, elemental graph data model, geometric and topological data models, and graph theory.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1182
442 A Robust Visual SLAM for Indoor Dynamic Environment

Authors: Xiang Zhang, Daohong Yang, Ziyuan Wu, Lei Li, Wanting Zhou

Abstract:

Visual Simultaneous Localization and Mapping (VSLAM) uses cameras to gather information in unknown environments to achieve simultaneous localization and mapping of the environment. This technology has a wide range of applications in autonomous driving, virtual reality, and other related fields. Currently, the research advancements related to VSLAM can maintain high accuracy in static environments. But in dynamic environments, the presence of moving objects in the scene can reduce the stability of the VSLAM system, leading to inaccurate localization and mapping, or even system failure. In this paper, a robust VSLAM method was proposed to effectively address the challenges in dynamic environments. We proposed a dynamic region removal scheme based on a semantic segmentation neural network and geometric constraints. Firstly, a semantic segmentation neural network is used to extract the prior active motion region, prior static region, and prior passive motion region in the environment. Then, the lightweight frame tracking module initializes the transform pose between the previous frame and the current frame on the prior static region. A motion consistency detection module based on multi-view geometry and scene flow is used to divide the environment into static regions and dynamic regions. Thus, the dynamic object region was successfully eliminated. Finally, only the static region is used for tracking thread. Our research is based on the ORBSLAM3 system, which is one of the most effective VSLAM systems available. We evaluated our method on the TUM RGB-D benchmark and the results demonstrate that the proposed VSLAM method improves the accuracy of the original ORBSLAM3 by 70%˜98.5% under a high dynamic environment.

Keywords: Dynamic scene, dynamic visual SLAM, semantic segmentation, scene flow, VSLAM.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 147
441 A Knowledge-Based E-mail System Using Semantic Categorization and Rating Mechanisms

Authors: Azleena Mohd Kassim, Muhamad Rashidi A. Rahman, Yu-N. Cheah

Abstract:

Knowledge-based e-mail systems focus on incorporating knowledge management approach in order to enhance the traditional e-mail systems. In this paper, we present a knowledgebased e-mail system called KS-Mail where people do not only send and receive e-mail conventionally but are also able to create a sense of knowledge flow. We introduce semantic processing on the e-mail contents by automatically assigning categories and providing links to semantically related e-mails. This is done to enrich the knowledge value of each e-mail as well as to ease the organization of the e-mails and their contents. At the application level, we have also built components like the service manager, evaluation engine and search engine to handle the e-mail processes efficiently by providing the means to share and reuse knowledge. For this purpose, we present the KS-Mail architecture, and elaborate on the details of the e-mail server and the application server. We present the ontology mapping technique used to achieve the e-mail content-s categorization as well as the protocols that we have developed to handle the transactions in the e-mail system. Finally, we discuss further on the implementation of the modules presented in the KS-Mail architecture.

Keywords: E-mail rating, knowledge-based system, ontology mapping, text categorization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1434
440 Modeling Uncertainty in Multiple Criteria Decision Making Using the Technique for Order Preference by Similarity to Ideal Solution for the Selection of Stealth Combat Aircraft

Authors: C. Ardil

Abstract:

Uncertainty set theory is a generalization of fuzzy set theory and intuitionistic fuzzy set theory. It serves as an effective tool for dealing with inconsistent, imprecise, and vague information. The technique for order preference by similarity to ideal solution (TOPSIS) method is a multiple-attribute method used to identify solutions from a finite set of alternatives. It simultaneously minimizes the distance from an ideal point and maximizes the distance from a nadir point. In this paper, an extension of the TOPSIS method for multiple attribute group decision-making (MAGDM) based on uncertainty sets is presented. In uncertainty decision analysis, decision-makers express information about attribute values and weights using uncertainty numbers to select the best stealth combat aircraft.

Keywords: Uncertainty set, stealth combat aircraft selection multiple criteria decision-making analysis, MCDM, uncertainty decision analysis, TOPSIS

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 113
439 Multi-Agents Coordination Model in Inter- Organizational Workflow: Applying in Egovernment

Authors: E. Karoui Chaabane, S. Hadouaj, K. Ghedira

Abstract:

Inter-organizational Workflow (IOW) is commonly used to support the collaboration between heterogeneous and distributed business processes of different autonomous organizations in order to achieve a common goal. E-government is considered as an application field of IOW. The coordination of the different organizations is the fundamental problem in IOW and remains the major cause of failure in e-government projects. In this paper, we introduce a new coordination model for IOW that improves the collaboration between government administrations and that respects IOW requirements applied to e-government. For this purpose, we adopt a Multi-Agent approach, which deals more easily with interorganizational digital government characteristics: distribution, heterogeneity and autonomy. Our model integrates also different technologies to deal with the semantic and technologic interoperability. Moreover, it conserves the existing systems of government administrations by offering a distributed coordination based on interfaces communication. This is especially applied in developing countries, where administrations are not necessary equipped with workflow systems. The use of our coordination techniques allows an easier migration for an e-government solution and with a lower cost. To illustrate the applicability of the proposed model, we present a case study of an identity card creation in Tunisia.

Keywords: E-government, Inter-organizational workflow, Multi-agent systems, Semantic web services.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2261
438 Porul: Option Generation and Selection and Scoring Algorithms for a Tamil Flash Card Game

Authors: Anitha Narasimhan, Aarthy Anandan, Madhan Karky, C. N. Subalalitha

Abstract:

Games can be the excellent tools for teaching a language. There are few e-learning games in Indian languages like word scrabble, cross word, quiz games etc., which were developed mainly for educational purposes. This paper proposes a Tamil word game called, “Porul”, which focuses on education as well as on players’ thinking and decision-making skills. Porul is a multiple choice based quiz game, in which the players attempt to answer questions correctly from the given multiple options that are generated using a unique algorithm called the Option Selection algorithm which explores the semantics of the question in various dimensions namely, synonym, rhyme and Universal Networking Language semantic category. This kind of semantic exploration of the question not only increases the complexity of the game but also makes it more interesting. The paper also proposes a Scoring Algorithm which allots a score based on the popularity score of the question word. The proposed game has been tested using 20,000 Tamil words.

Keywords: Porul game, Tamil word game, option selection, flash card, scoring, algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1150
437 Minimal Spanning Tree based Fuzzy Clustering

Authors: Ágnes Vathy-Fogarassy, Balázs Feil, János Abonyi

Abstract:

Most of fuzzy clustering algorithms have some discrepancies, e.g. they are not able to detect clusters with convex shapes, the number of the clusters should be a priori known, they suffer from numerical problems, like sensitiveness to the initialization, etc. This paper studies the synergistic combination of the hierarchical and graph theoretic minimal spanning tree based clustering algorithm with the partitional Gath-Geva fuzzy clustering algorithm. The aim of this hybridization is to increase the robustness and consistency of the clustering results and to decrease the number of the heuristically defined parameters of these algorithms to decrease the influence of the user on the clustering results. For the analysis of the resulted fuzzy clusters a new fuzzy similarity measure based tool has been presented. The calculated similarities of the clusters can be used for the hierarchical clustering of the resulted fuzzy clusters, which information is useful for cluster merging and for the visualization of the clustering results. As the examples used for the illustration of the operation of the new algorithm will show, the proposed algorithm can detect clusters from data with arbitrary shape and does not suffer from the numerical problems of the classical Gath-Geva fuzzy clustering algorithm.

Keywords: Clustering, fuzzy clustering, minimal spanning tree, cluster validity, fuzzy similarity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2387
436 Thermosolutal MHD Mixed Marangoni Convective Boundary Layers in the Presence of Suction or Injection

Authors: Noraini Ahmad, Seripah Awang Kechil, Norma Mohd Basir

Abstract:

The steady coupled dissipative layers, called Marangoni mixed convection boundary layers, in the presence of a magnetic field and solute concentration that are formed along the surface of two immiscible fluids with uniform suction or injection effects is examined. The similarity boundary layer equations are solved numerically using the Runge-Kutta Fehlberg with shooting technique. The Marangoni, buoyancy and external pressure gradient effects that are generated in mixed convection boundary layer flow are assessed. The velocity, temperature and concentration boundary layers thickness decrease with the increase of the magnetic field strength and the injection to suction. For buoyancy-opposed flow, the Marangoni mixed convection parameter enhances the velocity boundary layer but decreases the temperature and concentration boundary layers. However, for the buoyancy-assisted flow, the Marangoni mixed convection parameter decelerates the velocity but increases the temperature and concentration boundary layers.

Keywords: Magnetic field, mixed Marangoni convection, similarity boundary layers, solute concentration.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1867
435 Multi-Rate Exact Discretization based on Diagonalization of a Linear System - A Multiple-Real-Eigenvalue Case

Authors: T. Sakamoto, N. Hori

Abstract:

A multi-rate discrete-time model, whose response agrees exactly with that of a continuous-time original at all sampling instants for any sampling periods, is developed for a linear system, which is assumed to have multiple real eigenvalues. The sampling rates can be chosen arbitrarily and individually, so that their ratios can even be irrational. The state space model is obtained as a combination of a linear diagonal state equation and a nonlinear output equation. Unlike the usual lifted model, the order of the proposed model is the same as the number of sampling rates, which is less than or equal to the order of the original continuous-time system. The method is based on a nonlinear variable transformation, which can be considered as a generalization of linear similarity transformation, which cannot be applied to systems with multiple eigenvalues in general. An example and its simulation result show that the proposed multi-rate model gives exact responses at all sampling instants.

Keywords: Multi-rate discretization, linear systems, triangularization, similarity transformation, diagonalization, exponential transformation, multiple eigenvalues

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1350
434 Grouping and Indexing Color Features for Efficient Image Retrieval

Authors: M. V. Sudhamani, C. R. Venugopal

Abstract:

Content-based Image Retrieval (CBIR) aims at searching image databases for specific images that are similar to a given query image based on matching of features derived from the image content. This paper focuses on a low-dimensional color based indexing technique for achieving efficient and effective retrieval performance. In our approach, the color features are extracted using the mean shift algorithm, a robust clustering technique. Then the cluster (region) mode is used as representative of the image in 3-D color space. The feature descriptor consists of the representative color of a region and is indexed using a spatial indexing method that uses *R -tree thus avoiding the high-dimensional indexing problems associated with the traditional color histogram. Alternatively, the images in the database are clustered based on region feature similarity using Euclidian distance. Only representative (centroids) features of these clusters are indexed using *R -tree thus improving the efficiency. For similarity retrieval, each representative color in the query image or region is used independently to find regions containing that color. The results of these methods are compared. A JAVA based query engine supporting query-by- example is built to retrieve images by color.

Keywords: Content-based, indexing, cluster, region.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1802
433 Stability Analysis of Three-Dimensional Flow and Heat Transfer over a Permeable Shrinking Surface in a Cu-Water Nanofluid

Authors: Roslinda Nazar, Amin Noor, Khamisah Jafar, Ioan Pop

Abstract:

In this paper, the steady laminar three-dimensional boundary layer flow and heat transfer of a copper (Cu)-water nanofluid in the vicinity of a permeable shrinking flat surface in an otherwise quiescent fluid is studied. The nanofluid mathematical model in which the effect of the nanoparticle volume fraction is taken into account is considered. The governing nonlinear partial differential equations are transformed into a system of nonlinear ordinary differential equations using a similarity transformation which is then solved numerically using the function bvp4c from Matlab. Dual solutions (upper and lower branch solutions) are found for the similarity boundary layer equations for a certain range of the suction parameter. A stability analysis has been performed to show which branch solutions are stable and physically realizable. The numerical results for the skin friction coefficient and the local Nusselt number as well as the velocity and temperature profiles are obtained, presented and discussed in detail for a range of various governing parameters.

Keywords: Heat Transfer, Nanofluid, Shrinking Surface, Stability Analysis, Three-Dimensional Flow.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2184
432 A Review and Comparative Analysis on Cluster Ensemble Methods

Authors: S. Sarumathi, P. Ranjetha, C. Saraswathy, M. Vaishnavi, S. Geetha

Abstract:

Clustering is an unsupervised learning technique for aggregating data objects into meaningful classes so that intra cluster similarity is maximized and inter cluster similarity is minimized in data mining. However, no single clustering algorithm proves to be the most effective in producing the best result. As a result, a new challenging technique known as the cluster ensemble approach has blossomed in order to determine the solution to this problem. For the cluster analysis issue, this new technique is a successful approach. The cluster ensemble's main goal is to combine similar clustering solutions in a way that achieves the precision while also improving the quality of individual data clustering. Because of the massive and rapid creation of new approaches in the field of data mining, the ongoing interest in inventing novel algorithms necessitates a thorough examination of current techniques and future innovation. This paper presents a comparative analysis of various cluster ensemble approaches, including their methodologies, formal working process, and standard accuracy and error rates. As a result, the society of clustering practitioners will benefit from this exploratory and clear research, which will aid in determining the most appropriate solution to the problem at hand.

Keywords: Clustering, cluster ensemble methods, consensus function, data mining, unsupervised learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 798
431 Data Security in a DApp Twitter Alike on Web 3.0 With Blockchain Based Technology

Authors: Vishal Awasthi, Tanya Soni, Vigya Awasthi, Swati Singh, Shivali Verma

Abstract:

There is a growing demand for a network that grants a high level of data security and confidentiality. For this reason, the semantic web was introduced, which allows data to be shared and reused across applications while safeguarding users privacy and user’s will grab back control of their data. The earlier Web 1.0 and Web 2.0 versions were built on client-server architecture, in  which there was the risk of data theft and unconsented sale of user data. A decentralized version, Known as Web 3.0, that is mostly built on blockchain technology was interjected to resolve these issues. The recent research focuses on blockchain technology, deals with privacy, security, transparency, and innovation of decentralized applications (DApps), e.g. a Twitter Clone, Whatsapp clone. In this paper the Twitter Alike built on the Ethereum blockchain will replace traditional techniques with improved latency, throughput, and data ownership. The central principle of this DApp is smart contract implemented using Solidity which is an object- oriented and highlevel language. Consequently, this will provide a better Quality Services, high data security, and integrity for both present and future internet technologies.

Keywords: Blockchain, DApps, Ethereum, Semantic Web, Smart Contract, Solidity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 280
430 Classifying Biomedical Text Abstracts based on Hierarchical 'Concept' Structure

Authors: Rozilawati Binti Dollah, Masaki Aono

Abstract:

Classifying biomedical literature is a difficult and challenging task, especially when a large number of biomedical articles should be organized into a hierarchical structure. In this paper, we present an approach for classifying a collection of biomedical text abstracts downloaded from Medline database with the help of ontology alignment. To accomplish our goal, we construct two types of hierarchies, the OHSUMED disease hierarchy and the Medline abstract disease hierarchies from the OHSUMED dataset and the Medline abstracts, respectively. Then, we enrich the OHSUMED disease hierarchy before adapting it to ontology alignment process for finding probable concepts or categories. Subsequently, we compute the cosine similarity between the vector in probable concepts (in the “enriched" OHSUMED disease hierarchy) and the vector in Medline abstract disease hierarchies. Finally, we assign category to the new Medline abstracts based on the similarity score. The results obtained from the experiments show the performance of our proposed approach for hierarchical classification is slightly better than the performance of the multi-class flat classification.

Keywords: Biomedical literature, hierarchical text classification, ontology alignment, text mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2001
429 Contextual Distribution for Textual Alignment

Authors: Yuri Bizzoni, Marianne Reboul

Abstract:

Our program compares French and Italian translations of Homer’s Odyssey, from the XVIth to the XXth century. We focus on the third point, showing how distributional semantics systems can be used both to improve alignment between different French translations as well as between the Greek text and a French translation. Although we focus on French examples, the techniques we display are completely language independent.

Keywords: Translation studies, machine translation, computational linguistics, distributional semantics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1016
428 Fighter Aircraft Evaluation and Selection Process Based on Triangular Fuzzy Numbers in Multiple Criteria Decision Making Analysis Using the Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS)

Authors: C. Ardil

Abstract:

This article presents a multiple criteria evaluation approach to uncertainty, vagueness, and imprecision analysis for ranking alternatives with fuzzy data for decision making using the Technique for Order Preference by Similarity to Ideal Solution (TOPSIS). The fighter aircraft evaluation and selection decision making problem is modeled in a fuzzy environment with triangular fuzzy numbers. The fuzzy decision information related to the fighter aircraft selection problem is taken into account in ordering the alternatives and selecting the best candidate. The basic fuzzy TOPSIS procedure steps transform fuzzy decision matrices into matrices of alternatives evaluated according to all decision criteria. A practical numerical example illustrates the proposed approach to the fighter aircraft selection problem.

Keywords: triangular fuzzy number (TFN), multiple criteria decision making analysis, decision making, aircraft selection, MCDMA, fuzzy TOPSIS

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 453
427 Bird Diversity along Boat Touring Routes in Tha Ka Sub-District, Amphawa District, Samut Songkram Province, Thailand

Authors: N. Charoenpokaraj, P. Chitman

Abstract:

This research aims to study species, abundance, status of birds, the similarities and activity characteristics of birds which reap benefits from the research area in boat touring routes in Tha Ka sub-district, Amphawa District, Samut Songkram Province, Thailand. from October 2012 – September 2013. The data was analyzed to find the abundance, and similarity index of the birds. The results from the survey of birds on all three routes found that there are 33 families and 63 species. Route 3 (traditional coconut sugar making kiln – resort) had the most species; 56 species. There were 18 species of commonly found birds with an abundance level of 5, which calculates to 28.57% of all bird species. In August, 46 species are found, being the greatest number of bird species benefiting from this route. As for the status of the birds, there are 51 resident birds, 7 resident and migratory birds, and 5 migratory birds. On Route 2 and Route 3, the similarity index value is equal to 0.881. The birds are classified by their activity characteristics i.e. insectivore, piscivore, granivore, nectrivore and aquatic invertebrate feeder birds. Some birds also use the area for nesting.

Keywords: Bird diversity, boat touring routes, Samut Songkram.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1709
426 Optimal Model Order Selection for Transient Error Autoregressive Moving Average (TERA) MRI Reconstruction Method

Authors: Abiodun M. Aibinu, Athaur Rahman Najeeb, Momoh J. E. Salami, Amir A. Shafie

Abstract:

An alternative approach to the use of Discrete Fourier Transform (DFT) for Magnetic Resonance Imaging (MRI) reconstruction is the use of parametric modeling technique. This method is suitable for problems in which the image can be modeled by explicit known source functions with a few adjustable parameters. Despite the success reported in the use of modeling technique as an alternative MRI reconstruction technique, two important problems constitutes challenges to the applicability of this method, these are estimation of Model order and model coefficient determination. In this paper, five of the suggested method of evaluating the model order have been evaluated, these are: The Final Prediction Error (FPE), Akaike Information Criterion (AIC), Residual Variance (RV), Minimum Description Length (MDL) and Hannan and Quinn (HNQ) criterion. These criteria were evaluated on MRI data sets based on the method of Transient Error Reconstruction Algorithm (TERA). The result for each criterion is compared to result obtained by the use of a fixed order technique and three measures of similarity were evaluated. Result obtained shows that the use of MDL gives the highest measure of similarity to that use by a fixed order technique.

Keywords: Autoregressive Moving Average (ARMA), MagneticResonance Imaging (MRI), Parametric modeling, Transient Error.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1606
425 Graph Codes-2D Projections of Multimedia Feature Graphs for Fast and Effective Retrieval

Authors: Stefan Wagenpfeil, Felix Engel, Paul McKevitt, Matthias Hemmje

Abstract:

Multimedia Indexing and Retrieval is generally de-signed and implemented by employing feature graphs. These graphs typically contain a significant number of nodes and edges to reflect the level of detail in feature detection. A higher level of detail increases the effectiveness of the results but also leads to more complex graph structures. However, graph-traversal-based algorithms for similarity are quite inefficient and computation intensive, espe-cially for large data structures. To deliver fast and effective retrieval, an efficient similarity algorithm, particularly for large graphs, is mandatory. Hence, in this paper, we define a graph-projection into a 2D space (Graph Code) as well as the corresponding algorithms for indexing and retrieval. We show that calculations in this space can be performed more efficiently than graph-traversals due to a simpler processing model and a high level of parallelisation. In consequence, we prove that the effectiveness of retrieval also increases substantially, as Graph Codes facilitate more levels of detail in feature fusion. Thus, Graph Codes provide a significant increase in efficiency and effectiveness (especially for Multimedia indexing and retrieval) and can be applied to images, videos, audio, and text information.

Keywords: indexing, retrieval, multimedia, graph code, graph algorithm

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 425
424 Enhancing Word Meaning Retrieval Using FastText and NLP Techniques

Authors: Sankalp Devanand, Prateek Agasimani, V. S. Shamith, Rohith Neeraje

Abstract:

Machine translation has witnessed significant advancements in recent years, but the translation of languages with distinct linguistic characteristics, such as English and Sanskrit, remains a challenging task. This research presents the development of a dedicated English to Sanskrit machine translation model, aiming to bridge the linguistic and cultural gap between these two languages. Using a variety of natural language processing (NLP) approaches including FastText embeddings, this research proposes a thorough method to improve word meaning retrieval. Data preparation, part-of-speech tagging, dictionary searches, and transliteration are all included in the methodology. The study also addresses the implementation of an interpreter pattern and uses a word similarity task to assess the quality of word embeddings. The experimental outcomes show how the suggested approach may be used to enhance word meaning retrieval tasks with greater efficacy, accuracy, and adaptability. Evaluation of the model's performance is conducted through rigorous testing, comparing its output against existing machine translation systems. The assessment includes quantitative metrics such as BLEU scores, METEOR scores, Jaccard Similarity etc.

Keywords: Machine translation, English to Sanskrit, natural language processing, word meaning retrieval, FastText embeddings.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 36
423 A Corpus-Based Study on the Styles of Three Translators

Authors: Wang Yunhong

Abstract:

The present paper is preoccupied with the different styles of three translators in their translating a Chinese classical novel Shuihu Zhuan. Based on a parallel corpus, it adopts a target-oriented approach to look into whether and what stylistic differences and shifts the three translations have revealed. The findings show that the three translators demonstrate different styles concerning their word choices and sentence preferences, which implies that identification of recurrent textual patterns may be a basic step for investigating the style of a translator.

Keywords: Corpus, lexical choices, sentence characteristics, style.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 689