Search results for: searching.

64 A Study on Finding Similar Document with Multiple Categories

Authors: R. Saraçoğlu, N. Allahverdi

Abstract:

Searching similar documents and document management subjects have important place in text mining. One of the most important parts of similar document research studies is the process of classifying or clustering the documents. In this study, a similar document search approach that includes discussion of out the case of belonging to multiple categories (multiple categories problem) has been carried. The proposed method that based on Fuzzy Similarity Classification (FSC) has been compared with Rocchio algorithm and naive Bayes method which are widely used in text mining. Empirical results show that the proposed method is quite successful and can be applied effectively. For the second stage, multiple categories vector method based on information of categories regarding to frequency of being seen together has been used. Empirical results show that achievement is increased almost two times, when proposed method is compared with classical approach.

Keywords: Document similarity, Fuzzy classification, Multiple categories, Text mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1663

63 Security, Securitization and Human Capital: The New Wave of Canadian Immigration Laws

Authors: Robert M. Russo

Abstract:

This paper analyzes the linkage between migration, economic globalization and terrorism concerns. On a broad level, I analyze Canadian economic and political considerations, searching for causal relationships between political and economic actors on the one hand, and Canadian immigration law on the other. Specifically, the paper argues that there are contradictory impulses affecting state sovereignty. These impulses are are currently being played out in the field of Canadian immigration law through several proposed changes to Canada-s Immigration and Refugee Protection Act (IRPA). These changes reflect an ideological conception of sovereignty that is intrinsically connected with decision-making capacity centered on an individual. This conception of sovereign decision-making views Parliamentary debate and bureaucratic inefficiencies as both equally responsible for delaying essential decisions relating to the protection of state sovereignty, economic benefits and immigration control This paper discusses these concepts in relation to Canadian immigration policy under Canadian governments over the past twenty five years.

Keywords: Globalization, immigration law, security, anti-terrorism.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3157

62 A Decision Matrix for the Evaluation of Triplestores for Use in a Virtual Research Environment

Authors: Tristan O’Neill, Trina Myers, Jarrod Trevathan

Abstract:

The Tropical Data Hub (TDH) is a virtual research environment that provides researchers with an e-research infrastructure to congregate significant tropical data sets for data reuse, integration, searching, and correlation. However, researchers often require data and metadata synthesis across disciplines for cross-domain analyses and knowledge discovery. A triplestore offers a semantic layer to achieve a more intelligent method of search to support the synthesis requirements by automating latent linkages in the data and metadata. Presently, the benchmarks to aid the decision of which triplestore is best suited for use in an application environment like the TDH are limited to performance. This paper describes a new evaluation tool developed to analyze both features and performance. The tool comprises a weighted decision matrix to evaluate the interoperability, functionality, performance, and support availability of a range of integrated and native triplestores to rank them according to requirements of the TDH.

Keywords: Virtual research environment, Semantic Web, performance analysis, tropical data hub.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1664

61 Dynamic Measurement System Modeling with Machine Learning Algorithms

Authors: Changqiao Wu, Guoqing Ding, Xin Chen

Abstract:

In this paper, ways of modeling dynamic measurement systems are discussed. Specially, for linear system with single-input single-output, it could be modeled with shallow neural network. Then, gradient based optimization algorithms are used for searching the proper coefficients. Besides, method with normal equation and second order gradient descent are proposed to accelerate the modeling process, and ways of better gradient estimation are discussed. It shows that the mathematical essence of the learning objective is maximum likelihood with noises under Gaussian distribution. For conventional gradient descent, the mini-batch learning and gradient with momentum contribute to faster convergence and enhance model ability. Lastly, experimental results proved the effectiveness of second order gradient descent algorithm, and indicated that optimization with normal equation was the most suitable for linear dynamic models.

Keywords: Dynamic system modeling, neural network, normal equation, second order gradient descent.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 725

60 Genetic Algorithms for Feature Generation in the Context of Audio Classification

Authors: José A. Menezes, Giordano Cabral, Bruno T. Gomes

Abstract:

Choosing good features is an essential part of machine learning. Recent techniques aim to automate this process. For instance, feature learning intends to learn the transformation of raw data into a useful representation to machine learning tasks. In automatic audio classification tasks, this is interesting since the audio, usually complex information, needs to be transformed into a computationally convenient input to process. Another technique tries to generate features by searching a feature space. Genetic algorithms, for instance, have being used to generate audio features by combining or modifying them. We find this approach particularly interesting and, despite the undeniable advances of feature learning approaches, we wanted to take a step forward in the use of genetic algorithms to find audio features, combining them with more conventional methods, like PCA, and inserting search control mechanisms, such as constraints over a confusion matrix. This work presents the results obtained on particular audio classification problems.

Keywords: Feature generation, feature learning, genetic algorithm, music information retrieval.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1020

59 Music-Inspired Harmony Search Algorithm for Fixed Outline Non-Slicing VLSI Floorplanning

Authors: K. Sivasubramanian, K. B. Jayanthi

Abstract:

Floorplanning plays a vital role in the physical design process of Very Large Scale Integrated (VLSI) chips. It is an essential design step to estimate the chip area prior to the optimized placement of digital blocks and their interconnections. Since VLSI floorplanning is an NP-hard problem, many optimization techniques were adopted in the literature. In this work, a music-inspired Harmony Search (HS) algorithm is used for the fixed die outline constrained floorplanning, with the aim of reducing the total chip area. HS draws inspiration from the musical improvisation process of searching for a perfect state of harmony. Initially, B*-tree is used to generate the primary floorplan for the given rectangular hard modules and then HS algorithm is applied to obtain an optimal solution for the efficient floorplan. The experimental results of the HS algorithm are obtained for the MCNC benchmark circuits.

Keywords: Floor planning, harmony search, non-slicing floorplan, very large scale integrated circuits.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1914

58 Solution Economic Power Dispatch Problems by an Ant Colony Optimization Approach

Authors: Navid Mehdizadeh Afroozi, Khodakhast Isapour, Mojtaba Hakimzadeh, Abdolmohammad Davodi

Abstract:

The objective of the Economic Dispatch(ED) Problems of electric power generation is to schedule the committed generating units outputs so as to meet the required load demand at minimum operating cost while satisfying all units and system equality and inequality constraints. This paper presents a new method of ED problems utilizing the Max-Min Ant System Optimization. Historically, traditional optimizations techniques have been used, such as linear and non-linear programming, but within the past decade the focus has shifted on the utilization of Evolutionary Algorithms, as an example Genetic Algorithms, Simulated Annealing and recently Ant Colony Optimization (ACO). In this paper we introduce the Max-Min Ant System based version of the Ant System. This algorithm encourages local searching around the best solution found in each iteration. To show its efficiency and effectiveness, the proposed Max-Min Ant System is applied to sample ED problems composed of 4 generators. Comparison to conventional genetic algorithms is presented.

Keywords: Economic Dispatch (ED), Ant Colony Optimization, Fuel Cost, Algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2540

57 A Novel Genetic Algorithm Designed for Hardware Implementation

Authors: Zhenhuan Zhu, David Mulvaney, Vassilios Chouliaras

Abstract:

A new genetic algorithm, termed the 'optimum individual monogenetic genetic algorithm' (OIMGA), is presented whose properties have been deliberately designed to be well suited to hardware implementation. Specific design criteria were to ensure fast access to the individuals in the population, to keep the required silicon area for hardware implementation to a minimum and to incorporate flexibility in the structure for the targeting of a range of applications. The first two criteria are met by retaining only the current optimum individual, thereby guaranteeing a small memory requirement that can easily be stored in fast on-chip memory. Also, OIMGA can be easily reconfigured to allow the investigation of problems that normally warrant either large GA populations or individuals many genes in length. Local convergence is achieved in OIMGA by retaining elite individuals, while population diversity is ensured by continually searching for the best individuals in fresh regions of the search space. The results given in this paper demonstrate that both the performance of OIMGA and its convergence time are superior to those of a range of existing hardware GA implementations.

Keywords: Genetic algorithms, genetic hardware, machinelearning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1983

56 Simulation of Enhanced Biomass Gasification for Hydrogen Production using iCON

Authors: Mohd K. Yunus, Murni M. Ahmad, Abrar Inayat, Suzana Yusup

Abstract:

Due to the environmental and price issues of current energy crisis, scientists and technologists around the globe are intensively searching for new environmentally less-impact form of clean energy that will reduce the high dependency on fossil fuel. Particularly hydrogen can be produced from biomass via thermochemical processes including pyrolysis and gasification due to the economic advantage and can be further enhanced through in-situ carbon dioxide removal using calcium oxide. This work focuses on the synthesis and development of the flowsheet for the enhanced biomass gasification process in PETRONAS-s iCON process simulation software. This hydrogen prediction model is conducted at operating temperature between 600 to 1000oC at atmospheric pressure. Effects of temperature, steam-to-biomass ratio and adsorbent-to-biomass ratio were studied and 0.85 mol fraction of hydrogen is predicted in the product gas. Comparisons of the results are also made with experimental data from literature. The preliminary economic potential of developed system is RM 12.57 x 106 which equivalent to USD 3.77 x 106 annually shows economic viability of this process.

Keywords: Biomass, Gasification, Hydrogen, iCON.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2556

55 Research Topic Map Construction

Authors: Hei-Chia Wang, Che-Tsung Yang

Abstract:

While the explosive increase in information published on the Web, researchers have to filter information when searching for conference related information. To make it easier for users to search related information, this paper uses Topic Maps and social information to implement ontology since ontology can provide the formalisms and knowledge structuring for comprehensive and transportable machine understanding that digital information requires. Besides enhancing information in Topic Maps, this paper proposes a method of constructing research Topic Maps considering social information. First, extract conference data from the web. Then extract conference topics and the relationships between them through the proposed method. Finally visualize it for users to search and browse. This paper uses ontology, containing abundant of knowledge hierarchy structure, to facilitate researchers getting useful search results. However, most previous ontology construction methods didn-t take “people" into account. So this paper also analyzes the social information which helps researchers find the possibilities of cooperation/combination as well as associations between research topics, and tries to offer better results.

Keywords: Ontology, topic maps, social information, co-authorship.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1760

54 Ontology-Based Backpropagation Neural Network Classification and Reasoning Strategy for NoSQL and SQL Databases

Authors: Hao-Hsiang Ku, Ching-Ho Chi

Abstract:

Big data applications have become an imperative for many fields. Many researchers have been devoted into increasing correct rates and reducing time complexities. Hence, the study designs and proposes an Ontology-based backpropagation neural network classification and reasoning strategy for NoSQL big data applications, which is called ON4NoSQL. ON4NoSQL is responsible for enhancing the performances of classifications in NoSQL and SQL databases to build up mass behavior models. Mass behavior models are made by MapReduce techniques and Hadoop distributed file system based on Hadoop service platform. The reference engine of ON4NoSQL is the ontology-based backpropagation neural network classification and reasoning strategy. Simulation results indicate that ON4NoSQL can efficiently achieve to construct a high performance environment for data storing, searching, and retrieving.

Keywords: Hadoop, NoSQL, ontology, backpropagation neural network, and high distributed file system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 961

53 Exploiting Query Feedback for Efficient Query Routing in Unstructured Peer-to-peer Networks

Authors: Iskandar Ishak, Naomie Salim

Abstract:

Unstructured peer-to-peer networks are popular due to its robustness and scalability. Query schemes that are being used in unstructured peer-to-peer such as the flooding and interest-based shortcuts suffer various problems such as using large communication overhead long delay response. The use of routing indices has been a popular approach for peer-to-peer query routing. It helps the query routing processes to learn the routing based on the feedbacks collected. In an unstructured network where there is no global information available, efficient and low cost routing approach is needed for routing efficiency. In this paper, we propose a novel mechanism for query-feedback oriented routing indices to achieve routing efficiency in unstructured network at a minimal cost. The approach also applied information retrieval technique to make sure the content of the query is understandable and will make the routing process not just based to the query hits but also related to the query content. Experiments have shown that the proposed mechanism performs more efficient than flood-based routing.

Keywords: Unstructured peer-to-peer, Searching, Retrieval, Internet.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1492

52 Molecular Dynamics of Fatty Acid Interacting with Carbon Nanotube as Selective Device

Authors: David L. Azevedo, Jordan Del Nero

Abstract:

In this paper we study a system composed by carbon nanotube (CNT) and bundle of carbon nanotube (BuCNT) interacting with a specific fatty acid as molecular probe. Full system is represented by open nanotube (or nanotubes) and the linoleic acid (LA) relaxing due the interaction with CNT and BuCNT. The LA has in his form an asymmetric shape with COOH termination provoking a close BuCNT interaction mainly by van der Waals force field. The simulations were performed by classical molecular dynamics with standard parameterizations. Our results show that these BuCNT and CNT are dynamically stable and it shows a preferential interaction position with LA resulting in three features: (i) when the LA is interacting with CNT and BuCNT (including both termination, CH2 or COOH), the LA is repelled; (ii) when the LA terminated with CH2 is closer to open extremity of BuCNT, the LA is also repelled by the interaction between them; and (iii) when the LA terminated with COOH is closer to open extremity of BuCNT, the LA is encapsulated by the BuCNT. These simulations are part of a more extensive work on searching efficient selective molecular devices and could be useful to reach this goal.

Keywords: Carbon Nanotube, Linoleic Acid, MolecularDynamics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1638

51 An Approach to Integrate Ontologies of Open Educational Resources in Knowledge Based Management Systems

Authors: Firas A. Al Laban, Mohamed Chabi, Sammani Danwawu Abdullahi

Abstract:

There are real needs to integrate types of Open Educational Resources (OER) with an intelligent system to extract information and knowledge in the semantic searching level. The needs came because most of current learning standard adopted web based learning and the e-learning systems do not always serve all educational goals. Semantic Web systems provide educators, students, and researchers with intelligent queries based on a semantic knowledge management learning system. An ontology-based learning system is an advanced system, where ontology plays the core of the semantic web in a smart learning environment. The objective of this paper is to discuss the potentials of ontologies and mapping different kinds of ontologies; heterogeneous or homogenous to manage and control different types of Open Educational Resources. The important contribution of this research is that it uses logical rules and conceptual relations to map between ontologies of different educational resources. We expect from this methodology to establish an intelligent educational system supporting student tutoring, self and lifelong learning system.

Keywords: Knowledge Management Systems, Ontologies, Semantic Web, Open Educational Resources.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1520

50 Effects of Introducing Similarity Measures into Artificial Bee Colony Approach for Optimization of Vehicle Routing Problem

Authors: P. Shunmugapriya, S. Kanmani, P. Jude Fredieric, U. Vignesh, J. Reman Justin, K. Vivek

Abstract:

Vehicle Routing Problem (VRP) is a complex combinatorial optimization problem and it is quite difficult to find an optimal solution consisting of a set of routes for vehicles whose total cost is minimum. Evolutionary and swarm intelligent (SI) algorithms play a vital role in solving optimization problems. While the SI algorithms perform search, the diversity between the solutions they exploit is very important. This is because of the need to avoid early convergence and to get an appropriate balance between the exploration and exploitation. Therefore, it is important to check how far the solutions are diverse. In this paper, we measure the similarity between solutions, which ABC exploits while optimizing VRP. The similar solutions found are discarded at the end of the iteration and only unique solutions are passed on to the next iteration. The bees of discarded solutions become scouts and they start searching for new solutions. This process is continued and results show that the solution is optimized at lesser number of iterations but with the overhead of computing similarity in all the iterations. The problem instance from Solomon benchmarked dataset has been used for evaluating the presented methodology.

Keywords: ABC algorithm, vehicle routing problem, optimization, Jaccard’s similarity measure.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 800

49 Thread Lift: Classification, Technique, and How to Approach to the Patient

Authors: Panprapa Yongtrakul, Punyaphat Sirithanabadeekul, Pakjira Siriphan

Abstract:

Background: The thread lift technique has become popular because it is less invasive, requires a shorter operation, less downtime, and results in fewer postoperative complications. The advantage of the technique is that the thread can be inserted under the skin without the need for long incisions. Currently, there are a lot of thread lift techniques with respect to the specific types of thread used on specific areas, such as the mid-face, lower face, or neck area. Objective: To review the thread lift technique for specific areas according to type of thread, patient selection, and how to match the most appropriate to the patient. Materials and Methods: A literature review technique was conducted by searching PubMed and MEDLINE, then compiled and summarized. Result: We have divided our protocols into two sections: Protocols for short suture, and protocols for long suture techniques. We also created 3D pictures for each technique to enhance understanding and application in a clinical setting. Conclusion: There are advantages and disadvantages to short suture and long suture techniques. The best outcome for each patient depends on appropriate patient selection and determining the most suitable technique for the defect and area of patient concern.

Keywords: Thread lift, thread lift method, thread lift technique, thread lift procedure, threading.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 10088

48 Application of Exact String Matching Algorithms towards SMILES Representation of Chemical Structure

Authors: Ahmad Fadel Klaib, Zurinahni Zainol, Nurul Hashimah Ahamed, Rosma Ahmad, Wahidah Hussin

Abstract:

Bioinformatics and Cheminformatics use computer as disciplines providing tools for acquisition, storage, processing, analysis, integrate data and for the development of potential applications of biological and chemical data. A chemical database is one of the databases that exclusively designed to store chemical information. NMRShiftDB is one of the main databases that used to represent the chemical structures in 2D or 3D structures. SMILES format is one of many ways to write a chemical structure in a linear format. In this study we extracted Antimicrobial Structures in SMILES format from NMRShiftDB and stored it in our Local Data Warehouse with its corresponding information. Additionally, we developed a searching tool that would response to user-s query using the JME Editor tool that allows user to draw or edit molecules and converts the drawn structure into SMILES format. We applied Quick Search algorithm to search for Antimicrobial Structures in our Local Data Ware House.

Keywords: Exact String-matching Algorithms, NMRShiftDB, SMILES Format, Antimicrobial Structures.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2173

47 An Alternative and Complementary Medicine Method in Vulnerable Pediatric Cancer Patients: Yoga

Authors: Ç. Erdoğan, T. Turan

Abstract:

Pediatric cancer patients experience multiple distressing, challenges, physical symptom such as fatigue, pain, sleep disturbance, and balance impairment that continue years after treatment completion. In recent years, yoga is often used in children with cancer to cope with these symptoms. Yoga practice is defined as a unique physical activity that combines physical practice, breath work and mindfulness/meditation. Yoga is an increasingly popular mind-body practice also characterized as a mindfulness mode of exercise. This study aimed to evaluate the impact of yoga intervention of children with cancer. This article planned searching the literature in this field. It has been determined that individualized yoga is feasible and provides benefits for inpatient children, improves health-related quality of life, physical activity levels, physical fitness. After yoga program, children anxiety score decreases significantly. Additionally, individualized yoga is feasible for inpatient children receiving intensive chemotherapy. As a result, yoga is an alternative and complementary medicine that can be safely used in children with cancer.

Keywords: Cancer treatment, children, nursing, yoga.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1066

46 Dynamic Capitalization and Visualization Strategy in Collaborative Knowledge Management System for EI Process

Authors: Bolanle F. Oladejo, Victor T. Odumuyiwa, Amos A. David

Abstract:

Knowledge is attributed to human whose problemsolving behavior is subjective and complex. In today-s knowledge economy, the need to manage knowledge produced by a community of actors cannot be overemphasized. This is due to the fact that actors possess some level of tacit knowledge which is generally difficult to articulate. Problem-solving requires searching and sharing of knowledge among a group of actors in a particular context. Knowledge expressed within the context of a problem resolution must be capitalized for future reuse. In this paper, an approach that permits dynamic capitalization of relevant and reliable actors- knowledge in solving decision problem following Economic Intelligence process is proposed. Knowledge annotation method and temporal attributes are used for handling the complexity in the communication among actors and in contextualizing expressed knowledge. A prototype is built to demonstrate the functionalities of a collaborative Knowledge Management system based on this approach. It is tested with sample cases and the result showed that dynamic capitalization leads to knowledge validation hence increasing reliability of captured knowledge for reuse. The system can be adapted to various domains.

Keywords: Actors' communication, knowledge annotation, recursive knowledge capitalization, visualization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1332

45 Data Hiding by Vector Quantization in Color Image

Authors: Yung-Gi Wu

Abstract:

With the growing of computer and network, digital data can be spread to anywhere in the world quickly. In addition, digital data can also be copied or tampered easily so that the security issue becomes an important topic in the protection of digital data. Digital watermark is a method to protect the ownership of digital data. Embedding the watermark will influence the quality certainly. In this paper, Vector Quantization (VQ) is used to embed the watermark into the image to fulfill the goal of data hiding. This kind of watermarking is invisible which means that the users will not conscious the existing of embedded watermark even though the embedded image has tiny difference compared to the original image. Meanwhile, VQ needs a lot of computation burden so that we adopt a fast VQ encoding scheme by partial distortion searching (PDS) and mean approximation scheme to speed up the data hiding process. The watermarks we hide to the image could be gray, bi-level and color images. Texts are also can be regarded as watermark to embed. In order to test the robustness of the system, we adopt Photoshop to fulfill sharpen, cropping and altering to check if the extracted watermark is still recognizable. Experimental results demonstrate that the proposed system can resist the above three kinds of tampering in general cases.

Keywords: Data hiding, vector quantization, watermark.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1737

44 Self-evolving Artificial Immune System via Developing T and B Cell for Permutation Flow-shop Scheduling Problems

Authors: Pei-Chann Chang, Wei-Hsiu Huang, Ching-Jung Ting, Hwei-Wen Luo, Yu-Peng Yu

Abstract:

Artificial Immune System is applied as a Heuristic Algorithm for decades. Nevertheless, many of these applications took advantage of the benefit of this algorithm but seldom proposed approaches for enhancing the efficiency. In this paper, a Self-evolving Artificial Immune System is proposed via developing the T and B cell in Immune System and built a self-evolving mechanism for the complexities of different problems. In this research, it focuses on enhancing the efficiency of Clonal selection which is responsible for producing Affinities to resist the invading of Antigens. T and B cell are the main mechanisms for Clonal Selection to produce different combinations of Antibodies. Therefore, the development of T and B cell will influence the efficiency of Clonal Selection for searching better solution. Furthermore, for better cooperation of the two cells, a co-evolutional strategy is applied to coordinate for more effective productions of Antibodies. This work finally adopts Flow-shop scheduling instances in OR-library to validate the proposed algorithm.

Keywords: Artificial Immune System, Clonal Selection, Flow-shop Scheduling Problems, Co-evolutional strategy

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1704

43 Fast Database Indexing for Large Protein Sequence Collections Using Parallel N-Gram Transformation Algorithm

Authors: Jehad A. H. Hammad, Nur'Aini binti Abdul Rashid

Abstract:

With the rapid development in the field of life sciences and the flooding of genomic information, the need for faster and scalable searching methods has become urgent. One of the approaches that were investigated is indexing. The indexing methods have been categorized into three categories which are the lengthbased index algorithms, transformation-based algorithms and mixed techniques-based algorithms. In this research, we focused on the transformation based methods. We embedded the N-gram method into the transformation-based method to build an inverted index table. We then applied the parallel methods to speed up the index building time and to reduce the overall retrieval time when querying the genomic database. Our experiments show that the use of N-Gram transformation algorithm is an economical solution; it saves time and space too. The result shows that the size of the index is smaller than the size of the dataset when the size of N-Gram is 5 and 6. The parallel N-Gram transformation algorithm-s results indicate that the uses of parallel programming with large dataset are promising which can be improved further.

Keywords: Biological sequence, Database index, N-gram indexing, Parallel computing, Sequence retrieval.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2081

42 Long Term Variability of Temperature in Armenia in the Context of Climate Change

Authors: Hrachuhi Galstyan, Lucian Sfîcă, Pavel Ichim

Abstract:

The purpose of this study is to analyze the temporal and spatial variability of thermal conditions in the Republic of Armenia. The paper describes annual fluctuations in air temperature. Research has been focused on case study region of Armenia and surrounding areas, where long–term measurements and observations of weather conditions have been performed within the National Meteorological Service of Armenia and its surrounding areas. The study contains yearly air temperature data recorded between 1961- 2012. Mann-Kendal test and the autocorrelation function were applied to detect the change trend of annual mean temperature, as well as other parametric and non-parametric tests searching to find the presence of some breaks in the long term evolution of temperature. The analysis of all records reveals a tendency mostly towards warmer years, with increased temperatures especially in valleys and inner basins. The maximum temperature increase is up to 1,5°C. Negative results have not been observed in Armenia. The patterns of temperature change have been observed since the 1990’s over much of the Armenian territory. The climate in Armenia was influenced by global change in the last 2 decades, as results from the methods employed within the study.

Keywords: Air temperature, long-term variability, trend, climate change.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2151

41 On Pattern-Based Programming towards the Discovery of Frequent Patterns

Authors: Kittisak Kerdprasop, Nittaya Kerdprasop

Abstract:

The problem of frequent pattern discovery is defined as the process of searching for patterns such as sets of features or items that appear in data frequently. Finding such frequent patterns has become an important data mining task because it reveals associations, correlations, and many other interesting relationships hidden in a database. Most of the proposed frequent pattern mining algorithms have been implemented with imperative programming languages. Such paradigm is inefficient when set of patterns is large and the frequent pattern is long. We suggest a high-level declarative style of programming apply to the problem of frequent pattern discovery. We consider two languages: Haskell and Prolog. Our intuitive idea is that the problem of finding frequent patterns should be efficiently and concisely implemented via a declarative paradigm since pattern matching is a fundamental feature supported by most functional languages and Prolog. Our frequent pattern mining implementation using the Haskell and Prolog languages confirms our hypothesis about conciseness of the program. The comparative performance studies on line-of-code, speed and memory usage of declarative versus imperative programming have been reported in the paper.

Keywords: Frequent pattern mining, functional programming, pattern matching, logic programming.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1293

40 Cluster Algorithm for Genetic Diversity

Authors: Manpreet Singh, Keerat Kaur, Bhavdeep Singh

Abstract:

With the hardware technology advancing, the cost of storing is decreasing. Thus there is an urgent need for new techniques and tools that can intelligently and automatically assist us in transferring this data into useful knowledge. Different techniques of data mining are developed which are helpful for handling these large size databases [7]. Data mining is also finding its role in the field of biotechnology. Pedigree means the associated ancestry of a crop variety. Genetic diversity is the variation in the genetic composition of individuals within or among species. Genetic diversity depends upon the pedigree information of the varieties. Parents at lower hierarchic levels have more weightage for predicting genetic diversity as compared to the upper hierarchic levels. The weightage decreases as the level increases. For crossbreeding, the two varieties should be more and more genetically diverse so as to incorporate the useful characters of the two varieties in the newly developed variety. This paper discusses the searching and analyzing of different possible pairs of varieties selected on the basis of morphological characters, Climatic conditions and Nutrients so as to obtain the most optimal pair that can produce the required crossbreed variety. An algorithm was developed to determine the genetic diversity between the selected wheat varieties. Cluster analysis technique is used for retrieving the results.

Keywords: Genetic diversity, pedigree, nutrients.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1757

39 Grouping and Indexing Color Features for Efficient Image Retrieval

Authors: M. V. Sudhamani, C. R. Venugopal

Abstract:

Content-based Image Retrieval (CBIR) aims at searching image databases for specific images that are similar to a given query image based on matching of features derived from the image content. This paper focuses on a low-dimensional color based indexing technique for achieving efficient and effective retrieval performance. In our approach, the color features are extracted using the mean shift algorithm, a robust clustering technique. Then the cluster (region) mode is used as representative of the image in 3-D color space. The feature descriptor consists of the representative color of a region and is indexed using a spatial indexing method that uses *R -tree thus avoiding the high-dimensional indexing problems associated with the traditional color histogram. Alternatively, the images in the database are clustered based on region feature similarity using Euclidian distance. Only representative (centroids) features of these clusters are indexed using *R -tree thus improving the efficiency. For similarity retrieval, each representative color in the query image or region is used independently to find regions containing that color. The results of these methods are compared. A JAVA based query engine supporting query-by- example is built to retrieve images by color.

Keywords: Content-based, indexing, cluster, region.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1770

38 Intelligent Mobile Search Oriented to Global e-Commerce

Authors: Abdelkader Dekdouk

Abstract:

In this paper we propose a novel approach for searching eCommerce products using a mobile phone, illustrated by a prototype eCoMobile. This approach aims to globalize the mobile search by integrating the concept of user multilinguism into it. To show that, we particularly deal with English and Arabic languages. Indeed the mobile user can formulate his query on a commercial product in either language (English/Arabic). The description of his information need on commercial products relies on the ontology that represents the conceptualization of the product catalogue knowledge domain defined in both English and Arabic languages. A query expressed on a mobile device client defines the concept that corresponds to the name of the product followed by a set of pairs (property, value) specifying the characteristics of the product. Once a query is submitted it is then communicated to the server side which analyses it and in its turn performs an http request to an eCommerce application server (like Amazon). This latter responds by returning an XML file representing a set of elements where each element defines an item of the searched product with its specific characteristics. The XML file is analyzed on the server side and then items are displayed on the mobile device client along with its relevant characteristics in the chosen language.

Keywords: Mobile computing, search engine, multilingualglobal eCommerce, ontology, XML.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2044

37 Fuzzy C-Means Clustering for Biomedical Documents Using Ontology Based Indexing and Semantic Annotation

Authors: S. Logeswari, K. Premalatha

Abstract:

Search is the most obvious application of information retrieval. The variety of widely obtainable biomedical data is enormous and is expanding fast. This expansion makes the existing techniques are not enough to extract the most interesting patterns from the collection as per the user requirement. Recent researches are concentrating more on semantic based searching than the traditional term based searches. Algorithms for semantic searches are implemented based on the relations exist between the words of the documents. Ontologies are used as domain knowledge for identifying the semantic relations as well as to structure the data for effective information retrieval. Annotation of data with concepts of ontology is one of the wide-ranging practices for clustering the documents. In this paper, indexing based on concept and annotation are proposed for clustering the biomedical documents. Fuzzy c-means (FCM) clustering algorithm is used to cluster the documents. The performances of the proposed methods are analyzed with traditional term based clustering for PubMed articles in five different diseases communities. The experimental results show that the proposed methods outperform the term based fuzzy clustering.

Keywords: MeSH Ontology, Concept Indexing, Annotation, semantic relations, Fuzzy c-means.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2255

36 Searching for Forensic Evidence in a Compromised Virtual Web Server against SQL Injection Attacks and PHP Web Shell

Authors: Gigih Supriyatno

Abstract:

SQL injection is one of the most common types of attacks and has a very critical impact on web servers. In the worst case, an attacker can perform post-exploitation after a successful SQL injection attack. In the case of forensics web servers, web server analysis is closely related to log file analysis. But sometimes large file sizes and different log types make it difficult for investigators to look for traces of attackers on the server. The purpose of this paper is to help investigator take appropriate steps to investigate when the web server gets attacked. We use attack scenarios using SQL injection attacks including PHP backdoor injection as post-exploitation. We perform post-mortem analysis of web server logs based on Hypertext Transfer Protocol (HTTP) POST and HTTP GET method approaches that are characteristic of SQL injection attacks. In addition, we also propose structured analysis method between the web server application log file, database application, and other additional logs that exist on the webserver. This method makes the investigator more structured to analyze the log file so as to produce evidence of attack with acceptable time. There is also the possibility that other attack techniques can be detected with this method. On the other side, it can help web administrators to prepare their systems for the forensic readiness.

Keywords: Web forensic, SQL injection, web shell, investigation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1193

35 Optimizing Spatial Trend Detection By Artificial Immune Systems

Authors: M. Derakhshanfar, B. Minaei-Bidgoli

Abstract:

Spatial trends are one of the valuable patterns in geo databases. They play an important role in data analysis and knowledge discovery from spatial data. A spatial trend is a regular change of one or more non spatial attributes when spatially moving away from a start object. Spatial trend detection is a graph search problem therefore heuristic methods can be good solution. Artificial immune system (AIS) is a special method for searching and optimizing. AIS is a novel evolutionary paradigm inspired by the biological immune system. The models based on immune system principles, such as the clonal selection theory, the immune network model or the negative selection algorithm, have been finding increasing applications in fields of science and engineering. In this paper, we develop a novel immunological algorithm based on clonal selection algorithm (CSA) for spatial trend detection. We are created neighborhood graph and neighborhood path, then select spatial trends that their affinity is high for antibody. In an evolutionary process with artificial immune algorithm, affinity of low trends is increased with mutation until stop condition is satisfied.

Keywords: Spatial Data Mining, Spatial Trend Detection, Heuristic Methods, Artificial Immune System, Clonal Selection Algorithm (CSA)

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2007