Search results for: algorithm techniques
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 9577

Search results for: algorithm techniques

9127 Framework for Detecting External Plagiarism from Monolingual Documents: Use of Shallow NLP and N-Gram Frequency Comparison

Authors: Saugata Bose, Ritambhra Korpal

Abstract:

The internet has increased the copy-paste scenarios amongst students as well as amongst researchers leading to different levels of plagiarized documents. For this reason, much of research is focused on for detecting plagiarism automatically. In this paper, an initiative is discussed where Natural Language Processing (NLP) techniques as well as supervised machine learning algorithms have been combined to detect plagiarized texts. Here, the major emphasis is on to construct a framework which detects external plagiarism from monolingual texts successfully. For successfully detecting the plagiarism, n-gram frequency comparison approach has been implemented to construct the model framework. The framework is based on 120 characteristics which have been extracted during pre-processing the documents using NLP approach. Afterwards, filter metrics has been applied to select most relevant characteristics and then supervised classification learning algorithm has been used to classify the documents in four levels of plagiarism. Confusion matrix was built to estimate the false positives and false negatives. Our plagiarism framework achieved a very high the accuracy score.

Keywords: lexical matching, shallow NLP, supervised machine learning algorithm, word n-gram

Procedia PDF Downloads 342
9126 Cost Sensitive Feature Selection in Decision-Theoretic Rough Set Models for Customer Churn Prediction: The Case of Telecommunication Sector Customers

Authors: Emel Kızılkaya Aydogan, Mihrimah Ozmen, Yılmaz Delice

Abstract:

In recent days, there is a change and the ongoing development of the telecommunications sector in the global market. In this sector, churn analysis techniques are commonly used for analysing why some customers terminate their service subscriptions prematurely. In addition, customer churn is utmost significant in this sector since it causes to important business loss. Many companies make various researches in order to prevent losses while increasing customer loyalty. Although a large quantity of accumulated data is available in this sector, their usefulness is limited by data quality and relevance. In this paper, a cost-sensitive feature selection framework is developed aiming to obtain the feature reducts to predict customer churn. The framework is a cost based optional pre-processing stage to remove redundant features for churn management. In addition, this cost-based feature selection algorithm is applied in a telecommunication company in Turkey and the results obtained with this algorithm.

Keywords: churn prediction, data mining, decision-theoretic rough set, feature selection

Procedia PDF Downloads 428
9125 An Accurate Method for Phylogeny Tree Reconstruction Based on a Modified Wild Dog Algorithm

Authors: Essam Al Daoud

Abstract:

This study solves a phylogeny problem by using modified wild dog pack optimization. The least squares error is considered as a cost function that needs to be minimized. Therefore, in each iteration, new distance matrices based on the constructed trees are calculated and used to select the alpha dog. To test the suggested algorithm, ten homologous genes are selected and collected from National Center for Biotechnology Information (NCBI) databanks (i.e., 16S, 18S, 28S, Cox 1, ITS1, ITS2, ETS, ATPB, Hsp90, and STN). The data are divided into three categories: 50 taxa, 100 taxa and 500 taxa. The empirical results show that the proposed algorithm is more reliable and accurate than other implemented methods.

Keywords: least square, neighbor joining, phylogenetic tree, wild dog pack

Procedia PDF Downloads 308
9124 Identification of Soft Faults in Branched Wire Networks by Distributed Reflectometry and Multi-Objective Genetic Algorithm

Authors: Soumaya Sallem, Marc Olivas

Abstract:

This contribution presents a method for detecting, locating, and characterizing soft faults in a complex wired network. The proposed method is based on multi-carrier reflectometry MCTDR (Multi-Carrier Time Domain Reflectometry) combined with a multi-objective genetic algorithm. In order to ensure complete network coverage and eliminate diagnosis ambiguities, the MCTDR test signal is injected at several points on the network, and the data is merged between different reflectometers (sensors) distributed on the network. An adapted multi-objective genetic algorithm is used to merge data in order to obtain more accurate faults location and characterization. The proposed method performances are evaluated from numerical and experimental results.

Keywords: wired network, reflectometry, network distributed diagnosis, multi-objective genetic algorithm

Procedia PDF Downloads 178
9123 Estimating Air Particulate Matter 10 Using Satellite Data and Analyzing Its Annual Temporal Pattern over Gaza Strip, Palestine

Authors: ِAbdallah A. A. Shaheen

Abstract:

Gaza Strip faces economic and political issues such as conflict, siege and urbanization; all these have led to an increase in the air pollution over Gaza Strip. In this study, Particulate matter 10 (PM10) concentration over Gaza Strip has been estimated by Landsat Thematic Mapper (TM) and Landsat Enhanced Thematic Mapper Plus (ETM+) data, based on a multispectral algorithm. Simultaneously, in-situ measurements for the corresponding particulate are acquired for selected time period. Landsat and ground data for eleven years are used to develop the algorithm while four years data (2002, 2006, 2010 and 2014) have been used to validate the results of algorithm. The developed algorithm gives highest regression, R coefficient value i.e. 0.86; RMSE value as 9.71 µg/m³; P values as 0. Average validation of algorithm show that calculated PM10 strongly correlates with measured PM10, indicating high efficiency of algorithm for the mapping of PM10 concentration during the years 2000 to 2014. Overall results show increase in minimum, maximum and average yearly PM10 concentrations, also presents similar trend over urban area. The rate of urbanization has been evaluated by supervised classification of the Landsat image. Urban sprawl from year 2000 to 2014 results in a high concentration of PM10 in the study area.

Keywords: PM10, landsat, atmospheric reflectance, Gaza strip, urbanization

Procedia PDF Downloads 236
9122 High Secure Data Hiding Using Cropping Image and Least Significant Bit Steganography

Authors: Khalid A. Al-Afandy, El-Sayyed El-Rabaie, Osama Salah, Ahmed El-Mhalaway

Abstract:

This paper presents a high secure data hiding technique using image cropping and Least Significant Bit (LSB) steganography. The predefined certain secret coordinate crops will be extracted from the cover image. The secret text message will be divided into sections. These sections quantity is equal the image crops quantity. Each section from the secret text message will embed into an image crop with a secret sequence using LSB technique. The embedding is done using the cover image color channels. Stego image is given by reassembling the image and the stego crops. The results of the technique will be compared to the other state of art techniques. Evaluation is based on visualization to detect any degradation of stego image, the difficulty of extracting the embedded data by any unauthorized viewer, Peak Signal-to-Noise Ratio of stego image (PSNR), and the embedding algorithm CPU time. Experimental results ensure that the proposed technique is more secure compared with the other traditional techniques.

Keywords: steganography, stego, LSB, crop

Procedia PDF Downloads 255
9121 Fast and Robust Long-term Tracking with Effective Searching Model

Authors: Thang V. Kieu, Long P. Nguyen

Abstract:

Kernelized Correlation Filter (KCF) based trackers have gained a lot of attention recently because of their accuracy and fast calculation speed. However, this algorithm is not robust in cases where the object is lost by a sudden change of direction, being obscured or going out of view. In order to improve KCF performance in long-term tracking, this paper proposes an anomaly detection method for target loss warning by analyzing the response map of each frame, and a classification algorithm for reliable target re-locating mechanism by using Random fern. Being tested with Visual Tracker Benchmark and Visual Object Tracking datasets, the experimental results indicated that the precision and success rate of the proposed algorithm were 2.92 and 2.61 times higher than that of the original KCF algorithm, respectively. Moreover, the proposed tracker handles occlusion better than many state-of-the-art long-term tracking methods while running at 60 frames per second.

Keywords: correlation filter, long-term tracking, random fern, real-time tracking

Procedia PDF Downloads 125
9120 Design an Development of an Agorithm for Prioritizing the Test Cases Using Neural Network as Classifier

Authors: Amit Verma, Simranjeet Kaur, Sandeep Kaur

Abstract:

Test Case Prioritization (TCP) has gained wide spread acceptance as it often results in good quality software free from defects. Due to the increase in rate of faults in software traditional techniques for prioritization results in increased cost and time. Main challenge in TCP is difficulty in manually validate the priorities of different test cases due to large size of test suites and no more emphasis are made to make the TCP process automate. The objective of this paper is to detect the priorities of different test cases using an artificial neural network which helps to predict the correct priorities with the help of back propagation algorithm. In our proposed work one such method is implemented in which priorities are assigned to different test cases based on their frequency. After assigning the priorities ANN predicts whether correct priority is assigned to every test case or not otherwise it generates the interrupt when wrong priority is assigned. In order to classify the different priority test cases classifiers are used. Proposed algorithm is very effective as it reduces the complexity with robust efficiency and makes the process automated to prioritize the test cases.

Keywords: test case prioritization, classification, artificial neural networks, TF-IDF

Procedia PDF Downloads 374
9119 A609 Modeling of AC Servomotor Using Genetic Algorithm and Tests for Control of a Robotic Joint

Authors: J. G. Batista, T. S. Santiago, E. A. Ribeiro, G. A. P. Thé

Abstract:

This work deals with parameter identification of permanent magnet motors, a class of ac motor which is particularly important in industrial automation due to characteristics like applications high performance, are very attractive for applications with limited space and reducing the need to eliminate because they have reduced size and volume and can operate in a wide speed range, without independent ventilation. By using experimental data and genetic algorithm we have been able to extract values for both the motor inductance and the electromechanical coupling constant, which are then compared to measure and/or expected values.

Keywords: modeling, AC servomotor, permanent magnet synchronous motor-PMSM, genetic algorithm, vector control, robotic manipulator, control

Procedia PDF Downloads 504
9118 Artificial Intelligence in Bioscience: The Next Frontier

Authors: Parthiban Srinivasan

Abstract:

With recent advances in computational power and access to enough data in biosciences, artificial intelligence methods are increasingly being used in drug discovery research. These methods are essentially a series of advanced statistics based exercises that review the past to indicate the likely future. Our goal is to develop a model that accurately predicts biological activity and toxicity parameters for novel compounds. We have compiled a robust library of over 150,000 chemical compounds with different pharmacological properties from literature and public domain databases. The compounds are stored in simplified molecular-input line-entry system (SMILES), a commonly used text encoding for organic molecules. We utilize an automated process to generate an array of numerical descriptors (features) for each molecule. Redundant and irrelevant descriptors are eliminated iteratively. Our prediction engine is based on a portfolio of machine learning algorithms. We found Random Forest algorithm to be a better choice for this analysis. We captured non-linear relationship in the data and formed a prediction model with reasonable accuracy by averaging across a large number of randomized decision trees. Our next step is to apply deep neural network (DNN) algorithm to predict the biological activity and toxicity properties. We expect the DNN algorithm to give better results and improve the accuracy of the prediction. This presentation will review all these prominent machine learning and deep learning methods, our implementation protocols and discuss these techniques for their usefulness in biomedical and health informatics.

Keywords: deep learning, drug discovery, health informatics, machine learning, toxicity prediction

Procedia PDF Downloads 345
9117 An Algorithm Based on the Nonlinear Filter Generator for Speech Encryption

Authors: A. Belmeguenai, K. Mansouri, R. Djemili

Abstract:

This work present a new algorithm based on the nonlinear filter generator for speech encryption and decryption. The proposed algorithm consists on the use a linear feedback shift register (LFSR) whose polynomial is primitive and nonlinear Boolean function. The purpose of this system is to construct Keystream with good statistical properties, but also easily computable on a machine with limited capacity calculated. This proposed speech encryption scheme is very simple, highly efficient, and fast to implement the speech encryption and decryption. We conclude the paper by showing that this system can resist certain known attacks.

Keywords: nonlinear filter generator, stream ciphers, speech encryption, security analysis

Procedia PDF Downloads 278
9116 Optimizing Network Latency with Fast Path Assignment for Incoming Flows

Authors: Qing Lyu, Hang Zhu

Abstract:

Various flows in the network require to go through different types of middlebox. The improper placement of network middlebox and path assignment for flows could greatly increase the network latency and also decrease the performance of network. Minimizing the total end to end latency of all the ows requires to assign path for the incoming flows. In this paper, the flow path assignment problem in regard to the placement of various kinds of middlebox is studied. The flow path assignment problem is formulated to a linear programming problem, which is very time consuming. On the other hand, a naive greedy algorithm is studied. Which is very fast but causes much more latency than the linear programming algorithm. At last, the paper presents a heuristic algorithm named FPA, which takes bottleneck link information and estimated bandwidth occupancy into consideration, and achieves near optimal latency in much less time. Evaluation results validate the effectiveness of the proposed algorithm.

Keywords: flow path, latency, middlebox, network

Procedia PDF Downloads 192
9115 A Dynamic Ensemble Learning Approach for Online Anomaly Detection in Alibaba Datacenters

Authors: Wanyi Zhu, Xia Ming, Huafeng Wang, Junda Chen, Lu Liu, Jiangwei Jiang, Guohua Liu

Abstract:

Anomaly detection is a first and imperative step needed to respond to unexpected problems and to assure high performance and security in large data center management. This paper presents an online anomaly detection system through an innovative approach of ensemble machine learning and adaptive differentiation algorithms, and applies them to performance data collected from a continuous monitoring system for multi-tier web applications running in Alibaba data centers. We evaluate the effectiveness and efficiency of this algorithm with production traffic data and compare with the traditional anomaly detection approaches such as a static threshold and other deviation-based detection techniques. The experiment results show that our algorithm correctly identifies the unexpected performance variances of any running application, with an acceptable false positive rate. This proposed approach has already been deployed in real-time production environments to enhance the efficiency and stability in daily data center operations.

Keywords: Alibaba data centers, anomaly detection, big data computation, dynamic ensemble learning

Procedia PDF Downloads 181
9114 An Introduction to E-Content Producing Algorithm for Screen-Recorded Videos

Authors: Jamileh Darsareh, Mohammad Nikafrooz

Abstract:

Some teachers and e-content producers, based on their experiences, try to produce educational videos using screen recording software. There are many challenges that they may encounter while producing screen-recorded videos. These are in the domains of technical and pedagogical challenges like designing the roadmap, preparing the screen, setting the recording software and recording the screen, editing, etc. This study is a descriptive study and tries to present some procedures for producing acceptable and well-made videos. These procedures are presented in the form of an algorithm for producing screen-recorded video. This algorithm presents the main producing phases, including design, pre-production, production, post-production, and distribution. These phases consist of some steps which are supported by several technical and pedagogical considerations. Following these phases and steps according to the suggested order helps the producers to produce their intended and desired video by saving time and also facing fewer technical problems. It is expected that by using this algorithm, e-content producers and teachers gain better performance in producing educational videos.

Keywords: e-content producing algorithm, screen-recorded videos, screen recording software, technical and pedagogical considerations

Procedia PDF Downloads 183
9113 Performance Comparison of Prim’s and Ant Colony Optimization Algorithm to Select Shortest Path in Case of Link Failure

Authors: Rimmy Yadav, Avtar Singh

Abstract:

—Ant Colony Optimization (ACO) is a promising modern approach to the unused combinatorial optimization. Here ACO is applied to finding the shortest during communication link failure. In this paper, the performances of the prim’s and ACO algorithm are made. By comparing the time complexity and program execution time as set of parameters, we demonstrate the pleasant performance of ACO in finding excellent solution to finding shortest path during communication link failure.

Keywords: ant colony optimization, link failure, prim’s algorithm, shortest path

Procedia PDF Downloads 380
9112 3D Reconstruction of Human Body Based on Gender Classification

Authors: Jiahe Liu, Hongyang Yu, Feng Qian, Miao Luo

Abstract:

SMPL-X was a powerful parametric human body model that included male, neutral, and female models, with significant gender differences between these three models. During the process of 3D human body reconstruction, the correct selection of standard templates was crucial for obtaining accurate results. To address this issue, we developed an efficient gender classification algorithm to automatically select the appropriate template for 3D human body reconstruction. The key to this gender classification algorithm was the precise analysis of human body features. By using the SMPL-X model, the algorithm could detect and identify gender features of the human body, thereby determining which standard template should be used. The accuracy of this algorithm made the 3D reconstruction process more accurate and reliable, as it could adjust model parameters based on individual gender differences. SMPL-X and the related gender classification algorithm have brought important advancements to the field of 3D human body reconstruction. By accurately selecting standard templates, they have improved the accuracy of reconstruction and have broad potential in various application fields. These technologies continue to drive the development of the 3D reconstruction field, providing us with more realistic and accurate human body models.

Keywords: gender classification, joint detection, SMPL-X, 3D reconstruction

Procedia PDF Downloads 51
9111 Comparative Performance of Artificial Bee Colony Based Algorithms for Wind-Thermal Unit Commitment

Authors: P. K. Singhal, R. Naresh, V. Sharma

Abstract:

This paper presents the three optimization models, namely New Binary Artificial Bee Colony (NBABC) algorithm, NBABC with Local Search (NBABC-LS), and NBABC with Genetic Crossover (NBABC-GC) for solving the Wind-Thermal Unit Commitment (WTUC) problem. The uncertain nature of the wind power is incorporated using the Weibull probability density function, which is used to calculate the overestimation and underestimation costs associated with the wind power fluctuation. The NBABC algorithm utilizes a mechanism based on the dissimilarity measure between binary strings for generating the binary solutions in WTUC problem. In NBABC algorithm, an intelligent scout bee phase is proposed that replaces the abandoned solution with the global best solution. The local search operator exploits the neighboring region of the current solutions, whereas the integration of genetic crossover with the NBABC algorithm increases the diversity in the search space and thus avoids the problem of local trappings encountered with the NBABC algorithm. These models are then used to decide the units on/off status, whereas the lambda iteration method is used to dispatch the hourly load demand among the committed units. The effectiveness of the proposed models is validated on an IEEE 10-unit thermal system combined with a wind farm over the planning period of 24 hours.

Keywords: artificial bee colony algorithm, economic dispatch, unit commitment, wind power

Procedia PDF Downloads 363
9110 Cooperative Spectrum Sensing Using Hybrid IWO/PSO Algorithm in Cognitive Radio Networks

Authors: Deepa Das, Susmita Das

Abstract:

Cognitive Radio (CR) is an emerging technology to combat the spectrum scarcity issues. This is achieved by consistently sensing the spectrum, and detecting the under-utilized frequency bands without causing undue interference to the primary user (PU). In soft decision fusion (SDF) based cooperative spectrum sensing, various evolutionary algorithms have been discussed, which optimize the weight coefficient vector for maximizing the detection performance. In this paper, we propose the hybrid invasive weed optimization and particle swarm optimization (IWO/PSO) algorithm as a fast and global optimization method, which improves the detection probability with a lesser sensing time. Then, the efficiency of this algorithm is compared with the standard invasive weed optimization (IWO), particle swarm optimization (PSO), genetic algorithm (GA) and other conventional SDF based methods on the basis of convergence and detection probability.

Keywords: cognitive radio, spectrum sensing, soft decision fusion, GA, PSO, IWO, hybrid IWO/PSO

Procedia PDF Downloads 447
9109 3D Human Body Reconstruction Based on Multiple Viewpoints

Authors: Jiahe Liu, HongyangYu, Feng Qian, Miao Luo

Abstract:

The aim of this study was to improve the effects of human body 3D reconstruction. The MvP algorithm was adopted to obtain key point information from multiple perspectives. This algorithm allowed the capture of human posture and joint positions from multiple angles, providing more comprehensive and accurate data. The study also incorporated the SMPL-X model, which has been widely used for human body modeling, to achieve more accurate 3D reconstruction results. The use of the MvP algorithm made it possible to observe the reconstructed object from multiple angles, thus reducing the problems of blind spots and missing information. This algorithm was able to effectively capture key point information, including the position and rotation angle of limbs, providing key data for subsequent 3D reconstruction. Compared with traditional single-view methods, the method of multi-view fusion significantly improved the accuracy and stability of reconstruction. By combining the MvP algorithm with the SMPL-X model, we successfully achieved better human body 3D reconstruction effects. The SMPL-X model is highly scalable and can generate highly realistic 3D human body models, thus providing more detail and shape information.

Keywords: 3D human reconstruction, multi-view, joint point, SMPL-X

Procedia PDF Downloads 50
9108 Fuzzy Population-Based Meta-Heuristic Approaches for Attribute Reduction in Rough Set Theory

Authors: Mafarja Majdi, Salwani Abdullah, Najmeh S. Jaddi

Abstract:

One of the global combinatorial optimization problems in machine learning is feature selection. It concerned with removing the irrelevant, noisy, and redundant data, along with keeping the original meaning of the original data. Attribute reduction in rough set theory is an important feature selection method. Since attribute reduction is an NP-hard problem, it is necessary to investigate fast and effective approximate algorithms. In this paper, we proposed two feature selection mechanisms based on memetic algorithms (MAs) which combine the genetic algorithm with a fuzzy record to record travel algorithm and a fuzzy controlled great deluge algorithm to identify a good balance between local search and genetic search. In order to verify the proposed approaches, numerical experiments are carried out on thirteen datasets. The results show that the MAs approaches are efficient in solving attribute reduction problems when compared with other meta-heuristic approaches.

Keywords: rough set theory, attribute reduction, fuzzy logic, memetic algorithms, record to record algorithm, great deluge algorithm

Procedia PDF Downloads 436
9107 A Hybrid Genetic Algorithm for Assembly Line Balancing In Automotive Sector

Authors: Qazi Salman Khalid, Muhammad Khalid, Shahid Maqsood

Abstract:

This paper presents a solution for optimizing the cycle time in an assembly line with human-robot collaboration and diverse operators. A genetic algorithm with tailored parameters is used to address the assembly line balancing problem in the automobile sector. A mathematical model is developed, depicting the problem. Currently, the firm runs on the largest candidate rule; however, it causes a lag in orders, which ultimately gets penalized. The results of the study show that the proposed GA is effective in providing efficient solutions and that the cycle time has significantly impacted productivity.

Keywords: line balancing, cycle time, genetic algorithm, productivity

Procedia PDF Downloads 121
9106 Real-Time Detection of Space Manipulator Self-Collision

Authors: Zhang Xiaodong, Tang Zixin, Liu Xin

Abstract:

In order to avoid self-collision of space manipulators during operation process, a real-time detection method is proposed in this paper. The manipulator is fitted into a cylinder enveloping surface, and then the detection algorithm of collision between cylinders is analyzed. The collision model of space manipulator self-links can be detected by using this algorithm in real-time detection during the operation process. To ensure security of the operation, a safety threshold is designed. The simulation and experiment results verify the effectiveness of the proposed algorithm for a 7-DOF space manipulator.

Keywords: space manipulator, collision detection, self-collision, the real-time collision detection

Procedia PDF Downloads 447
9105 A Comparative Study between Different Techniques of Off-Page and On-Page Search Engine Optimization

Authors: Ahmed Ishtiaq, Maeeda Khalid, Umair Sajjad

Abstract:

In the fast-moving world, information is the key to success. If information is easily available, then it makes work easy. The Internet is the biggest collection and source of information nowadays, and with every single day, the data on internet increases, and it becomes difficult to find required data. Everyone wants to make his/her website at the top of search results. This can be possible when you have applied some techniques of SEO inside your application or outside your application, which are two types of SEO, onsite and offsite SEO. SEO is an abbreviation of Search Engine Optimization, and it is a set of techniques, methods to increase users of a website on World Wide Web or to rank up your website in search engine indexing. In this paper, we have compared different techniques of Onpage and Offpage SEO, and we have suggested many things that should be changed inside webpage, outside web page and mentioned some most powerful and search engine considerable elements and techniques in both types of SEO in order to gain high ranking on Search Engine.

Keywords: auto-suggestion, search engine optimization, SEO, query, web mining, web crawler

Procedia PDF Downloads 130
9104 Examining the Performance of Three Multiobjective Evolutionary Algorithms Based on Benchmarking Problems

Authors: Konstantinos Metaxiotis, Konstantinos Liagkouras

Abstract:

The objective of this study is to examine the performance of three well-known multiobjective evolutionary algorithms for solving optimization problems. The first algorithm is the Non-dominated Sorting Genetic Algorithm-II (NSGA-II), the second one is the Strength Pareto Evolutionary Algorithm 2 (SPEA-2), and the third one is the Multiobjective Evolutionary Algorithms based on decomposition (MOEA/D). The examined multiobjective algorithms are analyzed and tested on the ZDT set of test functions by three performance metrics. The results indicate that the NSGA-II performs better than the other two algorithms based on three performance metrics.

Keywords: MOEAs, multiobjective optimization, ZDT test functions, evolutionary algorithms

Procedia PDF Downloads 450
9103 Incorporating Information Gain in Regular Expressions Based Classifiers

Authors: Rosa L. Figueroa, Christopher A. Flores, Qing Zeng-Treitler

Abstract:

A regular expression consists of sequence characters which allow describing a text path. Usually, in clinical research, regular expressions are manually created by programmers together with domain experts. Lately, there have been several efforts to investigate how to generate them automatically. This article presents a text classification algorithm based on regexes. The algorithm named REX was designed, and then, implemented as a simplified method to create regexes to classify Spanish text automatically. In order to classify ambiguous cases, such as, when multiple labels are assigned to a testing example, REX includes an information gain method Two sets of data were used to evaluate the algorithm’s effectiveness in clinical text classification tasks. The results indicate that the regular expression based classifier proposed in this work performs statically better regarding accuracy and F-measure than Support Vector Machine and Naïve Bayes for both datasets.

Keywords: information gain, regular expressions, smith-waterman algorithm, text classification

Procedia PDF Downloads 302
9102 Hybridized Simulated Annealing with Chemical Reaction Optimization for Solving to Sequence Alignment Problem

Authors: Ernesto Linan, Linda Cruz, Lucero Becerra

Abstract:

In this paper, a new hybridized algorithm based on Chemical Reaction Optimization and Simulated Annealing is proposed to solve the alignment sequence Problem. The Chemical Reaction Optimization is a population-based meta-heuristic algorithm based on the principles of a chemical reaction. Simulated Annealing is applied to solve a large number of combinatorial optimization problems of general-purpose. In this paper, we propose hybridization between Chemical Reaction Optimization algorithm and Simulated Annealing in order to solve the Sequence Alignment Problem. An initial population of molecules is defined at beginning of the proposed algorithm, where each molecule represents a sequence alignment problem. In order to simulate inter-molecule collisions, the process of Chemical Reaction is placed inside the Metropolis Cycle at certain values of temperature. Inside this cycle, change of molecules is done due to collisions; some molecules are accepted by applying Boltzmann probability. The results with the hybrid scheme are better than the results obtained separately.

Keywords: chemical reaction optimization, sequence alignment problem, simulated annealing algorithm, metaheuristics

Procedia PDF Downloads 197
9101 A Comparative Study of Virus Detection Techniques

Authors: Sulaiman Al amro, Ali Alkhalifah

Abstract:

The growing number of computer viruses and the detection of zero day malware have been the concern for security researchers for a large period of time. Existing antivirus products (AVs) rely on detecting virus signatures which do not provide a full solution to the problems associated with these viruses. The use of logic formulae to model the behaviour of viruses is one of the most encouraging recent developments in virus research, which provides alternatives to classic virus detection methods. In this paper, we proposed a comparative study about different virus detection techniques. This paper provides the advantages and drawbacks of different detection techniques. Different techniques will be used in this paper to provide a discussion about what technique is more effective to detect computer viruses.

Keywords: computer viruses, virus detection, signature-based, behaviour-based, heuristic-based

Procedia PDF Downloads 457
9100 Evolution under Length Constraints for Convolutional Neural Networks Architecture Design

Authors: Ousmane Youme, Jean Marie Dembele, Eugene Ezin, Christophe Cambier

Abstract:

In recent years, the convolutional neural networks (CNN) architectures designed by evolution algorithms have proven to be competitive with handcrafted architectures designed by experts. However, these algorithms need a lot of computational power, which is beyond the capabilities of most researchers and engineers. To overcome this problem, we propose an evolution architecture under length constraints. It consists of two algorithms: a search length strategy to find an optimal space and a search architecture strategy based on a genetic algorithm to find the best individual in the optimal space. Our algorithms drastically reduce resource costs and also keep good performance. On the Cifar-10 dataset, our framework presents outstanding performance with an error rate of 5.12% and only 4.6 GPU a day to converge to the optimal individual -22 GPU a day less than the lowest cost automatic evolutionary algorithm in the peer competition.

Keywords: CNN architecture, genetic algorithm, evolution algorithm, length constraints

Procedia PDF Downloads 111
9099 Performativity and Valuation Techniques: Evidence from Investment Banks in the Wake of the Global Financial Crisis

Authors: Alicja Reuben, Amira Annabi

Abstract:

In this paper, we explore the relationship between the selection of valuation techniques by investment banks and the banks’ risk perceptions and performance in the context of the theory of performativity. We use inferential statistics to study these relationships by building a unique dataset based on the disclosure of 12 investment banks’ 2012-2015 annual financial statements. Moreover, we create two constructs, namely intensity of use and risk perception. We measure the intensity of use as a frequency metric of how often a particular bank adopts valuation techniques for a particular asset or liability. We measure risk perception based on disclosed ranges of values for unobservable inputs. Our results are twofold: we find a significant negative correlation between (1) intensity of use and investment bank performance and (2) intensity of use and risk perception. These results indicate that a performative process takes place, and the valuation techniques are enacting their environment.

Keywords: language, linguistics, performativity, financial techniques

Procedia PDF Downloads 146
9098 LEDs Based Indoor Positioning by Distances Derivation from Lambertian Illumination Model

Authors: Yan-Ren Chen, Jenn-Kaie Lain

Abstract:

This paper proposes a novel indoor positioning algorithm based on visible light communications, implemented by light-emitting diode fixtures. In the proposed positioning algorithm, distances between light-emitting diode fixtures and mobile terminal are derived from the assumption of ideal Lambertian optic radiation model, and Trilateration positioning method is proceeded immediately to get the coordinates of mobile terminal. The proposed positioning algorithm directly obtains distance information from the optical signal modeling, and therefore, statistical distribution of received signal strength at different positions in interior space has no need to be pre-established. Numerically, simulation results have shown that the proposed indoor positioning algorithm can provide accurate location coordinates estimation.

Keywords: indoor positioning, received signal strength, trilateration, visible light communications

Procedia PDF Downloads 400