Search results for: feature extraction
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1448

Search results for: feature extraction

938 Integration of Educational Data Mining Models to a Web-Based Support System for Predicting High School Student Performance

Authors: Sokkhey Phauk, Takeo Okazaki

Abstract:

The challenging task in educational institutions is to maximize the high performance of students and minimize the failure rate of poor-performing students. An effective method to leverage this task is to know student learning patterns with highly influencing factors and get an early prediction of student learning outcomes at the timely stage for setting up policies for improvement. Educational data mining (EDM) is an emerging disciplinary field of data mining, statistics, and machine learning concerned with extracting useful knowledge and information for the sake of improvement and development in the education environment. The study is of this work is to propose techniques in EDM and integrate it into a web-based system for predicting poor-performing students. A comparative study of prediction models is conducted. Subsequently, high performing models are developed to get higher performance. The hybrid random forest (Hybrid RF) produces the most successful classification. For the context of intervention and improving the learning outcomes, a feature selection method MICHI, which is the combination of mutual information (MI) and chi-square (CHI) algorithms based on the ranked feature scores, is introduced to select a dominant feature set that improves the performance of prediction and uses the obtained dominant set as information for intervention. By using the proposed techniques of EDM, an academic performance prediction system (APPS) is subsequently developed for educational stockholders to get an early prediction of student learning outcomes for timely intervention. Experimental outcomes and evaluation surveys report the effectiveness and usefulness of the developed system. The system is used to help educational stakeholders and related individuals for intervening and improving student performance.

Keywords: Academic performance prediction system, prediction model, educational data mining, dominant factors, feature selection methods, student performance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 907
937 Feature Selection for Web Page Classification Using Swarm Optimization

Authors: B. Leela Devi, A. Sankar

Abstract:

The web’s increased popularity has included a huge amount of information, due to which automated web page classification systems are essential to improve search engines’ performance. Web pages have many features like HTML or XML tags, hyperlinks, URLs and text contents which can be considered during an automated classification process. It is known that Webpage classification is enhanced by hyperlinks as it reflects Web page linkages. The aim of this study is to reduce the number of features to be used to improve the accuracy of the classification of web pages. In this paper, a novel feature selection method using an improved Particle Swarm Optimization (PSO) using principle of evolution is proposed. The extracted features were tested on the WebKB dataset using a parallel Neural Network to reduce the computational cost.

Keywords: Web page classification, WebKB Dataset, Term Frequency-Inverse Document Frequency (TF-IDF), Particle Swarm Optimization (PSO).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3210
936 Improved Tropical Wood Species Recognition System based on Multi-feature Extractor and Classifier

Authors: Marzuki Khalid, RubiyahYusof, AnisSalwaMohdKhairuddin

Abstract:

An automated wood recognition system is designed to classify tropical wood species.The wood features are extracted based on two feature extractors: Basic Grey Level Aura Matrix (BGLAM) technique and statistical properties of pores distribution (SPPD) technique. Due to the nonlinearity of the tropical wood species separation boundaries, a pre classification stage is proposed which consists ofKmeans clusteringand kernel discriminant analysis (KDA). Finally, Linear Discriminant Analysis (LDA) classifier and KNearest Neighbour (KNN) are implemented for comparison purposes. The study involves comparison of the system with and without pre classification using KNN classifier and LDA classifier.The results show that the inclusion of the pre classification stage has improved the accuracy of both the LDA and KNN classifiers by more than 12%.

Keywords: Tropical wood species, nonlinear data, featureextractors, classification

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1935
935 Unsupervised Feature Learning by Pre-Route Simulation of Auto-Encoder Behavior Model

Authors: Youngjae Jin, Daeshik Kim

Abstract:

This paper describes a cycle accurate simulation results of weight values learned by an auto-encoder behavior model in terms of pre-route simulation. Given the results we visualized the first layer representations with natural images. Many common deep learning threads have focused on learning high-level abstraction of unlabeled raw data by unsupervised feature learning. However, in the process of handling such a huge amount of data, the learning method’s computation complexity and time limited advanced research. These limitations came from the fact these algorithms were computed by using only single core CPUs. For this reason, parallel-based hardware, FPGAs, was seen as a possible solution to overcome these limitations. We adopted and simulated the ready-made auto-encoder to design a behavior model in VerilogHDL before designing hardware. With the auto-encoder behavior model pre-route simulation, we obtained the cycle accurate results of the parameter of each hidden layer by using MODELSIM. The cycle accurate results are very important factor in designing a parallel-based digital hardware. Finally this paper shows an appropriate operation of behavior model based pre-route simulation. Moreover, we visualized learning latent representations of the first hidden layer with Kyoto natural image dataset.

Keywords: Auto-encoder, Behavior model simulation, Digital hardware design, Pre-route simulation, Unsupervised feature learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2644
934 Mining Genes Relations in Microarray Data Combined with Ontology in Colon Cancer Automated Diagnosis System

Authors: A. Gruzdz, A. Ihnatowicz, J. Siddiqi, B. Akhgar

Abstract:

MATCH project [1] entitle the development of an automatic diagnosis system that aims to support treatment of colon cancer diseases by discovering mutations that occurs to tumour suppressor genes (TSGs) and contributes to the development of cancerous tumours. The constitution of the system is based on a) colon cancer clinical data and b) biological information that will be derived by data mining techniques from genomic and proteomic sources The core mining module will consist of the popular, well tested hybrid feature extraction methods, and new combined algorithms, designed especially for the project. Elements of rough sets, evolutionary computing, cluster analysis, self-organization maps and association rules will be used to discover the annotations between genes, and their influence on tumours [2]-[11]. The methods used to process the data have to address their high complexity, potential inconsistency and problems of dealing with the missing values. They must integrate all the useful information necessary to solve the expert's question. For this purpose, the system has to learn from data, or be able to interactively specify by a domain specialist, the part of the knowledge structure it needs to answer a given query. The program should also take into account the importance/rank of the particular parts of data it analyses, and adjusts the used algorithms accordingly.

Keywords: Bioinformatics, gene expression, ontology, selforganizingmaps.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1925
933 Real-Time Hand Tracking and Gesture Recognition System Using Neural Networks

Authors: Tin Hninn Hninn Maung

Abstract:

This paper introduces a hand gesture recognition system to recognize real time gesture in unstrained environments. Efforts should be made to adapt computers to our natural means of communication: Speech and body language. A simple and fast algorithm using orientation histograms will be developed. It will recognize a subset of MAL static hand gestures. A pattern recognition system will be using a transforrn that converts an image into a feature vector, which will be compared with the feature vectors of a training set of gestures. The final system will be Perceptron implementation in MATLAB. This paper includes experiments of 33 hand postures and discusses the results. Experiments shows that the system can achieve a 90% recognition average rate and is suitable for real time applications.

Keywords: Hand gesture recognition, Orientation Histogram, Myanmar Alphabet Language, Perceptronnetwork, MATLAB.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4634
932 Feature Based Unsupervised Intrusion Detection

Authors: Deeman Yousif Mahmood, Mohammed Abdullah Hussein

Abstract:

The goal of a network-based intrusion detection system is to classify activities of network traffics into two major categories: normal and attack (intrusive) activities. Nowadays, data mining and machine learning plays an important role in many sciences; including intrusion detection system (IDS) using both supervised and unsupervised techniques. However, one of the essential steps of data mining is feature selection that helps in improving the efficiency, performance and prediction rate of proposed approach. This paper applies unsupervised K-means clustering algorithm with information gain (IG) for feature selection and reduction to build a network intrusion detection system. For our experimental analysis, we have used the new NSL-KDD dataset, which is a modified dataset for KDDCup 1999 intrusion detection benchmark dataset. With a split of 60.0% for the training set and the remainder for the testing set, a 2 class classifications have been implemented (Normal, Attack). Weka framework which is a java based open source software consists of a collection of machine learning algorithms for data mining tasks has been used in the testing process. The experimental results show that the proposed approach is very accurate with low false positive rate and high true positive rate and it takes less learning time in comparison with using the full features of the dataset with the same algorithm.

Keywords: Information Gain (IG), Intrusion Detection System (IDS), K-means Clustering, Weka.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2710
931 Effect of Impurities in the Chlorination Process of TiO2

Authors: Seok Hong Min, Tae Kwon Ha

Abstract:

With the increasing interest on Ti alloys, the extraction process of Ti from its typical ore, TiO2, has long been and will be important issue. As an intermediate product for the production of pigment or titanium metal sponge, tetrachloride (TiCl4) is produced by fluidized bed using high TiO2 feedstock. The purity of TiCl4 after chlorination is subjected to the quality of the titanium feedstock. Since the impurities in the TiCl4 product are reported to final products, the purification process of the crude TiCl4 is required. The purification process includes fractional distillation and chemical treatment, which depends on the nature of the impurities present and the required quality of the final product. In this study, thermodynamic analysis on the impurity effect in the chlorination process, which is the first step of extraction of Ti from TiO2, has been conducted. All thermodynamic calculations were performed using the FactSage thermodynamical software.

Keywords: Rutile, titanium, chlorination process, impurities, thermodynamic calculation, FactSage.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1631
930 Defect Detection of Tiles Using 2D-Wavelet Transform and Statistical Features

Authors: M.Ghazvini, S. A. Monadjemi, N. Movahhedinia, K. Jamshidi

Abstract:

In this article, a method has been offered to classify normal and defective tiles using wavelet transform and artificial neural networks. The proposed algorithm calculates max and min medians as well as the standard deviation and average of detail images obtained from wavelet filters, then comes by feature vectors and attempts to classify the given tile using a Perceptron neural network with a single hidden layer. In this study along with the proposal of using median of optimum points as the basic feature and its comparison with the rest of the statistical features in the wavelet field, the relational advantages of Haar wavelet is investigated. This method has been experimented on a number of various tile designs and in average, it has been valid for over 90% of the cases. Amongst the other advantages, high speed and low calculating load are prominent.

Keywords: Defect detection, tile and ceramic quality inspection, wavelet transform, classification, neural networks, statistical features.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2311
929 Identifying New Sequence Features for Exon-Intron Discrimination by Rescaled-Range Frameshift Analysis

Authors: Sing-Wu Liou, Yin-Fu Huang

Abstract:

For identifying the discriminative sequence features between exons and introns, a new paradigm, rescaled-range frameshift analysis (RRFA), was proposed. By RRFA, two new sequence features, the frameshift sensitivity (FS) and the accumulative penta-mer complexity (APC), were discovered which were further integrated into a new feature of larger scale, the persistency in anti-mutation (PAM). The feature-validation experiments were performed on six model organisms to test the power of discrimination. All the experimental results highly support that FS, APC and PAM were all distinguishing features between exons and introns. These identified new sequence features provide new insights into the sequence composition of genes and they have great potentials of forming a new basis for recognizing the exonintron boundaries in gene sequences.

Keywords: Exon-Intron Discrimination, Rescaled-Range Frameshift Analysis, Frameshift Sensitivity, Accumulative Sequence Complexity

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1135
928 An Adaptive Dimensionality Reduction Approach for Hyperspectral Imagery Semantic Interpretation

Authors: Akrem Sellami, Imed Riadh Farah, Basel Solaiman

Abstract:

With the development of HyperSpectral Imagery (HSI) technology, the spectral resolution of HSI became denser, which resulted in large number of spectral bands, high correlation between neighboring, and high data redundancy. However, the semantic interpretation is a challenging task for HSI analysis due to the high dimensionality and the high correlation of the different spectral bands. In fact, this work presents a dimensionality reduction approach that allows to overcome the different issues improving the semantic interpretation of HSI. Therefore, in order to preserve the spatial information, the Tensor Locality Preserving Projection (TLPP) has been applied to transform the original HSI. In the second step, knowledge has been extracted based on the adjacency graph to describe the different pixels. Based on the transformation matrix using TLPP, a weighted matrix has been constructed to rank the different spectral bands based on their contribution score. Thus, the relevant bands have been adaptively selected based on the weighted matrix. The performance of the presented approach has been validated by implementing several experiments, and the obtained results demonstrate the efficiency of this approach compared to various existing dimensionality reduction techniques. Also, according to the experimental results, we can conclude that this approach can adaptively select the relevant spectral improving the semantic interpretation of HSI.

Keywords: Band selection, dimensionality reduction, feature extraction, hyperspectral imagery, semantic interpretation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1123
927 Advanced Information Extraction with n-gram based LSI

Authors: Ahmet Güven, Ö. Özgür Bozkurt, Oya Kalıpsız

Abstract:

Number of documents being created increases at an increasing pace while most of them being in already known topics and little of them introducing new concepts. This fact has started a new era in information retrieval discipline where the requirements have their own specialties. That is digging into topics and concepts and finding out subtopics or relations between topics. Up to now IR researches were interested in retrieving documents about a general topic or clustering documents under generic subjects. However these conventional approaches can-t go deep into content of documents which makes it difficult for people to reach to right documents they were searching. So we need new ways of mining document sets where the critic point is to know much about the contents of the documents. As a solution we are proposing to enhance LSI, one of the proven IR techniques by supporting its vector space with n-gram forms of words. Positive results we have obtained are shown in two different application area of IR domain; querying a document database, clustering documents in the document database.

Keywords: Document clustering, Information Extraction, Information Retrieval, LSI, n-gram.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1751
926 Automatic Text Summarization

Authors: Mohamed Abdel Fattah, Fuji Ren

Abstract:

This work proposes an approach to address automatic text summarization. This approach is a trainable summarizer, which takes into account several features, including sentence position, positive keyword, negative keyword, sentence centrality, sentence resemblance to the title, sentence inclusion of name entity, sentence inclusion of numerical data, sentence relative length, Bushy path of the sentence and aggregated similarity for each sentence to generate summaries. First we investigate the effect of each sentence feature on the summarization task. Then we use all features score function to train genetic algorithm (GA) and mathematical regression (MR) models to obtain a suitable combination of feature weights. The proposed approach performance is measured at several compression rates on a data corpus composed of 100 English religious articles. The results of the proposed approach are promising.

Keywords: Automatic Summarization, Genetic Algorithm, Mathematical Regression, Text Features.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2274
925 Automatic Landmark Selection Based on Feature Clustering for Visual Autonomous Unmanned Aerial Vehicle Navigation

Authors: Paulo Fernando Silva Filho, Elcio Hideiti Shiguemori

Abstract:

The selection of specific landmarks for an Unmanned Aerial Vehicles’ Visual Navigation systems based on Automatic Landmark Recognition has significant influence on the precision of the system’s estimated position. At the same time, manual selection of the landmarks does not guarantee a high recognition rate, which would also result on a poor precision. This work aims to develop an automatic landmark selection that will take the image of the flight area and identify the best landmarks to be recognized by the Visual Navigation Landmark Recognition System. The criterion to select a landmark is based on features detected by ORB or AKAZE and edges information on each possible landmark. Results have shown that disposition of possible landmarks is quite different from the human perception.

Keywords: Clustering, edges, feature points, landmark selection, X-Means.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 751
924 Speaker Identification using Neural Networks

Authors: R.V Pawar, P.P.Kajave, S.N.Mali

Abstract:

The speech signal conveys information about the identity of the speaker. The area of speaker identification is concerned with extracting the identity of the person speaking the utterance. As speech interaction with computers becomes more pervasive in activities such as the telephone, financial transactions and information retrieval from speech databases, the utility of automatically identifying a speaker is based solely on vocal characteristic. This paper emphasizes on text dependent speaker identification, which deals with detecting a particular speaker from a known population. The system prompts the user to provide speech utterance. System identifies the user by comparing the codebook of speech utterance with those of the stored in the database and lists, which contain the most likely speakers, could have given that speech utterance. The speech signal is recorded for N speakers further the features are extracted. Feature extraction is done by means of LPC coefficients, calculating AMDF, and DFT. The neural network is trained by applying these features as input parameters. The features are stored in templates for further comparison. The features for the speaker who has to be identified are extracted and compared with the stored templates using Back Propogation Algorithm. Here, the trained network corresponds to the output; the input is the extracted features of the speaker to be identified. The network does the weight adjustment and the best match is found to identify the speaker. The number of epochs required to get the target decides the network performance.

Keywords: Average Mean Distance function, Backpropogation, Linear Predictive Coding, MultilayeredPerceptron,

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1852
923 Fused Structure and Texture (FST) Features for Improved Pedestrian Detection

Authors: Hussin K. Ragb, Vijayan K. Asari

Abstract:

In this paper, we present a pedestrian detection descriptor called Fused Structure and Texture (FST) features based on the combination of the local phase information with the texture features. Since the phase of the signal conveys more structural information than the magnitude, the phase congruency concept is used to capture the structural features. On the other hand, the Center-Symmetric Local Binary Pattern (CSLBP) approach is used to capture the texture information of the image. The dimension less quantity of the phase congruency and the robustness of the CSLBP operator on the flat images, as well as the blur and illumination changes, lead the proposed descriptor to be more robust and less sensitive to the light variations. The proposed descriptor can be formed by extracting the phase congruency and the CSLBP values of each pixel of the image with respect to its neighborhood. The histogram of the oriented phase and the histogram of the CSLBP values for the local regions in the image are computed and concatenated to construct the FST descriptor. Several experiments were conducted on INRIA and the low resolution DaimlerChrysler datasets to evaluate the detection performance of the pedestrian detection system that is based on the FST descriptor. A linear Support Vector Machine (SVM) is used to train the pedestrian classifier. These experiments showed that the proposed FST descriptor has better detection performance over a set of state of the art feature extraction methodologies.

Keywords: Pedestrian detection, phase congruency, local phase, LBP features, CSLBP features, FST descriptor.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1432
922 On the Interactive Search with Web Documents

Authors: Mario Kubek, Herwig Unger

Abstract:

Due to the large amount of information in the World Wide Web (WWW, web) and the lengthy and usually linearly ordered result lists of web search engines that do not indicate semantic relationships between their entries, the search for topically similar and related documents can become a tedious task. Especially, the process of formulating queries with proper terms representing specific information needs requires much effort from the user. This problem gets even bigger when the user's knowledge on a subject and its technical terms is not sufficient enough to do so. This article presents the new and interactive search application DocAnalyser that addresses this problem by enabling users to find similar and related web documents based on automatic query formulation and state-ofthe- art search word extraction. Additionally, this tool can be used to track topics across semantically connected web documents.

Keywords: DocAnalyser, interactive web search, search word extraction, query formulation, source topic detection, topic tracking.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1609
921 Noninvasive Brain-Machine Interface to Control Both Mecha TE Robotic Hands Using Emotiv EEG Neuroheadset

Authors: Adrienne Kline, Jaydip Desai

Abstract:

Electroencephalogram (EEG) is a noninvasive technique that registers signals originating from the firing of neurons in the brain. The Emotiv EEG Neuroheadset is a consumer product comprised of 14 EEG channels and was used to record the reactions of the neurons within the brain to two forms of stimuli in 10 participants. These stimuli consisted of auditory and visual formats that provided directions of ‘right’ or ‘left.’ Participants were instructed to raise their right or left arm in accordance with the instruction given. A scenario in OpenViBE was generated to both stimulate the participants while recording their data. In OpenViBE, the Graz Motor BCI Stimulator algorithm was configured to govern the duration and number of visual stimuli. Utilizing EEGLAB under the cross platform MATLAB®, the electrodes most stimulated during the study were defined. Data outputs from EEGLAB were analyzed using IBM SPSS Statistics® Version 20. This aided in determining the electrodes to use in the development of a brain-machine interface (BMI) using real-time EEG signals from the Emotiv EEG Neuroheadset. Signal processing and feature extraction were accomplished via the Simulink® signal processing toolbox. An Arduino™ Duemilanove microcontroller was used to link the Emotiv EEG Neuroheadset and the right and left Mecha TE™ Hands.

Keywords: Brain-machine interface, EEGLAB, emotiv EEG neuroheadset, openViBE, simulink.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2752
920 Intrusion Detection Using a New Particle Swarm Method and Support Vector Machines

Authors: Essam Al Daoud

Abstract:

Intrusion detection is a mechanism used to protect a system and analyse and predict the behaviours of system users. An ideal intrusion detection system is hard to achieve due to nonlinearity, and irrelevant or redundant features. This study introduces a new anomaly-based intrusion detection model. The suggested model is based on particle swarm optimisation and nonlinear, multi-class and multi-kernel support vector machines. Particle swarm optimisation is used for feature selection by applying a new formula to update the position and the velocity of a particle; the support vector machine is used as a classifier. The proposed model is tested and compared with the other methods using the KDD CUP 1999 dataset. The results indicate that this new method achieves better accuracy rates than previous methods.

Keywords: Feature selection, Intrusion detection, Support vector machine, Particle swarm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1932
919 Causal Relation Identification Using Convolutional Neural Networks and Knowledge Based Features

Authors: Tharini N. de Silva, Xiao Zhibo, Zhao Rui, Mao Kezhi

Abstract:

Causal relation identification is a crucial task in information extraction and knowledge discovery. In this work, we present two approaches to causal relation identification. The first is a classification model trained on a set of knowledge-based features. The second is a deep learning based approach training a model using convolutional neural networks to classify causal relations. We experiment with several different convolutional neural networks (CNN) models based on previous work on relation extraction as well as our own research. Our models are able to identify both explicit and implicit causal relations as well as the direction of the causal relation. The results of our experiments show a higher accuracy than previously achieved for causal relation identification tasks.

Keywords: Causal relation identification, convolutional neural networks, natural Language Processing, Machine Learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2188
918 A DCT-Based Secure JPEG Image Authentication Scheme

Authors: Mona F. M. Mursi, Ghazy M.R. Assassa, Hatim A. Aboalsamh, Khaled Alghathbar

Abstract:

The challenge in the case of image authentication is that in many cases images need to be subjected to non malicious operations like compression, so the authentication techniques need to be compression tolerant. In this paper we propose an image authentication system that is tolerant to JPEG lossy compression operations. A scheme for JPEG grey scale images is proposed based on a data embedding method that is based on a secret key and a secret mapping vector in the frequency domain. An encrypted feature vector extracted from the image DCT coefficients, is embedded redundantly, and invisibly in the marked image. On the receiver side, the feature vector from the received image is derived again and compared against the extracted watermark to verify the image authenticity. The proposed scheme is robust against JPEG compression up to a maximum compression of approximately 80%,, but sensitive to malicious attacks such as cutting and pasting.

Keywords: Authentication, DCT, JPEG, Watermarking.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1687
917 Anomaly Detection in a Data Center with a Reconstruction Method Using a Multi-Autoencoders Model

Authors: Victor Breux, Jérôme Boutet, Alain Goret, Viviane Cattin

Abstract:

Early detection of anomalies in data centers is important to reduce downtimes and the costs of periodic maintenance. However, there is little research on this topic and even fewer on the fusion of sensor data for the detection of abnormal events. The goal of this paper is to propose a method for anomaly detection in data centers by combining sensor data (temperature, humidity, power) and deep learning models. The model described in the paper uses one autoencoder per sensor to reconstruct the inputs. The auto-encoders contain Long-Short Term Memory (LSTM) layers and are trained using the normal samples of the relevant sensors selected by correlation analysis. The difference signal between the input and its reconstruction is then used to classify the samples using feature extraction and a random forest classifier. The data measured by the sensors of a data center between January 2019 and May 2020 are used to train the model, while the data between June 2020 and May 2021 are used to assess it. Performances of the model are assessed a posteriori through F1-score by comparing detected anomalies with the data center’s history. The proposed model outperforms the state-of-the-art reconstruction method, which uses only one autoencoder taking multivariate sequences and detects an anomaly with a threshold on the reconstruction error, with an F1-score of 83.60% compared to 24.16%.

Keywords: Anomaly detection, autoencoder, data centers, deep learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 667
916 Effect of Hemicellulase on Extraction of Essential Oil from Algerian Artemisia campestris

Authors: Khalida Boutemak, Nasssima Benali, Nadji Moulai-Mostefa

Abstract:

Effect of enzyme on the yield and chemical composition of Artemisia campestris essential oil is reported in the present study. It was demonstrated that enzyme facilitated the extraction of essential oil with increase in oil yield and did not affect any noticeable change in flavour profile of the  volatile oil. Essential oil was tested for antibacterial activity using Escherichia coli; which was extremely sensitive against control with the largest inhibition (29mm), whereas Staphylococcus aureus was the most sensitive against essential oil obtained from enzymatic pre-treatment with the largest inhibition zone (25mm). The antioxidant activity of the essential oil with hemicellulase pre-treatment (EO2) and control sample (EO1) was determined through reducing power. It was significantly lower than the standard drug (vitamin C) in this order: vitamin C˃EO2˃EO1.

Keywords: Artemisia campestris, enzyme pre-treatment, hemicellulase, antibacterial activity, antioxidant activity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1507
915 Analysis Model for the Relationship of Users, Products, and Stores on Online Marketplace Based on Distributed Representation

Authors: Ke He, Wumaier Parezhati, Haruka Yamashita

Abstract:

Recently, online marketplaces in the e-commerce industry, such as Rakuten and Alibaba, have become some of the most popular online marketplaces in Asia. In these shopping websites, consumers can select purchase products from a large number of stores. Additionally, consumers of the e-commerce site have to register their name, age, gender, and other information in advance, to access their registered account. Therefore, establishing a method for analyzing consumer preferences from both the store and the product side is required. This study uses the Doc2Vec method, which has been studied in the field of natural language processing. Doc2Vec has been used in many cases to analyze the extraction of semantic relationships between documents (represented as consumers) and words (represented as products) in the field of document classification. This concept is applicable to represent the relationship between users and items; however, the problem is that one more factor (i.e., shops) needs to be considered in Doc2Vec. More precisely, a method for analyzing the relationship between consumers, stores, and products is required. The purpose of our study is to combine the analysis of the Doc2vec model for users and shops, and for users and items in the same feature space. This method enables the calculation of similar shops and items for each user. In this study, we derive the real data analysis accumulated in the online marketplace and demonstrate the efficiency of the proposal.

Keywords: Doc2Vec, marketing, online marketplace, recommendation system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 409
914 OHASD: The First On-Line Arabic Sentence Database Handwritten on Tablet PC

Authors: Randa I. M. Elanwar, Mohsen A. Rashwan, Samia A. Mashali

Abstract:

In this paper we present the first Arabic sentence dataset for on-line handwriting recognition written on tablet pc. The dataset is natural, simple and clear. Texts are sampled from daily newspapers. To collect naturally written handwriting, forms are dictated to writers. The current version of our dataset includes 154 paragraphs written by 48 writers. It contains more than 3800 words and more than 19,400 characters. Handwritten texts are mainly written by researchers from different research centers. In order to use this dataset in a recognition system word extraction is needed. In this paper a new word extraction technique based on the Arabic handwriting cursive nature is also presented. The technique is applied to this dataset and good results are obtained. The results can be considered as a bench mark for future research to be compared with.

Keywords: Arabic, Handwriting recognition, on-line dataset.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2017
913 Feature Preserving Nonlinear Diffusion for Ultrasonic Image Denoising and Edge Enhancement

Authors: Shujun Fu, Qiuqi Ruan, Wenqia Wang, Yu Li

Abstract:

Utilizing echoic intension and distribution from different organs and local details of human body, ultrasonic image can catch important medical pathological changes, which unfortunately may be affected by ultrasonic speckle noise. A feature preserving ultrasonic image denoising and edge enhancement scheme is put forth, which includes two terms: anisotropic diffusion and edge enhancement, controlled by the optimum smoothing time. In this scheme, the anisotropic diffusion is governed by the local coordinate transformation and the first and the second order normal derivatives of the image, while the edge enhancement is done by the hyperbolic tangent function. Experiments on real ultrasonic images indicate effective preservation of edges, local details and ultrasonic echoic bright strips on denoising by our scheme.

Keywords: anisotropic diffusion, coordinate transformationdirectional derivatives, edge enhancement, hyperbolic tangentfunction, image denoising.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1769
912 Applying Case-Based Reasoning in Supporting Strategy Decisions

Authors: S. M. Seyedhosseini, A. Makui, M. Ghadami

Abstract:

Globalization and therefore increasing tight competition among companies, have resulted to increase the importance of making well-timed decision. Devising and employing effective strategies, that are flexible and adaptive to changing market, stand a greater chance of being effective in the long-term. In other side, a clear focus on managing the entire product lifecycle has emerged as critical areas for investment. Therefore, applying wellorganized tools to employ past experience in new case, helps to make proper and managerial decisions. Case based reasoning (CBR) is based on a means of solving a new problem by using or adapting solutions to old problems. In this paper, an adapted CBR model with k-nearest neighbor (K-NN) is employed to provide suggestions for better decision making which are adopted for a given product in the middle of life phase. The set of solutions are weighted by CBR in the principle of group decision making. Wrapper approach of genetic algorithm is employed to generate optimal feature subsets. The dataset of the department store, including various products which are collected among two years, have been used. K-fold approach is used to evaluate the classification accuracy rate. Empirical results are compared with classical case based reasoning algorithm which has no special process for feature selection, CBR-PCA algorithm based on filter approach feature selection, and Artificial Neural Network. The results indicate that the predictive performance of the model, compare with two CBR algorithms, in specific case is more effective.

Keywords: Case based reasoning, Genetic algorithm, Groupdecision making, Product management.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2131
911 Social, Group and Individual Mind extracted from Rule Bases of Multiple Agents

Authors: P. Cermak

Abstract:

This paper shows possibility of extraction Social, Group and Individual Mind from Multiple Agents Rule Bases. Types those Rule bases are selected as two fuzzy systems, namely Mambdani and Takagi-Sugeno fuzzy system. Their rule bases are describing (modeling) agent behavior. Modifying of agent behavior in the time varying environment will be provided by learning fuzzyneural networks and optimization of their parameters with using genetic algorithms in development system FUZNET. Finally, extraction Social, Group and Individual Mind from Multiple Agents Rule Bases are provided by Cognitive analysis and Matching criterion.

Keywords: Mind, Multi-agent system, Cognitive analysis, Fuzzy system, Neural network, Genetic algorithm, Rule base.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1204
910 Assisted Prediction of Hypertension Based on Heart Rate Variability and Improved Residual Networks

Authors: Yong Zhao, Jian He, Cheng Zhang

Abstract:

Cardiovascular disease resulting from hypertension poses a significant threat to human health, and early detection of hypertension can potentially save numerous lives. Traditional methods for detecting hypertension require specialized equipment and are often incapable of capturing continuous blood pressure fluctuations. To address this issue, this study starts by analyzing the principle of heart rate variability (HRV) and introduces the utilization of sliding window and power spectral density (PSD) techniques to analyze both temporal and frequency domain features of HRV. Subsequently, a hypertension prediction network that relies on HRV is proposed, combining Resnet, attention mechanisms, and a multi-layer perceptron. The network leverages a modified ResNet18 to extract frequency domain features, while employing an attention mechanism to integrate temporal domain features, thus enabling auxiliary hypertension prediction through the multi-layer perceptron. The proposed network is trained and tested using the publicly available SHAREE dataset from PhysioNet. The results demonstrate that the network achieves a high prediction accuracy of 92.06% for hypertension, surpassing traditional models such as K Near Neighbor (KNN), Bayes, Logistic regression, and traditional Convolutional Neural Network (CNN).

Keywords: Feature extraction, heart rate variability, hypertension, residual networks.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 100
909 Boosting Method for Automated Feature Space Discovery in Supervised Quantum Machine Learning Models

Authors: Vladimir Rastunkov, Jae-Eun Park, Abhijit Mitra, Brian Quanz, Steve Wood, Christopher Codella, Heather Higgins, Joseph Broz

Abstract:

Quantum Support Vector Machines (QSVM) have become an important tool in research and applications of quantum kernel methods. In this work we propose a boosting approach for building ensembles of QSVM models and assess performance improvement across multiple datasets. This approach is derived from the best ensemble building practices that worked well in traditional machine learning and thus should push the limits of quantum model performance even further. We find that in some cases, a single QSVM model with tuned hyperparameters is sufficient to simulate the data, while in others - an ensemble of QSVMs that are forced to do exploration of the feature space via proposed method is beneficial.

Keywords: QSVM, Quantum Support Vector Machines, quantum kernel, boosting, ensemble.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 365