Search results for: Python machine learning libraries.
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2921

Search results for: Python machine learning libraries.

2801 Real-time Network Anomaly Detection Systems Based on Machine-Learning Algorithms

Authors: Zahra Ramezanpanah, Joachim Carvallo, Aurelien Rodriguez

Abstract:

This paper aims to detect anomalies in streaming data using machine learning algorithms. In this regard, we designed two separate pipelines and evaluated the effectiveness of each separately. The first pipeline, based on supervised machine learning methods, consists of two phases. In the first phase, we trained several supervised models using the UNSW-NB15 data set. We measured the efficiency of each using different performance metrics and selected the best model for the second phase. At the beginning of the second phase, we first, using Argus Server, sniffed a local area network. Several types of attacks were simulated and then sent the sniffed data to a running algorithm at short intervals. This algorithm can display the results of each packet of received data in real-time using the trained model. The second pipeline presented in this paper is based on unsupervised algorithms, in which a Temporal Graph Network (TGN) is used to monitor a local network. The TGN is trained to predict the probability of future states of the network based on its past behavior. Our contribution in this section is introducing an indicator to identify anomalies from these predicted probabilities.

Keywords: Cyber-security, Intrusion Detection Systems, Temporal Graph Network, Anomaly Detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 505
2800 Data Mining in Medicine Domain Using Decision Trees and Vector Support Machine

Authors: Djamila Benhaddouche, Abdelkader Benyettou

Abstract:

In this paper, we used data mining to extract biomedical knowledge. In general, complex biomedical data collected in studies of populations are treated by statistical methods, although they are robust, they are not sufficient in themselves to harness the potential wealth of data. For that you used in step two learning algorithms: the Decision Trees and Support Vector Machine (SVM). These supervised classification methods are used to make the diagnosis of thyroid disease. In this context, we propose to promote the study and use of symbolic data mining techniques.

Keywords: A classifier, Algorithms decision tree, knowledge extraction, Support Vector Machine.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1870
2799 Implementation of the Recursive Formula for Evaluation of the Strength of Daniels’ Model

Authors: Václav Sadílek, Miroslav Vořechovský

Abstract:

The paper deals with the classical fiber bundle model of equal load sharing, sometimes referred to as the Daniels’ bundle or the democratic bundle. Daniels formulated a multidimensional integral and also a recursive formula for evaluation of the strength cumulative distribution function. This paper describes three algorithms for evaluation of the recursive formula and also their implementations with source codes in the Python high-level programming language. A comparison of the algorithms are provided with respect to execution time. Analysis of orders of magnitudes of addends in the recursion is also provided.

Keywords: Daniels bundle model, equal load sharing, Python, mpmath.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2095
2798 Comparison of Machine Learning and Deep Learning Algorithms for Automatic Classification of 80 Different Pollen Species

Authors: Endrick Barnacin, Jean-Luc Henry, Jimmy Nagau, Jack Molinié

Abstract:

Palynology is a field of interest in many disciplines due to its multiple applications: chronological dating, climatology, allergy treatment, and honey characterization. Unfortunately, the analysis of a pollen slide is a complicated and time consuming task that requires the intervention of experts in the field, which are becoming increasingly rare due to economic and social conditions. In this context, the automation of this task is urgent. In this work, we compare classical feature extraction methods (Shape, GLCM, LBP, and others) and Deep Learning (CNN and Transfer Learning) to perform a recognition task over 80 regional pollen species. It has been found that the use of Transfer Learning seems to be more precise than the other approaches.

Keywords: Image segmentation, stuck particles separation, Sobel operator, thresholding.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 201
2797 Hybrid Approach for Software Defect Prediction Using Machine Learning with Optimization Technique

Authors: C. Manjula, Lilly Florence

Abstract:

Software technology is developing rapidly which leads to the growth of various industries. Now-a-days, software-based applications have been adopted widely for business purposes. For any software industry, development of reliable software is becoming a challenging task because a faulty software module may be harmful for the growth of industry and business. Hence there is a need to develop techniques which can be used for early prediction of software defects. Due to complexities in manual prediction, automated software defect prediction techniques have been introduced. These techniques are based on the pattern learning from the previous software versions and finding the defects in the current version. These techniques have attracted researchers due to their significant impact on industrial growth by identifying the bugs in software. Based on this, several researches have been carried out but achieving desirable defect prediction performance is still a challenging task. To address this issue, here we present a machine learning based hybrid technique for software defect prediction. First of all, Genetic Algorithm (GA) is presented where an improved fitness function is used for better optimization of features in data sets. Later, these features are processed through Decision Tree (DT) classification model. Finally, an experimental study is presented where results from the proposed GA-DT based hybrid approach is compared with those from the DT classification technique. The results show that the proposed hybrid approach achieves better classification accuracy.

Keywords: Decision tree, genetic algorithm, machine learning, software defect prediction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1465
2796 Application of Extreme Learning Machine Method for Time Series Analysis

Authors: Rampal Singh, S. Balasundaram

Abstract:

In this paper, we study the application of Extreme Learning Machine (ELM) algorithm for single layered feedforward neural networks to non-linear chaotic time series problems. In this algorithm the input weights and the hidden layer bias are randomly chosen. The ELM formulation leads to solving a system of linear equations in terms of the unknown weights connecting the hidden layer to the output layer. The solution of this general system of linear equations will be obtained using Moore-Penrose generalized pseudo inverse. For the study of the application of the method we consider the time series generated by the Mackey Glass delay differential equation with different time delays, Santa Fe A and UCR heart beat rate ECG time series. For the choice of sigmoid, sin and hardlim activation functions the optimal values for the memory order and the number of hidden neurons which give the best prediction performance in terms of root mean square error are determined. It is observed that the results obtained are in close agreement with the exact solution of the problems considered which clearly shows that ELM is a very promising alternative method for time series prediction.

Keywords: Chaotic time series, Extreme learning machine, Generalization performance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3519
2795 Hybrid Machine Learning Approach for Text Categorization

Authors: Nerijus Remeikis, Ignas Skucas, Vida Melninkaite

Abstract:

Text categorization - the assignment of natural language documents to one or more predefined categories based on their semantic content - is an important component in many information organization and management tasks. Performance of neural networks learning is known to be sensitive to the initial weights and architecture. This paper discusses the use multilayer neural network initialization with decision tree classifier for improving text categorization accuracy. An adaptation of the algorithm is proposed in which a decision tree from root node until a final leave is used for initialization of multilayer neural network. The experimental evaluation demonstrates this approach provides better classification accuracy with Reuters-21578 corpus, one of the standard benchmarks for text categorization tasks. We present results comparing the accuracy of this approach with multilayer neural network initialized with traditional random method and decision tree classifiers.

Keywords: Text categorization, decision trees, neural networks, machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1806
2794 Digital Preservation in Nigeria Universities Libraries: A Comparison between University of Nigeria Nsukka and Ahmadu Bello University Zaria

Authors: Suleiman Musa, Shuaibu Sidi Safiyanu

Abstract:

This study examined the digital preservation in Nigeria university libraries. A comparison between the university of Nigeria Nsukka (UNN) and Ahmadu Bello University Zaria (ABU, Zaria). The study utilized primary source of data obtained from two selected institution librarians. Finding revealed varying results in terms of skills acquired by librarians before and after digitization of the two institutions. The study reports that journals publication, text book, CD-ROMS, conference papers and proceedings, theses, dissertations and seminar papers are among the information resources available for digitization. The study further documents that copyright issue, power failure, and unavailability of needed materials are among the challenges facing the digitization of library of the institution. On the basis of the finding, the study concluded that digitization of library enhances efficiency in organization and retrieval of information services. The study therefore recommended that software should be upgraded with backup, training of the librarians on digital process, installation of antivirus and enhancement of technical collaboration between the library and MIS.

Keywords: Digitalization, preservation, libraries, comparison.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1726
2793 3D Human Reconstruction over Cloud Based Image Data via AI and Machine Learning

Authors: Kaushik Sathupadi, Sandesh Achar

Abstract:

Human action recognition (HAR) modeling is a critical task in machine learning. These systems require better techniques for recognizing body parts and selecting optimal features based on vision sensors to identify complex action patterns efficiently. Still, there is a considerable gap and challenges between images and videos, such as brightness, motion variation, and random clutters. This paper proposes a robust approach for classifying human actions over cloud-based image data. First, we apply pre-processing and detection, human and outer shape detection techniques. Next, we extract valuable information in terms of cues. We extract two distinct features: fuzzy local binary patterns and sequence representation. Then, we applied a greedy, randomized adaptive search procedure for data optimization and dimension reduction, and for classification, we used a random forest. We tested our model on two benchmark datasets, AAMAZ and the KTH Multi-view Football datasets. Our HAR framework significantly outperforms the other state-of-the-art approaches and achieves a better recognition rate of 91% and 89.6% over the AAMAZ and KTH Multi-view Football datasets, respectively.

Keywords: Computer vision, human motion analysis, random forest, machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 36
2792 Artificial Intelligence: A Comprehensive and Systematic Literature Review of Applications and Comparative Technologies

Authors: Z. M. Najmi

Abstract:

Over the years, the question around Artificial Intelligence has always been one with many answers. Whether by means of use in business and industry or complicated algorithmic programming, management of these technologies has always been the core focus. More recently, technologies have been questioned in industry and society alike as to whether they have improved human-centred design, assisted choices and objectives, and had a hand in systematic processes across the board. With these questions the answer may lie within AI technologies, and the steps needed in removing common human error. Elements such as Machine Learning, Deep Learning, Recommender Systems and Natural Language Processing will all be features to consider moving forward. Our previous intervention with AI applications has resulted in increased productivity, however, raised concerns for the continuation of traditional human-centred occupations. Emerging technologies such as Augmented Reality and Virtual Reality have all played a part in this during AI’s prominent rise. As mentioned, AI has been constantly under the microscope; the benefits and drawbacks may seem endless is wide, but AI is something we must take notice of and adapt into our everyday lives. The aim of this paper is to give an overview of the technologies surrounding A.I. and its’ related technologies. A comprehensive review has been written as a timeline of the developing events and key points in the history of Artificial Intelligence. This research is gathered entirely from secondary research, academic statements of knowledge and gathered to produce an understanding of the timeline of AI.

Keywords: Artificial Intelligence, Deep Learning, Augmented Reality, Reinforcement Learning, Machine Learning, Supervised Learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 576
2791 Extraction of Significant Phrases from Text

Authors: Yuan J. Lui

Abstract:

Prospective readers can quickly determine whether a document is relevant to their information need if the significant phrases (or keyphrases) in this document are provided. Although keyphrases are useful, not many documents have keyphrases assigned to them, and manually assigning keyphrases to existing documents is costly. Therefore, there is a need for automatic keyphrase extraction. This paper introduces a new domain independent keyphrase extraction algorithm. The algorithm approaches the problem of keyphrase extraction as a classification task, and uses a combination of statistical and computational linguistics techniques, a new set of attributes, and a new machine learning method to distinguish keyphrases from non-keyphrases. The experiments indicate that this algorithm performs better than other keyphrase extraction tools and that it significantly outperforms Microsoft Word 2000-s AutoSummarize feature. The domain independence of this algorithm has also been confirmed in our experiments.

Keywords: classification, keyphrase extraction, machine learning, summarization

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2051
2790 A Cognitive Robot Collaborative Reinforcement Learning Algorithm

Authors: Amit Gil, Helman Stern, Yael Edan

Abstract:

A cognitive collaborative reinforcement learning algorithm (CCRL) that incorporates an advisor into the learning process is developed to improve supervised learning. An autonomous learner is enabled with a self awareness cognitive skill to decide when to solicit instructions from the advisor. The learner can also assess the value of advice, and accept or reject it. The method is evaluated for robotic motion planning using simulation. Tests are conducted for advisors with skill levels from expert to novice. The CCRL algorithm and a combined method integrating its logic with Clouse-s Introspection Approach, outperformed a base-line fully autonomous learner, and demonstrated robust performance when dealing with various advisor skill levels, learning to accept advice received from an expert, while rejecting that of less skilled collaborators. Although the CCRL algorithm is based on RL, it fits other machine learning methods, since advisor-s actions are only added to the outer layer.

Keywords: Robot learning, human-robot collaboration, motion planning, reinforcement learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1724
2789 A Hybrid Machine Learning System for Stock Market Forecasting

Authors: Rohit Choudhry, Kumkum Garg

Abstract:

In this paper, we propose a hybrid machine learning system based on Genetic Algorithm (GA) and Support Vector Machines (SVM) for stock market prediction. A variety of indicators from the technical analysis field of study are used as input features. We also make use of the correlation between stock prices of different companies to forecast the price of a stock, making use of technical indicators of highly correlated stocks, not only the stock to be predicted. The genetic algorithm is used to select the set of most informative input features from among all the technical indicators. The results show that the hybrid GA-SVM system outperforms the stand alone SVM system.

Keywords: Genetic Algorithms, Support Vector Machines, Stock Market Forecasting.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 9318
2788 An Intelligent Baby Care System Based on IoT and Deep Learning Techniques

Authors: Chinlun Lai, Lunjyh Jiang

Abstract:

Due to the heavy burden and pressure of caring for infants, an integrated automatic baby watching system based on IoT smart sensing and deep learning machine vision techniques is proposed in this paper. By monitoring infant body conditions such as heartbeat, breathing, body temperature, sleeping posture, as well as the surrounding conditions such as dangerous/sharp objects, light, noise, humidity and temperature, the proposed system can analyze and predict the obvious/potential dangerous conditions according to observed data and then adopt suitable actions in real time to protect the infant from harm. Thus, reducing the burden of the caregiver and improving safety efficiency of the caring work. The experimental results show that the proposed system works successfully for the infant care work and thus can be implemented in various life fields practically.

Keywords: Baby care system, internet of things, deep learning, machine vision.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1902
2787 Towards Developing a Self-Explanatory Scheduling System Based on a Hybrid Approach

Authors: Jian Zheng, Yoshiyasu Takahashi, Yuichi Kobayashi, Tatsuhiro Sato

Abstract:

In the study, we present a conceptual framework for developing a scheduling system that can generate self-explanatory and easy-understanding schedules. To this end, a user interface is conceived to help planners record factors that are considered crucial in scheduling, as well as internal and external sources relating to such factors. A hybrid approach combining machine learning and constraint programming is developed to generate schedules and the corresponding factors, and accordingly display them on the user interface. Effects of the proposed system on scheduling are discussed, and it is expected that scheduling efficiency and system understandability will be improved, compared with previous scheduling systems.

Keywords: Constraint programming, Factors considered in scheduling, machine learning, scheduling system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1435
2786 Data Analysis Techniques for Predictive Maintenance on Fleet of Heavy-Duty Vehicles

Authors: Antonis Sideris, Elias Chlis Kalogeropoulos, Konstantia Moirogiorgou

Abstract:

The present study proposes a methodology for the efficient daily management of fleet vehicles and construction machinery. The application covers the area of remote monitoring of heavy-duty vehicles operation parameters, where specific sensor data are stored and examined in order to provide information about the vehicle’s health. The vehicle diagnostics allow the user to inspect whether maintenance tasks need to be performed before a fault occurs. A properly designed machine learning model is proposed for the detection of two different types of faults through classification. Cross validation is used and the accuracy of the trained model is checked with the confusion matrix.

Keywords: Fault detection, feature selection, machine learning, predictive maintenance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 781
2785 Reducing the Imbalance Penalty through Artificial Intelligence Methods Geothermal Production Forecasting: A Case Study for Turkey

Authors: H. Anıl, G. Kar

Abstract:

In addition to being rich in renewable energy resources, Turkey is one of the countries that promise potential in geothermal energy production with its high installed power, cheapness, and sustainability. Increasing imbalance penalties become an economic burden for organizations, since the geothermal generation plants cannot maintain the balance of supply and demand due to the inadequacy of the production forecasts given in the day-ahead market. A better production forecast reduces the imbalance penalties of market participants and provides a better imbalance in the day ahead market. In this study, using machine learning, deep learning and time series methods, the total generation of the power plants belonging to Zorlu Doğal Electricity Generation, which has a high installed capacity in terms of geothermal, was predicted for the first one-week and first two-weeks of March, then the imbalance penalties were calculated with these estimates and compared with the real values. These modeling operations were carried out on two datasets, the basic dataset and the dataset created by extracting new features from this dataset with the feature engineering method. According to the results, Support Vector Regression from traditional machine learning models outperformed other models and exhibited the best performance. In addition, the estimation results in the feature engineering dataset showed lower error rates than the basic dataset. It has been concluded that the estimated imbalance penalty calculated for the selected organization is lower than the actual imbalance penalty, optimum and profitable accounts.

Keywords: Machine learning, deep learning, time series models, feature engineering, geothermal energy production forecasting.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 204
2784 Gas Detection via Machine Learning

Authors: Walaa Khalaf, Calogero Pace, Manlio Gaudioso

Abstract:

We present an Electronic Nose (ENose), which is aimed at identifying the presence of one out of two gases, possibly detecting the presence of a mixture of the two. Estimation of the concentrations of the components is also performed for a volatile organic compound (VOC) constituted by methanol and acetone, for the ranges 40-400 and 22-220 ppm (parts-per-million), respectively. Our system contains 8 sensors, 5 of them being gas sensors (of the class TGS from FIGARO USA, INC., whose sensing element is a tin dioxide (SnO2) semiconductor), the remaining being a temperature sensor (LM35 from National Semiconductor Corporation), a humidity sensor (HIH–3610 from Honeywell), and a pressure sensor (XFAM from Fujikura Ltd.). Our integrated hardware–software system uses some machine learning principles and least square regression principle to identify at first a new gas sample, or a mixture, and then to estimate the concentrations. In particular we adopt a training model using the Support Vector Machine (SVM) approach with linear kernel to teach the system how discriminate among different gases. Then we apply another training model using the least square regression, to predict the concentrations. The experimental results demonstrate that the proposed multiclassification and regression scheme is effective in the identification of the tested VOCs of methanol and acetone with 96.61% correctness. The concentration prediction is obtained with 0.979 and 0.964 correlation coefficient for the predicted versus real concentrations of methanol and acetone, respectively.

Keywords: Electronic nose, Least square regression, Mixture ofgases, Support Vector Machine.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2539
2783 Multi-Factor Optimization Method through Machine Learning in Building Envelope Design: Focusing on Perforated Metal Façade

Authors: Jinwooung Kim, Jae-Hwan Jung, Seong-Jun Kim, Sung-Ah Kim

Abstract:

Because the building envelope has a significant impact on the operation and maintenance stage of the building, designing the facade considering the performance can improve the performance of the building and lower the maintenance cost of the building. In general, however, optimizing two or more performance factors confronts the limits of time and computational tools. The optimization phase typically repeats infinitely until a series of processes that generate alternatives and analyze the generated alternatives achieve the desired performance. In particular, as complex geometry or precision increases, computational resources and time are prohibitive to find the required performance, so an optimization methodology is needed to deal with this. Instead of directly analyzing all the alternatives in the optimization process, applying experimental techniques (heuristic method) learned through experimentation and experience can reduce resource waste. This study proposes and verifies a method to optimize the double envelope of a building composed of a perforated panel using machine learning to the design geometry and quantitative performance. The proposed method is to achieve the required performance with fewer resources by supplementing the existing method which cannot calculate the complex shape of the perforated panel.

Keywords: Building envelope, machine learning, perforated metal, multi-factor optimization, façade.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1224
2782 A Risk Assessment Tool for the Contamination of Aflatoxins on Dried Figs based on Machine Learning Algorithms

Authors: Kottaridi Klimentia, Demopoulos Vasilis, Sidiropoulos Anastasios, Ihara Diego, Nikolaidis Vasileios, Antonopoulos Dimitrios

Abstract:

Aflatoxins are highly poisonous and carcinogenic compounds produced by species of the genus Aspergillus spp. that can infect a variety of agricultural foods, including dried figs. Biological and environmental factors, such as population, pathogenicity and aflatoxinogenic capacity of the strains, topography, soil and climate parameters of the fig orchards are believed to have a strong effect on aflatoxin levels. Existing methods for aflatoxin detection and measurement, such as high-performance liquid chromatography (HPLC), and enzyme-linked immunosorbent assay (ELISA), can provide accurate results, but the procedures are usually time-consuming, sample-destructive and expensive. Predicting aflatoxin levels prior to crop harvest is useful for minimizing the health and financial impact of a contaminated crop. Consequently, there is interest in developing a tool that predicts aflatoxin levels based on topography and soil analysis data of fig orchards. This paper describes the development of a risk assessment tool for the contamination of aflatoxin on dried figs, based on the location and altitude of the fig orchards, the population of the fungus Aspergillus spp. in the soil, and soil parameters such as pH, saturation percentage (SP), electrical conductivity (EC), organic matter, particle size analysis (sand, silt, clay), concentration of the exchangeable cations (Ca, Mg, K, Na), extractable P and trace of elements (B, Fe, Mn, Zn and Cu), by employing machine learning methods. In particular, our proposed method integrates three machine learning techniques i.e., dimensionality reduction on the original dataset (Principal Component Analysis), metric learning (Mahalanobis Metric for Clustering) and K-nearest Neighbors learning algorithm (KNN), into an enhanced model, with mean performance equal to 85% by terms of the Pearson Correlation Coefficient (PCC) between observed and predicted values.

Keywords: aflatoxins, Aspergillus spp., dried figs, k-nearest neighbors, machine learning, prediction

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 648
2781 Load Forecasting in Microgrid Systems with R and Cortana Intelligence Suite

Authors: F. Lazzeri, I. Reiter

Abstract:

Energy production optimization has been traditionally very important for utilities in order to improve resource consumption. However, load forecasting is a challenging task, as there are a large number of relevant variables that must be considered, and several strategies have been used to deal with this complex problem. This is especially true also in microgrids where many elements have to adjust their performance depending on the future generation and consumption conditions. The goal of this paper is to present a solution for short-term load forecasting in microgrids, based on three machine learning experiments developed in R and web services built and deployed with different components of Cortana Intelligence Suite: Azure Machine Learning, a fully managed cloud service that enables to easily build, deploy, and share predictive analytics solutions; SQL database, a Microsoft database service for app developers; and PowerBI, a suite of business analytics tools to analyze data and share insights. Our results show that Boosted Decision Tree and Fast Forest Quantile regression methods can be very useful to predict hourly short-term consumption in microgrids; moreover, we found that for these types of forecasting models, weather data (temperature, wind, humidity and dew point) can play a crucial role in improving the accuracy of the forecasting solution. Data cleaning and feature engineering methods performed in R and different types of machine learning algorithms (Boosted Decision Tree, Fast Forest Quantile and ARIMA) will be presented, and results and performance metrics discussed.

Keywords: Time-series, features engineering methods for forecasting, energy demand forecasting, Azure machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1290
2780 Correlation Analysis to Quantify Learning Outcomes for Different Teaching Pedagogies

Authors: Kanika Sood, Sijie Shang

Abstract:

A fundamental goal of education includes preparing students to become a part of the global workforce by making beneficial contributions to society. In this paper, we analyze student performance for multiple courses that involve different teaching pedagogies: a cooperative learning technique and an inquiry-based learning strategy. Student performance includes student engagement, grades, and attendance records. We perform this study in the Computer Science department for online and in-person courses for 450 students. We will perform correlation analysis to study the relationship between student scores and other parameters such as gender, mode of learning. We use natural language processing and machine learning to analyze student feedback data and performance data. We assess the learning outcomes of two teaching pedagogies for undergraduate and graduate courses to showcase the impact of pedagogical adoption and learning outcome as determinants of academic achievement. Early findings suggest that when using the specified pedagogies, students become experts on their topics and illustrate enhanced engagement with peers.

Keywords: Bag-of-words, cooperative learning, education, inquiry-based learning, in-person learning, Natural Language Processing, online learning, sentiment analysis, teaching pedagogy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 81
2779 Validating Condition-Based Maintenance Algorithms Through Simulation

Authors: Marcel Chevalier, Léo Dupont, Sylvain Marié, Frédérique Roffet, Elena Stolyarova, William Templier, Costin Vasile

Abstract:

Industrial end users are currently facing an increasing need to reduce the risk of unexpected failures and optimize their maintenance. This calls for both short-term analysis and long-term ageing anticipation. At Schneider Electric, we tackle those two issues using both Machine Learning and First Principles models. Machine learning models are incrementally trained from normal data to predict expected values and detect statistically significant short-term deviations. Ageing models are constructed from breaking down physical systems into sub-assemblies, then determining relevant degradation modes and associating each one to the right kinetic law. Validating such anomaly detection and maintenance models is challenging, both because actual incident and ageing data are rare and distorted by human interventions, and incremental learning depends on human feedback. To overcome these difficulties, we propose to simulate physics, systems and humans – including asset maintenance operations – in order to validate the overall approaches in accelerated time and possibly choose between algorithmic alternatives.

Keywords: Degradation models, ageing, anomaly detection, soft sensor, incremental learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 328
2778 On the Learning of Causal Relationships between Banks in Saudi Equities Market Using Ensemble Feature Selection Methods

Authors: Adel Aloraini

Abstract:

Financial forecasting using machine learning techniques has received great efforts in the last decide . In this ongoing work, we show how machine learning of graphical models will be able to infer a visualized causal interactions between different banks in the Saudi equities market. One important discovery from such learned causal graphs is how companies influence each other and to what extend. In this work, a set of graphical models named Gaussian graphical models with developed ensemble penalized feature selection methods that combine ; filtering method, wrapper method and a regularizer will be shown. A comparison between these different developed ensemble combinations will also be shown. The best ensemble method will be used to infer the causal relationships between banks in Saudi equities market.

Keywords: Causal interactions , banks, feature selection, regularizere,

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1747
2777 Cognition of Driving Context for Driving Assistance

Authors: Manolo Dulva Hina, Clement Thierry, Assia Soukane, Amar Ramdane-Cherif

Abstract:

In this paper, we presented our innovative way of determining the driving context for a driving assistance system. We invoke the fusion of all parameters that describe the context of the environment, the vehicle and the driver to obtain the driving context. We created a training set that stores driving situation patterns and from which the system consults to determine the driving situation. A machine-learning algorithm predicts the driving situation. The driving situation is an input to the fission process that yields the action that must be implemented when the driver needs to be informed or assisted from the given the driving situation. The action may be directed towards the driver, the vehicle or both. This is an ongoing work whose goal is to offer an alternative driving assistance system for safe driving, green driving and comfortable driving. Here, ontologies are used for knowledge representation.

Keywords: Cognitive driving, intelligent transportation system, multimodal system, ontology, machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1459
2776 Normal and Peaberry Coffee Beans Classification from Green Coffee Bean Images Using Convolutional Neural Networks and Support Vector Machine

Authors: Hira Lal Gope, Hidekazu Fukai

Abstract:

The aim of this study is to develop a system which can identify and sort peaberries automatically at low cost for coffee producers in developing countries. In this paper, the focus is on the classification of peaberries and normal coffee beans using image processing and machine learning techniques. The peaberry is not bad and not a normal bean. The peaberry is born in an only single seed, relatively round seed from a coffee cherry instead of the usual flat-sided pair of beans. It has another value and flavor. To make the taste of the coffee better, it is necessary to separate the peaberry and normal bean before green coffee beans roasting. Otherwise, the taste of total beans will be mixed, and it will be bad. In roaster procedure time, all the beans shape, size, and weight must be unique; otherwise, the larger bean will take more time for roasting inside. The peaberry has a different size and different shape even though they have the same weight as normal beans. The peaberry roasts slower than other normal beans. Therefore, neither technique provides a good option to select the peaberries. Defect beans, e.g., sour, broken, black, and fade bean, are easy to check and pick up manually by hand. On the other hand, the peaberry pick up is very difficult even for trained specialists because the shape and color of the peaberry are similar to normal beans. In this study, we use image processing and machine learning techniques to discriminate the normal and peaberry bean as a part of the sorting system. As the first step, we applied Deep Convolutional Neural Networks (CNN) and Support Vector Machine (SVM) as machine learning techniques to discriminate the peaberry and normal bean. As a result, better performance was obtained with CNN than with SVM for the discrimination of the peaberry. The trained artificial neural network with high performance CPU and GPU in this work will be simply installed into the inexpensive and low in calculation Raspberry Pi system. We assume that this system will be used in under developed countries. The study evaluates and compares the feasibility of the methods in terms of accuracy of classification and processing speed.

Keywords: Convolutional neural networks, coffee bean, peaberry, sorting, support vector machine.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1553
2775 Using Interval Trees for Approximate Indexing of Instances

Authors: Khalil el Hindi

Abstract:

This paper presents a simple and effective method for approximate indexing of instances for instance based learning. The method uses an interval tree to determine a good starting search point for the nearest neighbor. The search stops when an early stopping criterion is met. The method proved to be very effective especially when only the first nearest neighbor is required.

Keywords: Instance based learning, interval trees, the knn algorithm, machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1510
2774 One-Class Support Vector Machine for Sentiment Analysis of Movie Review Documents

Authors: Chothmal, Basant Agarwal

Abstract:

Sentiment analysis means to classify a given review document into positive or negative polar document. Sentiment analysis research has been increased tremendously in recent times due to its large number of applications in the industry and academia. Sentiment analysis models can be used to determine the opinion of the user towards any entity or product. E-commerce companies can use sentiment analysis model to improve their products on the basis of users’ opinion. In this paper, we propose a new One-class Support Vector Machine (One-class SVM) based sentiment analysis model for movie review documents. In the proposed approach, we initially extract features from one class of documents, and further test the given documents with the one-class SVM model if a given new test document lies in the model or it is an outlier. Experimental results show the effectiveness of the proposed sentiment analysis model.

Keywords: Feature selection methods, Machine learning, NB, One-class SVM, Sentiment Analysis, Support Vector Machine.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3303
2773 Fine-Grained Sentiment Analysis: Recent Progress

Authors: Jie Liu, Xudong Luo, Pingping Lin, Yifan Fan

Abstract:

Facebook, Twitter, Weibo, and other social media and significant e-commerce sites generate a massive amount of online texts, which can be used to analyse people’s opinions or sentiments for better decision-making. So, sentiment analysis, especially the fine-grained sentiment analysis, is a very active research topic. In this paper, we survey various methods for fine-grained sentiment analysis, including traditional sentiment lexicon-based methods, ma-chine learning-based methods, and deep learning-based methods in aspect/target/attribute-based sentiment analysis tasks. Besides, we discuss their advantages and problems worthy of careful studies in the future.

Keywords: sentiment analysis, fine-grained, machine learning, deep learning

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2398
2772 Machine Morphisms and Simulation

Authors: Janis Buls

Abstract:

This paper examines the concept of simulation from a modelling viewpoint. How can one Mealy machine simulate the other one? We create formalism for simulation of Mealy machines. The injective s–morphism of the machine semigroups induces the simulation of machines [1]. We present the example of s–morphism such that it is not a homomorphism of semigroups. The story for the surjective s–morphisms is quite different. These are homomorphisms of semigroups but there exists the surjective s–morphism such that it does not induce the simulation.

Keywords: Mealy machine, simulation, machine semigroup, injective s–morphism, surjective s–morphisms.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1510