Search results for: Search engines; machine learning
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 3552

Search results for: Search engines; machine learning

3492 Multi-Level Air Quality Classification in China Using Information Gain and Support Vector Machine

Authors: Bingchun Liu, Pei-Chann Chang, Natasha Huang, Dun Li

Abstract:

Machine Learning and Data Mining are the two important tools for extracting useful information and knowledge from large datasets. In machine learning, classification is a wildly used technique to predict qualitative variables and is generally preferred over regression from an operational point of view. Due to the enormous increase in air pollution in various countries especially China, Air Quality Classification has become one of the most important topics in air quality research and modelling. This study aims at introducing a hybrid classification model based on information theory and Support Vector Machine (SVM) using the air quality data of four cities in China namely Beijing, Guangzhou, Shanghai and Tianjin from Jan 1, 2014 to April 30, 2016. China's Ministry of Environmental Protection has classified the daily air quality into 6 levels namely Serious Pollution, Severe Pollution, Moderate Pollution, Light Pollution, Good and Excellent based on their respective Air Quality Index (AQI) values. Using the information theory, information gain (IG) is calculated and feature selection is done for both categorical features and continuous numeric features. Then SVM Machine Learning algorithm is implemented on the selected features with cross-validation. The final evaluation reveals that the IG and SVM hybrid model performs better than SVM (alone), Artificial Neural Network (ANN) and K-Nearest Neighbours (KNN) models in terms of accuracy as well as complexity.

Keywords: Machine learning, air quality classification, air quality index, information gain, support vector machine, cross-validation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 885
3491 Scheduling Maintenance Actions for Gas Turbines Aircraft Engines

Authors: Anis Gharbi

Abstract:

This paper considers the problem of scheduling maintenance actions for identical aircraft gas turbine engines. Each one of the turbines consists of parts which frequently require replacement. A finite inventory of spare parts is available and all parts are ready for replacement at any time. The inventory consists of both new and refurbished parts. Hence, these parts have different field lives. The goal is to find a replacement part sequencing that maximizes the time that the aircraft will keep functioning before the inventory is replenished. The problem is formulated as an identical parallel machine scheduling problem where the minimum completion time has to be maximized. Two models have been developed. The first one is an optimization model which is based on a 0-1 linear programming formulation, while the second one is an approximate procedure which consists in decomposing the problem into several two-machine subproblems. Each subproblem is optimally solved using the first model. Both models have been implemented using Lingo and have been tested on two sets of randomly generated data with up to 150 parts and 10 turbines. Experimental results show that the optimization model is able to solve only instances with no more than 4 turbines, while the decomposition procedure often provides near-optimal solutions within a maximum CPU time of 3 seconds.

Keywords: Aircraft turbines, Scheduling, Identical parallel machines, 0-1 linear programming, Heuristic.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1960
3490 New Enhanced Hexagon-Based Search Using Point-Oriented Inner Search for Fast Block Motion Estimation

Authors: Lai-Man Po, Chi-Wang Ting, Ka-Ho Ng

Abstract:

Recently, an enhanced hexagon-based search (EHS) algorithm was proposed to speedup the original hexagon-based search (HS) by exploiting the group-distortion information of some evaluated points. In this paper, a second version of the EHS is proposed with a new point-oriented inner search technique which can further speedup the HS in both large and small motion environments. Experimental results show that the enhanced hexagon-based search version-2 (EHS2) is faster than the HS up to 34% with negligible PSNR degradation.

Keywords: Inner search, fast motion estimation, block-matching, hexagon search

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1385
3489 Solving Single Machine Total Weighted Tardiness Problem Using Gaussian Process Regression

Authors: Wanatchapong Kongkaew

Abstract:

This paper proposes an application of probabilistic technique, namely Gaussian process regression, for estimating an optimal sequence of the single machine with total weighted tardiness (SMTWT) scheduling problem. In this work, the Gaussian process regression (GPR) model is utilized to predict an optimal sequence of the SMTWT problem, and its solution is improved by using an iterated local search based on simulated annealing scheme, called GPRISA algorithm. The results show that the proposed GPRISA method achieves a very good performance and a reasonable trade-off between solution quality and time consumption. Moreover, in the comparison of deviation from the best-known solution, the proposed mechanism noticeably outperforms the recently existing approaches.

 

Keywords: Gaussian process regression, iterated local search, simulated annealing, single machine total weighted tardiness.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2196
3488 Learning User Keystroke Patterns for Authentication

Authors: Ying Zhao

Abstract:

Keystroke authentication is a new access control system to identify legitimate users via their typing behavior. In this paper, machine learning techniques are adapted for keystroke authentication. Seven learning methods are used to build models to differentiate user keystroke patterns. The selected classification methods are Decision Tree, Naive Bayesian, Instance Based Learning, Decision Table, One Rule, Random Tree and K-star. Among these methods, three of them are studied in more details. The results show that machine learning is a feasible alternative for keystroke authentication. Compared to the conventional Nearest Neighbour method in the recent research, learning methods especially Decision Tree can be more accurate. In addition, the experiment results reveal that 3-Grams is more accurate than 2-Grams and 4-Grams for feature extraction. Also, combination of attributes tend to result higher accuracy.

Keywords: Keystroke Authentication, Pattern recognition, MachineLearning, Instance-based Learning, Bayesian, Decision Tree.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2773
3487 Injury Prediction for Soccer Players Using Machine Learning

Authors: Amiel Satvedi, Richard Pyne

Abstract:

Injuries in professional sports occur on a regular basis. Some may be minor while others can cause huge impact on a player’s career and earning potential. In soccer, there is a high risk of players picking up injuries during game time. This research work seeks to help soccer players reduce the risk of getting injured by predicting the likelihood of injury while playing in the near future and then providing recommendations for intervention. The injury prediction tool will use a soccer player’s number of minutes played on the field, number of appearances, distance covered and performance data for the current and previous seasons as variables to conduct statistical analysis and provide injury predictive results using a machine learning linear regression model.

Keywords: Injury predictor, soccer injury prevention, machine learning in soccer, big data in soccer.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1664
3486 Prediction of MicroRNA-Target Gene by Machine Learning Algorithms in Lung Cancer Study

Authors: Nilubon Kurubanjerdjit, Nattakarn Iam-On, Ka-Lok Ng

Abstract:

MicroRNAs are small non-coding RNA found in many different species. They play crucial roles in cancer such as biological processes of apoptosis and proliferation. The identification of microRNA-target genes can be an essential first step towards to reveal the role of microRNA in various cancer types. In this paper, we predict miRNA-target genes for lung cancer by integrating prediction scores from miRanda and PITA algorithms used as a feature vector of miRNA-target interaction. Then, machine-learning algorithms were implemented for making a final prediction. The approach developed in this study should be of value for future studies into understanding the role of miRNAs in molecular mechanisms enabling lung cancer formation.

Keywords: MicroRNA, miRNAs, lung cancer, machine learning, Naïve Bayes, SVM.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2345
3485 Application of Machine Learning Methods to Online Test Error Detection in Semiconductor Test

Authors: Matthias Kirmse, Uwe Petersohn, Elief Paffrath

Abstract:

As in today's semiconductor industries test costs can make up to 50 percent of the total production costs, an efficient test error detection becomes more and more important. In this paper, we present a new machine learning approach to test error detection that should provide a faster recognition of test system faults as well as an improved test error recall. The key idea is to learn a classifier ensemble, detecting typical test error patterns in wafer test results immediately after finishing these tests. Since test error detection has not yet been discussed in the machine learning community, we define central problem-relevant terms and provide an analysis of important domain properties. Finally, we present comparative studies reflecting the failure detection performance of three individual classifiers and three ensemble methods based upon them. As base classifiers we chose a decision tree learner, a support vector machine and a Bayesian network, while the compared ensemble methods were simple and weighted majority vote as well as stacking. For the evaluation, we used cross validation and a specially designed practical simulation. By implementing our approach in a semiconductor test department for the observation of two products, we proofed its practical applicability.

Keywords: Ensemble methods, fault detection, machine learning, semiconductor test.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2225
3484 Missing Link Data Estimation with Recurrent Neural Network: An Application Using Speed Data of Daegu Metropolitan Area

Authors: JaeHwan Yang, Da-Woon Jeong, Seung-Young Kho, Dong-Kyu Kim

Abstract:

In terms of ITS, information on link characteristic is an essential factor for plan or operation. But in practical cases, not every link has installed sensors on it. The link that does not have data on it is called “Missing Link”. The purpose of this study is to impute data of these missing links. To get these data, this study applies the machine learning method. With the machine learning process, especially for the deep learning process, missing link data can be estimated from present link data. For deep learning process, this study uses “Recurrent Neural Network” to take time-series data of road. As input data, Dedicated Short-range Communications (DSRC) data of Dalgubul-daero of Daegu Metropolitan Area had been fed into the learning process. Neural Network structure has 17 links with present data as input, 2 hidden layers, for 1 missing link data. As a result, forecasted data of target link show about 94% of accuracy compared with actual data.

Keywords: Data Estimation, link data, machine learning, road network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1448
3483 Uplink Throughput Prediction in Cellular Mobile Networks

Authors: Engin Eyceyurt, Josko Zec

Abstract:

The current and future cellular mobile communication networks generate enormous amounts of data. Networks have become extremely complex with extensive space of parameters, features and counters. These networks are unmanageable with legacy methods and an enhanced design and optimization approach is necessary that is increasingly reliant on machine learning. This paper proposes that machine learning as a viable approach for uplink throughput prediction. LTE radio metric, such as Reference Signal Received Power (RSRP), Reference Signal Received Quality (RSRQ), and Signal to Noise Ratio (SNR) are used to train models to estimate expected uplink throughput. The prediction accuracy with high determination coefficient of 91.2% is obtained from measurements collected with a simple smartphone application.

Keywords: Drive test, LTE, machine learning, uplink throughput prediction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 822
3482 Gravitational Search Algorithm (GSA) Optimized SSSC Based Facts Controller to Improve Power System Oscillation Stability

Authors: Gayadhar Panda, P. K. Rautraya

Abstract:

In this paper, an investigation into the use of modified Genetic Algorithm optimized SSSC based controller to aid damping of low frequency inter-area oscillations in power systems is presented. Controller design is formulated as a nonlinear constrained optimization problem and modified Genetic Algorithm (MGA) is employed to search for the optimal controller parameters. For evaluation of effectiveness and robustness of proposed controllers, the performance was tested on multi-machine system subjected to different disturbances, loading conditions and system parameter variations. Simulation results are presented to show the fine performance of the proposed SSSC controller in damping the critical modes without significantly deteriorating the damping characteristics of other modes in multi-machine power system.

Keywords: SSSC, FACTS, Controller Design, Damping of Oscillations, Multi-machine system, Modified Genetic Algorithm (MGA).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1987
3481 Web Content Mining: A Solution to Consumer's Product Hunt

Authors: Syed Salman Ahmed, Zahid Halim, Rauf Baig, Shariq Bashir

Abstract:

With the rapid growth in business size, today's businesses orient towards electronic technologies. Amazon.com and e-bay.com are some of the major stakeholders in this regard. Unfortunately the enormous size and hugely unstructured data on the web, even for a single commodity, has become a cause of ambiguity for consumers. Extracting valuable information from such an everincreasing data is an extremely tedious task and is fast becoming critical towards the success of businesses. Web content mining can play a major role in solving these issues. It involves using efficient algorithmic techniques to search and retrieve the desired information from a seemingly impossible to search unstructured data on the Internet. Application of web content mining can be very encouraging in the areas of Customer Relations Modeling, billing records, logistics investigations, product cataloguing and quality management. In this paper we present a review of some very interesting, efficient yet implementable techniques from the field of web content mining and study their impact in the area specific to business user needs focusing both on the customer as well as the producer. The techniques we would be reviewing include, mining by developing a knowledge-base repository of the domain, iterative refinement of user queries for personalized search, using a graphbased approach for the development of a web-crawler and filtering information for personalized search using website captions. These techniques have been analyzed and compared on the basis of their execution time and relevance of the result they produced against a particular search.

Keywords: Data mining, web mining, search engines, knowledge discovery.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2010
3480 The Influence of Preprocessing Parameters on Text Categorization

Authors: Jan Pomikalek, Radim Rehurek

Abstract:

Text categorization (the assignment of texts in natural language into predefined categories) is an important and extensively studied problem in Machine Learning. Currently, popular techniques developed to deal with this task include many preprocessing and learning algorithms, many of which in turn require tuning nontrivial internal parameters. Although partial studies are available, many authors fail to report values of the parameters they use in their experiments, or reasons why these values were used instead of others. The goal of this work then is to create a more thorough comparison of preprocessing parameters and their mutual influence, and report interesting observations and results.

Keywords: Text categorization, machine learning, electronic documents, classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1540
3479 Support Vector Machine based Intelligent Watermark Decoding for Anticipated Attack

Authors: Syed Fahad Tahir, Asifullah Khan, Abdul Majid, Anwar M. Mirza

Abstract:

In this paper, we present an innovative scheme of blindly extracting message bits from an image distorted by an attack. Support Vector Machine (SVM) is used to nonlinearly classify the bits of the embedded message. Traditionally, a hard decoder is used with the assumption that the underlying modeling of the Discrete Cosine Transform (DCT) coefficients does not appreciably change. In case of an attack, the distribution of the image coefficients is heavily altered. The distribution of the sufficient statistics at the receiving end corresponding to the antipodal signals overlap and a simple hard decoder fails to classify them properly. We are considering message retrieval of antipodal signal as a binary classification problem. Machine learning techniques like SVM is used to retrieve the message, when certain specific class of attacks is most probable. In order to validate SVM based decoding scheme, we have taken Gaussian noise as a test case. We generate a data set using 125 images and 25 different keys. Polynomial kernel of SVM has achieved 100 percent accuracy on test data.

Keywords: Bit Correct Ratio (BCR), Grid Search, Intelligent Decoding, Jackknife Technique, Support Vector Machine (SVM), Watermarking.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1624
3478 A Survey in Techniques for Imbalanced Intrusion Detection System Datasets

Authors: Najmeh Abedzadeh, Matthew Jacobs

Abstract:

An intrusion detection system (IDS) is a software application that monitors malicious activities and generates alerts if any are detected. However, most network activities in IDS datasets are normal, and the relatively few numbers of attacks make the available data imbalanced. Consequently, cyber-attacks can hide inside a large number of normal activities, and machine learning algorithms have difficulty learning and classifying the data correctly. In this paper, a comprehensive literature review is conducted on different types of algorithms for both implementing the IDS and methods in correcting the imbalanced IDS dataset. The most famous algorithms are machine learning (ML), deep learning (DL), synthetic minority over-sampling technique (SMOTE), and reinforcement learning (RL). Most of the research use the CSE-CIC-IDS2017, CSE-CIC-IDS2018, and NSL-KDD datasets for evaluating their algorithms.

Keywords: IDS, intrusion detection system, imbalanced datasets, sampling algorithms, big data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1014
3477 A Study and Implementation of On-line Learning Diagnosis and Inquiry System

Authors: YuLung Wu

Abstract:

In Knowledge Structure Graph, each course unit represents a phase of learning activities. Both learning portfolios and Knowledge Structure Graphs contain learning information of students and let teachers know which content are difficulties and fails. The study purposes "Dual Mode On-line Learning Diagnosis System" that integrates two search methods: learning portfolio and knowledge structure. Teachers can operate the proposed system and obtain the information of specific students without any computer science background. The teachers can find out failed students in advance and provide remedial learning resources.

Keywords: Knowledge Structure Graph, On-line LearningDiagnosis

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1414
3476 A Learning Agent for Knowledge Extraction from an Active Semantic Network

Authors: Simon Thiel, Stavros Dalakakis, Dieter Roller

Abstract:

This paper outlines the development of a learning retrieval agent. Task of this agent is to extract knowledge of the Active Semantic Network in respect to user-requests. Based on a reinforcement learning approach, the agent learns to interpret the user-s intention. Especially, the learning algorithm focuses on the retrieval of complex long distant relations. Increasing its learnt knowledge with every request-result-evaluation sequence, the agent enhances his capability in finding the intended information.

Keywords: Reinforcement learning, learning retrieval agent, search in semantic networks.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1450
3475 Impovement of a Label Extraction Method for a Risk Search System

Authors: Shigeaki Sakurai, Ryohei Orihara

Abstract:

This paper proposes an improvement method of classification efficiency in a classification model. The model is used in a risk search system and extracts specific labels from articles posted at bulletin board sites. The system can analyze the important discussions composed of the articles. The improvement method introduces ensemble learning methods that use multiple classification models. Also, it introduces expressions related to the specific labels into generation of word vectors. The paper applies the improvement method to articles collected from three bulletin board sites selected by users and verifies the effectiveness of the improvement method.

Keywords: Text mining, Risk search system, Corporate reputation, Bulletin board site, Ensemble learning

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1281
3474 A Text Classification Approach Based on Natural Language Processing and Machine Learning Techniques

Authors: Rim Messaoudi, Nogaye-Gueye Gning, François Azelart

Abstract:

Automatic text classification applies mostly natural language processing (NLP) and other artificial intelligence (AI)-guided techniques to automatically classify text in a faster and more accurate manner. This paper discusses the subject of using predictive maintenance to manage incident tickets inside the sociality. It focuses on proposing a tool that treats and analyses comments and notes written by administrators after resolving an incident ticket. The goal here is to increase the quality of these comments. Additionally, this tool is based on NLP and machine learning techniques to realize the textual analytics of the extracted data. This approach was tested using real data taken from the French National Railways (SNCF) company and was given a high-quality result.

Keywords: Machine learning, text classification, NLP techniques, semantic representation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 77
3473 A Metametadata Architecture forPedagogic Data Description

Authors: A. Ismail, M. S. Joy, J. E. Sinclair, M. I. Hamzah

Abstract:

This paper focuses on a novel method for semantic searching and retrieval of information about learning materials. Metametadata encapsulate metadata instances by using the properties and attributes provided by ontologies rather than describing learning objects. A novel metametadata taxonomy has been developed which provides the basis for a semantic search engine to extract, match and map queries to retrieve relevant results. The use of ontological views is a foundation for viewing the pedagogical content of metadata extracted from learning objects by using the pedagogical attributes from the metametadata taxonomy. Using the ontological approach and metametadata (based on the metametadata taxonomy) we present a novel semantic searching mechanism.These three strands – the taxonomy, the ontological views, and the search algorithm – are incorporated into a novel architecture (OMESCOD) which has been implemented.

Keywords: Metadata, metametadata, semantic, ontologies.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1478
3472 An Analysis of Classification of Imbalanced Datasets by Using Synthetic Minority Over-Sampling Technique

Authors: Ghada A. Alfattni

Abstract:

Analysing unbalanced datasets is one of the challenges that practitioners in machine learning field face. However, many researches have been carried out to determine the effectiveness of the use of the synthetic minority over-sampling technique (SMOTE) to address this issue. The aim of this study was therefore to compare the effectiveness of the SMOTE over different models on unbalanced datasets. Three classification models (Logistic Regression, Support Vector Machine and Nearest Neighbour) were tested with multiple datasets, then the same datasets were oversampled by using SMOTE and applied again to the three models to compare the differences in the performances. Results of experiments show that the highest number of nearest neighbours gives lower values of error rates. 

Keywords: Imbalanced datasets, SMOTE, machine learning, logistic regression, support vector machine, nearest neighbour.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1281
3471 An Automated Stock Investment System Using Machine Learning Techniques: An Application in Australia

Authors: Carol Anne Hargreaves

Abstract:

A key issue in stock investment is how to select representative features for stock selection. The objective of this paper is to firstly determine whether an automated stock investment system, using machine learning techniques, may be used to identify a portfolio of growth stocks that are highly likely to provide returns better than the stock market index. The second objective is to identify the technical features that best characterize whether a stock’s price is likely to go up and to identify the most important factors and their contribution to predicting the likelihood of the stock price going up. Unsupervised machine learning techniques, such as cluster analysis, were applied to the stock data to identify a cluster of stocks that was likely to go up in price – portfolio 1. Next, the principal component analysis technique was used to select stocks that were rated high on component one and component two – portfolio 2. Thirdly, a supervised machine learning technique, the logistic regression method, was used to select stocks with a high probability of their price going up – portfolio 3. The predictive models were validated with metrics such as, sensitivity (recall), specificity and overall accuracy for all models. All accuracy measures were above 70%. All portfolios outperformed the market by more than eight times. The top three stocks were selected for each of the three stock portfolios and traded in the market for one month. After one month the return for each stock portfolio was computed and compared with the stock market index returns. The returns for all three stock portfolios was 23.87% for the principal component analysis stock portfolio, 11.65% for the logistic regression portfolio and 8.88% for the K-means cluster portfolio while the stock market performance was 0.38%. This study confirms that an automated stock investment system using machine learning techniques can identify top performing stock portfolios that outperform the stock market.

Keywords: Machine learning, stock market trading, logistic principal component analysis, automated stock investment system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1034
3470 Cirrhosis Mortality Prediction as Classification Using Frequent Subgraph Mining

Authors: Abdolghani Ebrahimi, Diego Klabjan, Chenxi Ge, Daniela Ladner, Parker Stride

Abstract:

In this work, we use machine learning and data analysis techniques to predict the one-year mortality of cirrhotic patients. Data from 2,322 patients with liver cirrhosis are collected at a single medical center. Different machine learning models are applied to predict one-year mortality. A comprehensive feature space including demographic information, comorbidity, clinical procedure and laboratory tests is being analyzed. A temporal pattern mining technic called Frequent Subgraph Mining (FSM) is being used. Model for End-stage liver disease (MELD) prediction of mortality is used as a comparator. All of our models statistically significantly outperform the MELD-score model and show an average 10% improvement of the area under the curve (AUC). The FSM technic itself does not improve the model significantly, but FSM, together with a machine learning technique called an ensemble, further improves the model performance. With the abundance of data available in healthcare through electronic health records (EHR), existing predictive models can be refined to identify and treat patients at risk for higher mortality. However, due to the sparsity of the temporal information needed by FSM, the FSM model does not yield significant improvements. Our work applies modern machine learning algorithms and data analysis methods on predicting one-year mortality of cirrhotic patients and builds a model that predicts one-year mortality significantly more accurate than the MELD score. We have also tested the potential of FSM and provided a new perspective of the importance of clinical features.

Keywords: machine learning, liver cirrhosis, subgraph mining, supervised learning

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 381
3469 Machine Learning Methods for Environmental Monitoring and Flood Protection

Authors: Alexander L. Pyayt, Ilya I. Mokhov, Bernhard Lang, Valeria V. Krzhizhanovskaya, Robert J. Meijer

Abstract:

More and more natural disasters are happening every year: floods, earthquakes, volcanic eruptions, etc. In order to reduce the risk of possible damages, governments all around the world are investing into development of Early Warning Systems (EWS) for environmental applications. The most important task of the EWS is identification of the onset of critical situations affecting environment and population, early enough to inform the authorities and general public. This paper describes an approach for monitoring of flood protections systems based on machine learning methods. An Artificial Intelligence (AI) component has been developed for detection of abnormal dike behaviour. The AI module has been integrated into an EWS platform of the UrbanFlood project (EU Seventh Framework Programme) and validated on real-time measurements from the sensors installed in a dike.

Keywords: Early Warning System, intelligent environmentalmonitoring, machine learning, flood protection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4048
3468 Predictive Maintenance of Industrial Shredders: Efficient Operation through Real-Time Monitoring Using Statistical Machine Learning

Authors: Federico Pittino, Dominik Holzmann, Krithika Sayar-Chand, Stefan Moser, Sebastian Pliessnig, Thomas Arnold

Abstract:

The shredding of waste materials is a key step in the recycling process towards circular economy. Industrial shredders for waste processing operate in very harsh operating conditions, leading to the need of frequent maintenance of critical components. The maintenance optimization is particularly important also to increase the machine’s efficiency, thereby reducing the operational costs. In this work, a monitoring system has been developed and deployed on an industrial shredder located at a waste recycling plant in Austria. The machine has been monitored for several months and methods for predictive maintenance have been developed for two key components: the cutting knives and the drive belt. The large amount of collected data is leveraged by statistical machine learning techniques, thereby not requiring a very detailed knowledge of the machine or its live operating conditions. The results show that, despite the wide range of operating conditions, a reliable estimate of the optimal time for maintenance can be derived. Moreover, the trade-off between the cost of maintenance and the increase in power consumption due to the wear state of the monitored components of the machine is investigated. This work proves the benefits of real-time monitoring system for efficient operation of industrial shredders.

Keywords: predictive maintenance, circular economy, industrial shredder, cost optimization, statistical machine learning

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 583
3467 In Search of Excellence – Google vs Baidu

Authors: Linda, Sau-ling LAI

Abstract:

This paper compares the search engine marketing strategies adopted in China and the Western countries through two illustrative cases, namely, Google and Baidu. Marketers in the West use search engine optimization (SEO) to rank their sites higher for queries in Google. Baidu, however, offers paid search placement, or the selling of engine results for particular keywords to the higher bidders. Whereas Google has been providing innovative services ranging from Google Map to Google Blog, Baidu remains focused on search services – the one that it does best. The challenges and opportunities of the Chinese Internet market offered to global entrepreneurs are also discussed in the paper

Keywords: Search Engine, Web analytics, Google, Baidu

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2402
3466 Machine Learning for Aiding Meningitis Diagnosis in Pediatric Patients

Authors: Karina Zaccari, Ernesto Cordeiro Marujo

Abstract:

This paper presents a Machine Learning (ML) approach to support Meningitis diagnosis in patients at a children’s hospital in Sao Paulo, Brazil. The aim is to use ML techniques to reduce the use of invasive procedures, such as cerebrospinal fluid (CSF) collection, as much as possible. In this study, we focus on predicting the probability of Meningitis given the results of a blood and urine laboratory tests, together with the analysis of pain or other complaints from the patient. We tested a number of different ML algorithms, including: Adaptative Boosting (AdaBoost), Decision Tree, Gradient Boosting, K-Nearest Neighbors (KNN), Logistic Regression, Random Forest and Support Vector Machines (SVM). Decision Tree algorithm performed best, with 94.56% and 96.18% accuracy for training and testing data, respectively. These results represent a significant aid to doctors in diagnosing Meningitis as early as possible and in preventing expensive and painful procedures on some children.

Keywords: Machine learning, medical diagnosis, meningitis detection, gradient boosting.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1038
3465 Movie Genre Preference Prediction Using Machine Learning for Customer-Based Information

Authors: Haifeng Wang, Haili Zhang

Abstract:

Most movie recommendation systems have been developed for customers to find items of interest. This work introduces a predictive model usable by small and medium-sized enterprises (SMEs) who are in need of a data-based and analytical approach to stock proper movies for local audiences and retain more customers. We used classification models to extract features from thousands of customers’ demographic, behavioral and social information to predict their movie genre preference. In the implementation, a Gaussian kernel support vector machine (SVM) classification model and a logistic regression model were established to extract features from sample data and their test error-in-sample were compared. Comparison of error-out-sample was also made under different Vapnik–Chervonenkis (VC) dimensions in the machine learning algorithm to find and prevent overfitting. Gaussian kernel SVM prediction model can correctly predict movie genre preferences in 85% of positive cases. The accuracy of the algorithm increased to 93% with a smaller VC dimension and less overfitting. These findings advance our understanding of how to use machine learning approach to predict customers’ preferences with a small data set and design prediction tools for these enterprises.

Keywords: Computational social science, movie preference, machine learning, SVM.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1601
3464 Integration of Big Data to Predict Transportation for Smart Cities

Authors: Sun-Young Jang, Sung-Ah Kim, Dongyoun Shin

Abstract:

The Intelligent transportation system is essential to build smarter cities. Machine learning based transportation prediction could be highly promising approach by delivering invisible aspect visible. In this context, this research aims to make a prototype model that predicts transportation network by using big data and machine learning technology. In detail, among urban transportation systems this research chooses bus system.  The research problem that existing headway model cannot response dynamic transportation conditions. Thus, bus delay problem is often occurred. To overcome this problem, a prediction model is presented to fine patterns of bus delay by using a machine learning implementing the following data sets; traffics, weathers, and bus statues. This research presents a flexible headway model to predict bus delay and analyze the result. The prototyping model is composed by real-time data of buses. The data are gathered through public data portals and real time Application Program Interface (API) by the government. These data are fundamental resources to organize interval pattern models of bus operations as traffic environment factors (road speeds, station conditions, weathers, and bus information of operating in real-time). The prototyping model is designed by the machine learning tool (RapidMiner Studio) and conducted tests for bus delays prediction. This research presents experiments to increase prediction accuracy for bus headway by analyzing the urban big data. The big data analysis is important to predict the future and to find correlations by processing huge amount of data. Therefore, based on the analysis method, this research represents an effective use of the machine learning and urban big data to understand urban dynamics.

Keywords: Big data, bus headway prediction, machine learning, public transportation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1520
3463 Socioculture and Cognitivist Perspectives on Language and Communication Barriers in Learning

Authors: David Hallberg

Abstract:

It is believed that major account on language diversity must be taken in learning, and especially in learning using ICT. This paper-s objective is to exhibit language and communication barriers in learning, to approach the topic from socioculture and cognitivist perspectives, and to give exploratory solutions of handling such barriers. The review is mainly conducted by approaching the journal Computers & Education, but also an initially broad search was conducted. The results show that not much attention is paid on language and communication barriers in an immediate relation to learning using ICT. The results shows, inter alia, that language and communication barriers are caused because of not enough account is taken on both the individual-s background and the technology.

Keywords: communication barriers, cognitive, ICT, language barriers, learning, socioculture

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2308