Search results for: unsupervised classification
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1179

Search results for: unsupervised classification

579 Real-time Network Anomaly Detection Systems Based on Machine-Learning Algorithms

Authors: Zahra Ramezanpanah, Joachim Carvallo, Aurelien Rodriguez

Abstract:

This paper aims to detect anomalies in streaming data using machine learning algorithms. In this regard, we designed two separate pipelines and evaluated the effectiveness of each separately. The first pipeline, based on supervised machine learning methods, consists of two phases. In the first phase, we trained several supervised models using the UNSW-NB15 data set. We measured the efficiency of each using different performance metrics and selected the best model for the second phase. At the beginning of the second phase, we first, using Argus Server, sniffed a local area network. Several types of attacks were simulated and then sent the sniffed data to a running algorithm at short intervals. This algorithm can display the results of each packet of received data in real-time using the trained model. The second pipeline presented in this paper is based on unsupervised algorithms, in which a Temporal Graph Network (TGN) is used to monitor a local network. The TGN is trained to predict the probability of future states of the network based on its past behavior. Our contribution in this section is introducing an indicator to identify anomalies from these predicted probabilities.

Keywords: Cyber-security, Intrusion Detection Systems, Temporal Graph Network, Anomaly Detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 505
578 Breast Cancer Survivability Prediction via Classifier Ensemble

Authors: Mohamed Al-Badrashiny, Abdelghani Bellaachia

Abstract:

This paper presents a classifier ensemble approach for predicting the survivability of the breast cancer patients using the latest database version of the Surveillance, Epidemiology, and End Results (SEER) Program of the National Cancer Institute. The system consists of two main components; features selection and classifier ensemble components. The features selection component divides the features in SEER database into four groups. After that it tries to find the most important features among the four groups that maximizes the weighted average F-score of a certain classification algorithm. The ensemble component uses three different classifiers, each of which models different set of features from SEER through the features selection module. On top of them, another classifier is used to give the final decision based on the output decisions and confidence scores from each of the underlying classifiers. Different classification algorithms have been examined; the best setup found is by using the decision tree, Bayesian network, and Na¨ıve Bayes algorithms for the underlying classifiers and Na¨ıve Bayes for the classifier ensemble step. The system outperforms all published systems to date when evaluated against the exact same data of SEER (period of 1973-2002). It gives 87.39% weighted average F-score compared to 85.82% and 81.34% of the other published systems. By increasing the data size to cover the whole database (period of 1973-2014), the overall weighted average F-score jumps to 92.4% on the held out unseen test set.

Keywords: Classifier ensemble, breast cancer survivability, data mining, SEER.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1671
577 Fetal and Infant Mortality in Botucatu City, São Paulo State, Brazil: Evaluation of Maternal - Infant Health Care

Authors: Noda L. M., Salvador I. C, C. M. L. G. Parada, Fonseca C. R. B.

Abstract:

In Brazil, neonatal mortality rate is considered incompatible with the country development conditions, and has been a Public Health concern. Reduction in infant mortality rates has also been part of the Millennium Development Goals, a commitment made by countries, members of the Organization of United Nations (OUN), including Brazil. Fetal mortality rate is considered a highly sensitive indicator of health care quality. Suitable actions, such as good quality and access to health services may contribute positively towards reduction in these fetal and neonatal rates. With appropriate antenatal follow-up and health care during gestation and delivery, some death causes could be reduced or even prevented by means of early diagnosis and intervention, as well as changes in risk factors and interventions. Objectives: To study the quality of maternal and infant health care based on fetal and neonatal mortality, as well as the possible actions to prevent those deaths in Botucatu (Brazil). Methods: Classification of prevention according to the International Classification of Diseases and the modified Wigglesworth´s classification. In order to evaluate adequacy, indicators of quality of antenatal and delivery care were established by the authors. Results: Considering fetal deaths, 56.7% of them occurred before delivery, which reveals possible shortcomings in antenatal care, and 38.2% of them were a result of intra- labor changes, which could be prevented or reduced by adequate obstetric management. These findings were different from those in the group of early neonatal deaths which were also studied. Adequacy of health services showed that antenatal and childbirth care was appropriate for 24% and 33.3% of pregnant women, respectively, which corroborates the results of prevention. These results revealed that shortcomings in obstetric and antenatal care could be the causes of deaths in the study. Early and late neonatal deaths have similar characteristics: 76% could be prevented or reduced mainly by adequate newborn care (52.9%) and adequate health care for gestational women (11.7%). When adequacy of care was evaluated, childbirth and newborn care was adequate in 25.8% and antenatal care was adequate in 16.1%. In conclusion, direct relationship was found between adequacy and quality of care rendered to pregnant women and newborns, and fetal and infant mortality. Moreover, our findings highlight that deaths could be prevented by an adequate obstetric and neonatal management.

Keywords: Fetal Mortality, Infant Mortality, Maternal-Child Health Services, Program Evaluation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5069
576 A Review: Comparative Analysis of Different Categorical Data Clustering Ensemble Methods

Authors: S. Sarumathi, N. Shanthi, M. Sharmila

Abstract:

Over the past epoch a rampant amount of work has been done in the data clustering research under the unsupervised learning technique in Data mining. Furthermore several algorithms and methods have been proposed focusing on clustering different data types, representation of cluster models, and accuracy rates of the clusters. However no single clustering algorithm proves to be the most efficient in providing best results. Accordingly in order to find the solution to this issue a new technique, called Cluster ensemble method was bloomed. This cluster ensemble is a good alternative approach for facing the cluster analysis problem. The main hope of the cluster ensemble is to merge different clustering solutions in such a way to achieve accuracy and to improve the quality of individual data clustering. Due to the substantial and unremitting development of new methods in the sphere of data mining and also the incessant interest in inventing new algorithms, makes obligatory to scrutinize a critical analysis of the existing techniques and the future novelty. This paper exposes the comparative study of different cluster ensemble methods along with their features, systematic working process and the average accuracy and error rates of each ensemble methods. Consequently this speculative and comprehensive analysis will be very useful for the community of clustering practitioners and also helps in deciding the most suitable one to rectify the problem in hand.

Keywords: Clustering, Cluster Ensemble methods, Co-association matrix, Consensus function, Median partition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2604
575 Diagnosis of the Heart Rhythm Disorders by Using Hybrid Classifiers

Authors: Sule Yucelbas, Gulay Tezel, Cuneyt Yucelbas, Seral Ozsen

Abstract:

In this study, it was tried to identify some heart rhythm disorders by electrocardiography (ECG) data that is taken from MIT-BIH arrhythmia database by subtracting the required features, presenting to artificial neural networks (ANN), artificial immune systems (AIS), artificial neural network based on artificial immune system (AIS-ANN) and particle swarm optimization based artificial neural network (PSO-NN) classifier systems. The main purpose of this study is to evaluate the performance of hybrid AIS-ANN and PSO-ANN classifiers with regard to the ANN and AIS. For this purpose, the normal sinus rhythm (NSR), atrial premature contraction (APC), sinus arrhythmia (SA), ventricular trigeminy (VTI), ventricular tachycardia (VTK) and atrial fibrillation (AF) data for each of the RR intervals were found. Then these data in the form of pairs (NSR-APC, NSR-SA, NSR-VTI, NSR-VTK and NSR-AF) is created by combining discrete wavelet transform which is applied to each of these two groups of data and two different data sets with 9 and 27 features were obtained from each of them after data reduction. Afterwards, the data randomly was firstly mixed within themselves, and then 4-fold cross validation method was applied to create the training and testing data. The training and testing accuracy rates and training time are compared with each other.

As a result, performances of the hybrid classification systems, AIS-ANN and PSO-ANN were seen to be close to the performance of the ANN system. Also, the results of the hybrid systems were much better than AIS, too. However, ANN had much shorter period of training time than other systems. In terms of training times, ANN was followed by PSO-ANN, AIS-ANN and AIS systems respectively. Also, the features that extracted from the data affected the classification results significantly.

Keywords: AIS, ANN, ECG, hybrid classifiers, PSO.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1916
574 AI-Based Techniques for Online Social Media Network Sentiment Analysis: A Methodical Review

Authors: A. M. John-Otumu, M. M. Rahman, O. C. Nwokonkwo, M. C. Onuoha

Abstract:

Online social media networks have long served as a primary arena for group conversations, gossip, text-based information sharing and distribution. The use of natural language processing techniques for text classification and unbiased decision making has not been far-fetched. Proper classification of these textual information in a given context has also been very difficult. As a result, a systematic review was conducted from previous literature on sentiment classification and AI-based techniques. The study was done in order to gain a better understanding of the process of designing and developing a robust and more accurate sentiment classifier that could correctly classify social media textual information of a given context between hate speech and inverted compliments with a high level of accuracy using the knowledge gain from the evaluation of different artificial intelligence techniques reviewed. The study evaluated over 250 articles from digital sources like ACM digital library, Google Scholar, and IEEE Xplore; and whittled down the number of research to 52 articles. Findings revealed that deep learning approaches such as Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), Bidirectional Encoder Representations from Transformer (BERT), and Long Short-Term Memory (LSTM) outperformed various machine learning techniques in terms of performance accuracy. A large dataset is also required to develop a robust sentiment classifier. Results also revealed that data can be obtained from places like Twitter, movie reviews, Kaggle, Stanford Sentiment Treebank (SST), and SemEval Task4 based on the required domain. The hybrid deep learning techniques like CNN+LSTM, CNN+ Gated Recurrent Unit (GRU), CNN+BERT outperformed single deep learning techniques and machine learning techniques. Python programming language outperformed Java programming language in terms of development simplicity and AI-based library functionalities. Finally, the study recommended the findings obtained for building robust sentiment classifier in the future.

Keywords: Artificial Intelligence, Natural Language Processing, Sentiment Analysis, Social Network, Text.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 594
573 Cheiloscopy and Dactylography in Relation to ABO Blood Groups: Egyptian vs. Malay Populations

Authors: Manal Hassan Abdel Aziz, Fatma Mohamed Magdy Badr El Dine, Nourhan Mohamed Mohamed Saeed

Abstract:

Establishing association between lip print patterns and those of fingerprints as well as blood groups is of fundamental importance in the forensic identification domain. The first aim of the current study was to determine the prevalent types of ABO blood groups, lip prints and fingerprints patterns in both studied populations. Secondly, to analyze any relation found between the different print patterns and the blood groups, which would be valuable in identification purposes. The present study was conducted on 60 healthy volunteers, (30 males and 30 females) from each of the studied population. Lip prints and fingerprints were obtained and classified according to Tsuchihashi's classification and Michael Kuchen’s classification, respectively. The results show that the ulnar loop was the most frequent among both populations. Blood group A was the most frequent among Egyptians, while blood groups O and B were the predominant among Malaysians. Significant relations were observed between lip print patterns and fingerprint (in the second quadrant for Egyptian males and the first one for Malaysian). For Malaysian females, a statistically significant association was proved in the fourth quadrant. Regarding the blood groups, 89.5% of ulnar loops were significantly related to blood group A among Egyptian males. The results proved an association between the fingerprint pattern and the lip prints, as well as between the ABO blood group and the pattern of fingerprints. However, further researches with larger sample sizes need to be directed to approve the current results.

Keywords: ABO, cheiloscopy, dactylography, Egyptians, Malaysians.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 884
572 An Automatic Bayesian Classification System for File Format Selection

Authors: Roman Graf, Sergiu Gordea, Heather M. Ryan

Abstract:

This paper presents an approach for the classification of an unstructured format description for identification of file formats. The main contribution of this work is the employment of data mining techniques to support file format selection with just the unstructured text description that comprises the most important format features for a particular organisation. Subsequently, the file format indentification method employs file format classifier and associated configurations to support digital preservation experts with an estimation of required file format. Our goal is to make use of a format specification knowledge base aggregated from a different Web sources in order to select file format for a particular institution. Using the naive Bayes method, the decision support system recommends to an expert, the file format for his institution. The proposed methods facilitate the selection of file format and the quality of a digital preservation process. The presented approach is meant to facilitate decision making for the preservation of digital content in libraries and archives using domain expert knowledge and specifications of file formats. To facilitate decision-making, the aggregated information about the file formats is presented as a file format vocabulary that comprises most common terms that are characteristic for all researched formats. The goal is to suggest a particular file format based on this vocabulary for analysis by an expert. The sample file format calculation and the calculation results including probabilities are presented in the evaluation section.

Keywords: Data mining, digital libraries, digital preservation, file format.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1660
571 sEMG Interface Design for Locomotion Identification

Authors: Rohit Gupta, Ravinder Agarwal

Abstract:

Surface electromyographic (sEMG) signal has the potential to identify the human activities and intention. This potential is further exploited to control the artificial limbs using the sEMG signal from residual limbs of amputees. The paper deals with the development of multichannel cost efficient sEMG signal interface for research application, along with evaluation of proposed class dependent statistical approach of the feature selection method. The sEMG signal acquisition interface was developed using ADS1298 of Texas Instruments, which is a front-end interface integrated circuit for ECG application. Further, the sEMG signal is recorded from two lower limb muscles for three locomotions namely: Plane Walk (PW), Stair Ascending (SA), Stair Descending (SD). A class dependent statistical approach is proposed for feature selection and also its performance is compared with 12 preexisting feature vectors. To make the study more extensive, performance of five different types of classifiers are compared. The outcome of the current piece of work proves the suitability of the proposed feature selection algorithm for locomotion recognition, as compared to other existing feature vectors. The SVM Classifier is found as the outperformed classifier among compared classifiers with an average recognition accuracy of 97.40%. Feature vector selection emerges as the most dominant factor affecting the classification performance as it holds 51.51% of the total variance in classification accuracy. The results demonstrate the potentials of the developed sEMG signal acquisition interface along with the proposed feature selection algorithm.

Keywords: Classifiers, feature selection, locomotion, sEMG.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1491
570 Multi-Temporal Urban Land Cover Mapping Using Spectral Indices

Authors: Mst Ilme Faridatul, Bo Wu

Abstract:

Multi-temporal urban land cover mapping is of paramount importance for monitoring urban sprawl and managing the ecological environment. For diversified urban activities, it is challenging to map land covers in a complex urban environment. Spectral indices have proved to be effective for mapping urban land covers. To improve multi-temporal urban land cover classification and mapping, we evaluate the performance of three spectral indices, e.g. modified normalized difference bare-land index (MNDBI), tasseled cap water and vegetation index (TCWVI) and shadow index (ShDI). The MNDBI is developed to evaluate its performance of enhancing urban impervious areas by separating bare lands. A tasseled cap index, TCWVI is developed to evaluate its competence to detect vegetation and water simultaneously. The ShDI is developed to maximize the spectral difference between shadows of skyscrapers and water and enhance water detection. First, this paper presents a comparative analysis of three spectral indices using Landsat Enhanced Thematic Mapper (ETM), Thematic Mapper (TM) and Operational Land Imager (OLI) data. Second, optimized thresholds of the spectral indices are imputed to classify land covers, and finally, their performance of enhancing multi-temporal urban land cover mapping is assessed. The results indicate that the spectral indices are competent to enhance multi-temporal urban land cover mapping and achieves an overall classification accuracy of 93-96%.

Keywords: Land cover, mapping, multi-temporal, spectral indices.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1111
569 Statistical Feature Extraction Method for Wood Species Recognition System

Authors: Mohd Iz'aan Paiz Bin Zamri, Anis Salwa Mohd Khairuddin, Norrima Mokhtar, Rubiyah Yusof

Abstract:

Effective statistical feature extraction and classification are important in image-based automatic inspection and analysis. An automatic wood species recognition system is designed to perform wood inspection at custom checkpoints to avoid mislabeling of timber which will results to loss of income to the timber industry. The system focuses on analyzing the statistical pores properties of the wood images. This paper proposed a fuzzy-based feature extractor which mimics the experts’ knowledge on wood texture to extract the properties of pores distribution from the wood surface texture. The proposed feature extractor consists of two steps namely pores extraction and fuzzy pores management. The total number of statistical features extracted from each wood image is 38 features. Then, a backpropagation neural network is used to classify the wood species based on the statistical features. A comprehensive set of experiments on a database composed of 5200 macroscopic images from 52 tropical wood species was used to evaluate the performance of the proposed feature extractor. The advantage of the proposed feature extraction technique is that it mimics the experts’ interpretation on wood texture which allows human involvement when analyzing the wood texture. Experimental results show the efficiency of the proposed method.

Keywords: Classification, fuzzy, inspection system, image analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1743
568 Learning Classifier Systems Approach for Automated Discovery of Crisp and Fuzzy Hierarchical Production Rules

Authors: Suraiya Jabin, Kamal K. Bharadwaj

Abstract:

This research presents a system for post processing of data that takes mined flat rules as input and discovers crisp as well as fuzzy hierarchical structures using Learning Classifier System approach. Learning Classifier System (LCS) is basically a machine learning technique that combines evolutionary computing, reinforcement learning, supervised or unsupervised learning and heuristics to produce adaptive systems. A LCS learns by interacting with an environment from which it receives feedback in the form of numerical reward. Learning is achieved by trying to maximize the amount of reward received. Crisp description for a concept usually cannot represent human knowledge completely and practically. In the proposed Learning Classifier System initial population is constructed as a random collection of HPR–trees (related production rules) and crisp / fuzzy hierarchies are evolved. A fuzzy subsumption relation is suggested for the proposed system and based on Subsumption Matrix (SM), a suitable fitness function is proposed. Suitable genetic operators are proposed for the chosen chromosome representation method. For implementing reinforcement a suitable reward and punishment scheme is also proposed. Experimental results are presented to demonstrate the performance of the proposed system.

Keywords: Hierarchical Production Rule, Data Mining, Learning Classifier System, Fuzzy Subsumption Relation, Subsumption matrix, Reinforcement Learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1456
567 Assessment of Urban Heat Island through Remote Sensing in Nagpur Urban Area Using Landsat 7 ETM+ Satellite Images

Authors: Meenal Surawar, Rajashree Kotharkar

Abstract:

Urban Heat Island (UHI) is found more pronounced as a prominent urban environmental concern in developing cities. To study the UHI effect in the Indian context, the Nagpur urban area has been explored in this paper using Landsat 7 ETM+ satellite images through Remote Sensing and GIS techniques. This paper intends to study the effect of LU/LC pattern on daytime Land Surface Temperature (LST) variation, contributing UHI formation within the Nagpur Urban area. Supervised LU/LC area classification was carried to study urban Change detection using ENVI 5. Change detection has been studied by carrying Normalized Difference Vegetation Index (NDVI) to understand the proportion of vegetative cover with respect to built-up ratio. Detection of spectral radiance from the thermal band of satellite images was processed to calibrate LST. Specific representative areas on the basis of urban built-up and vegetation classification were selected for observation of point LST. The entire Nagpur urban area shows that, as building density increases with decrease in vegetation cover, LST increases, thereby causing the UHI effect. UHI intensity has gradually increased by 0.7°C from 2000 to 2006; however, a drastic increase has been observed with difference of 1.8°C during the period 2006 to 2013. Within the Nagpur urban area, the UHI effect was formed due to increase in building density and decrease in vegetative cover.

Keywords: Land use, land cover, land surface temperature, remote sensing, urban heat island.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2608
566 Government of Ghana’s Budget: Its Functions, Coverage, Classification, and Integration with Chart of Accounts

Authors: Mohammed Sani Abdulai

Abstract:

Government budgets are the primary instruments for formulating and implementing a country’s fiscal policy objectives, development priorities, and the overall socio-economic aspirations of its people. Thus, in this paper, the author examined the Government of Ghana’s budgets with respect to their functions, coverage, classifications, and integration with the country’s chart of accounts. The author did so by amalgamating the research findings of extant literature with (a) the operational and procedural guidelines underpinning the formulation and execution of the government’s budgets; (b) the recommendations made by various development partners and thinktanks on reforming the country’s budgeting processes and procedures; and (c) the lessons Ghana could learn from the budget reform efforts of other countries. By way of research findings, the paper showed that the Government of Ghana’s budgets in terms of function are both eclectic and multidimensional. On coverage, the paper showed that the country’s budgets duly cover the revenues and expenditures of the general government (i.e., both the central and sub-national governments). Finally, on classifications, the paper noted with delight the Government of Ghana’s effort in providing classificatory codes to both its national development agenda and such international development goals as the AU’s Agenda 2063 and the UN’s Sustainable Development Goals. However, the paper found some significant lapses that require a complete overhaul and structuring on the integrations of its budget classifications with its chart of accounts. Thus, the paper concluded with a detailed examination of the challenges confronting the country’s current chart of accounts and recommendations for addressing them.

Keywords: Budget, budgetary transactions, budgetary governance, Chart of Accounts, classification, composition, coverage, Public Financial Management.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 515
565 Markov Random Field-Based Segmentation Algorithm for Detection of Land Cover Changes Using Uninhabited Aerial Vehicle Synthetic Aperture Radar Polarimetric Images

Authors: Mehrnoosh Omati, Mahmod Reza Sahebi

Abstract:

The information on land use/land cover changing plays an essential role for environmental assessment, planning and management in regional development. Remotely sensed imagery is widely used for providing information in many change detection applications. Polarimetric Synthetic aperture radar (PolSAR) image, with the discrimination capability between different scattering mechanisms, is a powerful tool for environmental monitoring applications. This paper proposes a new boundary-based segmentation algorithm as a fundamental step for land cover change detection. In this method, first, two PolSAR images are segmented using integration of marker-controlled watershed algorithm and coupled Markov random field (MRF). Then, object-based classification is performed to determine changed/no changed image objects. Compared with pixel-based support vector machine (SVM) classifier, this novel segmentation algorithm significantly reduces the speckle effect in PolSAR images and improves the accuracy of binary classification in object-based level. The experimental results on Uninhabited Aerial Vehicle Synthetic Aperture Radar (UAVSAR) polarimetric images show a 3% and 6% improvement in overall accuracy and kappa coefficient, respectively. Also, the proposed method can correctly distinguish homogeneous image parcels.

Keywords: Coupled Markov random field, environment, object-based analysis, Polarimetric SAR images.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 863
564 Identification of Spam Keywords Using Hierarchical Category in C2C E-commerce

Authors: Shao Bo Cheng, Yong-Jin Han, Se Young Park, Seong-Bae Park

Abstract:

Consumer-to-Consumer (C2C) E-commerce has been growing at a very high speed in recent years. Since identical or nearly-same kinds of products compete one another by relying on keyword search in C2C E-commerce, some sellers describe their products with spam keywords that are popular but are not related to their products. Though such products get more chances to be retrieved and selected by consumers than those without spam keywords, the spam keywords mislead the consumers and waste their time. This problem has been reported in many commercial services like ebay and taobao, but there have been little research to solve this problem. As a solution to this problem, this paper proposes a method to classify whether keywords of a product are spam or not. The proposed method assumes that a keyword for a given product is more reliable if the keyword is observed commonly in specifications of products which are the same or the same kind as the given product. This is because that a hierarchical category of a product in general determined precisely by a seller of the product and so is the specification of the product. Since higher layers of the hierarchical category represent more general kinds of products, a reliable degree is differently determined according to the layers. Hence, reliable degrees from different layers of a hierarchical category become features for keywords and they are used together with features only from specifications for classification of the keywords. Support Vector Machines are adopted as a basic classifier using the features, since it is powerful, and widely used in many classification tasks. In the experiments, the proposed method is evaluated with a golden standard dataset from Yi-han-wang, a Chinese C2C E-commerce, and is compared with a baseline method that does not consider the hierarchical category. The experimental results show that the proposed method outperforms the baseline in F1-measure, which proves that spam keywords are effectively identified by a hierarchical category in C2C E-commerce.

Keywords: Spam Keyword, E-commerce, keyword features, spam filtering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2508
563 Classifying Students for E-Learning in Information Technology Course Using ANN

Authors: S. Areerachakul, N. Ployong, S. Na Songkla

Abstract:

This research’s objective is to select the model with most accurate value by using Neural Network Technique as a way to filter potential students who enroll in IT course by Electronic learning at Suan Suanadha Rajabhat University. It is designed to help students selecting the appropriate courses by themselves. The result showed that the most accurate model was 100 Folds Cross-validation which had 73.58% points of accuracy.

Keywords: Artificial neural network, classification, students.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1498
562 Evaluation of Ensemble Classifiers for Intrusion Detection

Authors: M. Govindarajan

Abstract:

One of the major developments in machine learning in the past decade is the ensemble method, which finds highly accurate classifier by combining many moderately accurate component classifiers. In this research work, new ensemble classification methods are proposed with homogeneous ensemble classifier using bagging and heterogeneous ensemble classifier using arcing and their performances are analyzed in terms of accuracy. A Classifier ensemble is designed using Radial Basis Function (RBF) and Support Vector Machine (SVM) as base classifiers. The feasibility and the benefits of the proposed approaches are demonstrated by the means of standard datasets of intrusion detection. The main originality of the proposed approach is based on three main parts: preprocessing phase, classification phase, and combining phase. A wide range of comparative experiments is conducted for standard datasets of intrusion detection. The performance of the proposed homogeneous and heterogeneous ensemble classifiers are compared to the performance of other standard homogeneous and heterogeneous ensemble methods. The standard homogeneous ensemble methods include Error correcting output codes, Dagging and heterogeneous ensemble methods include majority voting, stacking. The proposed ensemble methods provide significant improvement of accuracy compared to individual classifiers and the proposed bagged RBF and SVM performs significantly better than ECOC and Dagging and the proposed hybrid RBF-SVM performs significantly better than voting and stacking. Also heterogeneous models exhibit better results than homogeneous models for standard datasets of intrusion detection. 

Keywords: Data mining, ensemble, radial basis function, support vector machine, accuracy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1700
561 Predictors of Social Participation of Children with Cerebral Palsy in Primary Schools in Czech Republic

Authors: Marija Zulić, Vanda Hájková, Nina Brkić-Jovanović, Linda Rathousová, Sanja Tomić

Abstract:

Cerebral palsy is primarily reflected in the disorder of the development of movement and posture, which may be accompanied by sensory disturbances, disturbances of perception, cognition and communication, behavioural disorders and epilepsy. According to current inclusive attitudes towards people with disabilities implies that full social participation of children with cerebral palsy means inclusion in all activities in family, peer, school and leisure environments in the same scope and to the same extent as is the case with the children of proper development and without physical difficulties. Due to the fact that it has been established that the quality of children's participation in primary school is directly related to their social inclusion in future life, the aim of the paper is to identify predictors of social participation, respectively, and in particular, factors that could to improve the quality of social participation of children with cerebral palsy, in the primary school environment in Czech Republic. The study includes children with cerebral palsy (n = 75) in the Czech Republic, aged between six and 12 years who attend mainstream or special primary schools to the sixth grade. The main instrument used was the first and third part of the School function assessment questionnaire. It will also take into account the type of damage assessed according to a scale the Gross motor function classification system, five–level classification system for cerebral palsy. The research results will provide detailed insight into the degree of social participation of children with cerebral palsy and the factors that would be a potential cause of their levels of participation, in regular and special primary schools, in different socioeconomic environments in Czech Republic.

Keywords: Cerebral palsy, social participation, Czech Republic, school function assessment.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1246
560 The Benefits of End-To-End Integrated Planning from the Mine to Client Supply for Minimizing Penalties

Authors: G. Martino, F. Silva, E. Marchal

Abstract:

The control over delivered iron ore blend characteristics is one of the most important aspects of the mining business. The iron ore price is a function of its composition, which is the outcome of the beneficiation process. So, end-to-end integrated planning of mine operations can reduce risks of penalties on the iron ore price. In a standard iron mining company, the production chain is composed of mining, ore beneficiation, and client supply. When mine planning and client supply decisions are made uncoordinated, the beneficiation plant struggles to deliver the best blend possible. Technological improvements in several fields allowed bridging the gap between departments and boosting integrated decision-making processes. Clusterization and classification algorithms over historical production data generate reasonable previsions for quality and volume of iron ore produced for each pile of run-of-mine (ROM) processed. Mathematical modeling can use those deterministic relations to propose iron ore blends that better-fit specifications within a delivery schedule. Additionally, a model capable of representing the whole production chain can clearly compare the overall impact of different decisions in the process. This study shows how flexibilization combined with a planning optimization model between the mine and the ore beneficiation processes can reduce risks of out of specification deliveries. The model capabilities are illustrated on a hypothetical iron ore mine with magnetic separation process. Finally, this study shows ways of cost reduction or profit increase by optimizing process indicators across the production chain and integrating the different plannings with the sales decisions.

Keywords: Clusterization and classification algorithms, integrated planning, optimization, mathematical modeling, penalty minimization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 645
559 Information Retrieval: Improving Question Answering Systems by Query Reformulation and Answer Validation

Authors: Mohammad Reza Kangavari, Samira Ghandchi, Manak Golpour

Abstract:

Question answering (QA) aims at retrieving precise information from a large collection of documents. Most of the Question Answering systems composed of three main modules: question processing, document processing and answer processing. Question processing module plays an important role in QA systems to reformulate questions. Moreover answer processing module is an emerging topic in QA systems, where these systems are often required to rank and validate candidate answers. These techniques aiming at finding short and precise answers are often based on the semantic relations and co-occurrence keywords. This paper discussed about a new model for question answering which improved two main modules, question processing and answer processing which both affect on the evaluation of the system operations. There are two important components which are the bases of the question processing. First component is question classification that specifies types of question and answer. Second one is reformulation which converts the user's question into an understandable question by QA system in a specific domain. The objective of an Answer Validation task is thus to judge the correctness of an answer returned by a QA system, according to the text snippet given to support it. For validating answers we apply candidate answer filtering, candidate answer ranking and also it has a final validation section by user voting. Also this paper described new architecture of question and answer processing modules with modeling, implementing and evaluating the system. The system differs from most question answering systems in its answer validation model. This module makes it more suitable to find exact answer. Results show that, from total 50 asked questions, evaluation of the model, show 92% improving the decision of the system.

Keywords: Answer processing, answer validation, classification, question answering, query reformulation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2847
558 A BERT-Based Model for Financial Social Media Sentiment Analysis

Authors: Josiel Delgadillo, Johnson Kinyua, Charles Mutigwe

Abstract:

The purpose of sentiment analysis is to determine the sentiment strength (e.g., positive, negative, neutral) from a textual source for good decision-making. Natural Language Processing (NLP) in domains such as financial markets requires knowledge of domain ontology, and pre-trained language models, such as BERT, have made significant breakthroughs in various NLP tasks by training on large-scale un-labeled generic corpora such as Wikipedia. However, sentiment analysis is a strong domain-dependent task. The rapid growth of social media has given users a platform to share their experiences and views about products, services, and processes, including financial markets. StockTwits and Twitter are social networks that allow the public to express their sentiments in real time. Hence, leveraging the success of unsupervised pre-training and a large amount of financial text available on social media platforms could potentially benefit a wide range of financial applications. This work is focused on sentiment analysis using social media text on platforms such as StockTwits and Twitter. To meet this need, SkyBERT, a domain-specific language model pre-trained and fine-tuned on financial corpora, has been developed. The results show that SkyBERT outperforms current state-of-the-art models in financial sentiment analysis. Extensive experimental results demonstrate the effectiveness and robustness of SkyBERT.

Keywords: BERT, financial markets, Twitter, sentiment analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 716
557 Quality Classification and Monitoring Using Adaptive Metric Distance and Neural Networks: Application in Pickling Process

Authors: S. Bouhouche, M. Lahreche, S. Ziani, J. Bast

Abstract:

Modern manufacturing facilities are large scale, highly complex, and operate with large number of variables under closed loop control. Early and accurate fault detection and diagnosis for these plants can minimise down time, increase the safety of plant operations, and reduce manufacturing costs. Fault detection and isolation is more complex particularly in the case of the faulty analog control systems. Analog control systems are not equipped with monitoring function where the process parameters are continually visualised. In this situation, It is very difficult to find the relationship between the fault importance and its consequences on the product failure. We consider in this paper an approach to fault detection and analysis of its effect on the production quality using an adaptive centring and scaling in the pickling process in cold rolling. The fault appeared on one of the power unit driving a rotary machine, this machine can not track a reference speed given by another machine. The length of metal loop is then in continuous oscillation, this affects the product quality. Using a computerised data acquisition system, the main machine parameters have been monitored. The fault has been detected and isolated on basis of analysis of monitored data. Normal and faulty situation have been obtained by an artificial neural network (ANN) model which is implemented to simulate the normal and faulty status of rotary machine. Correlation between the product quality defined by an index and the residual is used to quality classification.

Keywords: Modeling, fault detection and diagnosis, parameters estimation, neural networks, Fault Detection and Diagnosis (FDD), pickling process.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1577
556 A Robust Method for Finding Nearest-Neighbor using Hexagon Cells

Authors: Ahmad Attiq Al-Ogaibi, Ahmad Sharieh, Moh’d Belal Al-Zoubi, R. Bremananth

Abstract:

In pattern clustering, nearest neighborhood point computation is a challenging issue for many applications in the area of research such as Remote Sensing, Computer Vision, Pattern Recognition and Statistical Imaging. Nearest neighborhood computation is an essential computation for providing sufficient classification among the volume of pixels (voxels) in order to localize the active-region-of-interests (AROI). Furthermore, it is needed to compute spatial metric relationships of diverse area of imaging based on the applications of pattern recognition. In this paper, we propose a new methodology for finding the nearest neighbor point, depending on making a virtually grid of a hexagon cells, then locate every point beneath them. An algorithm is suggested for minimizing the computation and increasing the turnaround time of the process. The nearest neighbor query points Φ are fetched by seeking fashion of hexagon holistic. Seeking will be repeated until an AROI Φ is to be expected. If any point Υ is located then searching starts in the nearest hexagons in a circular way. The First hexagon is considered be level 0 (L0) and the surrounded hexagons is level 1 (L1). If Υ is located in L1, then search starts in the next level (L2) to ensure that Υ is the nearest neighbor for Φ. Based on the result and experimental results, we found that the proposed method has an advantage over the traditional methods in terms of minimizing the time complexity required for searching the neighbors, in turn, efficiency of classification will be improved sufficiently.

Keywords: Hexagon cells, k-nearest neighbors, Nearest Neighbor, Pattern recognition, Query pattern, Virtually grid

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2802
555 Implementation of the Personal Emergency Response System

Authors: Ah-young Jeon, In-cheol Kim, Jae-hee Jung, Soo-young Ye, Jae-hyung Kim, Ki-gon Nam, Seoung-wan Baik, Jung-hoon Ro, Gye-rok Jeon

Abstract:

The aged are faced with increasing risk for falls. The aged have the easily fragile bones than others. When falls have occurred, it is important to detect this emergency state because such events often lead to more serious illness or even death. A implementation of PDA system, for detection of emergency situation, was developed using 3-axis accelerometer in this paper as follows. The signals were acquired from the 3-axis accelerometer, and then transmitted to the PDA through Bluetooth module. This system can classify the human activity, and also detect the emergency state like falls. When the fall occurs, the system generates the alarm on the PDA. If a subject does not respond to the alarm, the system determines whether the current situation is an emergency state or not, and then sends some information to the emergency center in the case of urgent situation. Three different studies were conducted on 12 experimental subjects, with results indicating a good accuracy. The first study was performed to detect the posture change of human daily activity. The second study was performed to detect the correct direction of fall. The third study was conducted to check the classification of the daily physical activity. Each test was lasted at least 1 min. in third study. The output of acceleration signal was compared and evaluated by changing a various posture after attaching a 3-axis accelerometer module on the chest. The newly developed system has some important features such as portability, convenience and low cost. One of the main advantages of this system is that it is available at home healthcare environment. Another important feature lies in low cost to manufacture device. The implemented system can detect the fall accurately, so will be widely used in emergency situation.

Keywords: Alarm System, Ambulatory monitoring, Emergency detection, Classification of activity, and 3-axis accelerometer.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1596
554 Emotion Detection in Twitter Messages Using Combination of Long Short-Term Memory and Convolutional Deep Neural Networks

Authors: B. Golchin, N. Riahi

Abstract:

One of the most significant issues as attended a lot in recent years is that of recognizing the sentiments and emotions in social media texts. The analysis of sentiments and emotions is intended to recognize the conceptual information such as the opinions, feelings, attitudes and emotions of people towards the products, services, organizations, people, topics, events and features in the written text. These indicate the greatness of the problem space. In the real world, businesses and organizations are always looking for tools to gather ideas, emotions, and directions of people about their products, services, or events related to their own. This article uses the Twitter social network, one of the most popular social networks with about 420 million active users, to extract data. Using this social network, users can share their information and opinions about personal issues, policies, products, events, etc. It can be used with appropriate classification of emotional states due to the availability of its data. In this study, supervised learning and deep neural network algorithms are used to classify the emotional states of Twitter users. The use of deep learning methods to increase the learning capacity of the model is an advantage due to the large amount of available data. Tweets collected on various topics are classified into four classes using a combination of two Bidirectional Long Short Term Memory network and a Convolutional network. The results obtained from this study with an average accuracy of 93%, show good results extracted from the proposed framework and improved accuracy compared to previous work.

Keywords: emotion classification, sentiment analysis, social networks, deep neural networks

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 665
553 On Some Subspaces of Entire Sequence Space of Fuzzy Numbers

Authors: T. Balasubramanian, A. Pandiarani

Abstract:

In this paper we introduce some subspaces of fuzzy entire sequence space. Some general properties of these sequence spaces are discussed. Also some inclusion relation involving the spaces are obtained. Mathematics Subject Classification: 40A05, 40D25.

Keywords: Fuzzy Numbers, Entire sequences, completeness, Fuzzy entire sequences

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1241
552 Competitors’ Influence Analysis of a Retailer by Using Customer Value and Huff’s Gravity Model

Authors: Yepeng Cheng, Yasuhiko Morimoto

Abstract:

Customer relationship analysis is vital for retail stores, especially for supermarkets. The point of sale (POS) systems make it possible to record the daily purchasing behaviors of customers as an identification point of sale (ID-POS) database, which can be used to analyze customer behaviors of a supermarket. The customer value is an indicator based on ID-POS database for detecting the customer loyalty of a store. In general, there are many supermarkets in a city, and other nearby competitor supermarkets significantly affect the customer value of customers of a supermarket. However, it is impossible to get detailed ID-POS databases of competitor supermarkets. This study firstly focused on the customer value and distance between a customer's home and supermarkets in a city, and then constructed the models based on logistic regression analysis to analyze correlations between distance and purchasing behaviors only from a POS database of a supermarket chain. During the modeling process, there are three primary problems existed, including the incomparable problem of customer values, the multicollinearity problem among customer value and distance data, and the number of valid partial regression coefficients. The improved customer value, Huff’s gravity model, and inverse attractiveness frequency are considered to solve these problems. This paper presents three types of models based on these three methods for loyal customer classification and competitors’ influence analysis. In numerical experiments, all types of models are useful for loyal customer classification. The type of model, including all three methods, is the most superior one for evaluating the influence of the other nearby supermarkets on customers' purchasing of a supermarket chain from the viewpoint of valid partial regression coefficients and accuracy.

Keywords: Customer value, Huff's Gravity Model, POS, retailer.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 612
551 A Survey of Baseband Architecture for Software Defined Radio

Authors: M. A. Fodha, H. Benfradj, A. Ghazel

Abstract:

This paper is a survey of recent works that proposes a baseband processor architecture for software defined radio. A classification of different approaches is proposed. The performance of each architecture is also discussed in order to clarify the suitable approaches that meet software-defined radio constraints.

Keywords: Multi-core architectures, reconfigurable architecture, software defined radio.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1459
550 Pruning Algorithm for the Minimum Rule Reduct Generation

Authors: Şahin Emrah Amrahov, Fatih Aybar, Serhat Doğan

Abstract:

In this paper we consider the rule reduct generation problem. Rule Reduct Generation (RG) and Modified Rule Generation (MRG) algorithms, that are used to solve this problem, are well-known. Alternative to these algorithms, we develop Pruning Rule Generation (PRG) algorithm. We compare the PRG algorithm with RG and MRG.

Keywords: Rough sets, Decision rules, Rule induction, Classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2049