Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 6958

Search results for: user classification accuracy

6448 Interpretation of the Russia-Ukraine 2022 War via N-Gram Analysis

Authors: Elcin Timur Cakmak, Ayse Oguzlar

Abstract:

This study presents the results of the tweets sent by Twitter users on social media about the Russia-Ukraine war by bigram and trigram methods. On February 24, 2022, Russian President Vladimir Putin declared a military operation against Ukraine, and all eyes were turned to this war. Many people living in Russia and Ukraine reacted to this war and protested and also expressed their deep concern about this war as they felt the safety of their families and their futures were at stake. Most people, especially those living in Russia and Ukraine, express their views on the war in different ways. The most popular way to do this is through social media. Many people prefer to convey their feelings using Twitter, one of the most frequently used social media tools. Since the beginning of the war, it is seen that there have been thousands of tweets about the war from many countries of the world on Twitter. These tweets accumulated in data sources are extracted using various codes for analysis through Twitter API and analysed by Python programming language. The aim of the study is to find the word sequences in these tweets by the n-gram method, which is known for its widespread use in computational linguistics and natural language processing. The tweet language used in the study is English. The data set consists of the data obtained from Twitter between February 24, 2022, and April 24, 2022. The tweets obtained from Twitter using the #ukraine, #russia, #war, #putin, #zelensky hashtags together were captured as raw data, and the remaining tweets were included in the analysis stage after they were cleaned through the preprocessing stage. In the data analysis part, the sentiments are found to present what people send as a message about the war on Twitter. Regarding this, negative messages make up the majority of all the tweets as a ratio of %63,6. Furthermore, the most frequently used bigram and trigram word groups are found. Regarding the results, the most frequently used word groups are “he, is”, “I, do”, “I, am” for bigrams. Also, the most frequently used word groups are “I, do, not”, “I, am, not”, “I, can, not” for trigrams. In the machine learning phase, the accuracy of classifications is measured by Classification and Regression Trees (CART) and Naïve Bayes (NB) algorithms. The algorithms are used separately for bigrams and trigrams. We gained the highest accuracy and F-measure values by the NB algorithm and the highest precision and recall values by the CART algorithm for bigrams. On the other hand, the highest values for accuracy, precision, and F-measure values are achieved by the CART algorithm, and the highest value for the recall is gained by NB for trigrams.

Keywords: classification algorithms, machine learning, sentiment analysis, Twitter

Procedia PDF Downloads 56

6447 On the Network Packet Loss Tolerance of SVM Based Activity Recognition

Authors: Gamze Uslu, Sebnem Baydere, Alper K. Demir

Abstract:

In this study, data loss tolerance of Support Vector Machines (SVM) based activity recognition model and multi activity classification performance when data are received over a lossy wireless sensor network is examined. Initially, the classification algorithm we use is evaluated in terms of resilience to random data loss with 3D acceleration sensor data for sitting, lying, walking and standing actions. The results show that the proposed classification method can recognize these activities successfully despite high data loss. Secondly, the effect of differentiated quality of service performance on activity recognition success is measured with activity data acquired from a multi hop wireless sensor network, which introduces high data loss. The effect of number of nodes on the reliability and multi activity classification success is demonstrated in simulation environment. To the best of our knowledge, the effect of data loss in a wireless sensor network on activity detection success rate of an SVM based classification algorithm has not been studied before.

Keywords: activity recognition, support vector machines, acceleration sensor, wireless sensor networks, packet loss

Procedia PDF Downloads 454

6446 The Application of Fuzzy Set Theory to Mobile Internet Advertisement Fraud Detection

Authors: Jinming Ma, Tianbing Xia, Janusz Getta

Abstract:

This paper presents the application of fuzzy set theory to implement of mobile advertisement anti-fraud systems. Mobile anti-fraud is a method aiming to identify mobile advertisement fraudsters. One of the main problems of mobile anti-fraud is the lack of evidence to prove a user to be a fraudster. In this paper, we implement an application by using fuzzy set theory to demonstrate how to detect cheaters. The advantage of our method is that the hardship in detecting fraudsters in small data samples has been avoided. We achieved this by giving each user a suspicious degree showing how likely the user is cheating and decide whether a group of users (like all users of a certain APP) together to be fraudsters according to the average suspicious degree. This makes the process more accurate as the data of a single user is too small to be predictable.

Keywords: mobile internet, advertisement, anti-fraud, fuzzy set theory

Procedia PDF Downloads 153

6445 Deep Learning Based-Object-classes Semantic Classification of Arabic Texts

Authors: Imen Elleuch, Wael Ouarda, Gargouri Bilel

Abstract:

We proposes in this paper a Deep Learning based approach to classify text in order to enrich an Arabic ontology based on the objects classes of Gaston Gross. Those object classes are defined by taking into account the syntactic and semantic features of the treated language. Thus, our proposed approach is a hybrid one. In fact, it is based on the one hand on the object classes that represents a knowledge based-approach on classification of text and in the other hand it uses the deep learning approach that use the word embedding-based-approach to classify text. We have applied our proposed approach on a corpus constructed from an Arabic dictionary. The obtained semantic classification of text will enrich the Arabic objects classes ontology. In fact, new classes can be added to the ontology or an expansion of the features that characterizes each object class can be updated. The obtained results are compared to a similar work that treats the same object with a classical linguistic approach for the semantic classification of text. This comparison highlight our hybrid proposed approach that can be ameliorated by broaden the dataset used in the deep learning process.

Keywords: deep-learning approach, object-classes, semantic classification, Arabic

Procedia PDF Downloads 55

6444 Non-Uniform Filter Banks-based Minimum Distance to Riemannian Mean Classifition in Motor Imagery Brain-Computer Interface

Authors: Ping Tan, Xiaomeng Su, Yi Shen

Abstract:

The motion intention in the motor imagery braincomputer interface is identified by classifying the event-related desynchronization (ERD) and event-related synchronization ERS characteristics of sensorimotor rhythm (SMR) in EEG signals. When the subject imagines different limbs or different parts moving, the rhythm components and bandwidth will change, which varies from person to person. How to find the effective sensorimotor frequency band of subjects is directly related to the classification accuracy of brain-computer interface. To solve this problem, this paper proposes a Minimum Distance to Riemannian Mean Classification method based on Non-Uniform Filter Banks. During the training phase, the EEG signals are decomposed into multiple different bandwidt signals by using multiple band-pass filters firstly; Then the spatial covariance characteristics of each frequency band signal are computered to be as the feature vectors. these feature vectors will be classified by the MDRM (Minimum Distance to Riemannian Mean) method, and cross validation is employed to obtain the effective sensorimotor frequency bands. During the test phase, the test signals are filtered by the bandpass filter of the effective sensorimotor frequency bands, and the extracted spatial covariance feature vectors will be classified by using the MDRM. Experiments on the BCI competition IV 2a dataset show that the proposed method is superior to other classification methods.

Keywords: non-uniform filter banks, motor imagery, brain-computer interface, minimum distance to Riemannian mean

Procedia PDF Downloads 95

6443 Immersive and Interactive Storytelling: Exploring Narratives and Online Multisensory Experience for Cultural Memory and Collective Awareness through Graphic Novel

Authors: Cristina Greco

Abstract:

The spread of the digital and we-based technologies has led to a transformation process, which has coincided with an increase in the number of cases who are beyond the mainstream storytelling and its codes on the interaction with the user. On the base of a previous research on i-docs and virtual museums, this study analyses interactive and immersive online Graphic Novel – one-page, animated, illustrated, and hybrid – to reflect on the transformational implications of this expressive form on the user perception, remembrance, and awareness. The way in which the user experiences a certain level of interaction with the story and immersion in the semantic and figurative universe would bring user’s attention, activating introspection and self-reflection processes, perception, imagination, and creativity. This would have to do with the involvement of different senses – visual, proprioceptive, tactile, auditory, and vestibular – and the activation of a phenomenon of synaesthesia (involuntary cross-modal sensory association) – where, for example, the aural reconnect the user to another sense, providing a multisensory experience. The case studies show specific forms of interactive and immersive graphic novel and reflect on application that has sought to engage innovative ways to communicate different messages and stimulate cultural memory and collective awareness. The visual semiotic and narrative analysis of the distinctive traits of such a complex textuality, along with a study of the user’s experience through observation in naturalistic settings and interviews, allows us to question the functioning of these configurations, with regard to the relationships between the figurative dimension, the perceptive activity, and their impact on the user’s engagement.

Keywords: collective awareness, cultural memory, graphic novel, interactive and immersive storytelling

Procedia PDF Downloads 127

6442 Smart Product-Service System Innovation with User Experience: A Case Study of Chunmi

Authors: Ying Yu, Wen-Chi Kuo, Tung-Jung Sung

Abstract:

The Product-Service System (PSS) has received widespread attention due to the increasing global competition in manufacturing and service markets. Today’s smart products and services are driven by Internet of things (IoT) technologies which will promote the transformation from traditional PSS to smart PSS. Although the smart PSS has some of technological achievements in businesses, it often ignores the real demands of target users when using products and services. Therefore, designers should know and learn the User Experience (UX) of smart products, services and systems. However, both of academia and industry still lack relevant development experience of smart PSS since it is an emerging field. In doing so, this is a case study of Xiaomi’s Chunmi, the largest IoT platform in the world, and addresses the two major issues: (1) why Chunmi should develop smart PSS strategies with UX; and (2) how Chunmi could successfully implement the strategic objectives of smart PSS through the design. The case study results indicated that: (1) the smart PSS can distinguish competitors by their unique UX which is difficult to duplicate; (2) early user engagement is crucial for the success of smart PSS; and (3) interaction, expectation, and enjoyment can be treated as a three-dimensional evaluation of UX design for smart PSS innovation. In conclusion, the smart PSS can gain competitive advantages through good UX design in the market.

Keywords: design, smart PSS, user experience, user engagement

Procedia PDF Downloads 117

6441 Measurement of Susceptibility Users Using Email Phishing Attack

Authors: Cindy Sahera, Sarwono Sutikno

Abstract:

Rapid technological developments also have negative impacts, namely the increasing criminal cases based on technology or cybercrime. One technique that can be used to conduct cybercrime attacks are phishing email. The issue is whether the user is aware that email can be misused by others so that it can harm the user's own? This research was conducted to measure the susceptibility of selected targets against email abuse. The objectives of this research are measurement of targets’ susceptibility and find vulnerability in email recipient. There are three steps being taken in this research, (1) the information gathering phase, (2) the design phase, and (3) the execution phase. The first step includes the collection of the information necessary to carry out an attack on a target. The next step is to make the design of an attack against a target. The last step is to send phishing emails to the target. The levels of susceptibility are three: level 1, level 2 and level 3. Level 1 indicates a low level of targets’ susceptibility, level 2 indicates the intermediate level of targets’ susceptibility, and level 3 indicates a high level of targets’ susceptibility. The results showed that users who are on level 1 and level 2 more that level 3, which means the user is not too careless. However, it does not mean the user to be safe. There are still vulnerabilities that may occur, such as automatic location detection when opening emails and automatic downloaded malware as user clicks a link in the email.

Keywords: cybercrime, email phishing, susceptibility, vulnerability

Procedia PDF Downloads 263

6440 Human Action Retrieval System Using Features Weight Updating Based Relevance Feedback Approach

Authors: Munaf Rashid

Abstract:

For content-based human action retrieval systems, search accuracy is often inferior because of the following two reasons 1) global information pertaining to videos is totally ignored, only low level motion descriptors are considered as a significant feature to match the similarity between query and database videos, and 2) the semantic gap between the high level user concept and low level visual features. Hence, in this paper, we propose a method that will address these two issues and in doing so, this paper contributes in two ways. Firstly, we introduce a method that uses both global and local information in one framework for an action retrieval task. Secondly, to minimize the semantic gap, a user concept is involved by incorporating features weight updating (FWU) Relevance Feedback (RF) approach. We use statistical characteristics to dynamically update weights of the feature descriptors so that after every RF iteration feature space is modified accordingly. For testing and validation purpose two human action recognition datasets have been utilized, namely Weizmann and UCF. Results show that even with a number of visual challenges the proposed approach performs well.

Keywords: relevance feedback (RF), action retrieval, semantic gap, feature descriptor, codebook

Procedia PDF Downloads 446

6439 Incorporating Multiple Supervised Learning Algorithms for Effective Intrusion Detection

Authors: Umar Albalawi, Sang C. Suh, Jinoh Kim

Abstract:

As internet continues to expand its usage with an enormous number of applications, cyber-threats have significantly increased accordingly. Thus, accurate detection of malicious traffic in a timely manner is a critical concern in today’s Internet for security. One approach for intrusion detection is to use Machine Learning (ML) techniques. Several methods based on ML algorithms have been introduced over the past years, but they are largely limited in terms of detection accuracy and/or time and space complexity to run. In this work, we present a novel method for intrusion detection that incorporates a set of supervised learning algorithms. The proposed technique provides high accuracy and outperforms existing techniques that simply utilizes a single learning method. In addition, our technique relies on partial flow information (rather than full information) for detection, and thus, it is light-weight and desirable for online operations with the property of early identification. With the mid-Atlantic CCDC intrusion dataset publicly available, we show that our proposed technique yields a high degree of detection rate over 99% with a very low false alarm rate (0.4%).

Keywords: intrusion detection, supervised learning, traffic classification, computer networks

Procedia PDF Downloads 330

6438 A Hybrid Recommendation System Based on Association Rules

Authors: Ahmed Mohammed Alsalama

Abstract:

Recommendation systems are widely used in e-commerce applications. The engine of a current recommendation system recommends items to a particular user based on user preferences and previous high ratings. Various recommendation schemes such as collaborative filtering and content-based approaches are used to build a recommendation system. Most of the current recommendation systems were developed to fit a certain domain such as books, articles, and movies. We propose a hybrid framework recommendation system to be applied on two-dimensional spaces (User x Item) with a large number of Users and a small number of Items. Moreover, our proposed framework makes use of both favorite and non-favorite items of a particular user. The proposed framework is built upon the integration of association rules mining and the content-based approach. The results of experiments show that our proposed framework can provide accurate recommendations to users.

Keywords: data mining, association rules, recommendation systems, hybrid systems

Procedia PDF Downloads 438

6437 Scheduling Method for Electric Heater in HEMS considering User’s Comfort

Authors: Yong-Sung Kim, Je-Seok Shin, Ho-Jun Jo, Jin-O Kim

Abstract:

Home Energy Management System (HEMS) which makes the residential consumers contribute to the demand response is attracting attention in recent years. An aim of HEMS is to minimize their electricity cost by controlling the use of their appliances according to electricity price. The use of appliances in HEMS may be affected by some conditions such as external temperature and electricity price. Therefore, the user’s usage pattern of appliances should be modeled according to the external conditions, and the resultant usage pattern is related to the user’s comfortability on use of each appliances. This paper proposes a methodology to model the usage pattern based on the historical data with the copula function. Through copula function, the usage range of each appliance can be obtained and is able to satisfy the appropriate user’s comfort according to the external conditions for next day. Within the usage range, an optimal scheduling for appliances would be conducted so as to minimize an electricity cost with considering user’s comfort. Among the home appliance, electric heater (EH) is a representative appliance which is affected by the external temperature. In this paper, an optimal scheduling algorithm for an electric heater (EH) is addressed based on the method of branch and bound. As a result, scenarios for the EH usage are obtained according to user’s comfort levels and then the residential consumer would select the best scenario. The case study shows the effects of the proposed algorithm compared with the traditional operation of the EH, and it also represents impacts of the comfort level on the scheduling result.

Keywords: load scheduling, usage pattern, user’s comfort, copula function, branch and bound, electric heater

Procedia PDF Downloads 562

6436 Deep Learning for Recommender System: Principles, Methods and Evaluation

Authors: Basiliyos Tilahun Betru, Charles Awono Onana, Bernabe Batchakui

Abstract:

Recommender systems have become increasingly popular in recent years, and are utilized in numerous areas. Nowadays many web services provide several information for users and recommender systems have been developed as critical element of these web applications to predict choice of preference and provide significant recommendations. With the help of the advantage of deep learning in modeling different types of data and due to the dynamic change of user preference, building a deep model can better understand users demand and further improve quality of recommendation. In this paper, deep neural network models for recommender system are evaluated. Most of deep neural network models in recommender system focus on the classical collaborative filtering user-item setting. Deep learning models demonstrated high level features of complex data can be learned instead of using metadata which can significantly improve accuracy of recommendation. Even though deep learning poses a great impact in various areas, applying the model to a recommender system have not been fully exploited and still a lot of improvements can be done both in collaborative and content-based approach while considering different contextual factors.

Keywords: big data, decision making, deep learning, recommender system

Procedia PDF Downloads 454

6435 Automated Tracking and Statistics of Vehicles at the Signalized Intersection

Authors: Qiang Zhang, Xiaojian Hu1

Abstract:

Intersection is the place where vehicles and pedestrians must pass through, turn and evacuate. Obtaining the motion data of vehicles near the intersection is of great significance for transportation research. Since there are usually many targets and there are more conflicts between targets, this makes it difficult to obtain vehicle motion parameters in traffic videos of intersections. According to the characteristics of traffic videos, this paper applies video technology to realize the automated track, count and trajectory extraction of vehicles to collect traffic data by roadside surveillance cameras installed near the intersections. Based on the video recognition method, the vehicles in each lane near the intersection are tracked with extracting trajectory and counted respectively in various degrees of occlusion and visibility. The performances are compared with current recognized CPU-based algorithms of real-time tracking-by-detection. The speed of the presented system is higher than the others and the system has a better real-time performance. The accuracy of direction has reached about 94.99% on average, and the accuracy of classification and statistics has reached about 75.12% on average.

Keywords: tracking and statistics, vehicle, signalized intersection, motion parameter, trajectory

Procedia PDF Downloads 199

6434 A Survey of Recognizing of Daily Living Activities in Multi-User Smart Home Environments

Authors: Kulsoom S. Bughio, Naeem K. Janjua, Gordana Dermody, Leslie F. Sikos, Shamsul Islam

Abstract:

The advancement in information and communication technologies (ICT) and wireless sensor networks have played a pivotal role in the design and development of real-time healthcare solutions, mainly targeting the elderly living in health-assistive smart homes. Such smart homes are equipped with sensor technologies to detect and record activities of daily living (ADL). This survey reviews and evaluates existing approaches and techniques based on real-time sensor-based modeling and reasoning in single-user and multi-user environments. It classifies the approaches into three main categories: learning-based, knowledge-based, and hybrid, and evaluates how they handle temporal relations, granularity, and uncertainty. The survey also highlights open challenges across various disciplines (including computer and information sciences and health sciences) to encourage interdisciplinary research for the detection and recognition of ADLs and discusses future directions.

Keywords: daily living activities, smart homes, single-user environment, multi-user environment

Procedia PDF Downloads 118

6433 A Study on the Performance of 2-PC-D Classification Model

Authors: Nurul Aini Abdul Wahab, Nor Syamim Halidin, Sayidatina Aisah Masnan, Nur Izzati Romli

Abstract:

There are many applications of principle component method for reducing the large set of variables in various fields. Fisher’s Discriminant function is also a popular tool for classification. In this research, the researcher focuses on studying the performance of Principle Component-Fisher’s Discriminant function in helping to classify rice kernels to their defined classes. The data were collected on the smells or odour of the rice kernel using odour-detection sensor, Cyranose. 32 variables were captured by this electronic nose (e-nose). The objective of this research is to measure how well a combination model, between principle component and linear discriminant, to be as a classification model. Principle component method was used to reduce all 32 variables to a smaller and manageable set of components. Then, the reduced components were used to develop the Fisher’s Discriminant function. In this research, there are 4 defined classes of rice kernel which are Aromatic, Brown, Ordinary and Others. Based on the output from principle component method, the 32 variables were reduced to only 2 components. Based on the output of classification table from the discriminant analysis, 40.76% from the total observations were correctly classified into their classes by the PC-Discriminant function. Indirectly, it gives an idea that the classification model developed has committed to more than 50% of misclassifying the observations. As a conclusion, the Fisher’s Discriminant function that was built on a 2-component from PCA (2-PC-D) is not satisfying to classify the rice kernels into its defined classes.

Keywords: classification model, discriminant function, principle component analysis, variable reduction

Procedia PDF Downloads 314

6432 A New Approach of Preprocessing with SVM Optimization Based on PSO for Bearing Fault Diagnosis

Authors: Tawfik Thelaidjia, Salah Chenikher

Abstract:

Bearing fault diagnosis has attracted significant attention over the past few decades. It consists of two major parts: vibration signal feature extraction and condition classification for the extracted features. In this paper, feature extraction from faulty bearing vibration signals is performed by a combination of the signal’s Kurtosis and features obtained through the preprocessing of the vibration signal samples using Db2 discrete wavelet transform at the fifth level of decomposition. In this way, a 7-dimensional vector of the vibration signal feature is obtained. After feature extraction from vibration signal, the support vector machine (SVM) was applied to automate the fault diagnosis procedure. To improve the classification accuracy for bearing fault prediction, particle swarm optimization (PSO) is employed to simultaneously optimize the SVM kernel function parameter and the penalty parameter. The results have shown feasibility and effectiveness of the proposed approach

Keywords: condition monitoring, discrete wavelet transform, fault diagnosis, kurtosis, machine learning, particle swarm optimization, roller bearing, rotating machines, support vector machine, vibration measurement

Procedia PDF Downloads 415

6431 Probabilistic Crash Prediction and Prevention of Vehicle Crash

Authors: Lavanya Annadi, Fahimeh Jafari

Abstract:

Transportation brings immense benefits to society, but it also has its costs. Costs include such as the cost of infrastructure, personnel and equipment, but also the loss of life and property in traffic accidents on the road, delays in travel due to traffic congestion and various indirect costs in terms of air transport. More research has been done to identify the various factors that affect road accidents, such as road infrastructure, traffic, sociodemographic characteristics, land use, and the environment. The aim of this research is to predict the probabilistic crash prediction of vehicles using machine learning due to natural and structural reasons by excluding spontaneous reasons like overspeeding etc., in the United States. These factors range from weather factors, like weather conditions, precipitation, visibility, wind speed, wind direction, temperature, pressure, and humidity to human made structures like road structure factors like bump, roundabout, no exit, turning loop, give away, etc. Probabilities are dissected into ten different classes. All the predictions are based on multiclass classification techniques, which are supervised learning. This study considers all crashes that happened in all states collected by the US government. To calculate the probability, multinomial expected value was used and assigned a classification label as the crash probability. We applied three different classification models, including multiclass Logistic Regression, Random Forest and XGBoost. The numerical results show that XGBoost achieved a 75.2% accuracy rate which indicates the part that is being played by natural and structural reasons for the crash. The paper has provided in-deep insights through exploratory data analysis.

Keywords: road safety, crash prediction, exploratory analysis, machine learning

Procedia PDF Downloads 91

6430 A Laser Instrument Rapid-E+ for Real-Time Measurements of Airborne Bioaerosols Such as Bacteria, Fungi, and Pollen

Authors: Minghui Zhang, Sirine Fkaier, Sabri Fernana, Svetlana Kiseleva, Denis Kiselev

Abstract:

The real-time identification of bacteria and fungi is difficult because they emit much weaker signals than pollen. In 2020, Plair developed Rapid-E+, which extends abilities of Rapid-E to detect smaller bioaerosols such as bacteria and fungal spores with diameters down to 0.3 µm, while keeping the similar or even better capability for measurements of large bioaerosols like pollen. Rapid-E+ enables simultaneous measurements of (1) time-resolved, polarization and angle dependent Mie scattering patterns, (2) fluorescence spectra resolved in 16 channels, and (3) fluorescence lifetime of individual particles. Moreover, (4) it provides 2D Mie scattering images which give the full information on particle morphology. The parameters of every single bioaerosol aspired into the instrument are subsequently analysed by machine learning. Firstly, pure species of microbes, e.g., Bacillus subtilis (a species of bacteria), and Penicillium chrysogenum (a species of fungal spores), were aerosolized in a bioaerosol chamber for Rapid-E+ training. Afterwards, we tested microbes under different concentrations. We used several steps of data analysis to classify and identify microbes. All single particles were analysed by the parameters of light scattering and fluorescence in the following steps. (1) They were treated with a smart filter block to get rid of non-microbes. (2) By classification algorithm, we verified the filtered particles were microbes based on the calibration data. (3) The probability threshold (defined by the user) step provides the probability of being microbes ranging from 0 to 100%. We demonstrate how Rapid-E+ identified simultaneously microbes based on the results of Bacillus subtilis (bacteria) and Penicillium chrysogenum (fungal spores). By using machine learning, Rapid-E+ achieved identification precision of 99% against the background. The further classification suggests the precision of 87% and 89% for Bacillus subtilis and Penicillium chrysogenum, respectively. The developed algorithm was subsequently used to evaluate the performance of microbe classification and quantification in real-time. The bacteria and fungi were aerosolized again in the chamber with different concentrations. Rapid-E+ can classify different types of microbes and then quantify them in real-time. Rapid-E+ enables classifying different types of microbes and quantifying them in real-time. Rapid-E+ can identify pollen down to species with similar or even better performance than the previous version (Rapid-E). Therefore, Rapid-E+ is an all-in-one instrument which classifies and quantifies not only pollen, but also bacteria and fungi. Based on the machine learning platform, the user can further develop proprietary algorithms for specific microbes (e.g., virus aerosols) and other aerosols (e.g., combustion-related particles that contain polycyclic aromatic hydrocarbons).

Keywords: bioaerosols, laser-induced fluorescence, Mie-scattering, microorganisms

Procedia PDF Downloads 70

6429 Masked Candlestick Model: A Pre-Trained Model for Trading Prediction

Authors: Ling Qi, Matloob Khushi, Josiah Poon

Abstract:

This paper introduces a pre-trained Masked Candlestick Model (MCM) for trading time-series data. The pre-trained model is based on three core designs. First, we convert trading price data at each data point as a set of normalized elements and produce embeddings of each element. Second, we generate a masked sequence of such embedded elements as inputs for self-supervised learning. Third, we use the encoder mechanism from the transformer to train the inputs. The masked model learns the contextual relations among the sequence of embedded elements, which can aid downstream classification tasks. To evaluate the performance of the pre-trained model, we fine-tune MCM for three different downstream classification tasks to predict future price trends. The fine-tuned models achieved better accuracy rates for all three tasks than the baseline models. To better analyze the effectiveness of MCM, we test the same architecture for three currency pairs, namely EUR/GBP, AUD/USD, and EUR/JPY. The experimentation results demonstrate MCM’s effectiveness on all three currency pairs and indicate the MCM’s capability for signal extraction from trading data.

Keywords: masked language model, transformer, time series prediction, trading prediction, embedding, transfer learning, self-supervised learning

Procedia PDF Downloads 106

6428 Detecting Venomous Files in IDS Using an Approach Based on Data Mining Algorithm

Authors: Sukhleen Kaur

Abstract:

In security groundwork, Intrusion Detection System (IDS) has become an important component. The IDS has received increasing attention in recent years. IDS is one of the effective way to detect different kinds of attacks and malicious codes in a network and help us to secure the network. Data mining techniques can be implemented to IDS, which analyses the large amount of data and gives better results. Data mining can contribute to improving intrusion detection by adding a level of focus to anomaly detection. So far the study has been carried out on finding the attacks but this paper detects the malicious files. Some intruders do not attack directly, but they hide some harmful code inside the files or may corrupt those file and attack the system. These files are detected according to some defined parameters which will form two lists of files as normal files and harmful files. After that data mining will be performed. In this paper a hybrid classifier has been used via Naive Bayes and Ripper classification methods. The results show how the uploaded file in the database will be tested against the parameters and then it is characterised as either normal or harmful file and after that the mining is performed. Moreover, when a user tries to mine on harmful file it will generate an exception that mining cannot be made on corrupted or harmful files.

Keywords: data mining, association, classification, clustering, decision tree, intrusion detection system, misuse detection, anomaly detection, naive Bayes, ripper

Procedia PDF Downloads 397

6427 Usability Guidelines for Arab E-Government Websites

Authors: Omyma Alosaimi, Asma Alsumait

Abstract:

The website developer and designer should follow usability guidelines to provide a user-friendly interface. Many guidelines and heuristics have been developed by previous studies to help both the developer and designer in this task, but E-government websites are special cases that require specialized guidelines. This paper introduces a set of eighteen guidelines for evaluating the usability of e-government websites in general and Arabic e-government websites specifically, along with a check list of how to apply them. The validity and effectiveness of these guidelines were evaluated against a variety of user characteristics. The results indicated that the proposed set of guidelines can be used to identify qualitative similarities and differences with user testing and that the new set is best suited for evaluating general and e-governmental usability.

Keywords: e-government, human computer interaction, usability evaluation, usability guidelines

Procedia PDF Downloads 375

6426 The Design of the Multi-Agent Classification System (MACS)

Authors: Mohamed R. Mhereeg

Abstract:

The paper discusses the design of a .NET Windows Service based agent system called MACS (Multi-Agent Classification System). MACS is a system aims to accurately classify spread-sheet developers competency over a network. It is designed to automatically and autonomously monitor spread-sheet users and gather their development activities based on the utilization of the software Multi-Agent Technology (MAS). This is accomplished in such a way that makes management capable to efficiently allow for precise tailor training activities for future spread-sheet development. The monitoring agents of MACS are intended to be distributed over the WWW in order to satisfy the monitoring and classification of the multiple developer aspect. The Prometheus methodology is used for the design of the agents of MACS. Prometheus has been used to undertake this phase of the system design because it is developed specifically for specifying and designing agent-oriented systems. Additionally, Prometheus specifies also the communication needed between the agents in order to coordinate to achieve their delegated tasks.

Keywords: classification, design, MACS, MAS, prometheus

Procedia PDF Downloads 380

6425 The Best Prediction Data Mining Model for Breast Cancer Probability in Women Residents in Kabul

Authors: Mina Jafari, Kobra Hamraee, Saied Hossein Hosseini

Abstract:

The prediction of breast cancer disease is one of the challenges in medicine. In this paper we collected 528 records of women’s information who live in Kabul including demographic, life style, diet and pregnancy data. There are many classification algorithm in breast cancer prediction and tried to find the best model with most accurate result and lowest error rate. We evaluated some other common supervised algorithms in data mining to find the best model in prediction of breast cancer disease among afghan women living in Kabul regarding to momography result as target variable. For evaluating these algorithms we used Cross Validation which is an assured method for measuring the performance of models. After comparing error rate and accuracy of three models: Decision Tree, Naive Bays and Rule Induction, Decision Tree with accuracy of 94.06% and error rate of %15 is found the best model to predicting breast cancer disease based on the health care records.

Keywords: decision tree, breast cancer, probability, data mining

Procedia PDF Downloads 122

6424 A Case Study of Deep Learning for Disease Detection in Crops

Authors: Felipe A. Guth, Shane Ward, Kevin McDonnell

Abstract:

In the precision agriculture area, one of the main tasks is the automated detection of diseases in crops. Machine Learning algorithms have been studied in recent decades for such tasks in view of their potential for improving economic outcomes that automated disease detection may attain over crop fields. The latest generation of deep learning convolution neural networks has presented significant results in the area of image classification. In this way, this work has tested the implementation of an architecture of deep learning convolution neural network for the detection of diseases in different types of crops. A data augmentation strategy was used to meet the requirements of the algorithm implemented with a deep learning framework. Two test scenarios were deployed. The first scenario implemented a neural network under images extracted from a controlled environment while the second one took images both from the field and the controlled environment. The results evaluated the generalisation capacity of the neural networks in relation to the two types of images presented. Results yielded a general classification accuracy of 59% in scenario 1 and 96% in scenario 2.

Keywords: convolutional neural networks, deep learning, disease detection, precision agriculture

Procedia PDF Downloads 236

6423 Easily Memorable Strong Password Generation and Retrieval

Authors: Shatadru Das, Natarajan Vijayarangan

Abstract:

In this paper, a system and method for generating and recovering an authorization code has been designed and analyzed. The system creates an authorization code by accepting a base-sentence from a user. Based on the characters present in this base-sentence, the system computes a base-sentence matrix. The system also generates a plurality of patterns. The user can either select the pattern from the multiple patterns suggested by the system or can create his/her own pattern. The system then performs multiplications between the base-sentence matrix and the selected pattern matrix at different stages in the path forward, for obtaining a strong authorization code. In case the user forgets the base sentence, the system has a provision to manage and retrieve 'forgotten authorization code'. This is done by fragmenting the base sentence into different matrices and storing the fragmented matrices into a repository after computing matrix multiplication with a security question-answer approach and with a secret key provided by the user.

Keywords: easy authentication, key retrieval, memorable passwords, strong password generation

Procedia PDF Downloads 376

6422 Machine Learning-Driven Prediction of Cardiovascular Diseases: A Supervised Approach

Authors: Thota Sai Prakash, B. Yaswanth, Jhade Bhuvaneswar, Marreddy Divakar Reddy, Shyam Ji Gupta

Abstract:

Across the globe, there are a lot of chronic diseases, and heart disease stands out as one of the most perilous. Sadly, many lives are lost to this condition, even though early intervention could prevent such tragedies. However, identifying heart disease in its initial stages is not easy. To address this challenge, we propose an automated system aimed at predicting the presence of heart disease using advanced techniques. By doing so, we hope to empower individuals with the knowledge needed to take proactive measures against this potentially fatal illness. Our approach towards this problem involves meticulous data preprocessing and the development of predictive models utilizing classification algorithms such as Support Vector Machines (SVM), Decision Tree, and Random Forest. We assess the efficiency of every model based on metrics like accuracy, ensuring that we select the most reliable option. Additionally, we conduct thorough data analysis to reveal the importance of different attributes. Among the models considered, Random Forest emerges as the standout performer with an accuracy rate of 96.04% in our study.

Keywords: support vector machines, decision tree, random forest

Procedia PDF Downloads 20

6421 Hybrid Knowledge Approach for Determining Health Care Provider Specialty from Patient Diagnoses

Authors: Erin Lynne Plettenberg, Jeremy Vickery

Abstract:

In an access-control situation, the role of a user determines whether a data request is appropriate. This paper combines vetted web mining and logic modeling to build a lightweight system for determining the role of a health care provider based only on their prior authorized requests. The model identifies provider roles with 100% recall from very little data. This shows the value of vetted web mining in AI systems, and suggests the impact of the ICD classification on medical practice.

Keywords: electronic medical records, information extraction, logic modeling, ontology, vetted web mining

Procedia PDF Downloads 155

6420 Comparison of Extended Kalman Filter and Unscented Kalman Filter for Autonomous Orbit Determination of Lagrangian Navigation Constellation

Authors: Youtao Gao, Bingyu Jin, Tanran Zhao, Bo Xu

Abstract:

The history of satellite navigation can be dated back to the 1960s. From the U.S. Transit system and the Russian Tsikada system to the modern Global Positioning System (GPS) and the Globalnaya Navigatsionnaya Sputnikovaya Sistema (GLONASS), performance of satellite navigation has been greatly improved. Nowadays, the navigation accuracy and coverage of these existing systems have already fully fulfilled the requirement of near-Earth users, but these systems are still beyond the reach of deep space targets. Due to the renewed interest in space exploration, a novel high-precision satellite navigation system is becoming even more important. The increasing demand for such a deep space navigation system has contributed to the emergence of a variety of new constellation architectures, such as the Lunar Global Positioning System. Apart from a Walker constellation which is similar to the one adopted by GPS on Earth, a novel constellation architecture which consists of libration point satellites in the Earth-Moon system is also available to construct the lunar navigation system, which can be called accordingly, the libration point satellite navigation system. The concept of using Earth-Moon libration point satellites for lunar navigation was first proposed by Farquhar and then followed by many other researchers. Moreover, due to the special characteristics of Libration point orbits, an autonomous orbit determination technique, which is called ‘Liaison navigation’, can be adopted by the libration point satellites. Using only scalar satellite-to-satellite tracking data, both the orbits of the user and libration point satellites can be determined autonomously. In this way, the extensive Earth-based tracking measurement can be eliminated, and an autonomous satellite navigation system can be developed for future space exploration missions. The method of state estimate is an unnegligible factor which impacts on the orbit determination accuracy besides type of orbit, initial state accuracy and measurement accuracy. We apply the extended Kalman filter(EKF) and the unscented Kalman filter(UKF) to determinate the orbits of Lagrangian navigation satellites. The autonomous orbit determination errors are compared. The simulation results illustrate that UKF can improve the accuracy and z-axis convergence to some extent.

Keywords: extended Kalman filter, autonomous orbit determination, unscented Kalman filter, navigation constellation

Procedia PDF Downloads 264

6419 A Hierarchical Method for Multi-Class Probabilistic Classification Vector Machines

Authors: P. Byrnes, F. A. DiazDelaO

Abstract:

The Support Vector Machine (SVM) has become widely recognised as one of the leading algorithms in machine learning for both regression and binary classification. It expresses predictions in terms of a linear combination of kernel functions, referred to as support vectors. Despite its popularity amongst practitioners, SVM has some limitations, with the most significant being the generation of point prediction as opposed to predictive distributions. Stemming from this issue, a probabilistic model namely, Probabilistic Classification Vector Machines (PCVM), has been proposed which respects the original functional form of SVM whilst also providing a predictive distribution. As physical system designs become more complex, an increasing number of classification tasks involving industrial applications consist of more than two classes. Consequently, this research proposes a framework which allows for the extension of PCVM to a multi class setting. Additionally, the original PCVM framework relies on the use of type II maximum likelihood to provide estimates for both the kernel hyperparameters and model evidence. In a high dimensional multi class setting, however, this approach has been shown to be ineffective due to bad scaling as the number of classes increases. Accordingly, we propose the application of Markov Chain Monte Carlo (MCMC) based methods to provide a posterior distribution over both parameters and hyperparameters. The proposed framework will be validated against current multi class classifiers through synthetic and real life implementations.

Keywords: probabilistic classification vector machines, multi class classification, MCMC, support vector machines

Procedia PDF Downloads 208