Search results for: bayesian classifier
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 634

Search results for: bayesian classifier

124 A Framework Based on Dempster-Shafer Theory of Evidence Algorithm for the Analysis of the TV-Viewers’ Behaviors

Authors: Hamdi Amroun, Yacine Benziani, Mehdi Ammi

Abstract:

In this paper, we propose an approach of detecting the behavior of the viewers of a TV program in a non-controlled environment. The experiment we propose is based on the use of three types of connected objects (smartphone, smart watch, and a connected remote control). 23 participants were observed while watching their TV programs during three phases: before, during and after watching a TV program. Their behaviors were detected using an approach based on The Dempster Shafer Theory (DST) in two phases. The first phase is to approximate dynamically the mass functions using an approach based on the correlation coefficient. The second phase is to calculate the approximate mass functions. To approximate the mass functions, two approaches have been tested: the first approach was to divide each features data space into cells; each one has a specific probability distribution over the behaviors. The probability distributions were computed statistically (estimated by empirical distribution). The second approach was to predict the TV-viewing behaviors through the use of classifiers algorithms and add uncertainty to the prediction based on the uncertainty of the model. Results showed that mixing the fusion rule with the computation of the initial approximate mass functions using a classifier led to an overall of 96%, 95% and 96% success rate for the first, second and third TV-viewing phase respectively. The results were also compared to those found in the literature. This study aims to anticipate certain actions in order to maintain the attention of TV viewers towards the proposed TV programs with usual connected objects, taking into account the various uncertainties that can be generated.

Keywords: Iot, TV-viewing behaviors identification, automatic classification, unconstrained environment

Procedia PDF Downloads 212
123 Evaluation of Vehicle Classification Categories: Florida Case Study

Authors: Ren Moses, Jaqueline Masaki

Abstract:

This paper addresses the need for accurate and updated vehicle classification system through a thorough evaluation of vehicle class categories to identify errors arising from the existing system and proposing modifications. The data collected from two permanent traffic monitoring sites in Florida were used to evaluate the performance of the existing vehicle classification table. The vehicle data were collected and classified by the automatic vehicle classifier (AVC), and a video camera was used to obtain ground truth data. The Federal Highway Administration (FHWA) vehicle classification definitions were used to define vehicle classes from the video and compare them to the data generated by AVC in order to identify the sources of misclassification. Six types of errors were identified. Modifications were made in the classification table to improve the classification accuracy. The results of this study include the development of updated vehicle classification table with a reduction in total error by 5.1%, a step by step procedure to use for evaluation of vehicle classification studies and recommendations to improve FHWA 13-category rule set. The recommendations for the FHWA 13-category rule set indicate the need for the vehicle classification definitions in this scheme to be updated to reflect the distribution of current traffic. The presented results will be of interest to States’ transportation departments and consultants, researchers, engineers, designers, and planners who require accurate vehicle classification information for planning, designing and maintenance of transportation infrastructures.

Keywords: vehicle classification, traffic monitoring, pavement design, highway traffic

Procedia PDF Downloads 165
122 Molecular Identification and Evolutionary Status of Lucilia bufonivora: An Obligate Parasite of Amphibians in Europe

Authors: Gerardo Arias, Richard Wall, Jamie Stevens

Abstract:

Lucilia bufonivora Moniez, is an obligate parasite of toads and frogs widely distributed in Europe. Its sister taxon Lucilia silvarum Meigen behaves mainly as a carrion breeder in Europe, however it has been reported as a facultative parasite of amphibians. These two closely related species are morphologically almost identical, which has led to misidentification, and in fact, it has been suggested that the amphibian myiasis cases by L. silvarum reported in Europe should be attributed to L. bufonivora. Both species remain poorly studied and their taxonomic relationships are still unclear. The identification of the larval specimens involved in amphibian myiasis with molecular tools and phylogenetic analysis of these two closely related species may resolve this problem. In this work seventeen unidentified larval specimens extracted from toad myiasis cases of the UK, the Netherlands and Switzerland were obtained, their COX1 (mtDNA) and EF1-α (Nuclear DNA) gene regions were amplified and then sequenced. The 17 larval samples were identified with both molecular markers as L. bufonivora. Phylogenetic analysis was carried out with 10 other blowfly species, including L. silvarum samples from the UK and USA. Bayesian Inference trees of COX1 and a combined-gene dataset suggested that L. silvarum and L. bufonivora are separate sister species. However, the nuclear gene EF1-α does not appear to resolve their relationships, suggesting that the rates of evolution of the mtDNA are much faster than those of the nuclear DNA. This work provides the molecular evidence for successful identification of L. bufonivora and a molecular analysis of the populations of this obligate parasite from different locations across Europe. The relationships with L. silvarum are discussed.

Keywords: calliphoridae, molecular evolution, myiasis, obligate parasitism

Procedia PDF Downloads 213
121 Speech Detection Model Based on Deep Neural Networks Classifier for Speech Emotions Recognition

Authors: A. Shoiynbek, K. Kozhakhmet, P. Menezes, D. Kuanyshbay, D. Bayazitov

Abstract:

Speech emotion recognition has received increasing research interest all through current years. There was used emotional speech that was collected under controlled conditions in most research work. Actors imitating and artificially producing emotions in front of a microphone noted those records. There are four issues related to that approach, namely, (1) emotions are not natural, and it means that machines are learning to recognize fake emotions. (2) Emotions are very limited by quantity and poor in their variety of speaking. (3) There is language dependency on SER. (4) Consequently, each time when researchers want to start work with SER, they need to find a good emotional database on their language. In this paper, we propose the approach to create an automatic tool for speech emotion extraction based on facial emotion recognition and describe the sequence of actions of the proposed approach. One of the first objectives of the sequence of actions is a speech detection issue. The paper gives a detailed description of the speech detection model based on a fully connected deep neural network for Kazakh and Russian languages. Despite the high results in speech detection for Kazakh and Russian, the described process is suitable for any language. To illustrate the working capacity of the developed model, we have performed an analysis of speech detection and extraction from real tasks.

Keywords: deep neural networks, speech detection, speech emotion recognition, Mel-frequency cepstrum coefficients, collecting speech emotion corpus, collecting speech emotion dataset, Kazakh speech dataset

Procedia PDF Downloads 78
120 A Deep Learning Approach to Online Social Network Account Compromisation

Authors: Edward K. Boahen, Brunel E. Bouya-Moko, Changda Wang

Abstract:

The major threat to online social network (OSN) users is account compromisation. Spammers now spread malicious messages by exploiting the trust relationship established between account owners and their friends. The challenge in detecting a compromised account by service providers is validating the trusted relationship established between the account owners, their friends, and the spammers. Another challenge is the increase in required human interaction with the feature selection. Research available on supervised learning (machine learning) has limitations with the feature selection and accounts that cannot be profiled, like application programming interface (API). Therefore, this paper discusses the various behaviours of the OSN users and the current approaches in detecting a compromised OSN account, emphasizing its limitations and challenges. We propose a deep learning approach that addresses and resolve the constraints faced by the previous schemes. We detailed our proposed optimized nonsymmetric deep auto-encoder (OPT_NDAE) for unsupervised feature learning, which reduces the required human interaction levels in the selection and extraction of features. We evaluated our proposed classifier using the NSL-KDD and KDDCUP'99 datasets in a graphical user interface enabled Weka application. The results obtained indicate that our proposed approach outperformed most of the traditional schemes in OSN compromised account detection with an accuracy rate of 99.86%.

Keywords: computer security, network security, online social network, account compromisation

Procedia PDF Downloads 97
119 Convolutional Neural Networks versus Radiomic Analysis for Classification of Breast Mammogram

Authors: Mehwish Asghar

Abstract:

Breast Cancer (BC) is a common type of cancer among women. Its screening is usually performed using different imaging modalities such as magnetic resonance imaging, mammogram, X-ray, CT, etc. Among these modalities’ mammogram is considered a powerful tool for diagnosis and screening of breast cancer. Sophisticated machine learning approaches have shown promising results in complementing human diagnosis. Generally, machine learning methods can be divided into two major classes: one is Radiomics analysis (RA), where image features are extracted manually; and the other one is the concept of convolutional neural networks (CNN), in which the computer learns to recognize image features on its own. This research aims to improve the incidence of early detection, thus reducing the mortality rate caused by breast cancer through the latest advancements in computer science, in general, and machine learning, in particular. It has also been aimed to ease the burden of doctors by improving and automating the process of breast cancer detection. This research is related to a relative analysis of different techniques for the implementation of different models for detecting and classifying breast cancer. The main goal of this research is to provide a detailed view of results and performances between different techniques. The purpose of this paper is to explore the potential of a convolutional neural network (CNN) w.r.t feature extractor and as a classifier. Also, in this research, it has been aimed to add the module of Radiomics for comparison of its results with deep learning techniques.

Keywords: breast cancer (BC), machine learning (ML), convolutional neural network (CNN), radionics, magnetic resonance imaging, artificial intelligence

Procedia PDF Downloads 200
118 Hindi Speech Synthesis by Concatenation of Recognized Hand Written Devnagri Script Using Support Vector Machines Classifier

Authors: Saurabh Farkya, Govinda Surampudi

Abstract:

Optical Character Recognition is one of the current major research areas. This paper is focussed on recognition of Devanagari script and its sound generation. This Paper consists of two parts. First, Optical Character Recognition of Devnagari handwritten Script. Second, speech synthesis of the recognized text. This paper shows an implementation of support vector machines for the purpose of Devnagari Script recognition. The Support Vector Machines was trained with Multi Domain features; Transform Domain and Spatial Domain or Structural Domain feature. Transform Domain includes the wavelet feature of the character. Structural Domain consists of Distance Profile feature and Gradient feature. The Segmentation of the text document has been done in 3 levels-Line Segmentation, Word Segmentation, and Character Segmentation. The pre-processing of the characters has been done with the help of various Morphological operations-Otsu's Algorithm, Erosion, Dilation, Filtration and Thinning techniques. The Algorithm was tested on the self-prepared database, a collection of various handwriting. Further, Unicode was used to convert recognized Devnagari text into understandable computer document. The document so obtained is an array of codes which was used to generate digitized text and to synthesize Hindi speech. Phonemes from the self-prepared database were used to generate the speech of the scanned document using concatenation technique.

Keywords: Character Recognition (OCR), Text to Speech (TTS), Support Vector Machines (SVM), Library of Support Vector Machines (LIBSVM)

Procedia PDF Downloads 474
117 Feature Analysis of Predictive Maintenance Models

Authors: Zhaoan Wang

Abstract:

Research in predictive maintenance modeling has improved in the recent years to predict failures and needed maintenance with high accuracy, saving cost and improving manufacturing efficiency. However, classic prediction models provide little valuable insight towards the most important features contributing to the failure. By analyzing and quantifying feature importance in predictive maintenance models, cost saving can be optimized based on business goals. First, multiple classifiers are evaluated with cross-validation to predict the multi-class of failures. Second, predictive performance with features provided by different feature selection algorithms are further analyzed. Third, features selected by different algorithms are ranked and combined based on their predictive power. Finally, linear explainer SHAP (SHapley Additive exPlanations) is applied to interpret classifier behavior and provide further insight towards the specific roles of features in both local predictions and global model behavior. The results of the experiments suggest that certain features play dominant roles in predictive models while others have significantly less impact on the overall performance. Moreover, for multi-class prediction of machine failures, the most important features vary with type of machine failures. The results may lead to improved productivity and cost saving by prioritizing sensor deployment, data collection, and data processing of more important features over less importance features.

Keywords: automated supply chain, intelligent manufacturing, predictive maintenance machine learning, feature engineering, model interpretation

Procedia PDF Downloads 111
116 Reliability Qualification Test Plan Derivation Method for Weibull Distributed Products

Authors: Ping Jiang, Yunyan Xing, Dian Zhang, Bo Guo

Abstract:

The reliability qualification test (RQT) is widely used in product development to qualify whether the product meets predetermined reliability requirements, which are mainly described in terms of reliability indices, for example, MTBF (Mean Time Between Failures). It is widely exercised in product development. In engineering practices, RQT plans are mandatorily referred to standards, such as MIL-STD-781 or GJB899A-2009. But these conventional RQT plans in standards are not preferred, as the test plans often require long test times or have high risks for both producer and consumer due to the fact that the methods in the standards only use the test data of the product itself. And the standards usually assume that the product is exponentially distributed, which is not suitable for a complex product other than electronics. So it is desirable to develop an RQT plan derivation method that safely shortens test time while keeping the two risks under control. To meet this end, for the product whose lifetime follows Weibull distribution, an RQT plan derivation method is developed. The merit of the method is that expert judgment is taken into account. This is implemented by applying the Bayesian method, which translates the expert judgment into prior information on product reliability. Then producer’s risk and the consumer’s risk are calculated accordingly. The procedures to derive RQT plans are also proposed in this paper. As extra information and expert judgment are added to the derivation, the derived test plans have the potential to shorten the required test time and have satisfactory low risks for both producer and consumer, compared with conventional test plans. A case study is provided to prove that when using expert judgment in deriving product test plans, the proposed method is capable of finding ideal test plans that not only reduce the two risks but also shorten the required test time as well.

Keywords: expert judgment, reliability qualification test, test plan derivation, producer’s risk, consumer’s risk

Procedia PDF Downloads 112
115 Change Detection Analysis on Support Vector Machine Classifier of Land Use and Land Cover Changes: Case Study on Yangon

Authors: Khin Mar Yee, Mu Mu Than, Kyi Lint, Aye Aye Oo, Chan Mya Hmway, Khin Zar Chi Winn

Abstract:

The dynamic changes of Land Use and Land Cover (LULC) changes in Yangon have generally resulted the improvement of human welfare and economic development since the last twenty years. Making map of LULC is crucially important for the sustainable development of the environment. However, the exactly data on how environmental factors influence the LULC situation at the various scales because the nature of the natural environment is naturally composed of non-homogeneous surface features, so the features in the satellite data also have the mixed pixels. The main objective of this study is to the calculation of accuracy based on change detection of LULC changes by Support Vector Machines (SVMs). For this research work, the main data was satellite images of 1996, 2006 and 2015. Computing change detection statistics use change detection statistics to compile a detailed tabulation of changes between two classification images and Support Vector Machines (SVMs) process was applied with a soft approach at allocation as well as at a testing stage and to higher accuracy. The results of this paper showed that vegetation and cultivated area were decreased (average total 29 % from 1996 to 2015) because of conversion to the replacing over double of the built up area (average total 30 % from 1996 to 2015). The error matrix and confidence limits led to the validation of the result for LULC mapping.

Keywords: land use and land cover change, change detection, image processing, support vector machines

Procedia PDF Downloads 112
114 A Communication Signal Recognition Algorithm Based on Holder Coefficient Characteristics

Authors: Hui Zhang, Ye Tian, Fang Ye, Ziming Guo

Abstract:

Communication signal modulation recognition technology is one of the key technologies in the field of modern information warfare. At present, communication signal automatic modulation recognition methods are mainly divided into two major categories. One is the maximum likelihood hypothesis testing method based on decision theory, the other is a statistical pattern recognition method based on feature extraction. Now, the most commonly used is a statistical pattern recognition method, which includes feature extraction and classifier design. With the increasingly complex electromagnetic environment of communications, how to effectively extract the features of various signals at low signal-to-noise ratio (SNR) is a hot topic for scholars in various countries. To solve this problem, this paper proposes a feature extraction algorithm for the communication signal based on the improved Holder cloud feature. And the extreme learning machine (ELM) is used which aims at the problem of the real-time in the modern warfare to classify the extracted features. The algorithm extracts the digital features of the improved cloud model without deterministic information in a low SNR environment, and uses the improved cloud model to obtain more stable Holder cloud features and the performance of the algorithm is improved. This algorithm addresses the problem that a simple feature extraction algorithm based on Holder coefficient feature is difficult to recognize at low SNR, and it also has a better recognition accuracy. The results of simulations show that the approach in this paper still has a good classification result at low SNR, even when the SNR is -15dB, the recognition accuracy still reaches 76%.

Keywords: communication signal, feature extraction, Holder coefficient, improved cloud model

Procedia PDF Downloads 131
113 Artificial Intelligence Techniques for Enhancing Supply Chain Resilience: A Systematic Literature Review, Holistic Framework, and Future Research

Authors: Adane Kassa Shikur

Abstract:

Today’s supply chains (SC) have become vulnerable to unexpected and ever-intensifying disruptions from myriad sources. Consequently, the concept of supply chain resilience (SCRes) has become crucial to complement the conventional risk management paradigm, which has failed to cope with unexpected SC disruptions, resulting in severe consequences affecting SC performances and making business continuity questionable. Advancements in cutting-edge technologies like artificial intelligence (AI) and their potential to enhance SCRes by improving critical antecedents in the different phases have attracted the attention of scholars and practitioners. The research from academia and the practical interest of the industry have yielded significant publications at the nexus of AI and SCRes during the last two decades. However, the applications and examinations have been primarily conducted independently, and the extant literature is dispersed into research streams despite the complex nature of SCRes. To close this research gap, this study conducts a systematic literature review of 106 peer-reviewed articles by curating, synthesizing, and consolidating up-to-date literature and presents the state-of-the-art development from 2010 to 2022. Bayesian networks are the most topical ones among the 13 AI techniques evaluated. Concerning the critical antecedents, visibility is the first ranking to be realized by the techniques. The study revealed that AI techniques support only the first 3 phases of SCRes (readiness, response, and recovery), and readiness is the most popular one, while no evidence has been found for the growth phase. The study proposed an AI-SCRes framework to inform research and practice to approach SCRes holistically. It also provided implications for practice, policy, and theory as well as gaps for impactful future research.

Keywords: ANNs, risk, Bauesian networks, vulnerability, resilience

Procedia PDF Downloads 57
112 Forecasting Lake Malawi Water Level Fluctuations Using Stochastic Models

Authors: M. Mulumpwa, W. W. L. Jere, M. Lazaro, A. H. N. Mtethiwa

Abstract:

The study considered Seasonal Autoregressive Integrated Moving Average (SARIMA) processes to select an appropriate stochastic model to forecast the monthly data from the Lake Malawi water levels for the period 1986 through 2015. The appropriate model was chosen based on SARIMA (p, d, q) (P, D, Q)S. The Autocorrelation function (ACF), Partial autocorrelation (PACF), Akaike Information Criteria (AIC), Bayesian Information Criterion (BIC), Box–Ljung statistics, correlogram and distribution of residual errors were estimated. The SARIMA (1, 1, 0) (1, 1, 1)12 was selected to forecast the monthly data of the Lake Malawi water levels from August, 2015 to December, 2021. The plotted time series showed that the Lake Malawi water levels are decreasing since 2010 to date but not as much as was the case in 1995 through 1997. The future forecast of the Lake Malawi water levels until 2021 showed a mean of 474.47 m ranging from 473.93 to 475.02 meters with a confidence interval of 80% and 90% against registered mean of 473.398 m in 1997 and 475.475 m in 1989 which was the lowest and highest water levels in the lake respectively since 1986. The forecast also showed that the water levels of Lake Malawi will drop by 0.57 meters as compared to the mean water levels recorded in the previous years. These results suggest that the Lake Malawi water level may not likely go lower than that recorded in 1997. Therefore, utilisation and management of water-related activities and programs among others on the lake should provide room for such scenarios. The findings suggest a need to manage the Lake Malawi jointly and prudently with other stakeholders starting from the catchment area. This will reduce impacts of anthropogenic activities on the lake’s water quality, water level, aquatic and adjacent terrestrial ecosystems thereby ensuring its resilience to climate change impacts.

Keywords: forecasting, Lake Malawi, water levels, water level fluctuation, climate change, anthropogenic activities

Procedia PDF Downloads 208
111 Localization of Geospatial Events and Hoax Prediction in the UFO Database

Authors: Harish Krishnamurthy, Anna Lafontant, Ren Yi

Abstract:

Unidentified Flying Objects (UFOs) have been an interesting topic for most enthusiasts and hence people all over the United States report such findings online at the National UFO Report Center (NUFORC). Some of these reports are a hoax and among those that seem legitimate, our task is not to establish that these events confirm that they indeed are events related to flying objects from aliens in outer space. Rather, we intend to identify if the report was a hoax as was identified by the UFO database team with their existing curation criterion. However, the database provides a wealth of information that can be exploited to provide various analyses and insights such as social reporting, identifying real-time spatial events and much more. We perform analysis to localize these time-series geospatial events and correlate with known real-time events. This paper does not confirm any legitimacy of alien activity, but rather attempts to gather information from likely legitimate reports of UFOs by studying the online reports. These events happen in geospatial clusters and also are time-based. We look at cluster density and data visualization to search the space of various cluster realizations to decide best probable clusters that provide us information about the proximity of such activity. A random forest classifier is also presented that is used to identify true events and hoax events, using the best possible features available such as region, week, time-period and duration. Lastly, we show the performance of the scheme on various days and correlate with real-time events where one of the UFO reports strongly correlates to a missile test conducted in the United States.

Keywords: time-series clustering, feature extraction, hoax prediction, geospatial events

Procedia PDF Downloads 356
110 A Neurofeedback Learning Model Using Time-Frequency Analysis for Volleyball Performance Enhancement

Authors: Hamed Yousefi, Farnaz Mohammadi, Niloufar Mirian, Navid Amini

Abstract:

Investigating possible capacities of visual functions where adapted mechanisms can enhance the capability of sports trainees is a promising area of research, not only from the cognitive viewpoint but also in terms of unlimited applications in sports training. In this paper, the visual evoked potential (VEP) and event-related potential (ERP) signals of amateur and trained volleyball players in a pilot study were processed. Two groups of amateur and trained subjects are asked to imagine themselves in the state of receiving a ball while they are shown a simulated volleyball field. The proposed method is based on a set of time-frequency features using algorithms such as Gabor filter, continuous wavelet transform, and a multi-stage wavelet decomposition that are extracted from VEP signals that can be indicative of being amateur or trained. The linear discriminant classifier achieves the accuracy, sensitivity, and specificity of 100% when the average of the repetitions of the signal corresponding to the task is used. The main purpose of this study is to investigate the feasibility of a fast, robust, and reliable feature/model determination as a neurofeedback parameter to be utilized for improving the volleyball players’ performance. The proposed measure has potential applications in brain-computer interface technology where a real-time biomarker is needed.

Keywords: visual evoked potential, time-frequency feature extraction, short-time Fourier transform, event-related spectrum potential classification, linear discriminant analysis

Procedia PDF Downloads 117
109 Iris Cancer Detection System Using Image Processing and Neural Classifier

Authors: Abdulkader Helwan

Abstract:

Iris cancer, so called intraocular melanoma is a cancer that starts in the iris; the colored part of the eye that surrounds the pupil. There is a need for an accurate and cost-effective iris cancer detection system since the available techniques used currently are still not efficient. The combination of the image processing and artificial neural networks has a great efficiency for the diagnosis and detection of the iris cancer. Image processing techniques improve the diagnosis of the cancer by enhancing the quality of the images, so the physicians diagnose properly. However, neural networks can help in making decision; whether the eye is cancerous or not. This paper aims to develop an intelligent system that stimulates a human visual detection of the intraocular melanoma, so called iris cancer. The suggested system combines both image processing techniques and neural networks. The images are first converted to grayscale, filtered, and then segmented using prewitt edge detection algorithm to detect the iris, sclera circles and the cancer. The principal component analysis is used to reduce the image size and for extracting features. Those features are considered then as inputs for a neural network which is capable of deciding if the eye is cancerous or not, throughout its experience adopted by many training iterations of different normal and abnormal eye images during the training phase. Normal images are obtained from a public database available on the internet, “Mile Research”, while the abnormal ones are obtained from another database which is the “eyecancer”. The experimental results for the proposed system show high accuracy 100% for detecting cancer and making the right decision.

Keywords: iris cancer, intraocular melanoma, cancerous, prewitt edge detection algorithm, sclera

Procedia PDF Downloads 481
108 Machine Learning Classification of Fused Sentinel-1 and Sentinel-2 Image Data Towards Mapping Fruit Plantations in Highly Heterogenous Landscapes

Authors: Yingisani Chabalala, Elhadi Adam, Khalid Adem Ali

Abstract:

Mapping smallholder fruit plantations using optical data is challenging due to morphological landscape heterogeneity and crop types having overlapped spectral signatures. Furthermore, cloud covers limit the use of optical sensing, especially in subtropical climates where they are persistent. This research assessed the effectiveness of Sentinel-1 (S1) and Sentinel-2 (S2) data for mapping fruit trees and co-existing land-use types by using support vector machine (SVM) and random forest (RF) classifiers independently. These classifiers were also applied to fused data from the two sensors. Feature ranks were extracted using the RF mean decrease accuracy (MDA) and forward variable selection (FVS) to identify optimal spectral windows to classify fruit trees. Based on RF MDA and FVS, the SVM classifier resulted in relatively high classification accuracy with overall accuracy (OA) = 0.91.6% and kappa coefficient = 0.91% when applied to the fused satellite data. Application of SVM to S1, S2, S2 selected variables and S1S2 fusion independently produced OA = 27.64, Kappa coefficient = 0.13%; OA= 87%, Kappa coefficient = 86.89%; OA = 69.33, Kappa coefficient = 69. %; OA = 87.01%, Kappa coefficient = 87%, respectively. Results also indicated that the optimal spectral bands for fruit tree mapping are green (B3) and SWIR_2 (B10) for S2, whereas for S1, the vertical-horizontal (VH) polarization band. Including the textural metrics from the VV channel improved crop discrimination and co-existing land use cover types. The fusion approach proved robust and well-suited for accurate smallholder fruit plantation mapping.

Keywords: smallholder agriculture, fruit trees, data fusion, precision agriculture

Procedia PDF Downloads 31
107 Local Interpretable Model-agnostic Explanations (LIME) Approach to Email Spam Detection

Authors: Rohini Hariharan, Yazhini R., Blessy Maria Mathew

Abstract:

The task of detecting email spam is a very important one in the era of digital technology that needs effective ways of curbing unwanted messages. This paper presents an approach aimed at making email spam categorization algorithms transparent, reliable and more trustworthy by incorporating Local Interpretable Model-agnostic Explanations (LIME). Our technique assists in providing interpretable explanations for specific classifications of emails to help users understand the decision-making process by the model. In this study, we developed a complete pipeline that incorporates LIME into the spam classification framework and allows creating simplified, interpretable models tailored to individual emails. LIME identifies influential terms, pointing out key elements that drive classification results, thus reducing opacity inherent in conventional machine learning models. Additionally, we suggest a visualization scheme for displaying keywords that will improve understanding of categorization decisions by users. We test our method on a diverse email dataset and compare its performance with various baseline models, such as Gaussian Naive Bayes, Multinomial Naive Bayes, Bernoulli Naive Bayes, Support Vector Classifier, K-Nearest Neighbors, Decision Tree, and Logistic Regression. Our testing results show that our model surpasses all other models, achieving an accuracy of 96.59% and a precision of 99.12%.

Keywords: text classification, LIME (local interpretable model-agnostic explanations), stemming, tokenization, logistic regression.

Procedia PDF Downloads 26
106 Automatic Staging and Subtype Determination for Non-Small Cell Lung Carcinoma Using PET Image Texture Analysis

Authors: Seyhan Karaçavuş, Bülent Yılmaz, Ömer Kayaaltı, Semra İçer, Arzu Taşdemir, Oğuzhan Ayyıldız, Kübra Eset, Eser Kaya

Abstract:

In this study, our goal was to perform tumor staging and subtype determination automatically using different texture analysis approaches for a very common cancer type, i.e., non-small cell lung carcinoma (NSCLC). Especially, we introduced a texture analysis approach, called Law’s texture filter, to be used in this context for the first time. The 18F-FDG PET images of 42 patients with NSCLC were evaluated. The number of patients for each tumor stage, i.e., I-II, III or IV, was 14. The patients had ~45% adenocarcinoma (ADC) and ~55% squamous cell carcinoma (SqCCs). MATLAB technical computing language was employed in the extraction of 51 features by using first order statistics (FOS), gray-level co-occurrence matrix (GLCM), gray-level run-length matrix (GLRLM), and Laws’ texture filters. The feature selection method employed was the sequential forward selection (SFS). Selected textural features were used in the automatic classification by k-nearest neighbors (k-NN) and support vector machines (SVM). In the automatic classification of tumor stage, the accuracy was approximately 59.5% with k-NN classifier (k=3) and 69% with SVM (with one versus one paradigm), using 5 features. In the automatic classification of tumor subtype, the accuracy was around 92.7% with SVM one vs. one. Texture analysis of FDG-PET images might be used, in addition to metabolic parameters as an objective tool to assess tumor histopathological characteristics and in automatic classification of tumor stage and subtype.

Keywords: cancer stage, cancer cell type, non-small cell lung carcinoma, PET, texture analysis

Procedia PDF Downloads 305
105 Evaluation of Machine Learning Algorithms and Ensemble Methods for Prediction of Students’ Graduation

Authors: Soha A. Bahanshal, Vaibhav Verdhan, Bayong Kim

Abstract:

Graduation rates at six-year colleges are becoming a more essential indicator for incoming fresh students and for university rankings. Predicting student graduation is extremely beneficial to schools and has a huge potential for targeted intervention. It is important for educational institutions since it enables the development of strategic plans that will assist or improve students' performance in achieving their degrees on time (GOT). A first step and a helping hand in extracting useful information from these data and gaining insights into the prediction of students' progress and performance is offered by machine learning techniques. Data analysis and visualization techniques are applied to understand and interpret the data. The data used for the analysis contains students who have graduated in 6 years in the academic year 2017-2018 for science majors. This analysis can be used to predict the graduation of students in the next academic year. Different Predictive modelings such as logistic regression, decision trees, support vector machines, Random Forest, Naïve Bayes, and KNeighborsClassifier are applied to predict whether a student will graduate. These classifiers were evaluated with k folds of 5. The performance of these classifiers was compared based on accuracy measurement. The results indicated that Ensemble Classifier achieves better accuracy, about 91.12%. This GOT prediction model would hopefully be useful to university administration and academics in developing measures for assisting and boosting students' academic performance and ensuring they graduate on time.

Keywords: prediction, decision trees, machine learning, support vector machine, ensemble model, student graduation, GOT graduate on time

Procedia PDF Downloads 59
104 Improving 99mTc-tetrofosmin Myocardial Perfusion Images by Time Subtraction Technique

Authors: Yasuyuki Takahashi, Hayato Ishimura, Masao Miyagawa, Teruhito Mochizuki

Abstract:

Quantitative measurement of myocardium perfusion is possible with single photon emission computed tomography (SPECT) using a semiconductor detector. However, accumulation of 99mTc-tetrofosmin in the liver may make it difficult to assess that accurately in the inferior myocardium. Our idea is to reduce the high accumulation in the liver by using dynamic SPECT imaging and a technique called time subtraction. We evaluated the performance of a new SPECT system with a cadmium-zinc-telluride solid-state semi- conductor detector (Discovery NM 530c; GE Healthcare). Our system acquired list-mode raw data over 10 minutes for a typical patient. From the data, ten SPECT images were reconstructed, one for every minute of acquired data. Reconstruction with the semiconductor detector was based on an implementation of a 3-D iterative Bayesian reconstruction algorithm. We studied 20 patients with coronary artery disease (mean age 75.4 ± 12.1 years; range 42-86; 16 males and 4 females). In each subject, 259 MBq of 99mTc-tetrofosmin was injected intravenously. We performed both a phantom and a clinical study using dynamic SPECT. An approximation to a liver-only image is obtained by reconstructing an image from the early projections during which time the liver accumulation dominates (0.5~2.5 minutes SPECT image-5~10 minutes SPECT image). The extracted liver-only image is then subtracted from a later SPECT image that shows both the liver and the myocardial uptake (5~10 minutes SPECT image-liver-only image). The time subtraction of liver was possible in both a phantom and the clinical study. The visualization of the inferior myocardium was improved. In past reports, higher accumulation in the myocardium due to the overlap of the liver is un-diagnosable. Using our time subtraction method, the image quality of the 99mTc-tetorofosmin myocardial SPECT image is considerably improved.

Keywords: 99mTc-tetrofosmin, dynamic SPECT, time subtraction, semiconductor detector

Procedia PDF Downloads 312
103 Identification of Spam Keywords Using Hierarchical Category in C2C E-Commerce

Authors: Shao Bo Cheng, Yong-Jin Han, Se Young Park, Seong-Bae Park

Abstract:

Consumer-to-Consumer (C2C) E-commerce has been growing at a very high speed in recent years. Since identical or nearly-same kinds of products compete one another by relying on keyword search in C2C E-commerce, some sellers describe their products with spam keywords that are popular but are not related to their products. Though such products get more chances to be retrieved and selected by consumers than those without spam keywords, the spam keywords mislead the consumers and waste their time. This problem has been reported in many commercial services like e-bay and taobao, but there have been little research to solve this problem. As a solution to this problem, this paper proposes a method to classify whether keywords of a product are spam or not. The proposed method assumes that a keyword for a given product is more reliable if the keyword is observed commonly in specifications of products which are the same or the same kind as the given product. This is because that a hierarchical category of a product in general determined precisely by a seller of the product and so is the specification of the product. Since higher layers of the hierarchical category represent more general kinds of products, a reliable degree is differently determined according to the layers. Hence, reliable degrees from different layers of a hierarchical category become features for keywords and they are used together with features only from specifications for classification of the keywords. Support Vector Machines are adopted as a basic classifier using the features, since it is powerful, and widely used in many classification tasks. In the experiments, the proposed method is evaluated with a golden standard dataset from Yi-han-wang, a Chinese C2C e-commerce, and is compared with a baseline method that does not consider the hierarchical category. The experimental results show that the proposed method outperforms the baseline in F1-measure, which proves that spam keywords are effectively identified by a hierarchical category in C2C e-commerce.

Keywords: spam keyword, e-commerce, keyword features, spam filtering

Procedia PDF Downloads 274
102 Comparison of Different Artificial Intelligence-Based Protein Secondary Structure Prediction Methods

Authors: Jamerson Felipe Pereira Lima, Jeane Cecília Bezerra de Melo

Abstract:

The difficulty and cost related to obtaining of protein tertiary structure information through experimental methods, such as X-ray crystallography or NMR spectroscopy, helped raising the development of computational methods to do so. An approach used in these last is prediction of tridimensional structure based in the residue chain, however, this has been proved an NP-hard problem, due to the complexity of this process, explained by the Levinthal paradox. An alternative solution is the prediction of intermediary structures, such as the secondary structure of the protein. Artificial Intelligence methods, such as Bayesian statistics, artificial neural networks (ANN), support vector machines (SVM), among others, were used to predict protein secondary structure. Due to its good results, artificial neural networks have been used as a standard method to predict protein secondary structure. Recent published methods that use this technique, in general, achieved a Q3 accuracy between 75% and 83%, whereas the theoretical accuracy limit for protein prediction is 88%. Alternatively, to achieve better results, support vector machines prediction methods have been developed. The statistical evaluation of methods that use different AI techniques, such as ANNs and SVMs, for example, is not a trivial problem, since different training sets, validation techniques, as well as other variables can influence the behavior of a prediction method. In this study, we propose a prediction method based on artificial neural networks, which is then compared with a selected SVM method. The chosen SVM protein secondary structure prediction method is the one proposed by Huang in his work Extracting Physico chemical Features to Predict Protein Secondary Structure (2013). The developed ANN method has the same training and testing process that was used by Huang to validate his method, which comprises the use of the CB513 protein data set and three-fold cross-validation, so that the comparative analysis of the results can be made comparing directly the statistical results of each method.

Keywords: artificial neural networks, protein secondary structure, protein structure prediction, support vector machines

Procedia PDF Downloads 594
101 Developing a DNN Model for the Production of Biogas From a Hybrid BO-TPE System in an Anaerobic Wastewater Treatment Plant

Authors: Hadjer Sadoune, Liza Lamini, Scherazade Krim, Amel Djouadi, Rachida Rihani

Abstract:

Deep neural networks are highly regarded for their accuracy in predicting intricate fermentation processes. Their ability to learn from a large amount of datasets through artificial intelligence makes them particularly effective models. The primary obstacle in improving the performance of these models is to carefully choose the suitable hyperparameters, including the neural network architecture (number of hidden layers and hidden units), activation function, optimizer, learning rate, and other relevant factors. This study predicts biogas production from real wastewater treatment plant data using a sophisticated approach: hybrid Bayesian optimization with a tree-structured Parzen estimator (BO-TPE) for an optimised deep neural network (DNN) model. The plant utilizes an Upflow Anaerobic Sludge Blanket (UASB) digester that treats industrial wastewater from soft drinks and breweries. The digester has a working volume of 1574 m3 and a total volume of 1914 m3. Its internal diameter and height were 19 and 7.14 m, respectively. The data preprocessing was conducted with meticulous attention to preserving data quality while avoiding data reduction. Three normalization techniques were applied to the pre-processed data (MinMaxScaler, RobustScaler and StandardScaler) and compared with the Non-Normalized data. The RobustScaler approach has strong predictive ability for estimating the volume of biogas produced. The highest predicted biogas volume was 2236.105 Nm³/d, with coefficient of determination (R2), mean absolute error (MAE), and root mean square error (RMSE) values of 0.712, 164.610, and 223.429, respectively.

Keywords: anaerobic digestion, biogas production, deep neural network, hybrid bo-tpe, hyperparameters tuning

Procedia PDF Downloads 20
100 Markov Random Field-Based Segmentation Algorithm for Detection of Land Cover Changes Using Uninhabited Aerial Vehicle Synthetic Aperture Radar Polarimetric Images

Authors: Mehrnoosh Omati, Mahmod Reza Sahebi

Abstract:

The information on land use/land cover changing plays an essential role for environmental assessment, planning and management in regional development. Remotely sensed imagery is widely used for providing information in many change detection applications. Polarimetric Synthetic aperture radar (PolSAR) image, with the discrimination capability between different scattering mechanisms, is a powerful tool for environmental monitoring applications. This paper proposes a new boundary-based segmentation algorithm as a fundamental step for land cover change detection. In this method, first, two PolSAR images are segmented using integration of marker-controlled watershed algorithm and coupled Markov random field (MRF). Then, object-based classification is performed to determine changed/no changed image objects. Compared with pixel-based support vector machine (SVM) classifier, this novel segmentation algorithm significantly reduces the speckle effect in PolSAR images and improves the accuracy of binary classification in object-based level. The experimental results on Uninhabited Aerial Vehicle Synthetic Aperture Radar (UAVSAR) polarimetric images show a 3% and 6% improvement in overall accuracy and kappa coefficient, respectively. Also, the proposed method can correctly distinguish homogeneous image parcels.

Keywords: coupled Markov random field (MRF), environment, object-based analysis, polarimetric SAR (PolSAR) images

Procedia PDF Downloads 201
99 Vehicle Speed Estimation Using Image Processing

Authors: Prodipta Bhowmik, Poulami Saha, Preety Mehra, Yogesh Soni, Triloki Nath Jha

Abstract:

In India, the smart city concept is growing day by day. So, for smart city development, a better traffic management and monitoring system is a very important requirement. Nowadays, road accidents increase due to more vehicles on the road. Reckless driving is mainly responsible for a huge number of accidents. So, an efficient traffic management system is required for all kinds of roads to control the traffic speed. The speed limit varies from road to road basis. Previously, there was a radar system but due to high cost and less precision, the radar system is unable to become favorable in a traffic management system. Traffic management system faces different types of problems every day and it has become a researchable topic on how to solve this problem. This paper proposed a computer vision and machine learning-based automated system for multiple vehicle detection, tracking, and speed estimation of vehicles using image processing. Detection of vehicles and estimating their speed from a real-time video is tough work to do. The objective of this paper is to detect vehicles and estimate their speed as accurately as possible. So for this, a real-time video is first captured, then the frames are extracted from that video, then from that frames, the vehicles are detected, and thereafter, the tracking of vehicles starts, and finally, the speed of the moving vehicles is estimated. The goal of this method is to develop a cost-friendly system that can able to detect multiple types of vehicles at the same time.

Keywords: OpenCV, Haar Cascade classifier, DLIB, YOLOV3, centroid tracker, vehicle detection, vehicle tracking, vehicle speed estimation, computer vision

Procedia PDF Downloads 63
98 Intrusion Detection in Cloud Computing Using Machine Learning

Authors: Faiza Babur Khan, Sohail Asghar

Abstract:

With an emergence of distributed environment, cloud computing is proving to be the most stimulating computing paradigm shift in computer technology, resulting in spectacular expansion in IT industry. Many companies have augmented their technical infrastructure by adopting cloud resource sharing architecture. Cloud computing has opened doors to unlimited opportunities from application to platform availability, expandable storage and provision of computing environment. However, from a security viewpoint, an added risk level is introduced from clouds, weakening the protection mechanisms, and hardening the availability of privacy, data security and on demand service. Issues of trust, confidentiality, and integrity are elevated due to multitenant resource sharing architecture of cloud. Trust or reliability of cloud refers to its capability of providing the needed services precisely and unfailingly. Confidentiality is the ability of the architecture to ensure authorization of the relevant party to access its private data. It also guarantees integrity to protect the data from being fabricated by an unauthorized user. So in order to assure provision of secured cloud, a roadmap or model is obligatory to analyze a security problem, design mitigation strategies, and evaluate solutions. The aim of the paper is twofold; first to enlighten the factors which make cloud security critical along with alleviation strategies and secondly to propose an intrusion detection model that identifies the attackers in a preventive way using machine learning Random Forest classifier with an accuracy of 99.8%. This model uses less number of features. A comparison with other classifiers is also presented.

Keywords: cloud security, threats, machine learning, random forest, classification

Procedia PDF Downloads 304
97 Iris Recognition Based on the Low Order Norms of Gradient Components

Authors: Iman A. Saad, Loay E. George

Abstract:

Iris pattern is an important biological feature of human body; it becomes very hot topic in both research and practical applications. In this paper, an algorithm is proposed for iris recognition and a simple, efficient and fast method is introduced to extract a set of discriminatory features using first order gradient operator applied on grayscale images. The gradient based features are robust, up to certain extents, against the variations may occur in contrast or brightness of iris image samples; the variations are mostly occur due lightening differences and camera changes. At first, the iris region is located, after that it is remapped to a rectangular area of size 360x60 pixels. Also, a new method is proposed for detecting eyelash and eyelid points; it depends on making image statistical analysis, to mark the eyelash and eyelid as a noise points. In order to cover the features localization (variation), the rectangular iris image is partitioned into N overlapped sub-images (blocks); then from each block a set of different average directional gradient densities values is calculated to be used as texture features vector. The applied gradient operators are taken along the horizontal, vertical and diagonal directions. The low order norms of gradient components were used to establish the feature vector. Euclidean distance based classifier was used as a matching metric for determining the degree of similarity between the features vector extracted from the tested iris image and template features vectors stored in the database. Experimental tests were performed using 2639 iris images from CASIA V4-Interival database, the attained recognition accuracy has reached up to 99.92%.

Keywords: iris recognition, contrast stretching, gradient features, texture features, Euclidean metric

Procedia PDF Downloads 313
96 Fake News Detection Based on Fusion of Domain Knowledge and Expert Knowledge

Authors: Yulan Wu

Abstract:

The spread of fake news on social media has posed significant societal harm to the public and the nation, with its threats spanning various domains, including politics, economics, health, and more. News on social media often covers multiple domains, and existing models studied by researchers and relevant organizations often perform well on datasets from a single domain. However, when these methods are applied to social platforms with news spanning multiple domains, their performance significantly deteriorates. Existing research has attempted to enhance the detection performance of multi-domain datasets by adding single-domain labels to the data. However, these methods overlook the fact that a news article typically belongs to multiple domains, leading to the loss of domain knowledge information contained within the news text. To address this issue, research has found that news records in different domains often use different vocabularies to describe their content. In this paper, we propose a fake news detection framework that combines domain knowledge and expert knowledge. Firstly, it utilizes an unsupervised domain discovery module to generate a low-dimensional vector for each news article, representing domain embeddings, which can retain multi-domain knowledge of the news content. Then, a feature extraction module uses the domain embeddings discovered through unsupervised domain knowledge to guide multiple experts in extracting news knowledge for the total feature representation. Finally, a classifier is used to determine whether the news is fake or not. Experiments show that this approach can improve multi-domain fake news detection performance while reducing the cost of manually labeling domain labels.

Keywords: fake news, deep learning, natural language processing, multiple domains

Procedia PDF Downloads 47
95 Roof and Road Network Detection through Object Oriented SVM Approach Using Low Density LiDAR and Optical Imagery in Misamis Oriental, Philippines

Authors: Jigg L. Pelayo, Ricardo G. Villar, Einstine M. Opiso

Abstract:

The advances of aerial laser scanning in the Philippines has open-up entire fields of research in remote sensing and machine vision aspire to provide accurate timely information for the government and the public. Rapid mapping of polygonal roads and roof boundaries is one of its utilization offering application to disaster risk reduction, mitigation and development. The study uses low density LiDAR data and high resolution aerial imagery through object-oriented approach considering the theoretical concept of data analysis subjected to machine learning algorithm in minimizing the constraints of feature extraction. Since separating one class from another in distinct regions of a multi-dimensional feature-space, non-trivial computing for fitting distribution were implemented to formulate the learned ideal hyperplane. Generating customized hybrid feature which were then used in improving the classifier findings. Supplemental algorithms for filtering and reshaping object features are develop in the rule set for enhancing the final product. Several advantages in terms of simplicity, applicability, and process transferability is noticeable in the methodology. The algorithm was tested in the different random locations of Misamis Oriental province in the Philippines demonstrating robust performance in the overall accuracy with greater than 89% and potential to semi-automation. The extracted results will become a vital requirement for decision makers, urban planners and even the commercial sector in various assessment processes.

Keywords: feature extraction, machine learning, OBIA, remote sensing

Procedia PDF Downloads 340