Search results for: support vector machine classifier
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 9863

Search results for: support vector machine classifier

9623 Application of Model Tree in the Prediction of TBM Rate of Penetration with Synthetic Minority Oversampling Technique

Authors: Ehsan Mehryaar

Abstract:

The rate of penetration is (RoP) one of the vital factors in the cost and time of tunnel boring projects; therefore, predicting it can lead to a substantial increase in the efficiency of the project. RoP is heavily dependent geological properties of the project site and TBM properties. In this study, 151-point data from Queen’s water tunnel is collected, which includes unconfined compression strength, peak slope index, angle with weak planes, and distance between planes of weaknesses. Since the size of the data is small, it was observed that it is imbalanced. To solve that problem synthetic minority oversampling technique is utilized. The model based on the model tree is proposed, where each leaf consists of a support vector machine model. Proposed model performance is then compared to existing empirical equations in the literature.

Keywords: Model tree, SMOTE, rate of penetration, TBM(tunnel boring machine), SVM

Procedia PDF Downloads 164
9622 Fraud Detection in Credit Cards with Machine Learning

Authors: Anjali Chouksey, Riya Nimje, Jahanvi Saraf

Abstract:

Online transactions have increased dramatically in this new ‘social-distancing’ era. With online transactions, Fraud in online payments has also increased significantly. Frauds are a significant problem in various industries like insurance companies, baking, etc. These frauds include leaking sensitive information related to the credit card, which can be easily misused. Due to the government also pushing online transactions, E-commerce is on a boom. But due to increasing frauds in online payments, these E-commerce industries are suffering a great loss of trust from their customers. These companies are finding credit card fraud to be a big problem. People have started using online payment options and thus are becoming easy targets of credit card fraud. In this research paper, we will be discussing machine learning algorithms. We have used a decision tree, XGBOOST, k-nearest neighbour, logistic-regression, random forest, and SVM on a dataset in which there are transactions done online mode using credit cards. We will test all these algorithms for detecting fraud cases using the confusion matrix, F1 score, and calculating the accuracy score for each model to identify which algorithm can be used in detecting frauds.

Keywords: machine learning, fraud detection, artificial intelligence, decision tree, k nearest neighbour, random forest, XGBOOST, logistic regression, support vector machine

Procedia PDF Downloads 132
9621 A New Instrumented Drop-Weight Test Machine for Studying the Impact Behaviour of Reinforced Concrete Beams

Authors: M. Al-Farttoosi, M. Y. Rafiq, J. Summerscales, C. Williams

Abstract:

Structures can be subjected to impact loading from various sources like earthquake, tsunami, missiles and explosions. The impact loading can cause different degrees of damage to concrete structures. The demand for strengthening and rehabilitation of damaged structures is increasing. In recent years, Car0bon Fibre Reinforced Polymer (CFRP) matrix composites has gain more attention for strengthening and repairing these structures. To study the impact behaviour of the reinforced concrete (RC) beams strengthened or repaired using CFRP, a heavy impact test machine was designed and manufactured .The machine included a newly designed support system for beams together with various instrumentation. This paper describes the support design configuration of the impact test machine, instrumentation and dynamic analysis of the concrete beams. To evaluate the efficiency of the new impact test machine, experimental impact tests were conducted on simple supported reinforced concrete beam. Different methods were used to determine the impact force and impact response of the RC beams in terms of inertia force, maximum deflection, reaction force and fracture energy. The manufactured impact test machine was successfully used in testing RC beams under impact loading and used successfully to test the reinforced concrete beams strengthened or repaired using CFRP under impact loading.

Keywords: beam, concrete, impact, machine

Procedia PDF Downloads 401
9620 CRISPR-DT: Designing gRNAs for the CRISPR-Cpf1 System with Improved Target Efficiency and Specificity

Authors: Houxiang Zhu, Chun Liang

Abstract:

The CRISPR-Cpf1 system has been successfully applied in genome editing. However, target efficiency of the CRISPR-Cpf1 system varies among different gRNA sequences. The published CRISPR-Cpf1 gRNA data was reanalyzed. Many sequences and structural features of gRNAs (e.g., the position-specific nucleotide composition, position-nonspecific nucleotide composition, GC content, minimum free energy, and melting temperature) correlated with target efficiency were found. Using machine learning technology, a support vector machine (SVM) model was created to predict target efficiency for any given gRNAs. The first web service application, CRISPR-DT (CRISPR DNA Targeting), has been developed to help users design optimal gRNAs for the CRISPR-Cpf1 system by considering both target efficiency and specificity. CRISPR-DT will empower researchers in genome editing.

Keywords: CRISPR-Cpf1, genome editing, target efficiency, target specificity

Procedia PDF Downloads 249
9619 Fusion Models for Cyber Threat Defense: Integrating Clustering, Random Forests, and Support Vector Machines to Against Windows Malware

Authors: Azita Ramezani, Atousa Ramezani

Abstract:

In the ever-escalating landscape of windows malware the necessity for pioneering defense strategies turns into undeniable this study introduces an avant-garde approach fusing the capabilities of clustering random forests and support vector machines SVM to combat the intricate web of cyber threats our fusion model triumphs with a staggering accuracy of 98.67 and an equally formidable f1 score of 98.68 a testament to its effectiveness in the realm of windows malware defense by deciphering the intricate patterns within malicious code our model not only raises the bar for detection precision but also redefines the paradigm of cybersecurity preparedness this breakthrough underscores the potential embedded in the fusion of diverse analytical methodologies and signals a paradigm shift in fortifying against the relentless evolution of windows malicious threats as we traverse through the dynamic cybersecurity terrain this research serves as a beacon illuminating the path toward a resilient future where innovative fusion models stand at the forefront of cyber threat defense.

Keywords: fusion models, cyber threat defense, windows malware, clustering, random forests, support vector machines (SVM), accuracy, f1-score, cybersecurity, malicious code detection

Procedia PDF Downloads 49
9618 Power Control of a Doubly-Fed Induction Generator Used in Wind Turbine by RST Controller

Authors: A. Boualouch, A. Frigui, T. Nasser, A. Essadki, A.Boukhriss

Abstract:

This work deals with the vector control of the active and reactive powers of a Double-Fed Induction generator DFIG used as a wind generator by the polynomial RST controller. The control of the statoric power transfer between the machine and the grid is achieved by acting on the rotor parameters and control is provided by the polynomial controller RST. The performance and robustness of the controller are compared with PI controller and evaluated by simulation results in MATLAB/simulink.

Keywords: DFIG, RST, vector control, wind turbine

Procedia PDF Downloads 638
9617 Tibyan Automated Arabic Correction Using Machine-Learning in Detecting Syntactical Mistakes

Authors: Ashwag O. Maghraby, Nida N. Khan, Hosnia A. Ahmed, Ghufran N. Brohi, Hind F. Assouli, Jawaher S. Melibari

Abstract:

The Arabic language is one of the most important languages. Learning it is so important for many people around the world because of its religious and economic importance and the real challenge lies in practicing it without grammatical or syntactical mistakes. This research focused on detecting and correcting the syntactic mistakes of Arabic syntax according to their position in the sentence and focused on two of the main syntactical rules in Arabic: Dual and Plural. It analyzes each sentence in the text, using Stanford CoreNLP morphological analyzer and machine-learning approach in order to detect the syntactical mistakes and then correct it. A prototype of the proposed system was implemented and evaluated. It uses support vector machine (SVM) algorithm to detect Arabic grammatical errors and correct them using the rule-based approach. The prototype system has a far accuracy 81%. In general, it shows a set of useful grammatical suggestions that the user may forget about while writing due to lack of familiarity with grammar or as a result of the speed of writing such as alerting the user when using a plural term to indicate one person.

Keywords: Arabic language acquisition and learning, natural language processing, morphological analyzer, part-of-speech

Procedia PDF Downloads 134
9616 Performance Analysis and Optimization for Diagonal Sparse Matrix-Vector Multiplication on Machine Learning Unit

Authors: Qiuyu Dai, Haochong Zhang, Xiangrong Liu

Abstract:

Diagonal sparse matrix-vector multiplication is a well-studied topic in the fields of scientific computing and big data processing. However, when diagonal sparse matrices are stored in DIA format, there can be a significant number of padded zero elements and scattered points, which can lead to a degradation in the performance of the current DIA kernel. This can also lead to excessive consumption of computational and memory resources. In order to address these issues, the authors propose the DIA-Adaptive scheme and its kernel, which leverages the parallel instruction sets on MLU. The researchers analyze the effect of allocating a varying number of threads, clusters, and hardware architectures on the performance of SpMV using different formats. The experimental results indicate that the proposed DIA-Adaptive scheme performs well and offers excellent parallelism.

Keywords: adaptive method, DIA, diagonal sparse matrices, MLU, sparse matrix-vector multiplication

Procedia PDF Downloads 108
9615 A Comparative Study of k-NN and MLP-NN Classifiers Using GA-kNN Based Feature Selection Method for Wood Recognition System

Authors: Uswah Khairuddin, Rubiyah Yusof, Nenny Ruthfalydia Rosli

Abstract:

This paper presents a comparative study between k-Nearest Neighbour (k-NN) and Multi-Layer Perceptron Neural Network (MLP-NN) classifier using Genetic Algorithm (GA) as feature selector for wood recognition system. The features have been extracted from the images using Grey Level Co-Occurrence Matrix (GLCM). The use of GA based feature selection is mainly to ensure that the database used for training the features for the wood species pattern classifier consists of only optimized features. The feature selection process is aimed at selecting only the most discriminating features of the wood species to reduce the confusion for the pattern classifier. This feature selection approach maintains the ‘good’ features that minimizes the inter-class distance and maximizes the intra-class distance. Wrapper GA is used with k-NN classifier as fitness evaluator (GA-kNN). The results shows that k-NN is the best choice of classifier because it uses a very simple distance calculation algorithm and classification tasks can be done in a short time with good classification accuracy.

Keywords: feature selection, genetic algorithm, optimization, wood recognition system

Procedia PDF Downloads 527
9614 Landslide Susceptibility Mapping Using Soft Computing in Amhara Saint

Authors: Semachew M. Kassa, Africa M Geremew, Tezera F. Azmatch, Nandyala Darga Kumar

Abstract:

Frequency ratio (FR) and analytical hierarchy process (AHP) methods are developed based on past landslide failure points to identify the landslide susceptibility mapping because landslides can seriously harm both the environment and society. However, it is still difficult to select the most efficient method and correctly identify the main driving factors for particular regions. In this study, we used fourteen landslide conditioning factors (LCFs) and five soft computing algorithms, including Random Forest (RF), Support Vector Machine (SVM), Logistic Regression (LR), Artificial Neural Network (ANN), and Naïve Bayes (NB), to predict the landslide susceptibility at 12.5 m spatial scale. The performance of the RF (F1-score: 0.88, AUC: 0.94), ANN (F1-score: 0.85, AUC: 0.92), and SVM (F1-score: 0.82, AUC: 0.86) methods was significantly better than the LR (F1-score: 0.75, AUC: 0.76) and NB (F1-score: 0.73, AUC: 0.75) method, according to the classification results based on inventory landslide points. The findings also showed that around 35% of the study region was made up of places with high and very high landslide risk (susceptibility greater than 0.5). The very high-risk locations were primarily found in the western and southeastern regions, and all five models showed good agreement and similar geographic distribution patterns in landslide susceptibility. The towns with the highest landslide risk include Amhara Saint Town's western part, the Northern part, and St. Gebreal Church villages, with mean susceptibility values greater than 0.5. However, rainfall, distance to road, and slope were typically among the top leading factors for most villages. The primary contributing factors to landslide vulnerability were slightly varied for the five models. Decision-makers and policy planners can use the information from our study to make informed decisions and establish policies. It also suggests that various places should take different safeguards to reduce or prevent serious damage from landslide events.

Keywords: artificial neural network, logistic regression, landslide susceptibility, naïve Bayes, random forest, support vector machine

Procedia PDF Downloads 53
9613 Efficient Antenna Array Beamforming with Robustness against Random Steering Mismatch

Authors: Ju-Hong Lee, Ching-Wei Liao, Kun-Che Lee

Abstract:

This paper deals with the problem of using antenna sensors for adaptive beamforming in the presence of random steering mismatch. We present an efficient adaptive array beamformer with robustness to deal with the considered problem. The robustness of the proposed beamformer comes from the efficient designation of the steering vector. Using the received array data vector, we construct an appropriate correlation matrix associated with the received array data vector and a correlation matrix associated with signal sources. Then, the eigenvector associated with the largest eigenvalue of the constructed signal correlation matrix is designated as an appropriate estimate of the steering vector. Finally, the adaptive weight vector required for adaptive beamforming is obtained by using the estimated steering vector and the constructed correlation matrix of the array data vector. Simulation results confirm the effectiveness of the proposed method.

Keywords: adaptive beamforming, antenna array, linearly constrained minimum variance, robustness, steering vector

Procedia PDF Downloads 184
9612 Intelligent Decision Support for Wind Park Operation: Machine-Learning Based Detection and Diagnosis of Anomalous Operating States

Authors: Angela Meyer

Abstract:

The operation and maintenance cost for wind parks make up a major fraction of the park’s overall lifetime cost. To minimize the cost and risk involved, an optimal operation and maintenance strategy requires continuous monitoring and analysis. In order to facilitate this, we present a decision support system that automatically scans the stream of telemetry sensor data generated from the turbines. By learning decision boundaries and normal reference operating states using machine learning algorithms, the decision support system can detect anomalous operating behavior in individual wind turbines and diagnose the involved turbine sub-systems. Operating personal can be alerted if a normal operating state boundary is exceeded. The presented decision support system and method are applicable for any turbine type and manufacturer providing telemetry data of the turbine operating state. We demonstrate the successful detection and diagnosis of anomalous operating states in a case study at a German onshore wind park comprised of Vestas V112 turbines.

Keywords: anomaly detection, decision support, machine learning, monitoring, performance optimization, wind turbines

Procedia PDF Downloads 155
9611 Local Interpretable Model-agnostic Explanations (LIME) Approach to Email Spam Detection

Authors: Rohini Hariharan, Yazhini R., Blessy Maria Mathew

Abstract:

The task of detecting email spam is a very important one in the era of digital technology that needs effective ways of curbing unwanted messages. This paper presents an approach aimed at making email spam categorization algorithms transparent, reliable and more trustworthy by incorporating Local Interpretable Model-agnostic Explanations (LIME). Our technique assists in providing interpretable explanations for specific classifications of emails to help users understand the decision-making process by the model. In this study, we developed a complete pipeline that incorporates LIME into the spam classification framework and allows creating simplified, interpretable models tailored to individual emails. LIME identifies influential terms, pointing out key elements that drive classification results, thus reducing opacity inherent in conventional machine learning models. Additionally, we suggest a visualization scheme for displaying keywords that will improve understanding of categorization decisions by users. We test our method on a diverse email dataset and compare its performance with various baseline models, such as Gaussian Naive Bayes, Multinomial Naive Bayes, Bernoulli Naive Bayes, Support Vector Classifier, K-Nearest Neighbors, Decision Tree, and Logistic Regression. Our testing results show that our model surpasses all other models, achieving an accuracy of 96.59% and a precision of 99.12%.

Keywords: text classification, LIME (local interpretable model-agnostic explanations), stemming, tokenization, logistic regression.

Procedia PDF Downloads 28
9610 Early Diagnosis of Myocardial Ischemia Based on Support Vector Machine and Gaussian Mixture Model by Using Features of ECG Recordings

Authors: Merve Begum Terzi, Orhan Arikan, Adnan Abaci, Mustafa Candemir

Abstract:

Acute myocardial infarction is a major cause of death in the world. Therefore, its fast and reliable diagnosis is a major clinical need. ECG is the most important diagnostic methodology which is used to make decisions about the management of the cardiovascular diseases. In patients with acute myocardial ischemia, temporary chest pains together with changes in ST segment and T wave of ECG occur shortly before the start of myocardial infarction. In this study, a technique which detects changes in ST/T sections of ECG is developed for the early diagnosis of acute myocardial ischemia. For this purpose, a database of real ECG recordings that contains a set of records from 75 patients presenting symptoms of chest pain who underwent elective percutaneous coronary intervention (PCI) is constituted. 12-lead ECG’s of the patients were recorded before and during the PCI procedure. Two ECG epochs, which are the pre-inflation ECG which is acquired before any catheter insertion and the occlusion ECG which is acquired during balloon inflation, are analyzed for each patient. By using pre-inflation and occlusion recordings, ECG features that are critical in the detection of acute myocardial ischemia are identified and the most discriminative features for the detection of acute myocardial ischemia are extracted. A classification technique based on support vector machine (SVM) approach operating with linear and radial basis function (RBF) kernels to detect ischemic events by using ST-T derived joint features from non-ischemic and ischemic states of the patients is developed. The dataset is randomly divided into training and testing sets and the training set is used to optimize SVM hyperparameters by using grid-search method and 10fold cross-validation. SVMs are designed specifically for each patient by tuning the kernel parameters in order to obtain the optimal classification performance results. As a result of implementing the developed classification technique to real ECG recordings, it is shown that the proposed technique provides highly reliable detections of the anomalies in ECG signals. Furthermore, to develop a detection technique that can be used in the absence of ECG recording obtained during healthy stage, the detection of acute myocardial ischemia based on ECG recordings of the patients obtained during ischemia is also investigated. For this purpose, a Gaussian mixture model (GMM) is used to represent the joint pdf of the most discriminating ECG features of myocardial ischemia. Then, a Neyman-Pearson type of approach is developed to provide detection of outliers that would correspond to acute myocardial ischemia. Neyman – Pearson decision strategy is used by computing the average log likelihood values of ECG segments and comparing them with a range of different threshold values. For different discrimination threshold values and number of ECG segments, probability of detection and probability of false alarm values are computed, and the corresponding ROC curves are obtained. The results indicate that increasing number of ECG segments provide higher performance for GMM based classification. Moreover, the comparison between the performances of SVM and GMM based classification showed that SVM provides higher classification performance results over ECG recordings of considerable number of patients.

Keywords: ECG classification, Gaussian mixture model, Neyman–Pearson approach, support vector machine

Procedia PDF Downloads 142
9609 Using the Smith-Waterman Algorithm to Extract Features in the Classification of Obesity Status

Authors: Rosa Figueroa, Christopher Flores

Abstract:

Text categorization is the problem of assigning a new document to a set of predetermined categories, on the basis of a training set of free-text data that contains documents whose category membership is known. To train a classification model, it is necessary to extract characteristics in the form of tokens that facilitate the learning and classification process. In text categorization, the feature extraction process involves the use of word sequences also known as N-grams. In general, it is expected that documents belonging to the same category share similar features. The Smith-Waterman (SW) algorithm is a dynamic programming algorithm that performs a local sequence alignment in order to determine similar regions between two strings or protein sequences. This work explores the use of SW algorithm as an alternative to feature extraction in text categorization. The dataset used for this purpose, contains 2,610 annotated documents with the classes Obese/Non-Obese. This dataset was represented in a matrix form using the Bag of Word approach. The score selected to represent the occurrence of the tokens in each document was the term frequency-inverse document frequency (TF-IDF). In order to extract features for classification, four experiments were conducted: the first experiment used SW to extract features, the second one used unigrams (single word), the third one used bigrams (two word sequence) and the last experiment used a combination of unigrams and bigrams to extract features for classification. To test the effectiveness of the extracted feature set for the four experiments, a Support Vector Machine (SVM) classifier was tuned using 20% of the dataset. The remaining 80% of the dataset together with 5-Fold Cross Validation were used to evaluate and compare the performance of the four experiments of feature extraction. Results from the tuning process suggest that SW performs better than the N-gram based feature extraction. These results were confirmed by using the remaining 80% of the dataset, where SW performed the best (accuracy = 97.10%, weighted average F-measure = 97.07%). The second best was obtained by the combination of unigrams-bigrams (accuracy = 96.04, weighted average F-measure = 95.97) closely followed by the bigrams (accuracy = 94.56%, weighted average F-measure = 94.46%) and finally unigrams (accuracy = 92.96%, weighted average F-measure = 92.90%).

Keywords: comorbidities, machine learning, obesity, Smith-Waterman algorithm

Procedia PDF Downloads 278
9608 Comparison of Support Vector Machines and Artificial Neural Network Classifiers in Characterizing Threatened Tree Species Using Eight Bands of WorldView-2 Imagery in Dukuduku Landscape, South Africa

Authors: Galal Omer, Onisimo Mutanga, Elfatih M. Abdel-Rahman, Elhadi Adam

Abstract:

Threatened tree species (TTS) play a significant role in ecosystem functioning and services, land use dynamics, and other socio-economic aspects. Such aspects include ecological, economic, livelihood, security-based, and well-being benefits. The development of techniques for mapping and monitoring TTS is thus critical for understanding the functioning of ecosystems. The advent of advanced imaging systems and supervised learning algorithms has provided an opportunity to classify TTS over fragmenting landscape. Recently, vegetation maps have been produced using advanced imaging systems such as WorldView-2 (WV-2) and robust classification algorithms such as support vectors machines (SVM) and artificial neural network (ANN). However, delineation of TTS in a fragmenting landscape using high resolution imagery has widely remained elusive due to the complexity of the species structure and their distribution. Therefore, the objective of the current study was to examine the utility of the advanced WV-2 data for mapping TTS in the fragmenting Dukuduku indigenous forest of South Africa using SVM and ANN classification algorithms. The results showed the robustness of the two machine learning algorithms with an overall accuracy (OA) of 77.00% (total disagreement = 23.00%) for SVM and 75.00% (total disagreement = 25.00%) for ANN using all eight bands of WV-2 (8B). This study concludes that SVM and ANN classification algorithms with WV-2 8B have the potential to classify TTS in the Dukuduku indigenous forest. This study offers relatively accurate information that is important for forest managers to make informed decisions regarding management and conservation protocols of TTS.

Keywords: artificial neural network, threatened tree species, indigenous forest, support vector machines

Procedia PDF Downloads 495
9607 Prediction of Formation Pressure Using Artificial Intelligence Techniques

Authors: Abdulmalek Ahmed

Abstract:

Formation pressure is the main function that affects drilling operation economically and efficiently. Knowing the pore pressure and the parameters that affect it will help to reduce the cost of drilling process. Many empirical models reported in the literature were used to calculate the formation pressure based on different parameters. Some of these models used only drilling parameters to estimate pore pressure. Other models predicted the formation pressure based on log data. All of these models required different trends such as normal or abnormal to predict the pore pressure. Few researchers applied artificial intelligence (AI) techniques to predict the formation pressure by only one method or a maximum of two methods of AI. The objective of this research is to predict the pore pressure based on both drilling parameters and log data namely; weight on bit, rotary speed, rate of penetration, mud weight, bulk density, porosity and delta sonic time. A real field data is used to predict the formation pressure using five different artificial intelligence (AI) methods such as; artificial neural networks (ANN), radial basis function (RBF), fuzzy logic (FL), support vector machine (SVM) and functional networks (FN). All AI tools were compared with different empirical models. AI methods estimated the formation pressure by a high accuracy (high correlation coefficient and low average absolute percentage error) and outperformed all previous. The advantage of the new technique is its simplicity, which represented from its estimation of pore pressure without the need of different trends as compared to other models which require a two different trend (normal or abnormal pressure). Moreover, by comparing the AI tools with each other, the results indicate that SVM has the advantage of pore pressure prediction by its fast processing speed and high performance (a high correlation coefficient of 0.997 and a low average absolute percentage error of 0.14%). In the end, a new empirical correlation for formation pressure was developed using ANN method that can estimate pore pressure with a high precision (correlation coefficient of 0.998 and average absolute percentage error of 0.17%).

Keywords: Artificial Intelligence (AI), Formation pressure, Artificial Neural Networks (ANN), Fuzzy Logic (FL), Support Vector Machine (SVM), Functional Networks (FN), Radial Basis Function (RBF)

Procedia PDF Downloads 136
9606 Predictive Models of Ruin Probability in Retirement Withdrawal Strategies

Authors: Yuanjin Liu

Abstract:

Retirement withdrawal strategies are very important to minimize the probability of ruin in retirement. The ruin probability is modeled as a function of initial withdrawal age, gender, asset allocation, inflation rate, and initial withdrawal rate. The ruin probability is obtained based on the 2019 period life table for the Social Security, IRS Required Minimum Distribution (RMD) Worksheets, US historical bond and equity returns, and inflation rates using simulation. Several popular machine learning algorithms of the generalized additive model, random forest, support vector machine, extreme gradient boosting, and artificial neural network are built. The model validation and selection are based on the test errors using hyperparameter tuning and train-test split. The optimal model is recommended for retirees to monitor the ruin probability. The optimal withdrawal strategy can be obtained based on the optimal predictive model.

Keywords: ruin probability, retirement withdrawal strategies, predictive models, optimal model

Procedia PDF Downloads 59
9605 Markov Random Field-Based Segmentation Algorithm for Detection of Land Cover Changes Using Uninhabited Aerial Vehicle Synthetic Aperture Radar Polarimetric Images

Authors: Mehrnoosh Omati, Mahmod Reza Sahebi

Abstract:

The information on land use/land cover changing plays an essential role for environmental assessment, planning and management in regional development. Remotely sensed imagery is widely used for providing information in many change detection applications. Polarimetric Synthetic aperture radar (PolSAR) image, with the discrimination capability between different scattering mechanisms, is a powerful tool for environmental monitoring applications. This paper proposes a new boundary-based segmentation algorithm as a fundamental step for land cover change detection. In this method, first, two PolSAR images are segmented using integration of marker-controlled watershed algorithm and coupled Markov random field (MRF). Then, object-based classification is performed to determine changed/no changed image objects. Compared with pixel-based support vector machine (SVM) classifier, this novel segmentation algorithm significantly reduces the speckle effect in PolSAR images and improves the accuracy of binary classification in object-based level. The experimental results on Uninhabited Aerial Vehicle Synthetic Aperture Radar (UAVSAR) polarimetric images show a 3% and 6% improvement in overall accuracy and kappa coefficient, respectively. Also, the proposed method can correctly distinguish homogeneous image parcels.

Keywords: coupled Markov random field (MRF), environment, object-based analysis, polarimetric SAR (PolSAR) images

Procedia PDF Downloads 201
9604 Innovative Predictive Modeling and Characterization of Composite Material Properties Using Machine Learning and Genetic Algorithms

Authors: Hamdi Beji, Toufik Kanit, Tanguy Messager

Abstract:

This study aims to construct a predictive model proficient in foreseeing the linear elastic and thermal characteristics of composite materials, drawing on a multitude of influencing parameters. These parameters encompass the shape of inclusions (circular, elliptical, square, triangle), their spatial coordinates within the matrix, orientation, volume fraction (ranging from 0.05 to 0.4), and variations in contrast (spanning from 10 to 200). A variety of machine learning techniques are deployed, including decision trees, random forests, support vector machines, k-nearest neighbors, and an artificial neural network (ANN), to facilitate this predictive model. Moreover, this research goes beyond the predictive aspect by delving into an inverse analysis using genetic algorithms. The intent is to unveil the intrinsic characteristics of composite materials by evaluating their thermomechanical responses. The foundation of this research lies in the establishment of a comprehensive database that accounts for the array of input parameters mentioned earlier. This database, enriched with this diversity of input variables, serves as a bedrock for the creation of machine learning and genetic algorithm-based models. These models are meticulously trained to not only predict but also elucidate the mechanical and thermal conduct of composite materials. Remarkably, the coupling of machine learning and genetic algorithms has proven highly effective, yielding predictions with remarkable accuracy, boasting scores ranging between 0.97 and 0.99. This achievement marks a significant breakthrough, demonstrating the potential of this innovative approach in the field of materials engineering.

Keywords: machine learning, composite materials, genetic algorithms, mechanical and thermal proprieties

Procedia PDF Downloads 44
9603 Early Gastric Cancer Prediction from Diet and Epidemiological Data Using Machine Learning in Mizoram Population

Authors: Brindha Senthil Kumar, Payel Chakraborty, Senthil Kumar Nachimuthu, Arindam Maitra, Prem Nath

Abstract:

Gastric cancer is predominantly caused by demographic and diet factors as compared to other cancer types. The aim of the study is to predict Early Gastric Cancer (ECG) from diet and lifestyle factors using supervised machine learning algorithms. For this study, 160 healthy individual and 80 cases were selected who had been followed for 3 years (2016-2019), at Civil Hospital, Aizawl, Mizoram. A dataset containing 11 features that are core risk factors for the gastric cancer were extracted. Supervised machine algorithms: Logistic Regression, Naive Bayes, Support Vector Machine (SVM), Multilayer perceptron, and Random Forest were used to analyze the dataset using Python Jupyter Notebook Version 3. The obtained classified results had been evaluated using metrics parameters: minimum_false_positives, brier_score, accuracy, precision, recall, F1_score, and Receiver Operating Characteristics (ROC) curve. Data analysis results showed Naive Bayes - 88, 0.11; Random Forest - 83, 0.16; SVM - 77, 0.22; Logistic Regression - 75, 0.25 and Multilayer perceptron - 72, 0.27 with respect to accuracy and brier_score in percent. Naive Bayes algorithm out performs with very low false positive rates as well as brier_score and good accuracy. Naive Bayes algorithm classification results in predicting ECG showed very satisfactory results using only diet cum lifestyle factors which will be very helpful for the physicians to educate the patients and public, thereby mortality of gastric cancer can be reduced/avoided with this knowledge mining work.

Keywords: Early Gastric cancer, Machine Learning, Diet, Lifestyle Characteristics

Procedia PDF Downloads 139
9602 Liver Tumor Detection by Classification through FD Enhancement of CT Image

Authors: N. Ghatwary, A. Ahmed, H. Jalab

Abstract:

In this paper, an approach for the liver tumor detection in computed tomography (CT) images is represented. The detection process is based on classifying the features of target liver cell to either tumor or non-tumor. Fractional differential (FD) is applied for enhancement of Liver CT images, with the aim of enhancing texture and edge features. Later on, a fusion method is applied to merge between the various enhanced images and produce a variety of feature improvement, which will increase the accuracy of classification. Each image is divided into NxN non-overlapping blocks, to extract the desired features. Support vector machines (SVM) classifier is trained later on a supplied dataset different from the tested one. Finally, the block cells are identified whether they are classified as tumor or not. Our approach is validated on a group of patients’ CT liver tumor datasets. The experiment results demonstrated the efficiency of detection in the proposed technique.

Keywords: fractional differential (FD), computed tomography (CT), fusion, aplha, texture features.

Procedia PDF Downloads 342
9601 Investigation of Topic Modeling-Based Semi-Supervised Interpretable Document Classifier

Authors: Dasom Kim, William Xiu Shun Wong, Yoonjin Hyun, Donghoon Lee, Minji Paek, Sungho Byun, Namgyu Kim

Abstract:

There have been many researches on document classification for classifying voluminous documents automatically. Through document classification, we can assign a specific category to each unlabeled document on the basis of various machine learning algorithms. However, providing labeled documents manually requires considerable time and effort. To overcome the limitations, the semi-supervised learning which uses unlabeled document as well as labeled documents has been invented. However, traditional document classifiers, regardless of supervised or semi-supervised ones, cannot sufficiently explain the reason or the process of the classification. Thus, in this paper, we proposed a methodology to visualize major topics and class components of each document. We believe that our methodology for visualizing topics and classes of each document can enhance the reliability and explanatory power of document classifiers.

Keywords: data mining, document classifier, text mining, topic modeling

Procedia PDF Downloads 383
9600 Parallel Vector Processing Using Multi Level Orbital DATA

Authors: Nagi Mekhiel

Abstract:

Many applications use vector operations by applying single instruction to multiple data that map to different locations in conventional memory. Transferring data from memory is limited by access latency and bandwidth affecting the performance gain of vector processing. We present a memory system that makes all of its content available to processors in time so that processors need not to access the memory, we force each location to be available to all processors at a specific time. The data move in different orbits to become available to other processors in higher orbits at different time. We use this memory to apply parallel vector operations to data streams at first orbit level. Data processed in the first level move to upper orbit one data element at a time, allowing a processor in that orbit to apply another vector operation to deal with serial code limitations inherited in all parallel applications and interleaved it with lower level vector operations.

Keywords: Memory Organization, Parallel Processors, Serial Code, Vector Processing

Procedia PDF Downloads 252
9599 Emotion Recognition with Occlusions Based on Facial Expression Reconstruction and Weber Local Descriptor

Authors: Jadisha Cornejo, Helio Pedrini

Abstract:

Recognition of emotions based on facial expressions has received increasing attention from the scientific community over the last years. Several fields of applications can benefit from facial emotion recognition, such as behavior prediction, interpersonal relations, human-computer interactions, recommendation systems. In this work, we develop and analyze an emotion recognition framework based on facial expressions robust to occlusions through the Weber Local Descriptor (WLD). Initially, the occluded facial expressions are reconstructed following an extension approach of Robust Principal Component Analysis (RPCA). Then, WLD features are extracted from the facial expression representation, as well as Local Binary Patterns (LBP) and Histogram of Oriented Gradients (HOG). The feature vector space is reduced using Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA). Finally, K-Nearest Neighbor (K-NN) and Support Vector Machine (SVM) classifiers are used to recognize the expressions. Experimental results on three public datasets demonstrated that the WLD representation achieved competitive accuracy rates for occluded and non-occluded facial expressions compared to other approaches available in the literature.

Keywords: emotion recognition, facial expression, occlusion, fiducial landmarks

Procedia PDF Downloads 162
9598 Image Multi-Feature Analysis by Principal Component Analysis for Visual Surface Roughness Measurement

Authors: Wei Zhang, Yan He, Yan Wang, Yufeng Li, Chuanpeng Hao

Abstract:

Surface roughness is an important index for evaluating surface quality, needs to be accurately measured to ensure the performance of the workpiece. The roughness measurement based on machine vision involves various image features, some of which are redundant. These redundant features affect the accuracy and speed of the visual approach. Previous research used correlation analysis methods to select the appropriate features. However, this feature analysis is independent and cannot fully utilize the information of data. Besides, blindly reducing features lose a lot of useful information, resulting in unreliable results. Therefore, the focus of this paper is on providing a redundant feature removal approach for visual roughness measurement. In this paper, the statistical methods and gray-level co-occurrence matrix(GLCM) are employed to extract the texture features of machined images effectively. Then, the principal component analysis(PCA) is used to fuse all extracted features into a new one, which reduces the feature dimension and maintains the integrity of the original information. Finally, the relationship between new features and roughness is established by the support vector machine(SVM). The experimental results show that the approach can effectively solve multi-feature information redundancy of machined surface images and provides a new idea for the visual evaluation of surface roughness.

Keywords: feature analysis, machine vision, PCA, surface roughness, SVM

Procedia PDF Downloads 199
9597 0.13-µm Complementary Metal-Oxide Semiconductor Vector Modulator for Beamforming System

Authors: J. S. Kim

Abstract:

This paper presents a 0.13-µm Complementary Metal-Oxide Semiconductor (CMOS) vector modulator for beamforming system. The vector modulator features a 360° phase and gain range of -10 dB to 10 dB with a root mean square phase and amplitude error of only 2.2° and 0.45 dB, respectively. These features make it a suitable for wireless backhaul system in the 5 GHz industrial, scientific, and medical (ISM) bands. It draws a current of 20.4 mA from a 1.2 V supply. The total chip size is 1.87x1.34 mm².

Keywords: CMOS, vector modulator, beamforming, 802.11ac

Procedia PDF Downloads 191
9596 Random Forest Classification for Population Segmentation

Authors: Regina Chua

Abstract:

To reduce the costs of re-fielding a large survey, a Random Forest classifier was applied to measure the accuracy of classifying individuals into their assigned segments with the fewest possible questions. Given a long survey, one needed to determine the most predictive ten or fewer questions that would accurately assign new individuals to custom segments. Furthermore, the solution needed to be quick in its classification and usable in non-Python environments. In this paper, a supervised Random Forest classifier was modeled on a dataset with 7,000 individuals, 60 questions, and 254 features. The Random Forest consisted of an iterative collection of individual decision trees that result in a predicted segment with robust precision and recall scores compared to a single tree. A random 70-30 stratified sampling for training the algorithm was used, and accuracy trade-offs at different depths for each segment were identified. Ultimately, the Random Forest classifier performed at 87% accuracy at a depth of 10 with 20 instead of 254 features and 10 instead of 60 questions. With an acceptable accuracy in prioritizing feature selection, new tools were developed for non-Python environments: a worksheet with a formulaic version of the algorithm and an embedded function to predict the segment of an individual in real-time. Random Forest was determined to be an optimal classification model by its feature selection, performance, processing speed, and flexible application in other environments.

Keywords: machine learning, supervised learning, data science, random forest, classification, prediction, predictive modeling

Procedia PDF Downloads 79
9595 The Diminished Online Persona: A Semantic Change of Chinese Classifier Mei on Weibo

Authors: Hui Shi

Abstract:

This study investigates a newly emerged usage of Chinese numeral classifier mei (枚) in the cyberspace. In modern Chinese grammar, mei as a classifier should occupy the pre-nominal position, and its valid accompanying nouns are restricted to small, flat, fragile inanimate objects rather than humans. To examine the semantic change of mei, two types of data from Weibo.com were collected. First, 500 mei-included Weibo posts constructed a corpus for analyzing this classifier's word order distribution (post-nominal or pre-nominal) as well as its accompanying nouns' semantics (inanimate or human). Second, considering that mei accompanies a remarkable number of human nouns in the first corpus, the second corpus is composed of mei-involved Weibo IDs from users located in first and third-tier cities (n=8 respectively). The findings show that in the cyber community, mei frequently classifies human-related neologisms at the archaic post-normal position. Besides, the 23 to 29-year-old females as well as Weibo users from third-tier cities are the major populations who adopt mei in their user IDs for self-description and identity expression. This paper argues that the creative usage of mei gains popularity in the Chinese internet due to a humor effect. The marked word order switch and semantic misapplication combined to trigger incongruity and jocularity. This study has significance for research on Chinese cyber neologism. It may also lay a foundation for further studies on Chinese classifier change and Chinese internet communication.

Keywords: Chinese classifier, humor, neologism, semantic change

Procedia PDF Downloads 240
9594 Breast Cancer Survivability Prediction via Classifier Ensemble

Authors: Mohamed Al-Badrashiny, Abdelghani Bellaachia

Abstract:

This paper presents a classifier ensemble approach for predicting the survivability of the breast cancer patients using the latest database version of the Surveillance, Epidemiology, and End Results (SEER) Program of the National Cancer Institute. The system consists of two main components; features selection and classifier ensemble components. The features selection component divides the features in SEER database into four groups. After that it tries to find the most important features among the four groups that maximizes the weighted average F-score of a certain classification algorithm. The ensemble component uses three different classifiers, each of which models different set of features from SEER through the features selection module. On top of them, another classifier is used to give the final decision based on the output decisions and confidence scores from each of the underlying classifiers. Different classification algorithms have been examined; the best setup found is by using the decision tree, Bayesian network, and Na¨ıve Bayes algorithms for the underlying classifiers and Na¨ıve Bayes for the classifier ensemble step. The system outperforms all published systems to date when evaluated against the exact same data of SEER (period of 1973-2002). It gives 87.39% weighted average F-score compared to 85.82% and 81.34% of the other published systems. By increasing the data size to cover the whole database (period of 1973-2014), the overall weighted average F-score jumps to 92.4% on the held out unseen test set.

Keywords: classifier ensemble, breast cancer survivability, data mining, SEER

Procedia PDF Downloads 304