Search results for: classifier algorithms
1595 Distributed Real-time Framework for Experimental Multi Aerial Robotic Systems
Authors: Samuel Knox, Verdon Crann, Peyman Amiri, William Crowther
Abstract:
There exists a shortage of open-source firmware for allowing researchers to focus on implementing high-level planning and control strategies for multi aerial robotic systems in simulation and experiment. Within this body of work, practical firmware is presented, which performs all supplementary tasks, including communications, pre and post-experiment procedures, and emergency safety measures. This allows researchers to implement high-level planning and control algorithms for path planning, traffic management, flight formation and swarming of aerial robots. The framework is built in Python using the MAVSDK library, which is compatible with flight controllers running PX4 firmware and onboard computers based on Linux. Communication is performed using Wi-Fi and the MQTT protocol, currently implemented using a centralized broker. Finally, a graphical user interface (GUI) has been developed to send general commands and monitor the agents. This framework enables researchers to prepare customized planning and control algorithms in a modular manner. Studies can be performed experimentally and in simulation using PX4 software in the loop (SITL) and the Gazebo simulator. An example experimental use case of the framework is presented using novel distributed planning and control strategies. The demonstration is performed using off-the-shelf components and minimal setup.Keywords: aerial robotics, distributed framework, experimental, planning and control
Procedia PDF Downloads 1121594 Medical Neural Classifier Based on Improved Genetic Algorithm
Authors: Fadzil Ahmad, Noor Ashidi Mat Isa
Abstract:
This study introduces an improved genetic algorithm procedure that focuses search around near optimal solution corresponded to a group of elite chromosome. This is achieved through a novel crossover technique known as Segmented Multi Chromosome Crossover. It preserves the highly important information contained in a gene segment of elite chromosome and allows an offspring to carry information from gene segment of multiple chromosomes. In this way the algorithm has better possibility to effectively explore the solution space. The improved GA is applied for the automatic and simultaneous parameter optimization and feature selection of artificial neural network in pattern recognition of medical problem, the cancer and diabetes disease. The experimental result shows that the average classification accuracy of the cancer and diabetes dataset has improved by 0.1% and 0.3% respectively using the new algorithm.Keywords: genetic algorithm, artificial neural network, pattern clasification, classification accuracy
Procedia PDF Downloads 4741593 Localization of Buried People Using Received Signal Strength Indication Measurement of Wireless Sensor
Authors: Feng Tao, Han Ye, Shaoyi Liao
Abstract:
City constructions collapse after earthquake and people will be buried under ruins. Search and rescue should be conducted as soon as possible to save them. Therefore, according to the complicated environment, irregular aftershocks and rescue allow of no delay, a kind of target localization method based on RSSI (Received Signal Strength Indication) is proposed in this article. The target localization technology based on RSSI with the features of low cost and low complexity has been widely applied to nodes localization in WSN (Wireless Sensor Networks). Based on the theory of RSSI transmission and the environment impact to RSSI, this article conducts the experiments in five scenes, and multiple filtering algorithms are applied to original RSSI value in order to establish the signal propagation model with minimum test error respectively. Target location can be calculated from the distance, which can be estimated from signal propagation model, through improved centroid algorithm. Result shows that the localization technology based on RSSI is suitable for large-scale nodes localization. Among filtering algorithms, mixed filtering algorithm (average of average, median and Gaussian filtering) performs better than any other single filtering algorithm, and by using the signal propagation model, the minimum error of distance between known nodes and target node in the five scene is about 3.06m.Keywords: signal propagation model, centroid algorithm, localization, mixed filtering, RSSI
Procedia PDF Downloads 3001592 Improvement of the Robust Proportional–Integral–Derivative (PID) Controller Parameters for Controlling the Frequency in the Intelligent Multi-Zone System at the Present of Wind Generation Using the Seeker Optimization Algorithm
Authors: Roya Ahmadi Ahangar, Hamid Madadyari
Abstract:
The seeker optimization algorithm (SOA) is increasingly gaining popularity among the researchers society due to its effectiveness in solving some real-world optimization problems. This paper provides the load-frequency control method based on the SOA for removing oscillations in the power system. A three-zone power system includes a thermal zone, a hydraulic zone and a wind zone equipped with robust proportional-integral-differential (PID) controllers. The result of simulation indicates that load-frequency changes in the wind zone for the multi-zone system are damped in a short period of time. Meanwhile, in the oscillation period, the oscillations amplitude is not significant. The result of simulation emphasizes that the PID controller designed using the seeker optimization algorithm has a robust function and a better performance for oscillations damping compared to the traditional PID controller. The proposed controller’s performance has been compared to the performance of PID controller regulated with Particle Swarm Optimization (PSO) and. Genetic Algorithm (GA) and Artificial Bee Colony (ABC) algorithms in order to show the superior capability of the proposed SOA in regulating the PID controller. The simulation results emphasize the better performance of the optimized PID controller based on SOA compared to the PID controller optimized with PSO, GA and ABC algorithms.Keywords: load-frequency control, multi zone, robust PID controller, wind generation
Procedia PDF Downloads 3031591 Machine Learning and Deep Learning Approach for People Recognition and Tracking in Crowd for Safety Monitoring
Authors: A. Degale Desta, Cheng Jian
Abstract:
Deep learning application in computer vision is rapidly advancing, giving it the ability to monitor the public and quickly identify potentially anomalous behaviour from crowd scenes. Therefore, the purpose of the current work is to improve the performance of safety of people in crowd events from panic behaviour through introducing the innovative idea of Aggregation of Ensembles (AOE), which makes use of the pre-trained ConvNets and a pool of classifiers to find anomalies in video data with packed scenes. According to the theory of algorithms that applied K-means, KNN, CNN, SVD, and Faster-CNN, YOLOv5 architectures learn different levels of semantic representation from crowd videos; the proposed approach leverages an ensemble of various fine-tuned convolutional neural networks (CNN), allowing for the extraction of enriched feature sets. In addition to the above algorithms, a long short-term memory neural network to forecast future feature values and a handmade feature that takes into consideration the peculiarities of the crowd to understand human behavior. On well-known datasets of panic situations, experiments are run to assess the effectiveness and precision of the suggested method. Results reveal that, compared to state-of-the-art methodologies, the system produces better and more promising results in terms of accuracy and processing speed.Keywords: action recognition, computer vision, crowd detecting and tracking, deep learning
Procedia PDF Downloads 1611590 Global Based Histogram for 3D Object Recognition
Authors: Somar Boubou, Tatsuo Narikiyo, Michihiro Kawanishi
Abstract:
In this work, we address the problem of 3D object recognition with depth sensors such as Kinect or Structure sensor. Compared with traditional approaches based on local descriptors, which depends on local information around the object key points, we propose a global features based descriptor. Proposed descriptor, which we name as Differential Histogram of Normal Vectors (DHONV), is designed particularly to capture the surface geometric characteristics of the 3D objects represented by depth images. We describe the 3D surface of an object in each frame using a 2D spatial histogram capturing the normalized distribution of differential angles of the surface normal vectors. The object recognition experiments on the benchmark RGB-D object dataset and a self-collected dataset show that our proposed descriptor outperforms two others descriptors based on spin-images and histogram of normal vectors with linear-SVM classifier.Keywords: vision in control, robotics, histogram, differential histogram of normal vectors
Procedia PDF Downloads 2791589 Morphological Features Fusion for Identifying INBREAST-Database Masses Using Neural Networks and Support Vector Machines
Authors: Nadia el Atlas, Mohammed el Aroussi, Mohammed Wahbi
Abstract:
In this paper a novel technique of mass characterization based on robust features-fusion is presented. The proposed method consists of mainly four stages: (a) the first phase involves segmenting the masses using edge information’s. (b) The second phase is to calculate and fuse the most relevant morphological features. (c) The last phase is the classification step which allows us to classify the images into benign and malignant masses. In this step we have implemented Support Vectors Machines (SVM) and Artificial Neural Networks (ANN), which were evaluated with the following performance criteria: confusion matrix, accuracy, sensitivity, specificity, receiver operating characteristic ROC, and error histogram. The effectiveness of this new approach was evaluated by a recently developed database: INBREAST database. The fusion of the most appropriate morphological features provided very good results. The SVM gives accuracy to within 64.3%. Whereas the ANN classifier gives better results with an accuracy of 97.5%.Keywords: breast cancer, mammography, CAD system, features, fusion
Procedia PDF Downloads 5991588 Artificial Intelligence for Generative Modelling
Authors: Shryas Bhurat, Aryan Vashistha, Sampreet Dinakar Nayak, Ayush Gupta
Abstract:
As the technology is advancing more towards high computational resources, there is a paradigm shift in the usage of these resources to optimize the design process. This paper discusses the usage of ‘Generative Design using Artificial Intelligence’ to build better models that adapt the operations like selection, mutation, and crossover to generate results. The human mind thinks of the simplest approach while designing an object, but the intelligence learns from the past & designs the complex optimized CAD Models. Generative Design takes the boundary conditions and comes up with multiple solutions with iterations to come up with a sturdy design with the most optimal parameter that is given, saving huge amounts of time & resources. The new production techniques that are at our disposal allow us to use additive manufacturing, 3D printing, and other innovative manufacturing techniques to save resources and design artistically engineered CAD Models. Also, this paper discusses the Genetic Algorithm, the Non-Domination technique to choose the right results using biomimicry that has evolved for current habitation for millions of years. The computer uses parametric models to generate newer models using an iterative approach & uses cloud computing to store these iterative designs. The later part of the paper compares the topology optimization technology with Generative Design that is previously being used to generate CAD Models. Finally, this paper shows the performance of algorithms and how these algorithms help in designing resource-efficient models.Keywords: genetic algorithm, bio mimicry, generative modeling, non-dominant techniques
Procedia PDF Downloads 1491587 Melanoma and Non-Melanoma, Skin Lesion Classification, Using a Deep Learning Model
Authors: Shaira L. Kee, Michael Aaron G. Sy, Myles Joshua T. Tan, Hezerul Abdul Karim, Nouar AlDahoul
Abstract:
Skin diseases are considered the fourth most common disease, with melanoma and non-melanoma skin cancer as the most common type of cancer in Caucasians. The alarming increase in Skin Cancer cases shows an urgent need for further research to improve diagnostic methods, as early diagnosis can significantly improve the 5-year survival rate. Machine Learning algorithms for image pattern analysis in diagnosing skin lesions can dramatically increase the accuracy rate of detection and decrease possible human errors. Several studies have shown the diagnostic performance of computer algorithms outperformed dermatologists. However, existing methods still need improvements to reduce diagnostic errors and generate efficient and accurate results. Our paper proposes an ensemble method to classify dermoscopic images into benign and malignant skin lesions. The experiments were conducted using the International Skin Imaging Collaboration (ISIC) image samples. The dataset contains 3,297 dermoscopic images with benign and malignant categories. The results show improvement in performance with an accuracy of 88% and an F1 score of 87%, outperforming other existing models such as support vector machine (SVM), Residual network (ResNet50), EfficientNetB0, EfficientNetB4, and VGG16.Keywords: deep learning - VGG16 - efficientNet - CNN – ensemble – dermoscopic images - melanoma
Procedia PDF Downloads 811586 Improved Classification Procedure for Imbalanced and Overlapped Situations
Authors: Hankyu Lee, Seoung Bum Kim
Abstract:
The issue with imbalance and overlapping in the class distribution becomes important in various applications of data mining. The imbalanced dataset is a special case in classification problems in which the number of observations of one class (i.e., major class) heavily exceeds the number of observations of the other class (i.e., minor class). Overlapped dataset is the case where many observations are shared together between the two classes. Imbalanced and overlapped data can be frequently found in many real examples including fraud and abuse patients in healthcare, quality prediction in manufacturing, text classification, oil spill detection, remote sensing, and so on. The class imbalance and overlap problem is the challenging issue because this situation degrades the performance of most of the standard classification algorithms. In this study, we propose a classification procedure that can effectively handle imbalanced and overlapped datasets by splitting data space into three parts: nonoverlapping, light overlapping, and severe overlapping and applying the classification algorithm in each part. These three parts were determined based on the Hausdorff distance and the margin of the modified support vector machine. An experiments study was conducted to examine the properties of the proposed method and compared it with other classification algorithms. The results showed that the proposed method outperformed the competitors under various imbalanced and overlapped situations. Moreover, the applicability of the proposed method was demonstrated through the experiment with real data.Keywords: classification, imbalanced data with class overlap, split data space, support vector machine
Procedia PDF Downloads 3081585 Automated Feature Detection and Matching Algorithms for Breast IR Sequence Images
Authors: Chia-Yen Lee, Hao-Jen Wang, Jhih-Hao Lai
Abstract:
In recent years, infrared (IR) imaging has been considered as a potential tool to assess the efficacy of chemotherapy and early detection of breast cancer. Regions of tumor growth with high metabolic rate and angiogenesis phenomenon lead to the high temperatures. Observation of differences between the heat maps in long term is useful to help assess the growth of breast cancer cells and detect breast cancer earlier, wherein the multi-time infrared image alignment technology is a necessary step. Representative feature points detection and matching are essential steps toward the good performance of image registration and quantitative analysis. However, there is no clear boundary on the infrared images and the subject's posture are different for each shot. It cannot adhesive markers on a body surface for a very long period, and it is hard to find anatomic fiducial markers on a body surface. In other words, it’s difficult to detect and match features in an IR sequence images. In this study, automated feature detection and matching algorithms with two type of automatic feature points (i.e., vascular branch points and modified Harris corner) are developed respectively. The preliminary results show that the proposed method could identify the representative feature points on the IR breast images successfully of 98% accuracy and the matching results of 93% accuracy.Keywords: Harris corner, infrared image, feature detection, registration, matching
Procedia PDF Downloads 3041584 Early Stage Suicide Ideation Detection Using Supervised Machine Learning and Neural Network Classifier
Authors: Devendra Kr Tayal, Vrinda Gupta, Aastha Bansal, Khushi Singh, Sristi Sharma, Hunny Gaur
Abstract:
In today's world, suicide is a serious problem. In order to save lives, early suicide attempt detection and prevention should be addressed. A good number of at-risk people utilize social media platforms to talk about their issues or find knowledge on related chores. Twitter and Reddit are two of the most common platforms that are used for expressing oneself. Extensive research has already been done in this field. Through supervised classification techniques like Nave Bayes, Bernoulli Nave Bayes, and Multiple Layer Perceptron on a Reddit dataset, we demonstrate the early recognition of suicidal ideation. We also performed comparative analysis on these approaches and used accuracy, recall score, F1 score, and precision score for analysis.Keywords: machine learning, suicide ideation detection, supervised classification, natural language processing
Procedia PDF Downloads 901583 Seismic Perimeter Surveillance System (Virtual Fence) for Threat Detection and Characterization Using Multiple ML Based Trained Models in Weighted Ensemble Voting
Authors: Vivek Mahadev, Manoj Kumar, Neelu Mathur, Brahm Dutt Pandey
Abstract:
Perimeter guarding and protection of critical installations require prompt intrusion detection and assessment to take effective countermeasures. Currently, visual and electronic surveillance are the primary methods used for perimeter guarding. These methods can be costly and complicated, requiring careful planning according to the location and terrain. Moreover, these methods often struggle to detect stealthy and camouflaged insurgents. The object of the present work is to devise a surveillance technique using seismic sensors that overcomes the limitations of existing systems. The aim is to improve intrusion detection, assessment, and characterization by utilizing seismic sensors. Most of the similar systems have only two types of intrusion detection capability viz., human or vehicle. In our work we could even categorize further to identify types of intrusion activity such as walking, running, group walking, fence jumping, tunnel digging and vehicular movements. A virtual fence of 60 meters at GCNEP, Bahadurgarh, Haryana, India, was created by installing four underground geophones at a distance of 15 meters each. The signals received from these geophones are then processed to find unique seismic signatures called features. Various feature optimization and selection methodologies, such as LightGBM, Boruta, Random Forest, Logistics, Recursive Feature Elimination, Chi-2 and Pearson Ratio were used to identify the best features for training the machine learning models. The trained models were developed using algorithms such as supervised support vector machine (SVM) classifier, kNN, Decision Tree, Logistic Regression, Naïve Bayes, and Artificial Neural Networks. These models were then used to predict the category of events, employing weighted ensemble voting to analyze and combine their results. The models were trained with 1940 training events and results were evaluated with 831 test events. It was observed that using the weighted ensemble voting increased the efficiency of predictions. In this study we successfully developed and deployed the virtual fence using geophones. Since these sensors are passive, do not radiate any energy and are installed underground, it is impossible for intruders to locate and nullify them. Their flexibility, quick and easy installation, low costs, hidden deployment and unattended surveillance make such systems especially suitable for critical installations and remote facilities with difficult terrain. This work demonstrates the potential of utilizing seismic sensors for creating better perimeter guarding and protection systems using multiple machine learning models in weighted ensemble voting. In this study the virtual fence achieved an intruder detection efficiency of over 97%.Keywords: geophone, seismic perimeter surveillance, machine learning, weighted ensemble method
Procedia PDF Downloads 781582 Recognition of Grocery Products in Images Captured by Cellular Phones
Authors: Farshideh Einsele, Hassan Foroosh
Abstract:
In this paper, we present a robust algorithm to recognize extracted text from grocery product images captured by mobile phone cameras. Recognition of such text is challenging since text in grocery product images varies in its size, orientation, style, illumination, and can suffer from perspective distortion. Pre-processing is performed to make the characters scale and rotation invariant. Since text degradations can not be appropriately defined using wellknown geometric transformations such as translation, rotation, affine transformation and shearing, we use the whole character black pixels as our feature vector. Classification is performed with minimum distance classifier using the maximum likelihood criterion, which delivers very promising Character Recognition Rate (CRR) of 89%. We achieve considerably higher Word Recognition Rate (WRR) of 99% when using lower level linguistic knowledge about product words during the recognition process.Keywords: camera-based OCR, feature extraction, document, image processing, grocery products
Procedia PDF Downloads 4061581 Commuters Trip Purpose Decision Tree Based Model of Makurdi Metropolis, Nigeria and Strategic Digital City Project
Authors: Emmanuel Okechukwu Nwafor, Folake Olubunmi Akintayo, Denis Alcides Rezende
Abstract:
Decision tree models are versatile and interpretable machine learning algorithms widely used for both classification and regression tasks, which can be related to cities, whether physical or digital. The aim of this research is to assess how well decision tree algorithms can predict trip purposes in Makurdi, Nigeria, while also exploring their connection to the strategic digital city initiative. The research methodology involves formalizing household demographic and trips information datasets obtained from extensive survey process. Modelling and Prediction were achieved using Python Programming Language and the evaluation metrics like R-squared and mean absolute error were used to assess the decision tree algorithm's performance. The results indicate that the model performed well, with accuracies of 84% and 68%, and low MAE values of 0.188 and 0.314, on training and validation data, respectively. This suggests the model can be relied upon for future prediction. The conclusion reiterates that This model will assist decision-makers, including urban planners, transportation engineers, government officials, and commuters, in making informed decisions on transportation planning and management within the framework of a strategic digital city. Its application will enhance the efficiency, sustainability, and overall quality of transportation services in Makurdi, Nigeria.Keywords: decision tree algorithm, trip purpose, intelligent transport, strategic digital city, travel pattern, sustainable transport
Procedia PDF Downloads 201580 Optimal Pricing Based on Real Estate Demand Data
Authors: Vanessa Kummer, Maik Meusel
Abstract:
Real estate demand estimates are typically derived from transaction data. However, in regions with excess demand, transactions are driven by supply and therefore do not indicate what people are actually looking for. To estimate the demand for housing in Switzerland, search subscriptions from all important Swiss real estate platforms are used. These data do, however, suffer from missing information—for example, many users do not specify how many rooms they would like or what price they would be willing to pay. In economic analyses, it is often the case that only complete data is used. Usually, however, the proportion of complete data is rather small which leads to most information being neglected. Also, the data might have a strong distortion if it is complete. In addition, the reason that data is missing might itself also contain information, which is however ignored with that approach. An interesting issue is, therefore, if for economic analyses such as the one at hand, there is an added value by using the whole data set with the imputed missing values compared to using the usually small percentage of complete data (baseline). Also, it is interesting to see how different algorithms affect that result. The imputation of the missing data is done using unsupervised learning. Out of the numerous unsupervised learning approaches, the most common ones, such as clustering, principal component analysis, or neural networks techniques are applied. By training the model iteratively on the imputed data and, thereby, including the information of all data into the model, the distortion of the first training set—the complete data—vanishes. In a next step, the performances of the algorithms are measured. This is done by randomly creating missing values in subsets of the data, estimating those values with the relevant algorithms and several parameter combinations, and comparing the estimates to the actual data. After having found the optimal parameter set for each algorithm, the missing values are being imputed. Using the resulting data sets, the next step is to estimate the willingness to pay for real estate. This is done by fitting price distributions for real estate properties with certain characteristics, such as the region or the number of rooms. Based on these distributions, survival functions are computed to obtain the functional relationship between characteristics and selling probabilities. Comparing the survival functions shows that estimates which are based on imputed data sets do not differ significantly from each other; however, the demand estimate that is derived from the baseline data does. This indicates that the baseline data set does not include all available information and is therefore not representative for the entire sample. Also, demand estimates derived from the whole data set are much more accurate than the baseline estimation. Thus, in order to obtain optimal results, it is important to make use of all available data, even though it involves additional procedures such as data imputation.Keywords: demand estimate, missing-data imputation, real estate, unsupervised learning
Procedia PDF Downloads 2851579 The Classification Accuracy of Finance Data through Holder Functions
Authors: Yeliz Karaca, Carlo Cattani
Abstract:
This study focuses on the local Holder exponent as a measure of the function regularity for time series related to finance data. In this study, the attributes of the finance dataset belonging to 13 countries (India, China, Japan, Sweden, France, Germany, Italy, Australia, Mexico, United Kingdom, Argentina, Brazil, USA) located in 5 different continents (Asia, Europe, Australia, North America and South America) have been examined.These countries are the ones mostly affected by the attributes with regard to financial development, covering a period from 2012 to 2017. Our study is concerned with the most important attributes that have impact on the development of finance for the countries identified. Our method is comprised of the following stages: (a) among the multi fractal methods and Brownian motion Holder regularity functions (polynomial, exponential), significant and self-similar attributes have been identified (b) The significant and self-similar attributes have been applied to the Artificial Neuronal Network (ANN) algorithms (Feed Forward Back Propagation (FFBP) and Cascade Forward Back Propagation (CFBP)) (c) the outcomes of classification accuracy have been compared concerning the attributes that have impact on the attributes which affect the countries’ financial development. This study has enabled to reveal, through the application of ANN algorithms, how the most significant attributes are identified within the relevant dataset via the Holder functions (polynomial and exponential function).Keywords: artificial neural networks, finance data, Holder regularity, multifractals
Procedia PDF Downloads 2461578 Nondestructive Prediction and Classification of Gel Strength in Ethanol-Treated Kudzu Starch Gels Using Near-Infrared Spectroscopy
Authors: John-Nelson Ekumah, Selorm Yao-Say Solomon Adade, Mingming Zhong, Yufan Sun, Qiufang Liang, Muhammad Safiullah Virk, Xorlali Nunekpeku, Nana Adwoa Nkuma Johnson, Bridget Ama Kwadzokpui, Xiaofeng Ren
Abstract:
Enhancing starch gel strength and stability is crucial. However, traditional gel property assessment methods are destructive, time-consuming, and resource-intensive. Thus, understanding ethanol treatment effects on kudzu starch gel strength and developing a rapid, nondestructive gel strength assessment method is essential for optimizing the treatment process and ensuring product quality consistency. This study investigated the effects of different ethanol concentrations on the microstructure of kudzu starch gels using a comprehensive microstructural analysis. We also developed a nondestructive method for predicting gel strength and classifying treatment levels using near-infrared (NIR) spectroscopy, and advanced data analytics. Scanning electron microscopy revealed progressive network densification and pore collapse with increasing ethanol concentration, correlating with enhanced mechanical properties. NIR spectroscopy, combined with various variable selection methods (CARS, GA, and UVE) and modeling algorithms (PLS, SVM, and ELM), was employed to develop predictive models for gel strength. The UVE-SVM model demonstrated exceptional performance, with the highest R² values (Rc = 0.9786, Rp = 0.9688) and lowest error rates (RMSEC = 6.1340, RMSEP = 6.0283). Pattern recognition algorithms (PCA, LDA, and KNN) successfully classified gels based on ethanol treatment levels, achieving near-perfect accuracy. This integrated approach provided a multiscale perspective on ethanol-induced starch gel modification, from molecular interactions to macroscopic properties. Our findings demonstrate the potential of NIR spectroscopy, coupled with advanced data analysis, as a powerful tool for rapid, nondestructive quality assessment in starch gel production. This study contributes significantly to the understanding of starch modification processes and opens new avenues for research and industrial applications in food science, pharmaceuticals, and biomaterials.Keywords: kudzu starch gel, near-infrared spectroscopy, gel strength prediction, support vector machine, pattern recognition algorithms, ethanol treatment
Procedia PDF Downloads 361577 General Architecture for Automation of Machine Learning Practices
Authors: U. Borasi, Amit Kr. Jain, Rakesh, Piyush Jain
Abstract:
Data collection, data preparation, model training, model evaluation, and deployment are all processes in a typical machine learning workflow. Training data needs to be gathered and organised. This often entails collecting a sizable dataset and cleaning it to remove or correct any inaccurate or missing information. Preparing the data for use in the machine learning model requires pre-processing it after it has been acquired. This often entails actions like scaling or normalising the data, handling outliers, selecting appropriate features, reducing dimensionality, etc. This pre-processed data is then used to train a model on some machine learning algorithm. After the model has been trained, it needs to be assessed by determining metrics like accuracy, precision, and recall, utilising a test dataset. Every time a new model is built, both data pre-processing and model training—two crucial processes in the Machine learning (ML) workflow—must be carried out. Thus, there are various Machine Learning algorithms that can be employed for every single approach to data pre-processing, generating a large set of combinations to choose from. Example: for every method to handle missing values (dropping records, replacing with mean, etc.), for every scaling technique, and for every combination of features selected, a different algorithm can be used. As a result, in order to get the optimum outcomes, these tasks are frequently repeated in different combinations. This paper suggests a simple architecture for organizing this largely produced “combination set of pre-processing steps and algorithms” into an automated workflow which simplifies the task of carrying out all possibilities.Keywords: machine learning, automation, AUTOML, architecture, operator pool, configuration, scheduler
Procedia PDF Downloads 571576 Generating Music with More Refined Emotions
Authors: Shao-Di Feng, Von-Wun Soo
Abstract:
To generate symbolic music with specific emotions is a challenging task due to symbolic music datasets that have emotion labels are scarce and incomplete. This research aims to generate more refined emotions based on the training datasets that are only labeled with four quadrants in Russel’s 2D emotion model. We focus on the theory of Music Fadernet and map arousal and valence to the low-level attributes, and build a symbolic music generation model by combining transformer and GM-VAE. We adopt an in-attention mechanism for the model and improve it by allowing modulation by conditional information. And we show the music generation model could control the generation of music according to the emotions specified by users in terms of high-level linguistic expression and by manipulating their corresponding low-level musical attributes. Finally, we evaluate the model performance using a pre-trained emotion classifier against a pop piano midi dataset called EMOPIA, and by subjective listening evaluation, we demonstrate that the model could generate music with more refined emotions correctly.Keywords: music generation, music emotion controlling, deep learning, semi-supervised learning
Procedia PDF Downloads 891575 Incorporating Information Gain in Regular Expressions Based Classifiers
Authors: Rosa L. Figueroa, Christopher A. Flores, Qing Zeng-Treitler
Abstract:
A regular expression consists of sequence characters which allow describing a text path. Usually, in clinical research, regular expressions are manually created by programmers together with domain experts. Lately, there have been several efforts to investigate how to generate them automatically. This article presents a text classification algorithm based on regexes. The algorithm named REX was designed, and then, implemented as a simplified method to create regexes to classify Spanish text automatically. In order to classify ambiguous cases, such as, when multiple labels are assigned to a testing example, REX includes an information gain method Two sets of data were used to evaluate the algorithm’s effectiveness in clinical text classification tasks. The results indicate that the regular expression based classifier proposed in this work performs statically better regarding accuracy and F-measure than Support Vector Machine and Naïve Bayes for both datasets.Keywords: information gain, regular expressions, smith-waterman algorithm, text classification
Procedia PDF Downloads 3201574 An Adiabatic Quantum Optimization Approach for the Mixed Integer Nonlinear Programming Problem
Authors: Maxwell Henderson, Tristan Cook, Justin Chan Jin Le, Mark Hodson, YoungJung Chang, John Novak, Daniel Padilha, Nishan Kulatilaka, Ansu Bagchi, Sanjoy Ray, John Kelly
Abstract:
We present a method of using adiabatic quantum optimization (AQO) to solve a mixed integer nonlinear programming (MINLP) problem instance. The MINLP problem is a general form of a set of NP-hard optimization problems that are critical to many business applications. It requires optimizing a set of discrete and continuous variables with nonlinear and potentially nonconvex constraints. Obtaining an exact, optimal solution for MINLP problem instances of non-trivial size using classical computation methods is currently intractable. Current leading algorithms leverage heuristic and divide-and-conquer methods to determine approximate solutions. Creating more accurate and efficient algorithms is an active area of research. Quantum computing (QC) has several theoretical benefits compared to classical computing, through which QC algorithms could obtain MINLP solutions that are superior to current algorithms. AQO is a particular form of QC that could offer more near-term benefits compared to other forms of QC, as hardware development is in a more mature state and devices are currently commercially available from D-Wave Systems Inc. It is also designed for optimization problems: it uses an effect called quantum tunneling to explore all lowest points of an energy landscape where classical approaches could become stuck in local minima. Our work used a novel algorithm formulated for AQO to solve a special type of MINLP problem. The research focused on determining: 1) if the problem is possible to solve using AQO, 2) if it can be solved by current hardware, 3) what the currently achievable performance is, 4) what the performance will be on projected future hardware, and 5) when AQO is likely to provide a benefit over classical computing methods. Two different methods, integer range and 1-hot encoding, were investigated for transforming the MINLP problem instance constraints into a mathematical structure that can be embedded directly onto the current D-Wave architecture. For testing and validation a D-Wave 2X device was used, as well as QxBranch’s QxLib software library, which includes a QC simulator based on simulated annealing. Our results indicate that it is mathematically possible to formulate the MINLP problem for AQO, but that currently available hardware is unable to solve problems of useful size. Classical general-purpose simulated annealing is currently able to solve larger problem sizes, but does not scale well and such methods would likely be outperformed in the future by improved AQO hardware with higher qubit connectivity and lower temperatures. If larger AQO devices are able to show improvements that trend in this direction, commercially viable solutions to the MINLP for particular applications could be implemented on hardware projected to be available in 5-10 years. Continued investigation into optimal AQO hardware architectures and novel methods for embedding MINLP problem constraints on to those architectures is needed to realize those commercial benefits.Keywords: adiabatic quantum optimization, mixed integer nonlinear programming, quantum computing, NP-hard
Procedia PDF Downloads 5251573 Decision Support: How Explainable A.I. Can Improve Transparency and Trust with Human Users
Authors: Devon Brown, Liu Chunmei
Abstract:
This paper will present an analysis as part of the researchers dissertation topic focusing on the intersection of affective and analytical directed acyclic graphs (DAGs) in the context of Decision Support Systems (DSS). The researcher’s work involves analyzing decision theory models like Affective and Bayesian Decision theory models and how they could be implemented under an Affective Computing Framework using Information Fusion and Human-Centered Design. Additionally, the researcher is beginning research on an Affective-Analytic Decision Framework (AADF) model for their dissertation research and are looking to merge logic and analytic models with empathetic insights into affective DAGs. Data-collection efforts begin Fall 2024 and in preparation for the efforts this paper looks to analyze previous research in this area and introduce the AADF framework and propose conceptual models for consideration. For this paper, the research emphasis is placed on analyzing Bayesian networks and Markov models which offer probabilistic techniques during uncertainty in decision-making. Ideally, including affect into analytic models will ensure algorithms can increase user trust with algorithms by including emotional states and the user’s experience with the goal of developing emotionally intelligent A.I. systems that can start to navigate the complex fabric of human emotion during decision-making.Keywords: decision support systems, explainable AI, HCAI techniques, affective-analytical decision framework
Procedia PDF Downloads 201572 A t-SNE and UMAP Based Neural Network Image Classification Algorithm
Authors: Shelby Simpson, William Stanley, Namir Naba, Xiaodi Wang
Abstract:
Both t-SNE and UMAP are brand new state of art tools to predominantly preserve the local structure that is to group neighboring data points together, which indeed provides a very informative visualization of heterogeneity in our data. In this research, we develop a t-SNE and UMAP base neural network image classification algorithm to embed the original dataset to a corresponding low dimensional dataset as a preprocessing step, then use this embedded database as input to our specially designed neural network classifier for image classification. We use the fashion MNIST data set, which is a labeled data set of images of clothing objects in our experiments. t-SNE and UMAP are used for dimensionality reduction of the data set and thus produce low dimensional embeddings. Furthermore, we use the embeddings from t-SNE and UMAP to feed into two neural networks. The accuracy of the models from the two neural networks is then compared to a dense neural network that does not use embedding as an input to show which model can classify the images of clothing objects more accurately.Keywords: t-SNE, UMAP, fashion MNIST, neural networks
Procedia PDF Downloads 1981571 Hybrid Intelligent Optimization Methods for Optimal Design of Horizontal-Axis Wind Turbine Blades
Authors: E. Tandis, E. Assareh
Abstract:
Designing the optimal shape of MW wind turbine blades is provided in a number of cases through evolutionary algorithms associated with mathematical modeling (Blade Element Momentum Theory). Evolutionary algorithms, among the optimization methods, enjoy many advantages, particularly in stability. However, they usually need a large number of function evaluations. Since there are a large number of local extremes, the optimization method has to find the global extreme accurately. The present paper introduces a new population-based hybrid algorithm called Genetic-Based Bees Algorithm (GBBA). This algorithm is meant to design the optimal shape for MW wind turbine blades. The current method employs crossover and neighborhood searching operators taken from the respective Genetic Algorithm (GA) and Bees Algorithm (BA) to provide a method with good performance in accuracy and speed convergence. Different blade designs, twenty-one to be exact, were considered based on the chord length, twist angle and tip speed ratio using GA results. They were compared with BA and GBBA optimum design results targeting the power coefficient and solidity. The results suggest that the final shape, obtained by the proposed hybrid algorithm, performs better compared to either BA or GA. Furthermore, the accuracy and speed convergence increases when the GBBA is employedKeywords: Blade Design, Optimization, Genetic Algorithm, Bees Algorithm, Genetic-Based Bees Algorithm, Large Wind Turbine
Procedia PDF Downloads 3161570 Interpretation of the Russia-Ukraine 2022 War via N-Gram Analysis
Authors: Elcin Timur Cakmak, Ayse Oguzlar
Abstract:
This study presents the results of the tweets sent by Twitter users on social media about the Russia-Ukraine war by bigram and trigram methods. On February 24, 2022, Russian President Vladimir Putin declared a military operation against Ukraine, and all eyes were turned to this war. Many people living in Russia and Ukraine reacted to this war and protested and also expressed their deep concern about this war as they felt the safety of their families and their futures were at stake. Most people, especially those living in Russia and Ukraine, express their views on the war in different ways. The most popular way to do this is through social media. Many people prefer to convey their feelings using Twitter, one of the most frequently used social media tools. Since the beginning of the war, it is seen that there have been thousands of tweets about the war from many countries of the world on Twitter. These tweets accumulated in data sources are extracted using various codes for analysis through Twitter API and analysed by Python programming language. The aim of the study is to find the word sequences in these tweets by the n-gram method, which is known for its widespread use in computational linguistics and natural language processing. The tweet language used in the study is English. The data set consists of the data obtained from Twitter between February 24, 2022, and April 24, 2022. The tweets obtained from Twitter using the #ukraine, #russia, #war, #putin, #zelensky hashtags together were captured as raw data, and the remaining tweets were included in the analysis stage after they were cleaned through the preprocessing stage. In the data analysis part, the sentiments are found to present what people send as a message about the war on Twitter. Regarding this, negative messages make up the majority of all the tweets as a ratio of %63,6. Furthermore, the most frequently used bigram and trigram word groups are found. Regarding the results, the most frequently used word groups are “he, is”, “I, do”, “I, am” for bigrams. Also, the most frequently used word groups are “I, do, not”, “I, am, not”, “I, can, not” for trigrams. In the machine learning phase, the accuracy of classifications is measured by Classification and Regression Trees (CART) and Naïve Bayes (NB) algorithms. The algorithms are used separately for bigrams and trigrams. We gained the highest accuracy and F-measure values by the NB algorithm and the highest precision and recall values by the CART algorithm for bigrams. On the other hand, the highest values for accuracy, precision, and F-measure values are achieved by the CART algorithm, and the highest value for the recall is gained by NB for trigrams.Keywords: classification algorithms, machine learning, sentiment analysis, Twitter
Procedia PDF Downloads 731569 Tomato-Weed Classification by RetinaNet One-Step Neural Network
Authors: Dionisio Andujar, Juan lópez-Correa, Hugo Moreno, Angela Ri
Abstract:
The increased number of weeds in tomato crops highly lower yields. Weed identification with the aim of machine learning is important to carry out site-specific control. The last advances in computer vision are a powerful tool to face the problem. The analysis of RGB (Red, Green, Blue) images through Artificial Neural Networks had been rapidly developed in the past few years, providing new methods for weed classification. The development of the algorithms for crop and weed species classification looks for a real-time classification system using Object Detection algorithms based on Convolutional Neural Networks. The site study was located in commercial corn fields. The classification system has been tested. The procedure can detect and classify weed seedlings in tomato fields. The input to the Neural Network was a set of 10,000 RGB images with a natural infestation of Cyperus rotundus l., Echinochloa crus galli L., Setaria italica L., Portulaca oeracea L., and Solanum nigrum L. The validation process was done with a random selection of RGB images containing the aforementioned species. The mean average precision (mAP) was established as the metric for object detection. The results showed agreements higher than 95 %. The system will provide the input for an online spraying system. Thus, this work plays an important role in Site Specific Weed Management by reducing herbicide use in a single step.Keywords: deep learning, object detection, cnn, tomato, weeds
Procedia PDF Downloads 1031568 Comparative Study and Parallel Implementation of Stochastic Models for Pricing of European Options Portfolios using Monte Carlo Methods
Authors: Vinayak Bassi, Rajpreet Singh
Abstract:
Over the years, with the emergence of sophisticated computers and algorithms, finance has been quantified using computational prowess. Asset valuation has been one of the key components of quantitative finance. In fact, it has become one of the embryonic steps in determining risk related to a portfolio, the main goal of quantitative finance. This study comprises a drawing comparison between valuation output generated by two stochastic dynamic models, namely Black-Scholes and Dupire’s bi-dimensionality model. Both of these models are formulated for computing the valuation function for a portfolio of European options using Monte Carlo simulation methods. Although Monte Carlo algorithms have a slower convergence rate than calculus-based simulation techniques (like FDM), they work quite effectively over high-dimensional dynamic models. A fidelity gap is analyzed between the static (historical) and stochastic inputs for a sample portfolio of underlying assets. In order to enhance the performance efficiency of the model, the study emphasized the use of variable reduction methods and customizing random number generators to implement parallelization. An attempt has been made to further implement the Dupire’s model on a GPU to achieve higher computational performance. Furthermore, ideas have been discussed around the performance enhancement and bottleneck identification related to the implementation of options-pricing models on GPUs.Keywords: monte carlo, stochastic models, computational finance, parallel programming, scientific computing
Procedia PDF Downloads 1611567 Spatial Object-Oriented Template Matching Algorithm Using Normalized Cross-Correlation Criterion for Tracking Aerial Image Scene
Authors: Jigg Pelayo, Ricardo Villar
Abstract:
Leaning on the development of aerial laser scanning in the Philippine geospatial industry, researches about remote sensing and machine vision technology became a trend. Object detection via template matching is one of its application which characterized to be fast and in real time. The paper purposely attempts to provide application for robust pattern matching algorithm based on the normalized cross correlation (NCC) criterion function subjected in Object-based image analysis (OBIA) utilizing high-resolution aerial imagery and low density LiDAR data. The height information from laser scanning provides effective partitioning order, thus improving the hierarchal class feature pattern which allows to skip unnecessary calculation. Since detection is executed in the object-oriented platform, mathematical morphology and multi-level filter algorithms were established to effectively avoid the influence of noise, small distortion and fluctuating image saturation that affect the rate of recognition of features. Furthermore, the scheme is evaluated to recognized the performance in different situations and inspect the computational complexities of the algorithms. Its effectiveness is demonstrated in areas of Misamis Oriental province, achieving an overall accuracy of 91% above. Also, the garnered results portray the potential and efficiency of the implemented algorithm under different lighting conditions.Keywords: algorithm, LiDAR, object recognition, OBIA
Procedia PDF Downloads 2441566 Machine Learning Model to Predict TB Bacteria-Resistant Drugs from TB Isolates
Authors: Rosa Tsegaye Aga, Xuan Jiang, Pavel Vazquez Faci, Siqing Liu, Simon Rayner, Endalkachew Alemu, Markos Abebe
Abstract:
Tuberculosis (TB) is a major cause of disease globally. In most cases, TB is treatable and curable, but only with the proper treatment. There is a time when drug-resistant TB occurs when bacteria become resistant to the drugs that are used to treat TB. Current strategies to identify drug-resistant TB bacteria are laboratory-based, and it takes a longer time to identify the drug-resistant bacteria and treat the patient accordingly. But machine learning (ML) and data science approaches can offer new approaches to the problem. In this study, we propose to develop an ML-based model to predict the antibiotic resistance phenotypes of TB isolates in minutes and give the right treatment to the patient immediately. The study has been using the whole genome sequence (WGS) of TB isolates as training data that have been extracted from the NCBI repository and contain different countries’ samples to build the ML models. The reason that different countries’ samples have been included is to generalize the large group of TB isolates from different regions in the world. This supports the model to train different behaviors of the TB bacteria and makes the model robust. The model training has been considering three pieces of information that have been extracted from the WGS data to train the model. These are all variants that have been found within the candidate genes (F1), predetermined resistance-associated variants (F2), and only resistance-associated gene information for the particular drug. Two major datasets have been constructed using these three information. F1 and F2 information have been considered as two independent datasets, and the third information is used as a class to label the two datasets. Five machine learning algorithms have been considered to train the model. These are Support Vector Machine (SVM), Random forest (RF), Logistic regression (LR), Gradient Boosting, and Ada boost algorithms. The models have been trained on the datasets F1, F2, and F1F2 that is the F1 and the F2 dataset merged. Additionally, an ensemble approach has been used to train the model. The ensemble approach has been considered to run F1 and F2 datasets on gradient boosting algorithm and use the output as one dataset that is called F1F2 ensemble dataset and train a model using this dataset on the five algorithms. As the experiment shows, the ensemble approach model that has been trained on the Gradient Boosting algorithm outperformed the rest of the models. In conclusion, this study suggests the ensemble approach, that is, the RF + Gradient boosting model, to predict the antibiotic resistance phenotypes of TB isolates by outperforming the rest of the models.Keywords: machine learning, MTB, WGS, drug resistant TB
Procedia PDF Downloads 51