Search results for: euclidean classifier
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 423

Search results for: euclidean classifier

63 Local Directional Encoded Derivative Binary Pattern Based Coral Image Classification Using Weighted Distance Gray Wolf Optimization Algorithm

Authors: Annalakshmi G., Sakthivel Murugan S.

Abstract:

This paper presents a local directional encoded derivative binary pattern (LDEDBP) feature extraction method that can be applied for the classification of submarine coral reef images. The classification of coral reef images using texture features is difficult due to the dissimilarities in class samples. In coral reef image classification, texture features are extracted using the proposed method called local directional encoded derivative binary pattern (LDEDBP). The proposed approach extracts the complete structural arrangement of the local region using local binary batten (LBP) and also extracts the edge information using local directional pattern (LDP) from the edge response available in a particular region, thereby achieving extra discriminative feature value. Typically the LDP extracts the edge details in all eight directions. The process of integrating edge responses along with the local binary pattern achieves a more robust texture descriptor than the other descriptors used in texture feature extraction methods. Finally, the proposed technique is applied to an extreme learning machine (ELM) method with a meta-heuristic algorithm known as weighted distance grey wolf optimizer (GWO) to optimize the input weight and biases of single-hidden-layer feed-forward neural networks (SLFN). In the empirical results, ELM-WDGWO demonstrated their better performance in terms of accuracy on all coral datasets, namely RSMAS, EILAT, EILAT2, and MLC, compared with other state-of-the-art algorithms. The proposed method achieves the highest overall classification accuracy of 94% compared to the other state of art methods.

Keywords: feature extraction, local directional pattern, ELM classifier, GWO optimization

Procedia PDF Downloads 164
62 A Machine Learning Approach for Detecting and Locating Hardware Trojans

Authors: Kaiwen Zheng, Wanting Zhou, Nan Tang, Lei Li, Yuanhang He

Abstract:

The integrated circuit industry has become a cornerstone of the information society, finding widespread application in areas such as industry, communication, medicine, and aerospace. However, with the increasing complexity of integrated circuits, Hardware Trojans (HTs) implanted by attackers have become a significant threat to their security. In this paper, we proposed a hardware trojan detection method for large-scale circuits. As HTs introduce physical characteristic changes such as structure, area, and power consumption as additional redundant circuits, we proposed a machine-learning-based hardware trojan detection method based on the physical characteristics of gate-level netlists. This method transforms the hardware trojan detection problem into a machine-learning binary classification problem based on physical characteristics, greatly improving detection speed. To address the problem of imbalanced data, where the number of pure circuit samples is far less than that of HTs circuit samples, we used the SMOTETomek algorithm to expand the dataset and further improve the performance of the classifier. We used three machine learning algorithms, K-Nearest Neighbors, Random Forest, and Support Vector Machine, to train and validate benchmark circuits on Trust-Hub, and all achieved good results. In our case studies based on AES encryption circuits provided by trust-hub, the test results showed the effectiveness of the proposed method. To further validate the method’s effectiveness for detecting variant HTs, we designed variant HTs using open-source HTs. The proposed method can guarantee robust detection accuracy in the millisecond level detection time for IC, and FPGA design flows and has good detection performance for library variant HTs.

Keywords: hardware trojans, physical properties, machine learning, hardware security

Procedia PDF Downloads 148
61 An Unsupervised Domain-Knowledge Discovery Framework for Fake News Detection

Authors: Yulan Wu

Abstract:

With the rapid development of social media, the issue of fake news has gained considerable prominence, drawing the attention of both the public and governments. The widespread dissemination of false information poses a tangible threat across multiple domains of society, including politics, economy, and health. However, much research has concentrated on supervised training models within specific domains, their effectiveness diminishes when applied to identify fake news across multiple domains. To solve this problem, some approaches based on domain labels have been proposed. By segmenting news to their specific area in advance, judges in the corresponding field may be more accurate on fake news. However, these approaches disregard the fact that news records can pertain to multiple domains, resulting in a significant loss of valuable information. In addition, the datasets used for training must all be domain-labeled, which creates unnecessary complexity. To solve these problems, an unsupervised domain knowledge discovery framework for fake news detection is proposed. Firstly, to effectively retain the multidomain knowledge of the text, a low-dimensional vector for each news text to capture domain embeddings is generated. Subsequently, a feature extraction module utilizing the unsupervisedly discovered domain embeddings is used to extract the comprehensive features of news. Finally, a classifier is employed to determine the authenticity of the news. To verify the proposed framework, a test is conducted on the existing widely used datasets, and the experimental results demonstrate that this method is able to improve the detection performance for fake news across multiple domains. Moreover, even in datasets that lack domain labels, this method can still effectively transfer domain knowledge, which can educe the time consumed by tagging without sacrificing the detection accuracy.

Keywords: fake news, deep learning, natural language processing, multiple domains

Procedia PDF Downloads 101
60 Modeling Visual Memorability Assessment with Autoencoders Reveals Characteristics of Memorable Images

Authors: Elham Bagheri, Yalda Mohsenzadeh

Abstract:

Image memorability refers to the phenomenon where certain images are more likely to be remembered by humans than others. It is a quantifiable and intrinsic attribute of an image. Understanding how visual perception and memory interact is important in both cognitive science and artificial intelligence. It reveals the complex processes that support human cognition and helps to improve machine learning algorithms by mimicking the brain's efficient data processing and storage mechanisms. To explore the computational underpinnings of image memorability, this study examines the relationship between an image's reconstruction error, distinctiveness in latent space, and its memorability score. A trained autoencoder is used to replicate human-like memorability assessment inspired by the visual memory game employed in memorability estimations. This study leverages a VGG-based autoencoder that is pre-trained on the vast ImageNet dataset, enabling it to recognize patterns and features that are common to a wide and diverse range of images. An empirical analysis is conducted using the MemCat dataset, which includes 10,000 images from five broad categories: animals, sports, food, landscapes, and vehicles, along with their corresponding memorability scores. The memorability score assigned to each image represents the probability of that image being remembered by participants after a single exposure. The autoencoder is finetuned for one epoch with a batch size of one, attempting to create a scenario similar to human memorability experiments where memorability is quantified by the likelihood of an image being remembered after being seen only once. The reconstruction error, which is quantified as the difference between the original and reconstructed images, serves as a measure of how well the autoencoder has learned to represent the data. The reconstruction error of each image, the error reduction, and its distinctiveness in latent space are calculated and correlated with the memorability score. Distinctiveness is measured as the Euclidean distance between each image's latent representation and its nearest neighbor within the autoencoder's latent space. Different structural and perceptual loss functions are considered to quantify the reconstruction error. The results indicate that there is a strong correlation between the reconstruction error and the distinctiveness of images and their memorability scores. This suggests that images with more unique distinct features that challenge the autoencoder's compressive capacities are inherently more memorable. There is also a negative correlation between the reduction in reconstruction error compared to the autoencoder pre-trained on ImageNet, which suggests that highly memorable images are harder to reconstruct, probably due to having features that are more difficult to learn by the autoencoder. These insights suggest a new pathway for evaluating image memorability, which could potentially impact industries reliant on visual content and mark a step forward in merging the fields of artificial intelligence and cognitive science. The current research opens avenues for utilizing neural representations as instruments for understanding and predicting visual memory.

Keywords: autoencoder, computational vision, image memorability, image reconstruction, memory retention, reconstruction error, visual perception

Procedia PDF Downloads 92
59 Planning for Location and Distribution of Regional Facilities Using Central Place Theory and Location-Allocation Model

Authors: Danjuma Bawa

Abstract:

This paper aimed at exploring the capabilities of Location-Allocation model in complementing the strides of the existing physical planning models in the location and distribution of facilities for regional consumption. The paper was designed to provide a blueprint to the Nigerian government and other donor agencies especially the Fertilizer Distribution Initiative (FDI) by the federal government for the revitalization of the terrorism ravaged regions. Theoretical underpinnings of central place theory related to spatial distribution, interrelationships, and threshold prerequisites were reviewed. The study showcased how Location-Allocation Model (L-AM) alongside Central Place Theory (CPT) was applied in Geographic Information System (GIS) environment to; map and analyze the spatial distribution of settlements; exploit their physical and economic interrelationships, and to explore their hierarchical and opportunistic influences. The study was purely spatial qualitative research which largely used secondary data such as; spatial location and distribution of settlements, population figures of settlements, network of roads linking them and other landform features. These were sourced from government ministries and open source consortium. GIS was used as a tool for processing and analyzing such spatial features within the dictum of CPT and L-AM to produce a comprehensive spatial digital plan for equitable and judicious location and distribution of fertilizer deports in the study area in an optimal way. Population threshold was used as yardstick for selecting suitable settlements that could stand as service centers to other hinterlands; this was accomplished using the query syntax in ArcMapTM. ArcGISTM’ network analyst was used in conducting location-allocation analysis for apportioning of groups of settlements around such service centers within a given threshold distance. Most of the techniques and models ever used by utility planners have been centered on straight distance to settlements using Euclidean distances. Such models neglect impedance cutoffs and the routing capabilities of networks. CPT and L-AM take into consideration both the influential characteristics of settlements and their routing connectivity. The study was undertaken in two terrorism ravaged Local Government Areas of Adamawa state. Four (4) existing depots in the study area were identified. 20 more depots in 20 villages were proposed using suitability analysis. Out of the 300 settlements mapped in the study area about 280 of such settlements where optimally grouped and allocated to the selected service centers respectfully within 2km impedance cutoff. This study complements the giant strides by the federal government of Nigeria by providing a blueprint for ensuring proper distribution of these public goods in the spirit of bringing succor to these terrorism ravaged populace. This will ardently at the same time help in boosting agricultural activities thereby lowering food shortage and raising per capita income as espoused by the government.

Keywords: central place theory, GIS, location-allocation, network analysis, urban and regional planning, welfare economics

Procedia PDF Downloads 148
58 Lung Cancer Detection and Multi Level Classification Using Discrete Wavelet Transform Approach

Authors: V. Veeraprathap, G. S. Harish, G. Narendra Kumar

Abstract:

Uncontrolled growth of abnormal cells in the lung in the form of tumor can be either benign (non-cancerous) or malignant (cancerous). Patients with Lung Cancer (LC) have an average of five years life span expectancy provided diagnosis, detection and prediction, which reduces many treatment options to risk of invasive surgery increasing survival rate. Computed Tomography (CT), Positron Emission Tomography (PET), and Magnetic Resonance Imaging (MRI) for earlier detection of cancer are common. Gaussian filter along with median filter used for smoothing and noise removal, Histogram Equalization (HE) for image enhancement gives the best results without inviting further opinions. Lung cavities are extracted and the background portion other than two lung cavities is completely removed with right and left lungs segmented separately. Region properties measurements area, perimeter, diameter, centroid and eccentricity measured for the tumor segmented image, while texture is characterized by Gray-Level Co-occurrence Matrix (GLCM) functions, feature extraction provides Region of Interest (ROI) given as input to classifier. Two levels of classifications, K-Nearest Neighbor (KNN) is used for determining patient condition as normal or abnormal, while Artificial Neural Networks (ANN) is used for identifying the cancer stage is employed. Discrete Wavelet Transform (DWT) algorithm is used for the main feature extraction leading to best efficiency. The developed technology finds encouraging results for real time information and on line detection for future research.

Keywords: artificial neural networks, ANN, discrete wavelet transform, DWT, gray-level co-occurrence matrix, GLCM, k-nearest neighbor, KNN, region of interest, ROI

Procedia PDF Downloads 155
57 Web Data Scraping Technology Using Term Frequency Inverse Document Frequency to Enhance the Big Data Quality on Sentiment Analysis

Authors: Sangita Pokhrel, Nalinda Somasiri, Rebecca Jeyavadhanam, Swathi Ganesan

Abstract:

Tourism is a booming industry with huge future potential for global wealth and employment. There are countless data generated over social media sites every day, creating numerous opportunities to bring more insights to decision-makers. The integration of Big Data Technology into the tourism industry will allow companies to conclude where their customers have been and what they like. This information can then be used by businesses, such as those in charge of managing visitor centers or hotels, etc., and the tourist can get a clear idea of places before visiting. The technical perspective of natural language is processed by analysing the sentiment features of online reviews from tourists, and we then supply an enhanced long short-term memory (LSTM) framework for sentiment feature extraction of travel reviews. We have constructed a web review database using a crawler and web scraping technique for experimental validation to evaluate the effectiveness of our methodology. The text form of sentences was first classified through Vader and Roberta model to get the polarity of the reviews. In this paper, we have conducted study methods for feature extraction, such as Count Vectorization and TFIDF Vectorization, and implemented Convolutional Neural Network (CNN) classifier algorithm for the sentiment analysis to decide the tourist’s attitude towards the destinations is positive, negative, or simply neutral based on the review text that they posted online. The results demonstrated that from the CNN algorithm, after pre-processing and cleaning the dataset, we received an accuracy of 96.12% for the positive and negative sentiment analysis.

Keywords: counter vectorization, convolutional neural network, crawler, data technology, long short-term memory, web scraping, sentiment analysis

Procedia PDF Downloads 88
56 A Geoprocessing Tool for Early Civil Work Notification to Optimize Fiber Optic Cable Installation Cost

Authors: Hussain Adnan Alsalman, Khalid Alhajri, Humoud Alrashidi, Abdulkareem Almakrami, Badie Alguwaisem, Said Alshahrani, Abdullah Alrowaished

Abstract:

Most of the cost of installing a new fiber optic cable is attributed to civil work-trenching-cost. In many cases, information technology departments receive project proposals in their eReview system, but not all projects are visible to everyone. Additionally, if there was no IT scope in the proposed project, it is not likely to be visible to IT. Sometimes it is too late to add IT scope after project budgets have been finalized. Finally, the eReview system is a repository of PDF files for each project, which commits the reviewer to manual work and limits automation potential. This paper details a solution to address the late notification of the eReview system by integrating IT Sites GIS data-sites locations-with land use permit (LUP) data-civil work activity, which is the first step before securing the required land usage authorizations and means no detailed designs for any relevant project before an approved LUP request. To address the manual nature of eReview system, both the LUP System and IT data are using ArcGIS Desktop, which enables the creation of a geoprocessing tool with either Python or Model Builder to automate finding and evaluating potentially usable LUP requests to reduce trenching between two sites in need of a new FOC. To achieve this, a weekly dump was taken from LUP system production data and loaded manually onto ArcMap Desktop. Then a custom tool was developed in model builder, which consisted of a table of two columns containing all the pairs of sites in need of new fiber connectivity. The tool then iterates all rows of this table, taking the sites’ pair one at a time and finding potential LUPs between them, which satisfies the provided search radius. If a group of LUPs was found, an iterator would go through each LUP to find the required civil work between the two sites and the LUP Polyline feature and the distance through the line, which would be counted as cost avoidance if an IT scope had been added. Finally, the tool will export an Excel file named with sites pair, and it will contain as many rows as the number of LUPs, which met the search radius containing trenching and pulling information and cost. As a result, multiple projects have been identified – historical, missed opportunity, and proposed projects. For the proposed project, the savings were about 75% ($750,000) to install a new fiber with the Euclidean distance between Abqaiq GOSP2 and GOSP3 DCOs. In conclusion, the current tool setup identifies opportunities to bundle civil work on single projects at a time and between two sites. More work is needed to allow the bundling of multiple projects between two sites to achieve even more cost avoidance in both capital cost and carbon footprint.

Keywords: GIS, fiber optic cable installation optimization, eliminate redundant civil work, reduce carbon footprint for fiber optic cable installation

Procedia PDF Downloads 220
55 Spectral Mixture Model Applied to Cannabis Parcel Determination

Authors: Levent Basayigit, Sinan Demir, Yusuf Ucar, Burhan Kara

Abstract:

Many research projects require accurate delineation of the different land cover type of the agricultural area. Especially it is critically important for the definition of specific plants like cannabis. However, the complexity of vegetation stands structure, abundant vegetation species, and the smooth transition between different seconder section stages make vegetation classification difficult when using traditional approaches such as the maximum likelihood classifier. Most of the time, classification distinguishes only between trees/annual or grain. It has been difficult to accurately determine the cannabis mixed with other plants. In this paper, a mixed distribution models approach is applied to classify pure and mix cannabis parcels using Worldview-2 imagery in the Lakes region of Turkey. Five different land use types (i.e. sunflower, maize, bare soil, and cannabis) were identified in the image. A constrained Gaussian mixture discriminant analysis (GMDA) was used to unmix the image. In the study, 255 reflectance ratios derived from spectral signatures of seven bands (Blue-Green-Yellow-Red-Rededge-NIR1-NIR2) were randomly arranged as 80% for training and 20% for test data. Gaussian mixed distribution model approach is proved to be an effective and convenient way to combine very high spatial resolution imagery for distinguishing cannabis vegetation. Based on the overall accuracies of the classification, the Gaussian mixed distribution model was found to be very successful to achieve image classification tasks. This approach is sensitive to capture the illegal cannabis planting areas in the large plain. This approach can also be used for monitoring and determination with spectral reflections in illegal cannabis planting areas.

Keywords: Gaussian mixture discriminant analysis, spectral mixture model, Worldview-2, land parcels

Procedia PDF Downloads 197
54 Applying Multiplicative Weight Update to Skin Cancer Classifiers

Authors: Animish Jain

Abstract:

This study deals with using Multiplicative Weight Update within artificial intelligence and machine learning to create models that can diagnose skin cancer using microscopic images of cancer samples. In this study, the multiplicative weight update method is used to take the predictions of multiple models to try and acquire more accurate results. Logistic Regression, Convolutional Neural Network (CNN), and Support Vector Machine Classifier (SVMC) models are employed within the Multiplicative Weight Update system. These models are trained on pictures of skin cancer from the ISIC-Archive, to look for patterns to label unseen scans as either benign or malignant. These models are utilized in a multiplicative weight update algorithm which takes into account the precision and accuracy of each model through each successive guess to apply weights to their guess. These guesses and weights are then analyzed together to try and obtain the correct predictions. The research hypothesis for this study stated that there would be a significant difference in the accuracy of the three models and the Multiplicative Weight Update system. The SVMC model had an accuracy of 77.88%. The CNN model had an accuracy of 85.30%. The Logistic Regression model had an accuracy of 79.09%. Using Multiplicative Weight Update, the algorithm received an accuracy of 72.27%. The final conclusion that was drawn was that there was a significant difference in the accuracy of the three models and the Multiplicative Weight Update system. The conclusion was made that using a CNN model would be the best option for this problem rather than a Multiplicative Weight Update system. This is due to the possibility that Multiplicative Weight Update is not effective in a binary setting where there are only two possible classifications. In a categorical setting with multiple classes and groupings, a Multiplicative Weight Update system might become more proficient as it takes into account the strengths of multiple different models to classify images into multiple categories rather than only two categories, as shown in this study. This experimentation and computer science project can help to create better algorithms and models for the future of artificial intelligence in the medical imaging field.

Keywords: artificial intelligence, machine learning, multiplicative weight update, skin cancer

Procedia PDF Downloads 80
53 Preliminary Study of Hand Gesture Classification in Upper-Limb Prosthetics Using Machine Learning with EMG Signals

Authors: Linghui Meng, James Atlas, Deborah Munro

Abstract:

There is an increasing demand for prosthetics capable of mimicking natural limb movements and hand gestures, but precise movement control of prosthetics using only electrode signals continues to be challenging. This study considers the implementation of machine learning as a means of improving accuracy and presents an initial investigation into hand gesture recognition using models based on electromyographic (EMG) signals. EMG signals, which capture muscle activity, are used as inputs to machine learning algorithms to improve prosthetic control accuracy, functionality and adaptivity. Using logistic regression, a machine learning classifier, this study evaluates the accuracy of classifying two hand gestures from the publicly available Ninapro dataset using two-time series feature extraction algorithms: Time Series Feature Extraction (TSFE) and Convolutional Neural Networks (CNNs). Trials were conducted using varying numbers of EMG channels from one to eight to determine the impact of channel quantity on classification accuracy. The results suggest that although both algorithms can successfully distinguish between hand gesture EMG signals, CNNs outperform TSFE in extracting useful information for both accuracy and computational efficiency. In addition, although more channels of EMG signals provide more useful information, they also require more complex and computationally intensive feature extractors and consequently do not perform as well as lower numbers of channels. The findings also underscore the potential of machine learning techniques in developing more effective and adaptive prosthetic control systems.

Keywords: EMG, machine learning, prosthetic control, electromyographic prosthetics, hand gesture classification, CNN, computational neural networks, TSFE, time series feature extraction, channel count, logistic regression, ninapro, classifiers

Procedia PDF Downloads 38
52 Innovative Screening Tool Based on Physical Properties of Blood

Authors: Basant Singh Sikarwar, Mukesh Roy, Ayush Goyal, Priya Ranjan

Abstract:

This work combines two bodies of knowledge which includes biomedical basis of blood stain formation and fluid communities’ wisdom that such formation of blood stain depends heavily on physical properties. Moreover biomedical research tells that different patterns in stains of blood are robust indicator of blood donor’s health or lack thereof. Based on these valuable insights an innovative screening tool is proposed which can act as an aide in the diagnosis of diseases such Anemia, Hyperlipidaemia, Tuberculosis, Blood cancer, Leukemia, Malaria etc., with enhanced confidence in the proposed analysis. To realize this powerful technique, simple, robust and low-cost micro-fluidic devices, a micro-capillary viscometer and a pendant drop tensiometer are designed and proposed to be fabricated to measure the viscosity, surface tension and wettability of various blood samples. Once prognosis and diagnosis data has been generated, automated linear and nonlinear classifiers have been applied into the automated reasoning and presentation of results. A support vector machine (SVM) classifies data on a linear fashion. Discriminant analysis and nonlinear embedding’s are coupled with nonlinear manifold detection in data and detected decisions are made accordingly. In this way, physical properties can be used, using linear and non-linear classification techniques, for screening of various diseases in humans and cattle. Experiments are carried out to validate the physical properties measurement devices. This framework can be further developed towards a real life portable disease screening cum diagnostics tool. Small-scale production of screening cum diagnostic devices is proposed to carry out independent test.

Keywords: blood, physical properties, diagnostic, nonlinear, classifier, device, surface tension, viscosity, wettability

Procedia PDF Downloads 376
51 Selection of Optimal Reduced Feature Sets of Brain Signal Analysis Using Heuristically Optimized Deep Autoencoder

Authors: Souvik Phadikar, Nidul Sinha, Rajdeep Ghosh

Abstract:

In brainwaves research using electroencephalogram (EEG) signals, finding the most relevant and effective feature set for identification of activities in the human brain is a big challenge till today because of the random nature of the signals. The feature extraction method is a key issue to solve this problem. Finding those features that prove to give distinctive pictures for different activities and similar for the same activities is very difficult, especially for the number of activities. The performance of a classifier accuracy depends on this quality of feature set. Further, more number of features result in high computational complexity and less number of features compromise with the lower performance. In this paper, a novel idea of the selection of optimal feature set using a heuristically optimized deep autoencoder is presented. Using various feature extraction methods, a vast number of features are extracted from the EEG signals and fed to the autoencoder deep neural network. The autoencoder encodes the input features into a small set of codes. To avoid the gradient vanish problem and normalization of the dataset, a meta-heuristic search algorithm is used to minimize the mean square error (MSE) between encoder input and decoder output. To reduce the feature set into a smaller one, 4 hidden layers are considered in the autoencoder network; hence it is called Heuristically Optimized Deep Autoencoder (HO-DAE). In this method, no features are rejected; all the features are combined into the response of responses of the hidden layer. The results reveal that higher accuracy can be achieved using optimal reduced features. The proposed HO-DAE is also compared with the regular autoencoder to test the performance of both. The performance of the proposed method is validated and compared with the other two methods recently reported in the literature, which reveals that the proposed method is far better than the other two methods in terms of classification accuracy.

Keywords: autoencoder, brainwave signal analysis, electroencephalogram, feature extraction, feature selection, optimization

Procedia PDF Downloads 114
50 A Proposed Optimized and Efficient Intrusion Detection System for Wireless Sensor Network

Authors: Abdulaziz Alsadhan, Naveed Khan

Abstract:

In recent years intrusions on computer network are the major security threat. Hence, it is important to impede such intrusions. The hindrance of such intrusions entirely relies on its detection, which is primary concern of any security tool like Intrusion Detection System (IDS). Therefore, it is imperative to accurately detect network attack. Numerous intrusion detection techniques are available but the main issue is their performance. The performance of IDS can be improved by increasing the accurate detection rate and reducing false positive. The existing intrusion detection techniques have the limitation of usage of raw data set for classification. The classifier may get jumble due to redundancy, which results incorrect classification. To minimize this problem, Principle Component Analysis (PCA), Linear Discriminant Analysis (LDA), and Local Binary Pattern (LBP) can be applied to transform raw features into principle features space and select the features based on their sensitivity. Eigen values can be used to determine the sensitivity. To further classify, the selected features greedy search, back elimination, and Particle Swarm Optimization (PSO) can be used to obtain a subset of features with optimal sensitivity and highest discriminatory power. These optimal feature subset used to perform classification. For classification purpose, Support Vector Machine (SVM) and Multilayer Perceptron (MLP) used due to its proven ability in classification. The Knowledge Discovery and Data mining (KDD’99) cup dataset was considered as a benchmark for evaluating security detection mechanisms. The proposed approach can provide an optimal intrusion detection mechanism that outperforms the existing approaches and has the capability to minimize the number of features and maximize the detection rates.

Keywords: Particle Swarm Optimization (PSO), Principle Component Analysis (PCA), Linear Discriminant Analysis (LDA), Local Binary Pattern (LBP), Support Vector Machine (SVM), Multilayer Perceptron (MLP)

Procedia PDF Downloads 367
49 Using Machine Learning to Classify Different Body Parts and Determine Healthiness

Authors: Zachary Pan

Abstract:

Our general mission is to solve the problem of classifying images into different body part types and deciding if each of them is healthy or not. However, for now, we will determine healthiness for only one-sixth of the body parts, specifically the chest. We will detect pneumonia in X-ray scans of those chest images. With this type of AI, doctors can use it as a second opinion when they are taking CT or X-ray scans of their patients. Another ad-vantage of using this machine learning classifier is that it has no human weaknesses like fatigue. The overall ap-proach to this problem is to split the problem into two parts: first, classify the image, then determine if it is healthy. In order to classify the image into a specific body part class, the body parts dataset must be split into test and training sets. We can then use many models, like neural networks or logistic regression models, and fit them using the training set. Now, using the test set, we can obtain a realistic accuracy the models will have on images in the real world since these testing images have never been seen by the models before. In order to increase this testing accuracy, we can also apply many complex algorithms to the models, like multiplicative weight update. For the second part of the problem, to determine if the body part is healthy, we can have another dataset consisting of healthy and non-healthy images of the specific body part and once again split that into the test and training sets. We then use another neural network to train on those training set images and use the testing set to figure out its accuracy. We will do this process only for the chest images. A major conclusion reached is that convolutional neural networks are the most reliable and accurate at image classification. In classifying the images, the logistic regression model, the neural network, neural networks with multiplicative weight update, neural networks with the black box algorithm, and the convolutional neural network achieved 96.83 percent accuracy, 97.33 percent accuracy, 97.83 percent accuracy, 96.67 percent accuracy, and 98.83 percent accuracy, respectively. On the other hand, the overall accuracy of the model that de-termines if the images are healthy or not is around 78.37 percent accuracy.

Keywords: body part, healthcare, machine learning, neural networks

Procedia PDF Downloads 109
48 Data Mining Model for Predicting the Status of HIV Patients during Drug Regimen Change

Authors: Ermias A. Tegegn, Million Meshesha

Abstract:

Human Immunodeficiency Virus and Acquired Immunodeficiency Syndrome (HIV/AIDS) is a major cause of death for most African countries. Ethiopia is one of the seriously affected countries in sub Saharan Africa. Previously in Ethiopia, having HIV/AIDS was almost equivalent to a death sentence. With the introduction of Antiretroviral Therapy (ART), HIV/AIDS has become chronic, but manageable disease. The study focused on a data mining technique to predict future living status of HIV/AIDS patients at the time of drug regimen change when the patients become toxic to the currently taking ART drug combination. The data is taken from University of Gondar Hospital ART program database. Hybrid methodology is followed to explore the application of data mining on ART program dataset. Data cleaning, handling missing values and data transformation were used for preprocessing the data. WEKA 3.7.9 data mining tools, classification algorithms, and expertise are utilized as means to address the research problem. By using four different classification algorithms, (i.e., J48 Classifier, PART rule induction, Naïve Bayes and Neural network) and by adjusting their parameters thirty-two models were built on the pre-processed University of Gondar ART program dataset. The performances of the models were evaluated using the standard metrics of accuracy, precision, recall, and F-measure. The most effective model to predict the status of HIV patients with drug regimen substitution is pruned J48 decision tree with a classification accuracy of 98.01%. This study extracts interesting attributes such as Ever taking Cotrim, Ever taking TbRx, CD4 count, Age, Weight, and Gender so as to predict the status of drug regimen substitution. The outcome of this study can be used as an assistant tool for the clinician to help them make more appropriate drug regimen substitution. Future research directions are forwarded to come up with an applicable system in the area of the study.

Keywords: HIV drug regimen, data mining, hybrid methodology, predictive model

Procedia PDF Downloads 142
47 Speech Emotion Recognition: A DNN and LSTM Comparison in Single and Multiple Feature Application

Authors: Thiago Spilborghs Bueno Meyer, Plinio Thomaz Aquino Junior

Abstract:

Through speech, which privileges the functional and interactive nature of the text, it is possible to ascertain the spatiotemporal circumstances, the conditions of production and reception of the discourse, the explicit purposes such as informing, explaining, convincing, etc. These conditions allow bringing the interaction between humans closer to the human-robot interaction, making it natural and sensitive to information. However, it is not enough to understand what is said; it is necessary to recognize emotions for the desired interaction. The validity of the use of neural networks for feature selection and emotion recognition was verified. For this purpose, it is proposed the use of neural networks and comparison of models, such as recurrent neural networks and deep neural networks, in order to carry out the classification of emotions through speech signals to verify the quality of recognition. It is expected to enable the implementation of robots in a domestic environment, such as the HERA robot from the RoboFEI@Home team, which focuses on autonomous service robots for the domestic environment. Tests were performed using only the Mel-Frequency Cepstral Coefficients, as well as tests with several characteristics of Delta-MFCC, spectral contrast, and the Mel spectrogram. To carry out the training, validation and testing of the neural networks, the eNTERFACE’05 database was used, which has 42 speakers from 14 different nationalities speaking the English language. The data from the chosen database are videos that, for use in neural networks, were converted into audios. It was found as a result, a classification of 51,969% of correct answers when using the deep neural network, when the use of the recurrent neural network was verified, with the classification with accuracy equal to 44.09%. The results are more accurate when only the Mel-Frequency Cepstral Coefficients are used for the classification, using the classifier with the deep neural network, and in only one case, it is possible to observe a greater accuracy by the recurrent neural network, which occurs in the use of various features and setting 73 for batch size and 100 training epochs.

Keywords: emotion recognition, speech, deep learning, human-robot interaction, neural networks

Procedia PDF Downloads 171
46 Heart Rate Variability Analysis for Early Stage Prediction of Sudden Cardiac Death

Authors: Reeta Devi, Hitender Kumar Tyagi, Dinesh Kumar

Abstract:

In present scenario, cardiovascular problems are growing challenge for researchers and physiologists. As heart disease have no geographic, gender or socioeconomic specific reasons; detecting cardiac irregularities at early stage followed by quick and correct treatment is very important. Electrocardiogram is the finest tool for continuous monitoring of heart activity. Heart rate variability (HRV) is used to measure naturally occurring oscillations between consecutive cardiac cycles. Analysis of this variability is carried out using time domain, frequency domain and non-linear parameters. This paper presents HRV analysis of the online dataset for normal sinus rhythm (taken as healthy subject) and sudden cardiac death (SCD subject) using all three methods computing values for parameters like standard deviation of node to node intervals (SDNN), square root of mean of the sequences of difference between adjacent RR intervals (RMSSD), mean of R to R intervals (mean RR) in time domain, very low-frequency (VLF), low-frequency (LF), high frequency (HF) and ratio of low to high frequency (LF/HF ratio) in frequency domain and Poincare plot for non linear analysis. To differentiate HRV of healthy subject from subject died with SCD, k –nearest neighbor (k-NN) classifier has been used because of its high accuracy. Results show highly reduced values for all stated parameters for SCD subjects as compared to healthy ones. As the dataset used for SCD patients is recording of their ECG signal one hour prior to their death, it is therefore, verified with an accuracy of 95% that proposed algorithm can identify mortality risk of a patient one hour before its death. The identification of a patient’s mortality risk at such an early stage may prevent him/her meeting sudden death if in-time and right treatment is given by the doctor.

Keywords: early stage prediction, heart rate variability, linear and non-linear analysis, sudden cardiac death

Procedia PDF Downloads 342
45 Sound Analysis of Young Broilers Reared under Different Stocking Densities in Intensive Poultry Farming

Authors: Xiaoyang Zhao, Kaiying Wang

Abstract:

The choice of stocking density in poultry farming is a potential way for determining welfare level of poultry. However, it is difficult to measure stocking densities in poultry farming because of a lot of variables such as species, age and weight, feeding way, house structure and geographical location in different broiler houses. A method was proposed in this paper to measure the differences of young broilers reared under different stocking densities by sound analysis. Vocalisations of broilers were recorded and analysed under different stocking densities to identify the relationship between sounds and stocking densities. Recordings were made continuously for three-week-old chickens in order to evaluate the variation of sounds emitted by the animals at the beginning. The experimental trial was carried out in an indoor reared broiler farm; the audio recording procedures lasted for 5 days. Broilers were divided into 5 groups, stocking density treatments were 8/m², 10/m², 12/m² (96birds/pen), 14/m² and 16/m², all conditions including ventilation and feed conditions were kept same except from stocking densities in every group. The recordings and analysis of sounds of chickens were made noninvasively. Sound recordings were manually analysed and labelled using sound analysis software: GoldWave Digital Audio Editor. After sound acquisition process, the Mel Frequency Cepstrum Coefficients (MFCC) was extracted from sound data, and the Support Vector Machine (SVM) was used as an early detector and classifier. This preliminary study, conducted in an indoor reared broiler farm shows that this method can be used to classify sounds of chickens under different densities economically (only a cheap microphone and recorder can be used), the classification accuracy is 85.7%. This method can predict the optimum stocking density of broilers with the complement of animal welfare indicators, animal productive indicators and so on.

Keywords: broiler, stocking density, poultry farming, sound monitoring, Mel Frequency Cepstrum Coefficients (MFCC), Support Vector Machine (SVM)

Procedia PDF Downloads 162
44 Lithuanian Sign Language Literature: Metaphors at the Phonological Level

Authors: Anželika Teresė

Abstract:

In order to solve issues in sign language linguistics, address matters pertaining to maintaining high quality of sign language (SL) translation, contribute to dispelling misconceptions about SL and deaf people, and raise awareness and understanding of the deaf community heritage, this presentation discusses literature in Lithuanian Sign Language (LSL) and inherent metaphors that are created by using the phonological parameter –handshape, location, movement, palm orientation and nonmanual features. The study covered in this presentation is twofold, involving both the micro-level analysis of metaphors in terms of phonological parameters as a sub-lexical feature and the macro-level analysis of the poetic context. Cognitive theories underlie research of metaphors in sign language literature in a range of SL. The study follows this practice. The presentation covers the qualitative analysis of 34 pieces of LSL literature. The analysis employs ELAN software widely used in SL research. The target is to examine how specific types of each phonological parameter are used for the creation of metaphors in LSL literature and what metaphors are created. The results of the study show that LSL literature employs a range of metaphors created by using classifier signs and by modifying the established signs. The study also reveals that LSL literature tends to create reference metaphors indicating status and power. As the study shows, LSL poets metaphorically encode status by encoding another meaning in the same sign, which results in creating double metaphors. The metaphor of identity has been determined. Notably, the poetic context has revealed that the latter metaphor can also be identified as a metaphor for life. The study goes on to note that deaf poets create metaphors related to the importance of various phenomena significance of the lyrical subject. Notably, the study has allowed detecting locations, nonmanual features and etc., never mentioned in previous SL research as used for the creation of metaphors.

Keywords: Lithuanian sign language, sign language literature, sign language metaphor, metaphor at the phonological level, cognitive linguistics

Procedia PDF Downloads 137
43 Resisting Adversarial Assaults: A Model-Agnostic Autoencoder Solution

Authors: Massimo Miccoli, Luca Marangoni, Alberto Aniello Scaringi, Alessandro Marceddu, Alessandro Amicone

Abstract:

The susceptibility of deep neural networks (DNNs) to adversarial manipulations is a recognized challenge within the computer vision domain. Adversarial examples, crafted by adding subtle yet malicious alterations to benign images, exploit this vulnerability. Various defense strategies have been proposed to safeguard DNNs against such attacks, stemming from diverse research hypotheses. Building upon prior work, our approach involves the utilization of autoencoder models. Autoencoders, a type of neural network, are trained to learn representations of training data and reconstruct inputs from these representations, typically minimizing reconstruction errors like mean squared error (MSE). Our autoencoder was trained on a dataset of benign examples; learning features specific to them. Consequently, when presented with significantly perturbed adversarial examples, the autoencoder exhibited high reconstruction errors. The architecture of the autoencoder was tailored to the dimensions of the images under evaluation. We considered various image sizes, constructing models differently for 256x256 and 512x512 images. Moreover, the choice of the computer vision model is crucial, as most adversarial attacks are designed with specific AI structures in mind. To mitigate this, we proposed a method to replace image-specific dimensions with a structure independent of both dimensions and neural network models, thereby enhancing robustness. Our multi-modal autoencoder reconstructs the spectral representation of images across the red-green-blue (RGB) color channels. To validate our approach, we conducted experiments using diverse datasets and subjected them to adversarial attacks using models such as ResNet50 and ViT_L_16 from the torch vision library. The autoencoder extracted features used in a classification model, resulting in an MSE (RGB) of 0.014, a classification accuracy of 97.33%, and a precision of 99%.

Keywords: adversarial attacks, malicious images detector, binary classifier, multimodal transformer autoencoder

Procedia PDF Downloads 114
42 Electroencephalography Correlates of Memorability While Viewing Advertising Content

Authors: Victor N. Anisimov, Igor E. Serov, Ksenia M. Kolkova, Natalia V. Galkina

Abstract:

The problem of memorability of the advertising content is closely connected with the key issues of neuromarketing. The memorability of the advertising content contributes to the marketing effectiveness of the promoted product. Significant directions of studying the phenomenon of memorability are the memorability of the brand (detected through the memorability of the logo) and the memorability of the product offer (detected through the memorization of dynamic audiovisual advertising content - commercial). The aim of this work is to reveal the predictors of memorization of static and dynamic audiovisual stimuli (logos and commercials). An important direction of the research was revealing differences in psychophysiological correlates of memorability between static and dynamic audiovisual stimuli. We assumed that static and dynamic images are perceived in different ways and may have a difference in the memorization process. Objective methods of recording psychophysiological parameters while watching static and dynamic audiovisual materials are well suited to achieve the aim. The electroencephalography (EEG) method was performed with the aim of identifying correlates of the memorability of various stimuli in the electrical activity of the cerebral cortex. All stimuli (in the groups of statics and dynamics separately) were divided into 2 groups – remembered and not remembered based on the results of the questioning method. The questionnaires were filled out by survey participants after viewing the stimuli not immediately, but after a time interval (for detecting stimuli recorded through long-term memorization). Using statistical method, we developed the classifier (statistical model) that predicts which group (remembered or not remembered) stimuli gets, based on psychophysiological perception. The result of the statistical model was compared with the results of the questionnaire. Conclusions: Predictors of the memorability of static and dynamic stimuli have been identified, which allows prediction of which stimuli will have a higher probability of remembering. Further developments of this study will be the creation of stimulus memory model with the possibility of recognizing the stimulus as previously seen or new. Thus, in the process of remembering the stimulus, it is planned to take into account the stimulus recognition factor, which is one of the most important tasks for neuromarketing.

Keywords: memory, commercials, neuromarketing, EEG, branding

Procedia PDF Downloads 251
41 Identification and Classification of Medicinal Plants of Indian Himalayan Region Using Hyperspectral Remote Sensing and Machine Learning Techniques

Authors: Kishor Chandra Kandpal, Amit Kumar

Abstract:

The Indian Himalaya region harbours approximately 1748 plants of medicinal importance, and as per International Union for Conservation of Nature (IUCN), the 112 plant species among these are threatened and endangered. To ease the pressure on these plants, the government of India is encouraging its in-situ cultivation. The Saussurea costus, Valeriana jatamansi, and Picrorhiza kurroa have also been prioritized for large scale cultivation owing to their market demand, conservation value and medicinal properties. These species are found from 1000 m to 4000 m elevation ranges in the Indian Himalaya. Identification of these plants in the field requires taxonomic skills, which is one of the major bottleneck in the conservation and management of these plants. In recent years, Hyperspectral remote sensing techniques have been precisely used for the discrimination of plant species with the help of their unique spectral signatures. In this background, a spectral library of the above 03 medicinal plants was prepared by collecting the spectral data using a handheld spectroradiometer (325 to 1075 nm) from farmer’s fields of Himachal Pradesh and Uttarakhand states of Indian Himalaya. The Random forest (RF) model was implied on the spectral data for the classification of the medicinal plants. The 80:20 standard split ratio was followed for training and validation of the RF model, which resulted in training accuracy of 84.39 % (kappa coefficient = 0.72) and testing accuracy of 85.29 % (kappa coefficient = 0.77). This RF classifier has identified green (555 to 598 nm), red (605 nm), and near-infrared (725 to 840 nm) wavelength regions suitable for the discrimination of these species. The findings of this study have provided a technique for rapid and onsite identification of the above medicinal plants in the field. This will also be a key input for the classification of hyperspectral remote sensing images for mapping of these species in farmer’s field on a regional scale. This is a pioneer study in the Indian Himalaya region for medicinal plants in which the applicability of hyperspectral remote sensing has been explored.

Keywords: himalaya, hyperspectral remote sensing, machine learning; medicinal plants, random forests

Procedia PDF Downloads 204
40 Geospatial Analysis for Predicting Sinkhole Susceptibility in Greene County, Missouri

Authors: Shishay Kidanu, Abdullah Alhaj

Abstract:

Sinkholes in the karst terrain of Greene County, Missouri, pose significant geohazards, imposing challenges on construction and infrastructure development, with potential threats to lives and property. To address these issues, understanding the influencing factors and modeling sinkhole susceptibility is crucial for effective mitigation through strategic changes in land use planning and practices. This study utilizes geographic information system (GIS) software to collect and process diverse data, including topographic, geologic, hydrogeologic, and anthropogenic information. Nine key sinkhole influencing factors, ranging from slope characteristics to proximity to geological structures, were carefully analyzed. The Frequency Ratio method establishes relationships between attribute classes of these factors and sinkhole events, deriving class weights to indicate their relative importance. Weighted integration of these factors is accomplished using the Analytic Hierarchy Process (AHP) and the Weighted Linear Combination (WLC) method in a GIS environment, resulting in a comprehensive sinkhole susceptibility index (SSI) model for the study area. Employing Jenk's natural break classifier method, the SSI values are categorized into five distinct sinkhole susceptibility zones: very low, low, moderate, high, and very high. Validation of the model, conducted through the Area Under Curve (AUC) and Sinkhole Density Index (SDI) methods, demonstrates a robust correlation with sinkhole inventory data. The prediction rate curve yields an AUC value of 74%, indicating a 74% validation accuracy. The SDI result further supports the success of the sinkhole susceptibility model. This model offers reliable predictions for the future distribution of sinkholes, providing valuable insights for planners and engineers in the formulation of development plans and land-use strategies. Its application extends to enhancing preparedness and minimizing the impact of sinkhole-related geohazards on both infrastructure and the community.

Keywords: sinkhole, GIS, analytical hierarchy process, frequency ratio, susceptibility, Missouri

Procedia PDF Downloads 74
39 Pattern Recognition Approach Based on Metabolite Profiling Using In vitro Cancer Cell Line

Authors: Amanina Iymia Jeffree, Reena Thriumani, Mohammad Iqbal Omar, Ammar Zakaria, Yumi Zuhanis Has-Yun Hashim, Ali Yeon Md Shakaff

Abstract:

Metabolite profiling is a strategy to be approached in the pattern recognition method focused on three types of cancer cell line that driving the most to death specifically lung, breast, and colon cancer. The purpose of this study was to discriminate the VOCs pattern among cancerous and control group based on metabolite profiling. The sampling was executed utilizing the cell culture technique. All culture flasks were incubated till 72 hours and data collection started after 24 hours. Every running sample took 24 minutes to be completed accordingly. The comparative metabolite patterns were identified by the implementation of headspace-solid phase micro-extraction (HS-SPME) sampling coupled with gas chromatography-mass spectrometry (GCMS). The optimizations of the main experimental variables such as oven temperature and time were evaluated by response surface methodology (RSM) to get the optimal condition. Volatiles were acknowledged through the National Institute of Standards and Technology (NIST) mass spectral database and retention time libraries. To improve the reliability of significance, it is of crucial importance to eliminate background noise which data from 3rd minutes to 17th minutes were selected for statistical analysis. Targeted metabolites, of which were annotated as known compounds with the peak area greater than 0.5 percent were highlighted and subsequently treated statistically. Volatiles produced contain hundreds to thousands of compounds; therefore, it will be optimized by chemometric analysis, such as principal component analysis (PCA) as a preliminary analysis before subjected to a pattern classifier for identification of VOC samples. The volatile organic compound profiling has shown to be significantly distinguished among cancerous and control group based on metabolite profiling.

Keywords: in vitro cancer cell line, metabolite profiling, pattern recognition, volatile organic compounds

Procedia PDF Downloads 368
38 An ANOVA-based Sequential Forward Channel Selection Framework for Brain-Computer Interface Application based on EEG Signals Driven by Motor Imagery

Authors: Forouzan Salehi Fergeni

Abstract:

Converting the movement intents of a person into commands for action employing brain signals like electroencephalogram signals is a brain-computer interface (BCI) system. When left or right-hand motions are imagined, different patterns of brain activity appear, which can be employed as BCI signals for control. To make better the brain-computer interface (BCI) structures, effective and accurate techniques for increasing the classifying precision of motor imagery (MI) based on electroencephalography (EEG) are greatly needed. Subject dependency and non-stationary are two features of EEG signals. So, EEG signals must be effectively processed before being used in BCI applications. In the present study, after applying an 8 to 30 band-pass filter, a car spatial filter is rendered for the purpose of denoising, and then, a method of analysis of variance is used to select more appropriate and informative channels from a category of a large number of different channels. After ordering channels based on their efficiencies, a sequential forward channel selection is employed to choose just a few reliable ones. Features from two domains of time and wavelet are extracted and shortlisted with the help of a statistical technique, namely the t-test. Finally, the selected features are classified with different machine learning and neural network classifiers being k-nearest neighbor, Probabilistic neural network, support-vector-machine, Extreme learning machine, decision tree, Multi-layer perceptron, and linear discriminant analysis with the purpose of comparing their performance in this application. Utilizing a ten-fold cross-validation approach, tests are performed on a motor imagery dataset found in the BCI competition III. Outcomes demonstrated that the SVM classifier got the greatest classification precision of 97% when compared to the other available approaches. The entire investigative findings confirm that the suggested framework is reliable and computationally effective for the construction of BCI systems and surpasses the existing methods.

Keywords: brain-computer interface, channel selection, motor imagery, support-vector-machine

Procedia PDF Downloads 52
37 Detection of Phoneme [S] Mispronounciation for Sigmatism Diagnosis in Adults

Authors: Michal Krecichwost, Zauzanna Miodonska, Pawel Badura

Abstract:

The diagnosis of sigmatism is mostly based on the observation of articulatory organs. It is, however, not always possible to precisely observe the vocal apparatus, in particular in the oral cavity of the patient. Speech processing can allow to objectify the therapy and simplify the verification of its progress. In the described study the methodology for classification of incorrectly pronounced phoneme [s] is proposed. The recordings come from adults. They were registered with the speech recorder at the sampling rate of 44.1 kHz and the resolution of 16 bit. The database of pathological and normative speech has been collected for the study including reference assessments provided by the speech therapy experts. Ten adult subjects were asked to simulate a certain type of stigmatism under the speech therapy expert supervision. In the recordings, the analyzed phone [s] was surrounded by vowels, viz: ASA, ESE, ISI, SPA, USU, YSY. Thirteen MFCC (mel-frequency cepstral coefficients) and RMS (root mean square) values are calculated within each frame being a part of the analyzed phoneme. Additionally, 3 fricative formants along with corresponding amplitudes are determined for the entire segment. In order to aggregate the information within the segment, the average value of each MFCC coefficient is calculated. All features of other types are aggregated by means of their 75th percentile. The proposed method of features aggregation reduces the size of the feature vector used in the classification. Binary SVM (support vector machine) classifier is employed at the phoneme recognition stage. The first group consists of pathological phones, while the other of the normative ones. The proposed feature vector yields classification sensitivity and specificity measures above 90% level in case of individual logo phones. The employment of a fricative formants-based information improves the sole-MFCC classification results average of 5 percentage points. The study shows that the employment of specific parameters for the selected phones improves the efficiency of pathology detection referred to the traditional methods of speech signal parameterization.

Keywords: computer-aided pronunciation evaluation, sibilants, sigmatism diagnosis, speech processing

Procedia PDF Downloads 284
36 Exploring Pre-Trained Automatic Speech Recognition Model HuBERT for Early Alzheimer’s Disease and Mild Cognitive Impairment Detection in Speech

Authors: Monica Gonzalez Machorro

Abstract:

Dementia is hard to diagnose because of the lack of early physical symptoms. Early dementia recognition is key to improving the living condition of patients. Speech technology is considered a valuable biomarker for this challenge. Recent works have utilized conventional acoustic features and machine learning methods to detect dementia in speech. BERT-like classifiers have reported the most promising performance. One constraint, nonetheless, is that these studies are either based on human transcripts or on transcripts produced by automatic speech recognition (ASR) systems. This research contribution is to explore a method that does not require transcriptions to detect early Alzheimer’s disease (AD) and mild cognitive impairment (MCI). This is achieved by fine-tuning a pre-trained ASR model for the downstream early AD and MCI tasks. To do so, a subset of the thoroughly studied Pitt Corpus is customized. The subset is balanced for class, age, and gender. Data processing also involves cropping the samples into 10-second segments. For comparison purposes, a baseline model is defined by training and testing a Random Forest with 20 extracted acoustic features using the librosa library implemented in Python. These are: zero-crossing rate, MFCCs, spectral bandwidth, spectral centroid, root mean square, and short-time Fourier transform. The baseline model achieved a 58% accuracy. To fine-tune HuBERT as a classifier, an average pooling strategy is employed to merge the 3D representations from audio into 2D representations, and a linear layer is added. The pre-trained model used is ‘hubert-large-ls960-ft’. Empirically, the number of epochs selected is 5, and the batch size defined is 1. Experiments show that our proposed method reaches a 69% balanced accuracy. This suggests that the linguistic and speech information encoded in the self-supervised ASR-based model is able to learn acoustic cues of AD and MCI.

Keywords: automatic speech recognition, early Alzheimer’s recognition, mild cognitive impairment, speech impairment

Procedia PDF Downloads 127
35 The Classification Performance in Parametric and Nonparametric Discriminant Analysis for a Class- Unbalanced Data of Diabetes Risk Groups

Authors: Lily Ingsrisawang, Tasanee Nacharoen

Abstract:

Introduction: The problems of unbalanced data sets generally appear in real world applications. Due to unequal class distribution, many research papers found that the performance of existing classifier tends to be biased towards the majority class. The k -nearest neighbors’ nonparametric discriminant analysis is one method that was proposed for classifying unbalanced classes with good performance. Hence, the methods of discriminant analysis are of interest to us in investigating misclassification error rates for class-imbalanced data of three diabetes risk groups. Objective: The purpose of this study was to compare the classification performance between parametric discriminant analysis and nonparametric discriminant analysis in a three-class classification application of class-imbalanced data of diabetes risk groups. Methods: Data from a healthy project for 599 staffs in a government hospital in Bangkok were obtained for the classification problem. The staffs were diagnosed into one of three diabetes risk groups: non-risk (90%), risk (5%), and diabetic (5%). The original data along with the variables; diabetes risk group, age, gender, cholesterol, and BMI was analyzed and bootstrapped up to 50 and 100 samples, 599 observations per sample, for additional estimation of misclassification error rate. Each data set was explored for the departure of multivariate normality and the equality of covariance matrices of the three risk groups. Both the original data and the bootstrap samples show non-normality and unequal covariance matrices. The parametric linear discriminant function, quadratic discriminant function, and the nonparametric k-nearest neighbors’ discriminant function were performed over 50 and 100 bootstrap samples and applied to the original data. In finding the optimal classification rule, the choices of prior probabilities were set up for both equal proportions (0.33: 0.33: 0.33) and unequal proportions with three choices of (0.90:0.05:0.05), (0.80: 0.10: 0.10) or (0.70, 0.15, 0.15). Results: The results from 50 and 100 bootstrap samples indicated that the k-nearest neighbors approach when k = 3 or k = 4 and the prior probabilities of {non-risk:risk:diabetic} as {0.90:0.05:0.05} or {0.80:0.10:0.10} gave the smallest error rate of misclassification. Conclusion: The k-nearest neighbors approach would be suggested for classifying a three-class-imbalanced data of diabetes risk groups.

Keywords: error rate, bootstrap, diabetes risk groups, k-nearest neighbors

Procedia PDF Downloads 436
34 Automated Feature Extraction and Object-Based Detection from High-Resolution Aerial Photos Based on Machine Learning and Artificial Intelligence

Authors: Mohammed Al Sulaimani, Hamad Al Manhi

Abstract:

With the development of Remote Sensing technology, the resolution of optical Remote Sensing images has greatly improved, and images have become largely available. Numerous detectors have been developed for detecting different types of objects. In the past few years, Remote Sensing has benefited a lot from deep learning, particularly Deep Convolution Neural Networks (CNNs). Deep learning holds great promise to fulfill the challenging needs of Remote Sensing and solving various problems within different fields and applications. The use of Unmanned Aerial Systems in acquiring Aerial Photos has become highly used and preferred by most organizations to support their activities because of their high resolution and accuracy, which make the identification and detection of very small features much easier than Satellite Images. And this has opened an extreme era of Deep Learning in different applications not only in feature extraction and prediction but also in analysis. This work addresses the capacity of Machine Learning and Deep Learning in detecting and extracting Oil Leaks from Flowlines (Onshore) using High-Resolution Aerial Photos which have been acquired by UAS fixed with RGB Sensor to support early detection of these leaks and prevent the company from the leak’s losses and the most important thing environmental damage. Here, there are two different approaches and different methods of DL have been demonstrated. The first approach focuses on detecting the Oil Leaks from the RAW Aerial Photos (not processed) using a Deep Learning called Single Shoot Detector (SSD). The model draws bounding boxes around the leaks, and the results were extremely good. The second approach focuses on detecting the Oil Leaks from the Ortho-mosaiced Images (Georeferenced Images) by developing three Deep Learning Models using (MaskRCNN, U-Net and PSP-Net Classifier). Then, post-processing is performed to combine the results of these three Deep Learning Models to achieve a better detection result and improved accuracy. Although there is a relatively small amount of datasets available for training purposes, the Trained DL Models have shown good results in extracting the extent of the Oil Leaks and obtaining excellent and accurate detection.

Keywords: GIS, remote sensing, oil leak detection, machine learning, aerial photos, unmanned aerial systems

Procedia PDF Downloads 34