Search results for: Kazakh speech dataset
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1937

Search results for: Kazakh speech dataset

647 A Multi-Science Study of Modern Synergetic War and Its Information Security Component

Authors: Alexander G. Yushchenko

Abstract:

From a multi-science point of view, we analyze threats to security resulting from globalization of international information space and information and communication aggression of Russia. A definition of Ruschism is formulated as an ideology supporting aggressive actions of modern Russia against the Euro-Atlantic community. Stages of the hybrid war Russia is leading against Ukraine are described, including the elements of subversive activity of the special services, the activation of the military phase and the gradual shift of the focus of confrontation to the realm of information and communication technologies. We reveal an emergence of a threat for democratic states resulting from the destabilizing impact of a target state’s mass media and social networks being exploited by Russian secret services under freedom-of-speech disguise. Thus, we underline the vulnerability of cyber- and information security of the network society in regard of hybrid war. We propose to define the latter a synergetic war. Our analysis is supported with a long-term qualitative monitoring of representation of top state officials on popular TV channels and Facebook. From the memetics point of view, we have detected a destructive psycho-information technology used by the Kremlin, a kind of information catastrophe, the essence of which is explained in detail. In the conclusion, a comprehensive plan for information protection of the public consciousness and mentality of Euro-Atlantic citizens from the aggression of the enemy is proposed.

Keywords: cyber and information security, hybrid war, psycho-information technology, synergetic war, Ruschism

Procedia PDF Downloads 134
646 Parkinson’s Disease Detection Analysis through Machine Learning Approaches

Authors: Muhtasim Shafi Kader, Fizar Ahmed, Annesha Acharjee

Abstract:

Machine learning and data mining are crucial in health care, as well as medical information and detection. Machine learning approaches are now being utilized to improve awareness of a variety of critical health issues, including diabetes detection, neuron cell tumor diagnosis, COVID 19 identification, and so on. Parkinson’s disease is basically a disease for our senior citizens in Bangladesh. Parkinson's Disease indications often seem progressive and get worst with time. People got affected trouble walking and communicating with the condition advances. Patients can also have psychological and social vagaries, nap problems, hopelessness, reminiscence loss, and weariness. Parkinson's disease can happen in both men and women. Though men are affected by the illness at a proportion that is around partial of them are women. In this research, we have to get out the accurate ML algorithm to find out the disease with a predictable dataset and the model of the following machine learning classifiers. Therefore, nine ML classifiers are secondhand to portion study to use machine learning approaches like as follows, Naive Bayes, Adaptive Boosting, Bagging Classifier, Decision Tree Classifier, Random Forest classifier, XBG Classifier, K Nearest Neighbor Classifier, Support Vector Machine Classifier, and Gradient Boosting Classifier are used.

Keywords: naive bayes, adaptive boosting, bagging classifier, decision tree classifier, random forest classifier, XBG classifier, k nearest neighbor classifier, support vector classifier, gradient boosting classifier

Procedia PDF Downloads 129
645 Vision-Based Daily Routine Recognition for Healthcare with Transfer Learning

Authors: Bruce X. B. Yu, Yan Liu, Keith C. C. Chan

Abstract:

We propose to record Activities of Daily Living (ADLs) of elderly people using a vision-based system so as to provide better assistive and personalization technologies. Current ADL-related research is based on data collected with help from non-elderly subjects in laboratory environments and the activities performed are predetermined for the sole purpose of data collection. To obtain more realistic datasets for the application, we recorded ADLs for the elderly with data collected from real-world environment involving real elderly subjects. Motivated by the need to collect data for more effective research related to elderly care, we chose to collect data in the room of an elderly person. Specifically, we installed Kinect, a vision-based sensor on the ceiling, to capture the activities that the elderly subject performs in the morning every day. Based on the data, we identified 12 morning activities that the elderly person performs daily. To recognize these activities, we created a HARELCARE framework to investigate into the effectiveness of existing Human Activity Recognition (HAR) algorithms and propose the use of a transfer learning algorithm for HAR. We compared the performance, in terms of accuracy, and training progress. Although the collected dataset is relatively small, the proposed algorithm has a good potential to be applied to all daily routine activities for healthcare purposes such as evidence-based diagnosis and treatment.

Keywords: daily activity recognition, healthcare, IoT sensors, transfer learning

Procedia PDF Downloads 132
644 Grammatical Forms and Functions in Selected Political Interviews of Nigerian Presidential Aspirants in 2015 General Election

Authors: Temitope Abiodun Balogun

Abstract:

Political interviews are one of the ways by which political office-seekers in Nigeria sell themselves to the electorates. Extant studies have examined the discourse of political interviews from conversational, philosophical, rhetorical, stylistic and pragmatic perspectives with insufficient attention paid to grammatical forms and communicative intentions of the interviews granted by the two presidential aspirants in the 2015 Nigerian general election. This study fills this scholarly gap to unmask their grammatical forms and communicative styles, intention and credibility. The paper adopts Halliday’s Systemic Functional Grammar, specifically interpersonal function coupled with Searle’s Model of Speech Acts Theory as a theoretical framework. A total of six interviews granted by the two presidential aspirants in media serve as the source of data. It is discovered that, in most cases, politicians’ communicative intention is to “pull-down” their political opponents. While declarative and interrogatives are simple, direct and straightforward, the intention is to condemn, lambast and castigate their opponents. This communicative style does not allow the general populace to decipher the political manifestoes of the political aspirants and the party they represent. The paper recommends that before Nigeria can boast of any sustainable growth and development, there is the need for her political office-seekers to adopt effective communication strategies and styles to unveil their intention and manifestoes so that electorates can evaluate their performance after their tenure of office.

Keywords: general election, grammatical forms and function, political interviews, presidential aspirants

Procedia PDF Downloads 161
643 Autism Disease Detection Using Transfer Learning Techniques: Performance Comparison between Central Processing Unit vs. Graphics Processing Unit Functions for Neural Networks

Authors: Mst Shapna Akter, Hossain Shahriar

Abstract:

Neural network approaches are machine learning methods used in many domains, such as healthcare and cyber security. Neural networks are mostly known for dealing with image datasets. While training with the images, several fundamental mathematical operations are carried out in the Neural Network. The operation includes a number of algebraic and mathematical functions, including derivative, convolution, and matrix inversion and transposition. Such operations require higher processing power than is typically needed for computer usage. Central Processing Unit (CPU) is not appropriate for a large image size of the dataset as it is built with serial processing. While Graphics Processing Unit (GPU) has parallel processing capabilities and, therefore, has higher speed. This paper uses advanced Neural Network techniques such as VGG16, Resnet50, Densenet, Inceptionv3, Xception, Mobilenet, XGBOOST-VGG16, and our proposed models to compare CPU and GPU resources. A system for classifying autism disease using face images of an autistic and non-autistic child was used to compare performance during testing. We used evaluation matrices such as Accuracy, F1 score, Precision, Recall, and Execution time. It has been observed that GPU runs faster than the CPU in all tests performed. Moreover, the performance of the Neural Network models in terms of accuracy increases on GPU compared to CPU.

Keywords: autism disease, neural network, CPU, GPU, transfer learning

Procedia PDF Downloads 118
642 Traffic Prediction with Raw Data Utilization and Context Building

Authors: Zhou Yang, Heli Sun, Jianbin Huang, Jizhong Zhao, Shaojie Qiao

Abstract:

Traffic prediction is essential in a multitude of ways in modern urban life. The researchers of earlier work in this domain carry out the investigation chiefly with two major focuses: (1) the accurate forecast of future values in multiple time series and (2) knowledge extraction from spatial-temporal correlations. However, two key considerations for traffic prediction are often missed: the completeness of raw data and the full context of the prediction timestamp. Concentrating on the two drawbacks of earlier work, we devise an approach that can address these issues in a two-phase framework. First, we utilize the raw trajectories to a greater extent through building a VLA table and data compression. We obtain the intra-trajectory features with graph-based encoding and the intertrajectory ones with a grid-based model and the technique of back projection that restore their surrounding high-resolution spatial-temporal environment. To the best of our knowledge, we are the first to study direct feature extraction from raw trajectories for traffic prediction and attempt the use of raw data with the least degree of reduction. In the prediction phase, we provide a broader context for the prediction timestamp by taking into account the information that are around it in the training dataset. Extensive experiments on several well-known datasets have verified the effectiveness of our solution that combines the strength of raw trajectory data and prediction context. In terms of performance, our approach surpasses several state-of-the-art methods for traffic prediction.

Keywords: traffic prediction, raw data utilization, context building, data reduction

Procedia PDF Downloads 128
641 Divergences in Interpreters’ Oral Interpretation among Pentecostal Churches: Sermonic Reflections

Authors: Rufus Olufemi Adebayo, Sylvia Phiwani Zulu

Abstract:

Interpreting in the setting of diverse language and multicultural congregants, is often understood as integrating the content of the message. Preaching, similar to any communication, takes seriously people’s multiple contexts. The one who provides the best insight into understanding “the other”, traditionally speaking could be an interpreter in a multilingual context. Nonetheless, there are reflections in the loss of spiritual communication, translation and interpretive dialogue. No matter how eloquent the preacher is, an interpreter can make or mere the sermon (speech). The sermon that the preacher preaches is not always the one the congregation hears from the interpreter. In other occurrences, however, interpreting can lead not only to distort messages but also to dissatisfied audiences and preacher being overshadowed by the pranks of the interpreter. Using qualitative methodology, this paper explores the challenges and the conventional assumptions about preachers’ interpreter as influenced by spirituality, culture, and language in empirical and theoretical perspectives. An emphasis on the bias translation and the basis of reality that suppresses or devalues the spiritual communication is examined. The result indicates that interpretation of the declaration of guilt, history of congregation, spirituality, attitudes, morals, customs, specific practices of a preacher, education, and the environment form an entangled and misinterpretation. The article concludes by re-examining these qualities and rearticulating them into a preliminary theory for practice, as distinguished from theory, which could possibly enhance the development of more sustainable multilingual interpretation in the South African Pentecostal churches.

Keywords: congregants, divergences, interpreting/translation, language & communication, sermon/preaching

Procedia PDF Downloads 166
640 Grey Wolf Optimization Technique for Predictive Analysis of Products in E-Commerce: An Adaptive Approach

Authors: Shital Suresh Borse, Vijayalaxmi Kadroli

Abstract:

E-commerce industries nowadays implement the latest AI, ML Techniques to improve their own performance and prediction accuracy. This helps to gain a huge profit from the online market. Ant Colony Optimization, Genetic algorithm, Particle Swarm Optimization, Neural Network & GWO help many e-commerce industries for up-gradation of their predictive performance. These algorithms are providing optimum results in various applications, such as stock price prediction, prediction of drug-target interaction & user ratings of similar products in e-commerce sites, etc. In this study, customer reviews will play an important role in prediction analysis. People showing much interest in buying a lot of services& products suggested by other customers. This ultimately increases net profit. In this work, a convolution neural network (CNN) is proposed which further is useful to optimize the prediction accuracy of an e-commerce website. This method shows that CNN is used to optimize hyperparameters of GWO algorithm using an appropriate coding scheme. Accurate model results are verified by comparing them to PSO results whose hyperparameters have been optimized by CNN in Amazon's customer review dataset. Here, experimental outcome proves that this proposed system using the GWO algorithm achieves superior execution in terms of accuracy, precision, recovery, etc. in prediction analysis compared to the existing systems.

Keywords: prediction analysis, e-commerce, machine learning, grey wolf optimization, particle swarm optimization, CNN

Procedia PDF Downloads 113
639 Accentuation Moods of Blaming Utterances in Egyptian Arabic: A Pragmatic Study of Prosodic Focus

Authors: Reda A. H. Mahmoud

Abstract:

This paper investigates the pragmatic meaning of prosodic focus through four accentuation moods of blaming utterances in Egyptian Arabic. Prosodic focus results in various pragmatic meanings when the speaker utters the same blaming expression in different emotional moods: the angry, the mocking, the frustrated, and the informative moods. The main objective of this study is to interpret the meanings of these four accentuation moods in relation to their illocutionary forces and pre-locutionary effects, the integrated features of prosodic focus (e.g., tone movement distributions, pitch accents, lengthening of vowels, deaccentuation of certain syllables/words, and tempo), and the consonance between the former prosodic features and certain lexico-grammatical components to communicate the intentions of the speaker. The data on blaming utterances has been collected via elicitation and pre-recorded material, and the selection of blaming utterances is based on the criteria of lexical and prosodic regularity to be processed and verified by three computer programs, Praat, Speech Analyzer, and Spectrogram Freeware. A dual pragmatic approach is established to interpret expressive blaming utterance and their lexico-grammatical distributions into intonational focus structure units. The pragmatic component of this approach explains the variable psychological attitudes through the expressions of blaming and their effects whereas the analysis of prosodic focus structure is used to describe the intonational contours of blaming utterances and other prosodic features. The study concludes that every accentuation mood has its different prosodic configuration which influences the listener’s interpretation of the pragmatic meanings of blaming utterances.

Keywords: pragmatics, pragmatic interpretation, prosody, prosodic focus

Procedia PDF Downloads 153
638 AutoML: Comprehensive Review and Application to Engineering Datasets

Authors: Parsa Mahdavi, M. Amin Hariri-Ardebili

Abstract:

The development of accurate machine learning and deep learning models traditionally demands hands-on expertise and a solid background to fine-tune hyperparameters. With the continuous expansion of datasets in various scientific and engineering domains, researchers increasingly turn to machine learning methods to unveil hidden insights that may elude classic regression techniques. This surge in adoption raises concerns about the adequacy of the resultant meta-models and, consequently, the interpretation of the findings. In response to these challenges, automated machine learning (AutoML) emerges as a promising solution, aiming to construct machine learning models with minimal intervention or guidance from human experts. AutoML encompasses crucial stages such as data preparation, feature engineering, hyperparameter optimization, and neural architecture search. This paper provides a comprehensive overview of the principles underpinning AutoML, surveying several widely-used AutoML platforms. Additionally, the paper offers a glimpse into the application of AutoML on various engineering datasets. By comparing these results with those obtained through classical machine learning methods, the paper quantifies the uncertainties inherent in the application of a single ML model versus the holistic approach provided by AutoML. These examples showcase the efficacy of AutoML in extracting meaningful patterns and insights, emphasizing its potential to revolutionize the way we approach and analyze complex datasets.

Keywords: automated machine learning, uncertainty, engineering dataset, regression

Procedia PDF Downloads 61
637 Multi-Atlas Segmentation Based on Dynamic Energy Model: Application to Brain MR Images

Authors: Jie Huo, Jonathan Wu

Abstract:

Segmentation of anatomical structures in medical images is essential for scientific inquiry into the complex relationships between biological structure and clinical diagnosis, treatment and assessment. As a method of incorporating the prior knowledge and the anatomical structure similarity between a target image and atlases, multi-atlas segmentation has been successfully applied in segmenting a variety of medical images, including the brain, cardiac, and abdominal images. The basic idea of multi-atlas segmentation is to transfer the labels in atlases to the coordinate of the target image by matching the target patch to the atlas patch in the neighborhood. However, this technique is limited by the pairwise registration between target image and atlases. In this paper, a novel multi-atlas segmentation approach is proposed by introducing a dynamic energy model. First, the target is mapped to each atlas image by minimizing the dynamic energy function, then the segmentation of target image is generated by weighted fusion based on the energy. The method is tested on MICCAI 2012 Multi-Atlas Labeling Challenge dataset which includes 20 target images and 15 atlases images. The paper also analyzes the influence of different parameters of the dynamic energy model on the segmentation accuracy and measures the dice coefficient by using different feature terms with the energy model. The highest mean dice coefficient obtained with the proposed method is 0.861, which is competitive compared with the recently published method.

Keywords: brain MRI segmentation, dynamic energy model, multi-atlas segmentation, energy minimization

Procedia PDF Downloads 336
636 GeneNet: Temporal Graph Data Visualization for Gene Nomenclature and Relationships

Authors: Jake Gonzalez, Tommy Dang

Abstract:

This paper proposes a temporal graph approach to visualize and analyze the evolution of gene relationships and nomenclature over time. An interactive web-based tool implements this temporal graph, enabling researchers to traverse a timeline and observe coupled dynamics in network topology and naming conventions. Analysis of a real human genomic dataset reveals the emergence of densely interconnected functional modules over time, representing groups of genes involved in key biological processes. For example, the antimicrobial peptide DEFA1A3 shows increased connections to related alpha-defensins involved in infection response. Tracking degree and betweenness centrality shifts over timeline iterations also quantitatively highlight the reprioritization of certain genes’ topological importance as knowledge advances. Examination of the CNR1 gene encoding the cannabinoid receptor CB1 demonstrates changing synonymous relationships and consolidating naming patterns over time, reflecting its unique functional role discovery. The integrated framework interconnecting these topological and nomenclature dynamics provides richer contextual insights compared to isolated analysis methods. Overall, this temporal graph approach enables a more holistic study of knowledge evolution to elucidate complex biology.

Keywords: temporal graph, gene relationships, nomenclature evolution, interactive visualization, biological insights

Procedia PDF Downloads 61
635 A Mathematical Agent-Based Model to Examine Two Patterns of Language Change

Authors: Gareth Baxter

Abstract:

We use a mathematical model of language change to examine two recently observed patterns of language change: one in which most speakers change gradually, following the mean of the community change, and one in which most individuals use predominantly one variant or another, and change rapidly if they change at all. The model is based on Croft’s Utterance Selection account of language change, which views language change as an evolutionary process, in which different variants (different ‘ways of saying the same thing’) compete for usage in a population of speakers. Language change occurs when a new variant replaces an older one as the convention within a given population. The present model extends a previous simpler model to include effects related to speaker aging and interspeaker variation in behaviour. The two patterns of individual change (one more centralized and the other more polarized) were recently observed in historical language changes, and it was further observed that slower changes were more associated with the centralized pattern, while quicker changes were more polarized. Our model suggests that the two patterns of change can be explained by different balances between the preference of speakers to use one variant over another and the degree of accommodation to (propensity to adapt towards) other speakers. The correlation with the rate of change appears naturally in our model, and results from the fact that both differential weighting of variants and the degree of accommodation affect the time for change to occur, while also determining the patterns of change. This work represents part of an ongoing effort to examine phenomena in language change through the use of mathematical models. This offers another way to evaluate qualitative explanations that cannot be practically tested (or cannot be tested at all) in a real-world, large-scale speech community.

Keywords: agent based modeling, cultural evolution, language change, social behavior modeling, social influence

Procedia PDF Downloads 235
634 Study on the Relationship between the Urban Geography and Urban Agglomeration to the Effects of Carbon Emissions

Authors: Peng-Shao Chen, Yen-Jong Chen

Abstract:

In recent years, global warming, the dramatic change in energy prices and the exhaustion of natural resources illustrated that energy-related topic cannot be ignored. Despite the relationship between the cities and CO₂ emissions has been extensively studied in recent years, little attention has been paid to differences in the geographical location of the city. However, the geographical climate has a great impact on lifestyle from city to city, such as the type of buildings, the major industry of the city, etc. Therefore, the paper instigates empirically the effects of kinds of urban factors and CO₂ emissions with consideration of the different geographic, climatic zones which cities are located. Using the regression model and a dataset of urban agglomeration in East Asia cities with over one million population, including 2005, 2010, and 2015 three years, the findings suggest that the impact of urban factors on CO₂ emissions vary with the latitude of the cities. Surprisingly, all kinds of urban factors, including the urban population, the share of GDP in service industry, per capita income, and others, have different level of impact on the cities locate in the tropical climate zone and temperate climate zone. The results of the study analyze the impact of different urban factors on CO₂ emissions in urban area with different geographical climate zones. These findings will be helpful for the formulation of relevant policies for urban planners and policy makers in different regions.

Keywords: carbon emissions, urban agglomeration, urban factor, urban geography

Procedia PDF Downloads 267
633 Burnout Recognition for Call Center Agents by Using Skin Color Detection with Hand Poses

Authors: El Sayed A. Sharara, A. Tsuji, K. Terada

Abstract:

Call centers have been expanding and they have influence on activation in various markets increasingly. A call center’s work is known as one of the most demanding and stressful jobs. In this paper, we propose the fatigue detection system in order to detect burnout of call center agents in the case of a neck pain and upper back pain. Our proposed system is based on the computer vision technique combined skin color detection with the Viola-Jones object detector. To recognize the gesture of hand poses caused by stress sign, the YCbCr color space is used to detect the skin color region including face and hand poses around the area related to neck ache and upper back pain. A cascade of clarifiers by Viola-Jones is used for face recognition to extract from the skin color region. The detection of hand poses is given by the evaluation of neck pain and upper back pain by using skin color detection and face recognition method. The system performance is evaluated using two groups of dataset created in the laboratory to simulate call center environment. Our call center agent burnout detection system has been implemented by using a web camera and has been processed by MATLAB. From the experimental results, our system achieved 96.3% for upper back pain detection and 94.2% for neck pain detection.

Keywords: call center agents, fatigue, skin color detection, face recognition

Procedia PDF Downloads 294
632 Quality Parameters of Offset Printing Wastewater

Authors: Kiurski S. Jelena, Kecić S. Vesna, Aksentijević M. Snežana

Abstract:

Samples of tap and wastewater were collected in three offset printing facilities in Novi Sad, Serbia. Ten physicochemical parameters were analyzed within all collected samples: pH, conductivity, m - alkalinity, p - alkalinity, acidity, carbonate concentration, hydrogen carbonate concentration, active oxygen content, chloride concentration and total alkali content. All measurements were conducted using the standard analytical and instrumental methods. Comparing the obtained results for tap water and wastewater, a clear quality difference was noticeable, since all physicochemical parameters were significantly higher within wastewater samples. The study also involves the application of simple linear regression analysis on the obtained dataset. By using software package ORIGIN 5 the pH value was mutually correlated with other physicochemical parameters. Based on the obtained values of Pearson coefficient of determination a strong positive correlation between chloride concentration and pH (r = -0.943), as well as between acidity and pH (r = -0.855) was determined. In addition, statistically significant difference was obtained only between acidity and chloride concentration with pH values, since the values of parameter F (247.634 and 182.536) were higher than Fcritical (5.59). In this way, results of statistical analysis highlighted the most influential parameter of water contamination in offset printing, in the form of acidity and chloride concentration. The results showed that variable dependence could be represented by the general regression model: y = a0 + a1x+ k, which further resulted with matching graphic regressions.

Keywords: pollution, printing industry, simple linear regression analysis, wastewater

Procedia PDF Downloads 235
631 A Bibliometric Analysis on Filter Bubble

Authors: Misbah Fatma, Anam Saiyeda

Abstract:

This analysis charts the introduction and expansion of research into the filter bubble phenomena over the last 10 years using a large dataset of academic publications. This bibliometric study demonstrates how interdisciplinary filter bubble research is. The identification of key authors and organizations leading the filter bubble study sheds information on collaborative networks and knowledge transfer. Relevant papers are organized based on themes including algorithmic bias, polarisation, social media, and ethical implications through a systematic examination of the literature. In order to shed light on how these patterns have changed over time, the study plots their historical history. The study also looks at how research is distributed globally, showing geographic patterns and discrepancies in scholarly output. The results of this bibliometric analysis let us fully comprehend the development and reach of filter bubble research. This study offers insights into the ongoing discussion surrounding information personalization and its implications for societal discourse, democratic participation, and the potential risks to an informed citizenry by exposing dominant themes, interdisciplinary collaborations, and geographic patterns. In order to solve the problems caused by filter bubbles and to advance a more diverse and inclusive information environment, this analysis is essential for scholars and researchers.

Keywords: bibliometric analysis, social media, social networking, algorithmic personalization, self-selection, content moderation policies and limited access to information, recommender system and polarization

Procedia PDF Downloads 118
630 Voice Liveness Detection Using Kolmogorov Arnold Networks

Authors: Arth J. Shah, Madhu R. Kamble

Abstract:

Voice biometric liveness detection is customized to certify an authentication process of the voice data presented is genuine and not a recording or synthetic voice. With the rise of deepfakes and other equivalently sophisticated spoofing generation techniques, it’s becoming challenging to ensure that the person on the other end is a live speaker or not. Voice Liveness Detection (VLD) system is a group of security measures which detect and prevent voice spoofing attacks. Motivated by the recent development of the Kolmogorov-Arnold Network (KAN) based on the Kolmogorov-Arnold theorem, we proposed KAN for the VLD task. To date, multilayer perceptron (MLP) based classifiers have been used for the classification tasks. We aim to capture not only the compositional structure of the model but also to optimize the values of univariate functions. This study explains the mathematical as well as experimental analysis of KAN for VLD tasks, thereby opening a new perspective for scientists to work on speech and signal processing-based tasks. This study emerges as a combination of traditional signal processing tasks and new deep learning models, which further proved to be a better combination for VLD tasks. The experiments are performed on the POCO and ASVSpoof 2017 V2 database. We used Constant Q-transform, Mel, and short-time Fourier transform (STFT) based front-end features and used CNN, BiLSTM, and KAN as back-end classifiers. The best accuracy is 91.26 % on the POCO database using STFT features with the KAN classifier. In the ASVSpoof 2017 V2 database, the lowest EER we obtained was 26.42 %, using CQT features and KAN as a classifier.

Keywords: Kolmogorov Arnold networks, multilayer perceptron, pop noise, voice liveness detection

Procedia PDF Downloads 40
629 Hand Gesture Recognition for Sign Language: A New Higher Order Fuzzy HMM Approach

Authors: Saad M. Darwish, Magda M. Madbouly, Murad B. Khorsheed

Abstract:

Sign Languages (SL) are the most accomplished forms of gestural communication. Therefore, their automatic analysis is a real challenge, which is interestingly implied to their lexical and syntactic organization levels. Hidden Markov models (HMM’s) have been used prominently and successfully in speech recognition and, more recently, in handwriting recognition. Consequently, they seem ideal for visual recognition of complex, structured hand gestures such as are found in sign language. In this paper, several results concerning static hand gesture recognition using an algorithm based on Type-2 Fuzzy HMM (T2FHMM) are presented. The features used as observables in the training as well as in the recognition phases are based on Singular Value Decomposition (SVD). SVD is an extension of Eigen decomposition to suit non-square matrices to reduce multi attribute hand gesture data to feature vectors. SVD optimally exposes the geometric structure of a matrix. In our approach, we replace the basic HMM arithmetic operators by some adequate Type-2 fuzzy operators that permits us to relax the additive constraint of probability measures. Therefore, T2FHMMs are able to handle both random and fuzzy uncertainties existing universally in the sequential data. Experimental results show that T2FHMMs can effectively handle noise and dialect uncertainties in hand signals besides a better classification performance than the classical HMMs. The recognition rate of the proposed system is 100% for uniform hand images and 86.21% for cluttered hand images.

Keywords: hand gesture recognition, hand detection, type-2 fuzzy logic, hidden Markov Model

Procedia PDF Downloads 462
628 Melanoma and Non-Melanoma, Skin Lesion Classification, Using a Deep Learning Model

Authors: Shaira L. Kee, Michael Aaron G. Sy, Myles Joshua T. Tan, Hezerul Abdul Karim, Nouar AlDahoul

Abstract:

Skin diseases are considered the fourth most common disease, with melanoma and non-melanoma skin cancer as the most common type of cancer in Caucasians. The alarming increase in Skin Cancer cases shows an urgent need for further research to improve diagnostic methods, as early diagnosis can significantly improve the 5-year survival rate. Machine Learning algorithms for image pattern analysis in diagnosing skin lesions can dramatically increase the accuracy rate of detection and decrease possible human errors. Several studies have shown the diagnostic performance of computer algorithms outperformed dermatologists. However, existing methods still need improvements to reduce diagnostic errors and generate efficient and accurate results. Our paper proposes an ensemble method to classify dermoscopic images into benign and malignant skin lesions. The experiments were conducted using the International Skin Imaging Collaboration (ISIC) image samples. The dataset contains 3,297 dermoscopic images with benign and malignant categories. The results show improvement in performance with an accuracy of 88% and an F1 score of 87%, outperforming other existing models such as support vector machine (SVM), Residual network (ResNet50), EfficientNetB0, EfficientNetB4, and VGG16.

Keywords: deep learning - VGG16 - efficientNet - CNN – ensemble – dermoscopic images - melanoma

Procedia PDF Downloads 81
627 When Sex Matters: A Comparative Generalized Structural Equation Model (GSEM) for the Determinants of Stunting Amongst Under-fives in Uganda

Authors: Vallence Ngabo M., Leonard Atuhaire, Peter Clever Rutayisire

Abstract:

The main aim of this study was to establish the differences in both the determinants of stunting and the causal mechanism through which the identified determinants influence stunting amongst male and female under-fives in Uganda. Literature shows that male children below the age of five years are at a higher risk of being stunted than their female counterparts. Specifically, studies in Uganda indicate that being a male child is positively associated with stunting, while being a female is negatively associated with stunting. Data for 904 males and 829 females under-fives was extracted form UDHS-2016 survey dataset. Key variables for this study were identified and used in generating relevant models and paths. Structural equation modeling techniques were used in their generalized form (GSEM). The generalized nature necessitated specifying both the family and link functions for each response variable in the system of the model. The sex of the child (b4) was used as a grouping factor and the height for age (HAZ) scores were used to construct the status for stunting of under-fives. The estimated models and path clearly indicated that the set of underlying factors that influence male and female under-fives respectively was different and the path through which they influence stunting was different. However, some of the determinants that influenced stunting amongst male under-fives also influenced stunting amongst the female under-fives. To reduce the stunting problem to the desirable state, it is important to consider the multifaceted and complex nature of the risk factors that influence stunting amongst the under-fives but, more importantly, consider the different sex-specific factors and their causal mechanism or paths through which they influence stunting.

Keywords: stunting, underfives, sex of the child, GSEM, causal mechanism

Procedia PDF Downloads 140
626 Influence of Hearing Aids on Non-medically Treatable Deafness

Authors: Donatien Niragira

Abstract:

The progress of technology creates new expectations for patients. The world of deafness is no exception. In recent years, there have been considerable advances in the field of technologies aimed at assisting failing hearing. According to the usual medical vocabulary, hearing aids are actually orthotics. They do not replace an organ but compensate for a functional impairment. The Amplifier Hearing amplification is useful for a large number of people with hearing loss. Hearing aids restore speech audibility. However, their benefits vary depending on the quality of residual hearing. The hearing aid is not a "cure" for deafness. It cannot correct all affected hearing abilities. It should be considered as an aid to communication. The urge to judge from the audiogram alone should be resisted here, as audiometry only indicates the ability to detect non-verbal sounds. To prevent hearing aids from ending up in the drawer, it is important to ensure that the patient's disability situations justify the use of this type of orthosis. If the problems of receptive Pre-fitting counseling are crucial: the person with hearing loss must be informed of the advantages and disadvantages of amplification in his or her case. Their expectations must be realistic. They also need to be aware that the adaptation process requires a good deal of patience and perseverance. They should be informed about the various models and types of hearing aids, including all the aesthetic, functional and financial considerations. If the person's motivation "survives" pre-fitting counseling, we are in the presence of a good candidate for amplification. In addition to its relevance, it shows that the results found in this study significantly improve the quality of audibility in the patient, from where this technology must be made accessible everywhere in the world.

Keywords: auditives protheses, hearing, aids, no medicaly treatable deafnes

Procedia PDF Downloads 57
625 Unattended Crowdsensing Method to Monitor the Quality Condition of Dirt Roads

Authors: Matias Micheletto, Rodrigo Santos, Sergio F. Ochoa

Abstract:

In developing countries, the most roads in rural areas are dirt road. They require frequent maintenance since are affected by erosive events, such as rain or wind, and the transit of heavy-weight trucks and machinery. Early detection of damages on the road condition is a key aspect, since it allows to reduce the main-tenance time and cost, and also the limitations for other vehicles to travel through. Most proposals that help address this problem require the explicit participation of drivers, a permanent internet connection, or important instrumentation in vehicles or roads. These constraints limit the suitability of these proposals when applied into developing regions, like in Latin America. This paper proposes an alternative method, based on unattended crowdsensing, to determine the quality of dirt roads in rural areas. This method involves the use of a mobile application that complements the road condition surveys carried out by organizations in charge of the road network maintenance, giving them early warnings about road areas that could be requiring maintenance. Drivers can also take advantage of the early warnings while they move through these roads. The method was evaluated using information from a public dataset. Although they are preliminary, the results indicate the proposal is potentially suitable to provide awareness about dirt roads condition to drivers, transportation authority and road maintenance companies.

Keywords: dirt roads automatic quality assessment, collaborative system, unattended crowdsensing method, roads quality awareness provision

Procedia PDF Downloads 199
624 Teaching Children with Autism Spectrum Disorder Using Virtual Reality: Exploratory Study

Authors: Abdiwahab Guled

Abstract:

Autism spectrum disorder (ASD) is a neurodevelopmental disorder that emanates from a broad range of conditions, which affect the communication skills, social skills. It causes restrictive and repetitive behaviors to individuals. The number of children with ASD is an increasing prevalence around the world. Virtual reality (VR) is an assistive technology, which puts the learner in an immersive learning environment. It allows the learner to interact with that environment in a seemingly real or physical way using special electronic equipment, such as headsets. This exploratory study examines the potential benefits that VR may provide to improving the communication skills of children with ASD. Educating a child with ASD is challenging because access to services, resources, and support for autistic children is inadequate. Therefore, this study intends to investigate the challenges of teaching children with ASD and how VR might help teachers to improve the communication skills of these children with ASD. Online research and literature review were used as a method to gather previously published studies to identify the research gap and provide the groundwork for future studies. Results show that VR offers potential benefits to improving the communication skills of children with ASD but there is a gap in our understanding of the functionalities of all the features of VR technology and how we can utilize it to improve the communication skills of children with ASD. Communication is a broad subject and it is impossible for one study to evidently define the speech challenges of autistic children and provide an irrefutable solution. Therefore, this study proposes further research to dissect how can VR be used to improve the different communication challenges that impede the everyday functioning of autistic children.

Keywords: Autism spectrum disorder (ASD), autistic, Asperger, Disorder-Not Otherwise Specified (PDD-NOS), virtual reality (VR).

Procedia PDF Downloads 115
623 The Application of AI in Developing Assistive Technologies for Non-Verbal Individuals with Autism

Authors: Ferah Tesfaye Admasu

Abstract:

Autism Spectrum Disorder (ASD) often presents significant communication challenges, particularly for non-verbal individuals who struggle to express their needs and emotions effectively. Assistive technologies (AT) have emerged as vital tools in enhancing communication abilities for this population. Recent advancements in artificial intelligence (AI) hold the potential to revolutionize the design and functionality of these technologies. This study explores the application of AI in developing intelligent, adaptive, and user-centered assistive technologies for non-verbal individuals with autism. Through a review of current AI-driven tools, including speech-generating devices, predictive text systems, and emotion-recognition software, this research investigates how AI can bridge communication gaps, improve engagement, and support independence. Machine learning algorithms, natural language processing (NLP), and facial recognition technologies are examined as core components in creating more personalized and responsive communication aids. The study also discusses the challenges and ethical considerations involved in deploying AI-based AT, such as data privacy and the risk of over-reliance on technology. Findings suggest that integrating AI into assistive technologies can significantly enhance the quality of life for non-verbal individuals with autism, providing them with greater opportunities for social interaction and participation in daily activities. However, continued research and development are needed to ensure these technologies are accessible, affordable, and culturally sensitive.

Keywords: artificial intelligence, autism spectrum disorder, non-verbal communication, assistive technology, machine learning

Procedia PDF Downloads 19
622 A Machine Learning Approach for Earthquake Prediction in Various Zones Based on Solar Activity

Authors: Viacheslav Shkuratskyy, Aminu Bello Usman, Michael O’Dea, Saifur Rahman Sabuj

Abstract:

This paper examines relationships between solar activity and earthquakes; it applied machine learning techniques: K-nearest neighbour, support vector regression, random forest regression, and long short-term memory network. Data from the SILSO World Data Center, the NOAA National Center, the GOES satellite, NASA OMNIWeb, and the United States Geological Survey were used for the experiment. The 23rd and 24th solar cycles, daily sunspot number, solar wind velocity, proton density, and proton temperature were all included in the dataset. The study also examined sunspots, solar wind, and solar flares, which all reflect solar activity and earthquake frequency distribution by magnitude and depth. The findings showed that the long short-term memory network model predicts earthquakes more correctly than the other models applied in the study, and solar activity is more likely to affect earthquakes of lower magnitude and shallow depth than earthquakes of magnitude 5.5 or larger with intermediate depth and deep depth.

Keywords: k-nearest neighbour, support vector regression, random forest regression, long short-term memory network, earthquakes, solar activity, sunspot number, solar wind, solar flares

Procedia PDF Downloads 73
621 Velocity Profiles of Vowel Perception by Javanese and Sundanese English Language Learners

Authors: Arum Perwitasari

Abstract:

Learning L2 sounds is influenced by the first language (L1) sound system. This current study seeks to examine how the listeners with a different L1 vowel system perceive L2 sounds. The fact that English has a bigger number of vowel inventory than Javanese and Sundanese L1 might cause problems for Javanese and Sundanese English language learners perceiving English sounds. To reveal the L2 sound perception over time, we measured the mouse trajectories related to the hand movements made by Javanese and Sundanese language learners, two of Indonesian local languages. Do the Javanese and Sundanese listeners show higher velocity than the English listeners when they perceive English vowels which are similar and new to their L1 system? The study aims to map the patterns of real-time processing through compatible hand movements to reveal any uncertainties when making selections. The results showed that the Javanese listeners exhibited significantly slower velocity values than the English listeners for similar vowels /I, ɛ, ʊ/ in the 826-1200ms post stimulus. Unlike the Javanese, the Sundanese listeners showed slow velocity values except for similar vowel /ʊ/. For the perception of new vowels /i:, æ, ɜ:, ʌ, ɑː, u:, ɔ:/, the Javanese listeners showed slower velocity in making the lexical decision. In contrast, the Sundanese listeners showed slow velocity only for vowels /ɜ:, ɔ:, æ, I/ indicating that these vowels are hard to perceive. Our results fit well with the second language model representing how the L1 vowel system influences the L2 sound perception.

Keywords: velocity profiles, EFL learners, speech perception, experimental linguistics

Procedia PDF Downloads 217
620 Defect Classification of Hydrogen Fuel Pressure Vessels using Deep Learning

Authors: Dongju Kim, Youngjoo Suh, Hyojin Kim, Gyeongyeong Kim

Abstract:

Acoustic Emission Testing (AET) is widely used to test the structural integrity of an operational hydrogen storage container, and clustering algorithms are frequently used in pattern recognition methods to interpret AET results. However, the interpretation of AET results can vary from user to user as the tuning of the relevant parameters relies on the user's experience and knowledge of AET. Therefore, it is necessary to use a deep learning model to identify patterns in acoustic emission (AE) signal data that can be used to classify defects instead. In this paper, a deep learning-based model for classifying the types of defects in hydrogen storage tanks, using AE sensor waveforms, is proposed. As hydrogen storage tanks are commonly constructed using carbon fiber reinforced polymer composite (CFRP), a defect classification dataset is collected through a tensile test on a specimen of CFRP with an AE sensor attached. The performance of the classification model, using one-dimensional convolutional neural network (1-D CNN) and synthetic minority oversampling technique (SMOTE) data augmentation, achieved 91.09% accuracy for each defect. It is expected that the deep learning classification model in this paper, used with AET, will help in evaluating the operational safety of hydrogen storage containers.

Keywords: acoustic emission testing, carbon fiber reinforced polymer composite, one-dimensional convolutional neural network, smote data augmentation

Procedia PDF Downloads 93
619 Multimodal Deep Learning for Human Activity Recognition

Authors: Ons Slimene, Aroua Taamallah, Maha Khemaja

Abstract:

In recent years, human activity recognition (HAR) has been a key area of research due to its diverse applications. It has garnered increasing attention in the field of computer vision. HAR plays an important role in people’s daily lives as it has the ability to learn advanced knowledge about human activities from data. In HAR, activities are usually represented by exploiting different types of sensors, such as embedded sensors or visual sensors. However, these sensors have limitations, such as local obstacles, image-related obstacles, sensor unreliability, and consumer concerns. Recently, several deep learning-based approaches have been proposed for HAR and these approaches are classified into two categories based on the type of data used: vision-based approaches and sensor-based approaches. This research paper highlights the importance of multimodal data fusion from skeleton data obtained from videos and data generated by embedded sensors using deep neural networks for achieving HAR. We propose a deep multimodal fusion network based on a twostream architecture. These two streams use the Convolutional Neural Network combined with the Bidirectional LSTM (CNN BILSTM) to process skeleton data and data generated by embedded sensors and the fusion at the feature level is considered. The proposed model was evaluated on a public OPPORTUNITY++ dataset and produced a accuracy of 96.77%.

Keywords: human activity recognition, action recognition, sensors, vision, human-centric sensing, deep learning, context-awareness

Procedia PDF Downloads 101
618 Glaucoma Detection in Retinal Tomography Using the Vision Transformer

Authors: Sushish Baral, Pratibha Joshi, Yaman Maharjan

Abstract:

Glaucoma is a chronic eye condition that causes vision loss that is irreversible. Early detection and treatment are critical to prevent vision loss because it can be asymptomatic. For the identification of glaucoma, multiple deep learning algorithms are used. Transformer-based architectures, which use the self-attention mechanism to encode long-range dependencies and acquire extremely expressive representations, have recently become popular. Convolutional architectures, on the other hand, lack knowledge of long-range dependencies in the image due to their intrinsic inductive biases. The aforementioned statements inspire this thesis to look at transformer-based solutions and investigate the viability of adopting transformer-based network designs for glaucoma detection. Using retinal fundus images of the optic nerve head to develop a viable algorithm to assess the severity of glaucoma necessitates a large number of well-curated images. Initially, data is generated by augmenting ocular pictures. After that, the ocular images are pre-processed to make them ready for further processing. The system is trained using pre-processed images, and it classifies the input images as normal or glaucoma based on the features retrieved during training. The Vision Transformer (ViT) architecture is well suited to this situation, as it allows the self-attention mechanism to utilise structural modeling. Extensive experiments are run on the common dataset, and the results are thoroughly validated and visualized.

Keywords: glaucoma, vision transformer, convolutional architectures, retinal fundus images, self-attention, deep learning

Procedia PDF Downloads 191