Search results for: collecting speech emotion dataset
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2791

Search results for: collecting speech emotion dataset

2521 A Semantic Analysis of Modal Verbs in Barak Obama’s 2012 Presidential Campaign Speech

Authors: Kais A. Kadhim

Abstract:

This paper is a semantic analysis of the English modals in Obama’s speech. The main objective of this study is to analyze selected modal auxiliaries identified in selected speeches of Obama’s campaign based on Coates’ (1983) semantic clusters. A total of fifteen speeches of Obama’s campaign were selected as the primary data and the modal auxiliaries selected for analysis include will, would, can, could, should, must, ought, shall, may and might. All the modal auxiliaries taken from the speeches of Barack Obama were analyzed based on the framework of Coates’ semantic clusters. Such analytical framework was carried out to examine how modal auxiliaries are used in the context of persuading people in Obama’s campaign speeches. The findings reveal that modals of intention, prediction, futurity and modals of possibility, ability, permission are mostly used in Obama’s campaign speeches.

Keywords: modals, meaning, persuasion, speech

Procedia PDF Downloads 379
2520 Cross-Cultural Pragmatics: Apology Strategies by Libyans

Authors: Ahmed Elgadri

Abstract:

In the last thirty years, studies on cross-cultural pragmatics in general and apology strategies in specific have focused on western and East-Asian societies. A small volume of research has been conducted in investigating speech acts production by Arabic dialect speakers. Therefore, this study investigated the apology strategies used by Libyan Arabic speakers using an online Discourse Completion Task (DCT) questionnaire. The DCT consisted of six situations covering different social contexts. The survey was written in Libyan Arabic dialect to help generate vernacular speech as much as possible. The participants were 25 Libyan nationals, 12 females, and 13 males. Also, to get a deeper understanding of the motivation behind the use of certain strategies, the researcher interviewed four participants using the Libyan Arabic dialect as well. The results revealed a high use of IFID, offer of repair, and explanation. Although this might support the universality claim of speech acts strategies, it was clear that cultural norms and religion determined the choice of apology strategies significantly. This led to the discovery of new culture-specific strategies, as outlined later in this paper. This study gives an insight into politeness strategies in Libyan society, and it is hoped to contribute to the field of cross-cultural pragmatics.

Keywords: apologies, cross-cultural pragmatics, language and culture, Libyan Arabic, politeness, pragmatics, socio-pragmatics, speech acts

Procedia PDF Downloads 126
2519 An Investigation of the Effects of Emotional Experience Induction on Mirror Neurons System Activity with Regard to Spectrum of Depressive Symptoms

Authors: Elyas Akbari, Jafar Hasani, Newsha Dehestani, Mohammad Khaleghi, Alireza Moradi

Abstract:

The aim of the present study was to assess the effect of emotional experience induction in the mirror neurons systems (MNS) activity with regard to the spectrum of depressive symptoms. For this purpose, at first stage, 449 students of Kharazmi University of Tehran were selected randomly and completed the second version of the Beck Depression Inventory (BDI-II). Then, 36 students with standard Z-score equal or above +1.5 and equal or equal or below -1.5 were selected to construct two groups of high and low spectrum of depressive symptoms. In the next stage, the basic activity of MNS was recorded (mu wave) before presenting the positive and negative emotional video clips by Electroencephalography (EEG) technique. The findings related to emotion induction (neutral, negative and positive emotion) demonstrated that the activity of recorded mirror neuron areas had a significant difference between the depressive and non-depressive groups. These findings suggest that probably processing of negative emotions in depressive individuals is due to the idea that the mirror neurons in motor cortex matched up the activity of cognitive regions with the person’s schema. Considering the results of the present study, it could be said that the MNS provides a substrate where emotional disorders can be studied and evaluated.

Keywords: emotional experiences, mirror neurons, depressive symptoms, negative and positive emotion

Procedia PDF Downloads 333
2518 OPEN-EmoRec-II-A Multimodal Corpus of Human-Computer Interaction

Authors: Stefanie Rukavina, Sascha Gruss, Steffen Walter, Holger Hoffmann, Harald C. Traue

Abstract:

OPEN-EmoRecII is an open multimodal corpus with experimentally induced emotions. In the first half of the experiment, emotions were induced with standardized picture material and in the second half during a human-computer interaction (HCI), realized with a wizard-of-oz design. The induced emotions are based on the dimensional theory of emotions (valence, arousal and dominance). These emotional sequences - recorded with multimodal data (mimic reactions, speech, audio and physiological reactions) during a naturalistic-like HCI-environment one can improve classification methods on a multimodal level. This database is the result of an HCI-experiment, for which 30 subjects in total agreed to a publication of their data including the video material for research purposes. The now available open corpus contains sensory signal of: video, audio, physiology (SCL, respiration, BVP, EMG Corrugator supercilii, EMG Zygomaticus Major) and mimic annotations.

Keywords: open multimodal emotion corpus, annotated labels, intelligent interaction

Procedia PDF Downloads 386
2517 Design and Implementation a Platform for Adaptive Online Learning Based on Fuzzy Logic

Authors: Budoor Al Abid

Abstract:

Educational systems are increasingly provided as open online services, providing guidance and support for individual learners. To adapt the learning systems, a proper evaluation must be made. This paper builds the evaluation model Fuzzy C Means Adaptive System (FCMAS) based on data mining techniques to assess the difficulty of the questions. The following steps are implemented; first using a dataset from an online international learning system called (slepemapy.cz) the dataset contains over 1300000 records with 9 features for students, questions and answers information with feedback evaluation. Next, a normalization process as preprocessing step was applied. Then FCM clustering algorithms are used to adaptive the difficulty of the questions. The result is three cluster labeled data depending on the higher Wight (easy, Intermediate, difficult). The FCM algorithm gives a label to all the questions one by one. Then Random Forest (RF) Classifier model is constructed on the clustered dataset uses 70% of the dataset for training and 30% for testing; the result of the model is a 99.9% accuracy rate. This approach improves the Adaptive E-learning system because it depends on the student behavior and gives accurate results in the evaluation process more than the evaluation system that depends on feedback only.

Keywords: machine learning, adaptive, fuzzy logic, data mining

Procedia PDF Downloads 166
2516 Robust Features for Impulsive Noisy Speech Recognition Using Relative Spectral Analysis

Authors: Hajer Rahali, Zied Hajaiej, Noureddine Ellouze

Abstract:

The goal of speech parameterization is to extract the relevant information about what is being spoken from the audio signal. In speech recognition systems Mel-Frequency Cepstral Coefficients (MFCC) and Relative Spectral Mel-Frequency Cepstral Coefficients (RASTA-MFCC) are the two main techniques used. It will be shown in this paper that it presents some modifications to the original MFCC method. In our work the effectiveness of proposed changes to MFCC called Modified Function Cepstral Coefficients (MODFCC) were tested and compared against the original MFCC and RASTA-MFCC features. The prosodic features such as jitter and shimmer are added to baseline spectral features. The above-mentioned techniques were tested with impulsive signals under various noisy conditions within AURORA databases.

Keywords: auditory filter, impulsive noise, MFCC, prosodic features, RASTA filter

Procedia PDF Downloads 397
2515 A New Dual Forward Affine Projection Adaptive Algorithm for Speech Enhancement in Airplane Cockpits

Authors: Djendi Mohmaed

Abstract:

In this paper, we propose a dual adaptive algorithm, which is based on the combination between the forward blind source separation (FBSS) structure and the affine projection algorithm (APA). This proposed algorithm combines the advantages of the source separation properties of the FBSS structure and the fast convergence characteristics of the APA algorithm. The proposed algorithm needs two noisy observations to provide an enhanced speech signal. This process is done in a blind manner without the need for ant priori information about the source signals. The proposed dual forward blind source separation affine projection algorithm is denoted (DFAPA) and used for the first time in an airplane cockpit context to enhance the communication from- and to- the airplane. Intensive experiments were carried out in this sense to evaluate the performance of the proposed DFAPA algorithm.

Keywords: adaptive algorithm, speech enhancement, system mismatch, SNR

Procedia PDF Downloads 113
2514 Searching Linguistic Synonyms through Parts of Speech Tagging

Authors: Faiza Hussain, Usman Qamar

Abstract:

Synonym-based searching is recognized to be a complicated problem as text mining from unstructured data of web is challenging. Finding useful information which matches user need from bulk of web pages is a cumbersome task. In this paper, a novel and practical synonym retrieval technique is proposed for addressing this problem. For replacement of semantics, user intent is taken into consideration to realize the technique. Parts-of-Speech tagging is applied for pattern generation of the query and a thesaurus for this experiment was formed and used. Comparison with Non-Context Based Searching, Context Based searching proved to be a more efficient approach while dealing with linguistic semantics. This approach is very beneficial in doing intent based searching. Finally, results and future dimensions are presented.

Keywords: natural language processing, text mining, information retrieval, parts-of-speech tagging, grammar, semantics

Procedia PDF Downloads 281
2513 Hindi Speech Synthesis by Concatenation of Recognized Hand Written Devnagri Script Using Support Vector Machines Classifier

Authors: Saurabh Farkya, Govinda Surampudi

Abstract:

Optical Character Recognition is one of the current major research areas. This paper is focussed on recognition of Devanagari script and its sound generation. This Paper consists of two parts. First, Optical Character Recognition of Devnagari handwritten Script. Second, speech synthesis of the recognized text. This paper shows an implementation of support vector machines for the purpose of Devnagari Script recognition. The Support Vector Machines was trained with Multi Domain features; Transform Domain and Spatial Domain or Structural Domain feature. Transform Domain includes the wavelet feature of the character. Structural Domain consists of Distance Profile feature and Gradient feature. The Segmentation of the text document has been done in 3 levels-Line Segmentation, Word Segmentation, and Character Segmentation. The pre-processing of the characters has been done with the help of various Morphological operations-Otsu's Algorithm, Erosion, Dilation, Filtration and Thinning techniques. The Algorithm was tested on the self-prepared database, a collection of various handwriting. Further, Unicode was used to convert recognized Devnagari text into understandable computer document. The document so obtained is an array of codes which was used to generate digitized text and to synthesize Hindi speech. Phonemes from the self-prepared database were used to generate the speech of the scanned document using concatenation technique.

Keywords: Character Recognition (OCR), Text to Speech (TTS), Support Vector Machines (SVM), Library of Support Vector Machines (LIBSVM)

Procedia PDF Downloads 470
2512 Variation of Lexical Choice and Changing Need of Identity Expression

Authors: Thapasya J., Rajesh Kumar

Abstract:

Language plays complex roles in society. The previous studies on language and society explain their interconnected, complementary and complex interactions and, those studies were primarily focused on the variations in the language. Variation being the fundamental nature of languages, the question of personal and social identity navigated through language variation and established that there is an interconnection between language variation and identity. This paper analyses the sociolinguistic variation in language at the lexical level and how the lexical choice of the speaker(s) affects in shaping their identity. It obtains primary data from the lexicon of the Mappila dialect of Malayalam spoken by the members of Mappila (Muslim) community of Kerala. The variation in the lexical choice is analysed by collecting data from the speech samples of 15 minutes from four different age groups of Mappila dialect speakers. Various contexts were analysed and the frequency of borrowed words in each instance is calculated to reach a conclusion on how the variation is happening in the speech community. The paper shows how the lexical choice of the speakers could be socially motivated and involve in shaping and changing identities. Lexical items or vocabulary clearly signal the group identity and personal identity. Mappila dialect of Malayalam was rich in frequent use of borrowed words from Arabic, Persian and Urdu. There was a deliberate attempt to show their identity as a Mappila community member, which was derived from the socio-political situation during those days. This made a clear variation between the Mappila dialect and other dialects of Malayalam at the surface level, which was motivated to create and establish the identity of a person as the member of Mappila community. Historically, these kinds of linguistic variation were highly motivated because of the socio-political factors and, intertwined with the historical facts about the origin and spread of Islamism in the region; people from the Mappila community highly motivated to project their identity as a Mappila because of the social insecurities they had to face before accepting that religion. Thus the deliberate inclusion of Arabic, Persian and Urdu words in their speech helped in showing their identity. However, the socio-political situations and factors at the origin of Mappila community have been changed over a period of time. The social motivation for indicating their identity as a Mappila no longer exist and thus the frequency of borrowed words from Arabic, Persian and Urdu have been reduced from their speech. Apart from the religious terms, the borrowed words from these languages are very few at present. The analysis is carried out by the changes in the language of the people according to their age and found to have significant variations between generations and literacy plays a major role in this variation process. The need of projecting a specific identity of an individual would vary according to the change in the socio-political scenario and a variation in language can shape the identity in order to go with the varying socio-political situation in any language.

Keywords: borrowings, dialect, identity, lexical choice, literacy, variation

Procedia PDF Downloads 212
2511 A Proposed Approach for Emotion Lexicon Enrichment

Authors: Amr Mansour Mohsen, Hesham Ahmed Hassan, Amira M. Idrees

Abstract:

Document Analysis is an important research field that aims to gather the information by analyzing the data in documents. As one of the important targets for many fields is to understand what people actually want, sentimental analysis field has been one of the vital fields that are tightly related to the document analysis. This research focuses on analyzing text documents to classify each document according to its opinion. The aim of this research is to detect the emotions from text documents based on enriching the lexicon with adapting their content based on semantic patterns extraction. The proposed approach has been presented, and different experiments are applied by different perspectives to reveal the positive impact of the proposed approach on the classification results.

Keywords: document analysis, sentimental analysis, emotion detection, WEKA tool, NRC lexicon

Procedia PDF Downloads 400
2510 A Simple Adaptive Atomic Decomposition Voice Activity Detector Implemented by Matching Pursuit

Authors: Thomas Bryan, Veton Kepuska, Ivica Kostanic

Abstract:

A simple adaptive voice activity detector (VAD) is implemented using Gabor and gammatone atomic decomposition of speech for high Gaussian noise environments. Matching pursuit is used for atomic decomposition, and is shown to achieve optimal speech detection capability at high data compression rates for low signal to noise ratios. The most active dictionary elements found by matching pursuit are used for the signal reconstruction so that the algorithm adapts to the individual speakers dominant time-frequency characteristics. Speech has a high peak to average ratio enabling matching pursuit greedy heuristic of highest inner products to isolate high energy speech components in high noise environments. Gabor and gammatone atoms are both investigated with identical logarithmically spaced center frequencies, and similar bandwidths. The algorithm performs equally well for both Gabor and gammatone atoms with no significant statistical differences. The algorithm achieves 70% accuracy at a 0 dB SNR, 90% accuracy at a 5 dB SNR and 98% accuracy at a 20dB SNR using 30dB SNR as a reference for voice activity.

Keywords: atomic decomposition, gabor, gammatone, matching pursuit, voice activity detection

Procedia PDF Downloads 271
2509 Computerized Scoring System: A Stethoscope to Understand Consumer's Emotion through His or Her Feedback

Authors: Chen Yang, Jun Hu, Ping Li, Lili Xue

Abstract:

Most companies pay careful attention to consumer feedback collection, so it is popular to find the ‘feedback’ button of all kinds of mobile apps. Yet it is much more changeling to analyze these feedback texts and to catch the true feelings of a consumer regarding either a problem or a complimentary of consumers who hands out the feedback. Especially to the Chinese content, it is possible that; in one context the Chinese feedback expresses positive feedback, but in the other context, the same Chinese feedback may be a negative one. For example, in Chinese, the feedback 'operating with loudness' works well with both refrigerator and stereo system. Apparently, this feedback towards a refrigerator shows negative feedback; however, the same feedback is positive towards a stereo system. By introducing Bradley, M. and Lang, P.'s Affective Norms for English Text (ANET) theory and Bucci W.’s Referential Activity (RA) theory, we, usability researchers at Pingan, are able to decipher the feedback and to find the hidden feelings behind the content. We subtract 2 disciplines ‘valence’ and ‘dominance’ out of 3 of ANET and 2 disciplines ‘concreteness’ and ‘specificity’ out of 4 of RA to organize our own rating system with a scale of 1 to 5 points. This rating system enables us to judge the feelings/emotion behind each feedback, and it works well with both single word/phrase and a whole paragraph. The result of the rating reflects the strength of the feeling/emotion of the consumer when he/she is typing the feedback. In our daily work, we first require a consumer to answer the net promoter score (NPS) before writing the feedback, so we can determine the feedback is positive or negative. Secondly, we code the feedback content according to company problematic list, which contains 200 problematic items. In this way, we are able to collect the data that how many feedbacks left by the consumer belong to one typical problem. Thirdly, we rate each feedback based on the rating system mentioned above to illustrate the strength of the feeling/emotion when our consumer writes the feedback. In this way, we actually obtain two kinds of data 1) the portion, which means how many feedbacks are ascribed into one problematic item and 2) the severity, how strong the negative feeling/emotion is when the consumer is writing this feedback. By crossing these two, and introducing the portion into X-axis and severity into Y-axis, we are able to find which typical problem gets the high score in both portion and severity. The higher the score of a problem has, the more urgent a problem is supposed to be solved as it means more people write stronger negative feelings in feedbacks regarding this problem. Moreover, by introducing hidden Markov model to program our rating system, we are able to computerize the scoring system and are able to process thousands of feedback in a short period of time, which is efficient and accurate enough for the industrial purpose.

Keywords: computerized scoring system, feeling/emotion of consumer feedback, referential activity, text mining

Procedia PDF Downloads 145
2508 Using Satellite Images Datasets for Road Intersection Detection in Route Planning

Authors: Fatma El-Zahraa El-Taher, Ayman Taha, Jane Courtney, Susan Mckeever

Abstract:

Understanding road networks plays an important role in navigation applications such as self-driving vehicles and route planning for individual journeys. Intersections of roads are essential components of road networks. Understanding the features of an intersection, from a simple T-junction to larger multi-road junctions, is critical to decisions such as crossing roads or selecting the safest routes. The identification and profiling of intersections from satellite images is a challenging task. While deep learning approaches offer the state-of-the-art in image classification and detection, the availability of training datasets is a bottleneck in this approach. In this paper, a labelled satellite image dataset for the intersection recognition problem is presented. It consists of 14,692 satellite images of Washington DC, USA. To support other users of the dataset, an automated download and labelling script is provided for dataset replication. The challenges of construction and fine-grained feature labelling of a satellite image dataset is examined, including the issue of how to address features that are spread across multiple images. Finally, the accuracy of the detection of intersections in satellite images is evaluated.

Keywords: satellite images, remote sensing images, data acquisition, autonomous vehicles

Procedia PDF Downloads 116
2507 Some Theoretical Approaches on the Style of Lyrical Subject of the Confessional Poetry

Authors: Lemac Tin

Abstract:

This paper deals with the lyrical subject of the confessional poetry which is the main part of her stylistic strucuture. We concluded two types of this subject in the classical confessional poetic discourse; reflexive and authentic subject. We offer the model of their genesis, textual features and appeareance realisations. Genesis is related to the theories of deriving poetry from emotion and magic and their similar position in the primitive lyrics and lyrics of the ancient civilizations. Textual features are related to the emotive and semiotic analysis of each type. Appearance realisations of these two types are I-subject, We-subject, transvocal and objectified subject. We check this approaches on some of the poems from World literature.

Keywords: confessional poetry, confessional lyrical subject, magic, emotion, emotive analysis, semiotic analysis

Procedia PDF Downloads 246
2506 Articles, Delimitation of Speech and Perception

Authors: Nataliya L. Ogurechnikova

Abstract:

The paper aims to clarify the function of articles in the English speech and specify their place and role in the English language, taking into account the use of articles for delimitation of speech. A focus of the paper is the use of the definite and the indefinite articles with different types of noun phrases which comprise either one noun with or without attributes, such as the King, the Queen, the Lion, the Unicorn, a dimple, a smile, a new language, an unknown dialect, or several nouns with or without attributes, such as the King and Queen of Hearts, the Lion and Unicorn, a dimple or smile, a completely isolated language or dialect. It is stated that the function of delimitation is related to perception: the number of speech units in a text correlates with the way the speaker perceives and segments the denotation. The two following combinations of words the house and garden and the house and the garden contain different numbers of speech units, one and two respectively, and reveal two different perception modes which correspond to the use of the definite article in the examples given. Thus, the function of delimitation is twofold, it is related to perception and cognition, on the one hand, and, on the other hand, to grammar, if the subject of grammar is the structure of speech. Analysis of speech units in the paper is not limited by noun phrases and is amplified by discussion of peripheral phenomena which are nevertheless important because they enable to qualify articles as a syntactic phenomenon whereas they are not infrequently described in terms of noun morphology. With this regard attention is given to the history of linguistic studies, specifically to the description of English articles by Niels Haislund, a disciple of Otto Jespersen. A discrepancy is noted between the initial plan of Jespersen who intended to describe articles as a syntactic phenomenon in ‘A Modern English Grammar on Historical Principles’ and the interpretation of articles in terms of noun morphology, finally given by Haislund. Another issue of the paper is correlation between description and denotation, being a traditional aspect of linguistic studies focused on articles. An overview of relevant studies, given in the paper, goes back to the works of G. Frege, which gave rise to a series of scientific works where the meaning of articles was described within the scope of logical semantics. Correlation between denotation and description is treated in the paper as the meaning of article, i.e. a component in its semantic structure, which differs from the function of delimitation and is similar to the meaning of other quantifiers. The paper further explains why the relation between description and denotation, i.e. the meaning of English article, is irrelevant for noun morphology and has nothing to do with nominal categories of the English language.

Keywords: delimitation of speech, denotation, description, perception, speech units, syntax

Procedia PDF Downloads 219
2505 A Psychophysiological Evaluation of an Effective Recognition Technique Using Interactive Dynamic Virtual Environments

Authors: Mohammadhossein Moghimi, Robert Stone, Pia Rotshtein

Abstract:

Recording psychological and physiological correlates of human performance within virtual environments and interpreting their impacts on human engagement, ‘immersion’ and related emotional or ‘effective’ states is both academically and technologically challenging. By exposing participants to an effective, real-time (game-like) virtual environment, designed and evaluated in an earlier study, a psychophysiological database containing the EEG, GSR and Heart Rate of 30 male and female gamers, exposed to 10 games, was constructed. Some 174 features were subsequently identified and extracted from a number of windows, with 28 different timing lengths (e.g. 2, 3, 5, etc. seconds). After reducing the number of features to 30, using a feature selection technique, K-Nearest Neighbour (KNN) and Support Vector Machine (SVM) methods were subsequently employed for the classification process. The classifiers categorised the psychophysiological database into four effective clusters (defined based on a 3-dimensional space – valence, arousal and dominance) and eight emotion labels (relaxed, content, happy, excited, angry, afraid, sad, and bored). The KNN and SVM classifiers achieved average cross-validation accuracies of 97.01% (±1.3%) and 92.84% (±3.67%), respectively. However, no significant differences were found in the classification process based on effective clusters or emotion labels.

Keywords: virtual reality, effective computing, effective VR, emotion-based effective physiological database

Procedia PDF Downloads 209
2504 Transcultural Study on Social Intelligence

Authors: Martha Serrano-Arias, Martha Frías-Armenta

Abstract:

Significant results have been found both supporting universality of emotion recognition and cultural background influence. Thus, the aim of this research was to test a Mexican version of the MTSI in different cultures to find differences in their performance. The MTSI-Mx assesses through a scenario approach were subjects must evaluate real persons. Two target persons were used for the construction, a man (FS) and a woman (AD). The items were grouped in four variables: Picture, Video, and FS and AD scenarios. The test was applied to 201 students from Mexico and Germany. T-test for picture and FS scenario show no significance. Video and AD had a significance at the 5% level. Results show slight differences between cultures, although a more comprehensive research is needed to conclude which culture can perform better in this kind of assessments.

Keywords: emotion recognition, MTSI, social intelligence, transcultural study

Procedia PDF Downloads 299
2503 Adaptive Swarm Balancing Algorithms for Rare-Event Prediction in Imbalanced Healthcare Data

Authors: Jinyan Li, Simon Fong, Raymond Wong, Mohammed Sabah, Fiaidhi Jinan

Abstract:

Clinical data analysis and forecasting have make great contributions to disease control, prevention and detection. However, such data usually suffer from highly unbalanced samples in class distributions. In this paper, we target at the binary imbalanced dataset, where the positive samples take up only the minority. We investigate two different meta-heuristic algorithms, particle swarm optimization and bat-inspired algorithm, and combine both of them with the synthetic minority over-sampling technique (SMOTE) for processing the datasets. One approach is to process the full dataset as a whole. The other is to split up the dataset and adaptively process it one segment at a time. The experimental results reveal that while the performance improvements obtained by the former methods are not scalable to larger data scales, the later one, which we call Adaptive Swarm Balancing Algorithms, leads to significant efficiency and effectiveness improvements on large datasets. We also find it more consistent with the practice of the typical large imbalanced medical datasets. We further use the meta-heuristic algorithms to optimize two key parameters of SMOTE. Leading to more credible performances of the classifier, and shortening the running time compared with the brute-force method.

Keywords: Imbalanced dataset, meta-heuristic algorithm, SMOTE, big data

Procedia PDF Downloads 418
2502 Motor Speech Profile of Marathi Speaking Adults and Children

Authors: Anindita Banik, Anjali Kant, Aninda Duti Banik, Arun Banik

Abstract:

Speech is a complex, dynamic unique motor activity through which we express thoughts and emotions and respond to and control our environment. The aim was based to compare select Motor Speech parameters and their sub parameters across typical Marathi speaking adults and children. The subjects included a total of 300 divided into Group I, II, III including males and females. Subjects included were reported of no significant medical history and had a rating of 0-1 on GRBAS scale. The recordings were obtained utilizing three stimuli for the acoustic analysis of Diadochokinetic rate (DDK), Second Formant Transition, Voice and Tremor and its sub parameters. And these aforementioned parameters were acoustically analyzed in Motor Speech Profile software in VisiPitch IV. The statistical analyses were done by applying descriptive statistics and Two- Way ANOVA.The results obtained showed statistically significant difference across age groups and gender for the aforementioned parameters and its sub parameters.In DDK, for avp (ms) there was a significant difference only across age groups. However, for avr (/s) there was a significant difference across age groups and gender. It was observed that there was an increase in rate with an increase in age groups. The second formant transition sub parameter F2 magn (Hz) also showed a statistically significant difference across both age groups and gender. There was an increase in mean value with an increase in age. Females had a higher mean when compared to males. For F2 rate (/s) a statistically significant difference was observed across age groups. There was an increase in mean value with increase in age. It was observed for Voice and Tremor MFTR (%) that a statistically significant difference was present across age groups and gender. Also for RATR (Hz) there was statistically significant difference across both age groups and gender. In other words, the values of MFTR and RATR increased with an increase in age. Thus, this study highlights the variation of the motor speech parameters amongst the typical population which would be beneficial for comparison with the individuals with motor speech disorders for assessment and management.

Keywords: adult, children, diadochokinetic rate, second formant transition, tremor, voice

Procedia PDF Downloads 278
2501 Unsupervised Part-of-Speech Tagging for Amharic Using K-Means Clustering

Authors: Zelalem Fantahun

Abstract:

Part-of-speech tagging is the process of assigning a part-of-speech or other lexical class marker to each word into naturally occurring text. Part-of-speech tagging is the most fundamental and basic task almost in all natural language processing. In natural language processing, the problem of providing large amount of manually annotated data is a knowledge acquisition bottleneck. Since, Amharic is one of under-resourced language, the availability of tagged corpus is the bottleneck problem for natural language processing especially for POS tagging. A promising direction to tackle this problem is to provide a system that does not require manually tagged data. In unsupervised learning, the learner is not provided with classifications. Unsupervised algorithms seek out similarity between pieces of data in order to determine whether they can be characterized as forming a group. This paper explicates the development of unsupervised part-of-speech tagger using K-Means clustering for Amharic language since large amount of data is produced in day-to-day activities. In the development of the tagger, the following procedures are followed. First, the unlabeled data (raw text) is divided into 10 folds and tokenization phase takes place; at this level, the raw text is chunked at sentence level and then into words. The second phase is feature extraction which includes word frequency, syntactic and morphological features of a word. The third phase is clustering. Among different clustering algorithms, K-means is selected and implemented in this study that brings group of similar words together. The fourth phase is mapping, which deals with looking at each cluster carefully and the most common tag is assigned to a group. This study finds out two features that are capable of distinguishing one part-of-speech from others these are morphological feature and positional information and show that it is possible to use unsupervised learning for Amharic POS tagging. In order to increase performance of the unsupervised part-of-speech tagger, there is a need to incorporate other features that are not included in this study, such as semantic related information. Finally, based on experimental result, the performance of the system achieves a maximum of 81% accuracy.

Keywords: POS tagging, Amharic, unsupervised learning, k-means

Procedia PDF Downloads 416
2500 Detection of Phoneme [S] Mispronounciation for Sigmatism Diagnosis in Adults

Authors: Michal Krecichwost, Zauzanna Miodonska, Pawel Badura

Abstract:

The diagnosis of sigmatism is mostly based on the observation of articulatory organs. It is, however, not always possible to precisely observe the vocal apparatus, in particular in the oral cavity of the patient. Speech processing can allow to objectify the therapy and simplify the verification of its progress. In the described study the methodology for classification of incorrectly pronounced phoneme [s] is proposed. The recordings come from adults. They were registered with the speech recorder at the sampling rate of 44.1 kHz and the resolution of 16 bit. The database of pathological and normative speech has been collected for the study including reference assessments provided by the speech therapy experts. Ten adult subjects were asked to simulate a certain type of stigmatism under the speech therapy expert supervision. In the recordings, the analyzed phone [s] was surrounded by vowels, viz: ASA, ESE, ISI, SPA, USU, YSY. Thirteen MFCC (mel-frequency cepstral coefficients) and RMS (root mean square) values are calculated within each frame being a part of the analyzed phoneme. Additionally, 3 fricative formants along with corresponding amplitudes are determined for the entire segment. In order to aggregate the information within the segment, the average value of each MFCC coefficient is calculated. All features of other types are aggregated by means of their 75th percentile. The proposed method of features aggregation reduces the size of the feature vector used in the classification. Binary SVM (support vector machine) classifier is employed at the phoneme recognition stage. The first group consists of pathological phones, while the other of the normative ones. The proposed feature vector yields classification sensitivity and specificity measures above 90% level in case of individual logo phones. The employment of a fricative formants-based information improves the sole-MFCC classification results average of 5 percentage points. The study shows that the employment of specific parameters for the selected phones improves the efficiency of pathology detection referred to the traditional methods of speech signal parameterization.

Keywords: computer-aided pronunciation evaluation, sibilants, sigmatism diagnosis, speech processing

Procedia PDF Downloads 258
2499 Recognition of Noisy Words Using the Time Delay Neural Networks Approach

Authors: Khenfer-Koummich Fatima, Mesbahi Larbi, Hendel Fatiha

Abstract:

This paper presents a recognition system for isolated words like robot commands. It’s carried out by Time Delay Neural Networks; TDNN. To teleoperate a robot for specific tasks as turn, close, etc… In industrial environment and taking into account the noise coming from the machine. The choice of TDNN is based on its generalization in terms of accuracy, in more it acts as a filter that allows the passage of certain desirable frequency characteristics of speech; the goal is to determine the parameters of this filter for making an adaptable system to the variability of speech signal and to noise especially, for this the back propagation technique was used in learning phase. The approach was applied on commands pronounced in two languages separately: The French and Arabic. The results for two test bases of 300 spoken words for each one are 87%, 97.6% in neutral environment and 77.67%, 92.67% when the white Gaussian noisy was added with a SNR of 35 dB.

Keywords: TDNN, neural networks, noise, speech recognition

Procedia PDF Downloads 256
2498 Grammatical Interference in Russian-Spanish Bilingualism

Authors: Olga A. Gnatyuk

Abstract:

The article is devoted to the phenomenon of interference that occurs in the case of the Russian-Spanish language contact. The questions of the definition of the term and levels, as well as prerequisites of interference occurrence, are considered. Interference, which is an essential part of bilingualism, may become apparent at different linguistic levels. Interference is especially evident in oral speech. The article reviews some examples of grammatical interference in Russian-Spanish bilingualism of Russian immigrants living in Spain. According to the results of the research, some cases of mother-tongue interference in Russian-Speaking Spanish language learners’ speech were revealed. Special attention is paid to such key spheres of grammatical interference as articles, personal pronouns, gender, and number of nouns. In the research, the drop of a link-verb, as well as its usage in some incorrect form, are observed in Russian immigrants’ speech. Conclusions are drawn that in the Spanish language, interference errors appear because of a consequence of both the absence in the Russian language of certain phenomena and categories of the Spanish language and the discrepancy of the linguistic systems of the two languages.

Keywords: bilingualism, interference, grammatical interference, Russian language, Spanish language

Procedia PDF Downloads 136
2497 Role of Speech Language Pathologists in Vocational Rehabilitation

Authors: Marlyn Mathew

Abstract:

Communication is the key factor in any vocational /job set-up. However many persons with disabilities suffer a deficit in this very area in terms of comprehension, expression and cognitive skills making it difficult for them to get employed appropriately or stay employed. Vocational Rehabilitation is a continuous and coordinated process which involves the provision of vocational related services designed to enable a person with disability to obtain and maintain employment. Therefore the role of the speech language pathologist is crucial in assessing the communication deficits and needs of the individual at the various phases of employment- right from the time of seeking a job and attending interview with suitable employers and also at regular intervals of the employment. This article discusses the various communication deficits and the obstacles faced by individuals with special needs including but not limited to cognitive- linguistic deficits, execution function deficits, speech and language processing difficulties and strategies that can be introduced in the workplace to overcome these obstacles including use of visual cues, checklists, flow charts. The paper also throws light on the importance of educating colleagues and work partners about the communication difficulties faced by the individual. This would help to reduce the communication barriers in the workplace, help colleagues develop an empathetic approach and also reduce misunderstandings that can arise as a result of the communication impairment.

Keywords: vocational rehabilitation, disability, speech language pathologist, cognitive, linguistics

Procedia PDF Downloads 114
2496 Performance Analysis of Traffic Classification with Machine Learning

Authors: Htay Htay Yi, Zin May Aye

Abstract:

Network security is role of the ICT environment because malicious users are continually growing that realm of education, business, and then related with ICT. The network security contravention is typically described and examined centrally based on a security event management system. The firewalls, Intrusion Detection System (IDS), and Intrusion Prevention System are becoming essential to monitor or prevent of potential violations, incidents attack, and imminent threats. In this system, the firewall rules are set only for where the system policies are needed. Dataset deployed in this system are derived from the testbed environment. The traffic as in DoS and PortScan traffics are applied in the testbed with firewall and IDS implementation. The network traffics are classified as normal or attacks in the existing testbed environment based on six machine learning classification methods applied in the system. It is required to be tested to get datasets and applied for DoS and PortScan. The dataset is based on CICIDS2017 and some features have been added. This system tested 26 features from the applied dataset. The system is to reduce false positive rates and to improve accuracy in the implemented testbed design. The system also proves good performance by selecting important features and comparing existing a dataset by machine learning classifiers.

Keywords: false negative rate, intrusion detection system, machine learning methods, performance

Procedia PDF Downloads 98
2495 Drone Classification Using Classification Methods Using Conventional Model With Embedded Audio-Visual Features

Authors: Hrishi Rakshit, Pooneh Bagheri Zadeh

Abstract:

This paper investigates the performance of drone classification methods using conventional DCNN with different hyperparameters, when additional drone audio data is embedded in the dataset for training and further classification. In this paper, first a custom dataset is created using different images of drones from University of South California (USC) datasets and Leeds Beckett university datasets with embedded drone audio signal. The three well-known DCNN architectures namely, Resnet50, Darknet53 and Shufflenet are employed over the created dataset tuning their hyperparameters such as, learning rates, maximum epochs, Mini Batch size with different optimizers. Precision-Recall curves and F1 Scores-Threshold curves are used to evaluate the performance of the named classification algorithms. Experimental results show that Resnet50 has the highest efficiency compared to other DCNN methods.

Keywords: drone classifications, deep convolutional neural network, hyperparameters, drone audio signal

Procedia PDF Downloads 62
2494 Self-Supervised Learning for Hate-Speech Identification

Authors: Shrabani Ghosh

Abstract:

Automatic offensive language detection in social media has become a stirring task in today's NLP. Manual Offensive language detection is tedious and laborious work where automatic methods based on machine learning are only alternatives. Previous works have done sentiment analysis over social media in different ways such as supervised, semi-supervised, and unsupervised manner. Domain adaptation in a semi-supervised way has also been explored in NLP, where the source domain and the target domain are different. In domain adaptation, the source domain usually has a large amount of labeled data, while only a limited amount of labeled data is available in the target domain. Pretrained transformers like BERT, RoBERTa models are fine-tuned to perform text classification in an unsupervised manner to perform further pre-train masked language modeling (MLM) tasks. In previous work, hate speech detection has been explored in Gab.ai, which is a free speech platform described as a platform of extremist in varying degrees in online social media. In domain adaptation process, Twitter data is used as the source domain, and Gab data is used as the target domain. The performance of domain adaptation also depends on the cross-domain similarity. Different distance measure methods such as L2 distance, cosine distance, Maximum Mean Discrepancy (MMD), Fisher Linear Discriminant (FLD), and CORAL have been used to estimate domain similarity. Certainly, in-domain distances are small, and between-domain distances are expected to be large. The previous work finding shows that pretrain masked language model (MLM) fine-tuned with a mixture of posts of source and target domain gives higher accuracy. However, in-domain performance of the hate classifier on Twitter data accuracy is 71.78%, and out-of-domain performance of the hate classifier on Gab data goes down to 56.53%. Recently self-supervised learning got a lot of attention as it is more applicable when labeled data are scarce. Few works have already been explored to apply self-supervised learning on NLP tasks such as sentiment classification. Self-supervised language representation model ALBERTA focuses on modeling inter-sentence coherence and helps downstream tasks with multi-sentence inputs. Self-supervised attention learning approach shows better performance as it exploits extracted context word in the training process. In this work, a self-supervised attention mechanism has been proposed to detect hate speech on Gab.ai. This framework initially classifies the Gab dataset in an attention-based self-supervised manner. On the next step, a semi-supervised classifier trained on the combination of labeled data from the first step and unlabeled data. The performance of the proposed framework will be compared with the results described earlier and also with optimized outcomes obtained from different optimization techniques.

Keywords: attention learning, language model, offensive language detection, self-supervised learning

Procedia PDF Downloads 83
2493 Mother-Child Conversations about Emotions and Socio-Emotional Education in Children with Autism Spectrum Disorder

Authors: Beaudoin Marie-Joelle, Poirier Nathalie

Abstract:

Introduction: Children with autism spectrum disorder (ASD) tend to lack socio-emotional skills (e.g., emotional regulation and theory of mind). Eisenberg’s theoretical model on emotion-related socialization behaviors suggests that mothers of children with ASD could play a central role in fostering the acquisition of socio-emotional skills by engaging in frequent educational conversations about emotions. Although, mothers’ perceptions of their own emotional skills and their child’s personality traits and social deficits could mitigate the benefit of their educative role. Objective: Our study aims to explore the association between mother-child conversations about emotions and the socio-emotional skills of their children when accounting for the moderating role of the mothers’ perceptions. Forty-nine mothers completed five questionnaires about emotionally related conversations, self-openness to emotions, and perceptions of personality and socio-emotional skills of their children with ASD. Results: Regression analyses showed that frequent mother-child conversations about emotions predicted better emotional regulation and theory of mind skills in children with ASD (p < 0.01). The children’s theory of mind was moderated by mothers’ perceptions of their own emotional openness (p < 0.05) and their perceptions of their children’s openness to experience (p < 0.01) and conscientiousness (p < 0.05). Conclusion: Mothers likely play an important role in the socio-emotional education of children with ASD. Further, mothers may be most helpful when they perceive that their interventions improve their child’s behaviors. Our findings corroborate those of the Eisenberg model, which claims that mother-child conversations about emotions predict socio-emotional development skills in children with ASD. Our results also help clarify the moderating role of mothers’ perceptions, which could mitigate their willingness to engage in educational conversations about emotions with their children. Therefore, in special needs' children education, school professionals could collaborate with mothers to increase the frequency of emotion-related conversations in ASD's students with emotion dysregulation or theory of mind problems.

Keywords: autism, parental socialization of emotion, emotional regulation, theory of mind

Procedia PDF Downloads 59
2492 Attention-based Adaptive Convolution with Progressive Learning in Speech Enhancement

Authors: Tian Lan, Yixiang Wang, Wenxin Tai, Yilan Lyu, Zufeng Wu

Abstract:

The monaural speech enhancement task in the time-frequencydomain has a myriad of approaches, with the stacked con-volutional neural network (CNN) demonstrating superiorability in feature extraction and selection. However, usingstacked single convolutions method limits feature represen-tation capability and generalization ability. In order to solvethe aforementioned problem, we propose an attention-basedadaptive convolutional network that integrates the multi-scale convolutional operations into a operation-specific blockvia input dependent attention to adapt to complex auditoryscenes. In addition, we introduce a two-stage progressivelearning method to enlarge the receptive field without a dra-matic increase in computation burden. We conduct a series ofexperiments based on the TIMIT corpus, and the experimen-tal results prove that our proposed model is better than thestate-of-art models on all metrics.

Keywords: speech enhancement, adaptive convolu-tion, progressive learning, time-frequency domain

Procedia PDF Downloads 91