Search results for: speech to text
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1892

Search results for: speech to text

1682 Towards a Deconstructive Text: Beyond Language and the Politics of Absences in Samuel Beckett’s Waiting for Godot

Authors: Afia Shahid

Abstract:

The writing of Samuel Beckett is associated with meaning in the meaninglessness and the production of what he calls ‘literature of unword’. The casual escape from the world of words in the form of silences and pauses, in his play Waiting for Godot, urges to ask question of their existence and ultimately leads to investigate the theory behind their use in the play. This paper proposes that these absences (silence and pause) in Beckett’s play force to think ‘beyond’ language. This paper asks how silence and pause in Beckett’s text speak for the emergence of poststructuralist text. It aims to identify the significant features of the philosophy of deconstruction in the play of Beckett to demystify the hostile complicity between literature and philosophy. With the interpretive paradigm of poststructuralism this research focuses on the text as a research data. It attempts to delineate the relationship between poststructuralist theoretical concerns and text of Beckett. Keeping in view the theoretical concerns of Poststructuralist theorist Jacques Derrida, the main concern of the discussion is directed towards the notion of ‘beyond’ language into the absences that are aimed at silencing the existing discourse with the ‘radical irony’ of this anti-formal art that contains its own denial and thus represents the idea of ceaseless questioning and radical contradiction in art and any text. This article asks how text of Beckett vibrates with loud silence and has disrupted language to demonstrate the emptiness of words and thus exploring the limitless void of absences. Beckett’s text resonates with silence and pause that is neither negation nor affirmation rather a poststructuralist’s suspension of reality that is ever changing with the undecidablity of all meanings. Within the theoretical notion of Derrida’s Différance this study interprets silence and pause in Beckett’s art. The silence and pause behave like Derrida’s Différance and have questioned their own existence in the text to deconstruct any definiteness and finality of reality to extend an undecidable threshold of poststructuralists that aims to evade the ‘labyrinth of language’.

Keywords: Différance, language, pause, poststructuralism, silence, text

Procedia PDF Downloads 173
1681 The Platform for Digitization of Georgian Documents

Authors: Erekle Magradze, Davit Soselia, Levan Shughliashvili, Irakli Koberidze, Shota Tsiskaridze, Victor Kakhniashvili, Tamar Chaghiashvili

Abstract:

Since the beginning of active publishing activity in Georgia, voluminous printed material has been accumulated, the digitization of which is an important task. Digitized materials will be available to the audience, and it will be possible to find text in them and conduct various factual research. Digitizing scanned documents means scanning documents, extracting text from the scanned documents, and processing the text into a corresponding language model to detect inaccuracies and grammatical errors. Implementing these stages requires a unified, scalable, and automated platform, where the digital service developed for each stage will perform the task assigned to it; at the same time, it will be possible to develop these services dynamically so that there is no interruption in the work of the platform.

Keywords: NLP, OCR, BERT, Kubernetes, transformers

Procedia PDF Downloads 108
1680 In-Context Meta Learning for Automatic Designing Pretext Tasks for Self-Supervised Image Analysis

Authors: Toktam Khatibi

Abstract:

Self-supervised learning (SSL) includes machine learning models that are trained on one aspect and/or one part of the input to learn other aspects and/or part of it. SSL models are divided into two different categories, including pre-text task-based models and contrastive learning ones. Pre-text tasks are some auxiliary tasks learning pseudo-labels, and the trained models are further fine-tuned for downstream tasks. However, one important disadvantage of SSL using pre-text task solving is defining an appropriate pre-text task for each image dataset with a variety of image modalities. Therefore, it is required to design an appropriate pretext task automatically for each dataset and each downstream task. To the best of our knowledge, the automatic designing of pretext tasks for image analysis has not been considered yet. In this paper, we present a framework based on In-context learning that describes each task based on its input and output data using a pre-trained image transformer. Our proposed method combines the input image and its learned description for optimizing the pre-text task design and its hyper-parameters using Meta-learning models. The representations learned from the pre-text tasks are fine-tuned for solving the downstream tasks. We demonstrate that our proposed framework outperforms the compared ones on unseen tasks and image modalities in addition to its superior performance for previously known tasks and datasets.

Keywords: in-context learning (ICL), meta learning, self-supervised learning (SSL), vision-language domain, transformers

Procedia PDF Downloads 40
1679 Reconstructed Phase Space Features for Estimating Post Traumatic Stress Disorder

Authors: Andre Wittenborn, Jarek Krajewski

Abstract:

Trauma-related sadness in speech can alter the voice in several ways. The generation of non-linear aerodynamic phenomena within the vocal tract is crucial when analyzing trauma-influenced speech production. They include non-laminar flow and formation of jets rather than well-behaved laminar flow aspects. Especially state-space reconstruction methods based on chaotic dynamics and fractal theory have been suggested to describe these aerodynamic turbulence-related phenomena of the speech production system. To extract the non-linear properties of the speech signal, we used the time delay embedding method to reconstruct from a scalar time series (reconstructed phase space, RPS). This approach results in the extraction of 7238 Features per .wav file (N= 47, 32 m, 15 f). The speech material was prompted by telling about autobiographical related sadness-inducing experiences (sampling rate 16 kHz, 8-bit resolution). After combining these features in a support vector machine based machine learning approach (leave-one-sample out validation), we achieved a correlation of r = .41 with the well-established, self-report ground truth measure (RATS) of post-traumatic stress disorder (PTSD).

Keywords: non-linear dynamics features, post traumatic stress disorder, reconstructed phase space, support vector machine

Procedia PDF Downloads 76
1678 AI-Based Techniques for Online Social Media Network Sentiment Analysis: A Methodical Review

Authors: A. M. John-Otumu, M. M. Rahman, O. C. Nwokonkwo, M. C. Onuoha

Abstract:

Online social media networks have long served as a primary arena for group conversations, gossip, text-based information sharing and distribution. The use of natural language processing techniques for text classification and unbiased decision-making has not been far-fetched. Proper classification of this textual information in a given context has also been very difficult. As a result, we decided to conduct a systematic review of previous literature on sentiment classification and AI-based techniques that have been used in order to gain a better understanding of the process of designing and developing a robust and more accurate sentiment classifier that can correctly classify social media textual information of a given context between hate speech and inverted compliments with a high level of accuracy by assessing different artificial intelligence techniques. We evaluated over 250 articles from digital sources like ScienceDirect, ACM, Google Scholar, and IEEE Xplore and whittled down the number of research to 31. Findings revealed that Deep learning approaches such as CNN, RNN, BERT, and LSTM outperformed various machine learning techniques in terms of performance accuracy. A large dataset is also necessary for developing a robust sentiment classifier and can be obtained from places like Twitter, movie reviews, Kaggle, SST, and SemEval Task4. Hybrid Deep Learning techniques like CNN+LSTM, CNN+GRU, CNN+BERT outperformed single Deep Learning techniques and machine learning techniques. Python programming language outperformed Java programming language in terms of sentiment analyzer development due to its simplicity and AI-based library functionalities. Based on some of the important findings from this study, we made a recommendation for future research.

Keywords: artificial intelligence, natural language processing, sentiment analysis, social network, text

Procedia PDF Downloads 87
1677 Speech Perception by Video Hosting Services Actors: Urban Planning Conflicts

Authors: M. Pilgun

Abstract:

The report presents the results of a study of the specifics of speech perception by actors of video hosting services on the material of urban planning conflicts. To analyze the content, the multimodal approach using neural network technologies is employed. Analysis of word associations and associative networks of relevant stimulus revealed the evaluative reactions of the actors. Analysis of the data identified key topics that generated negative and positive perceptions from the participants. The calculation of social stress and social well-being indices based on user-generated content made it possible to build a rating of road transport construction objects according to the degree of negative and positive perception by actors.

Keywords: social media, speech perception, video hosting, networks

Procedia PDF Downloads 117
1676 Functions and Pragmatic Aspects of English Nonsense

Authors: Natalia V. Ursul

Abstract:

In linguistic studies, the question of nonsense is attracting increasing interest. Nonsense is usually defined as spoken or written words that have no meaning. However, this definition is likely to be outdated as any speech act is generated due to the speaker’s pragmatic reasons, thus it cannot be purely illogical or meaningless. In the current paper a new working definition of nonsense as a linguistic medium will be formulated; moreover, the pragmatic peculiarities of newly coined linguistic patterns and possible ways of their interpretation will be discussed.

Keywords: nonsense, nonse verse, pragmatics, speech act

Procedia PDF Downloads 482
1675 Preliminary Study of the Phonological Development in Three and Four Year Old Bulgarian Children

Authors: Tsvetomira Braynova, Miglena Simonska

Abstract:

The article presents the results of research on phonological processes in three and four-year-old children. For the purpose of the study, an author's test was developed and conducted among 120 children. The study included three areas of research - at the level of words (96 words), at the level of sentence repetition (10 sentences) and at the level of generating own speech from a picture (15 pictures). The test also gives us additional information about the articulation errors of the assessed children. The main purpose of the icing is to analyze all phonological processes that occur at this age in Bulgarian children and to identify which are typical and atypical for this age. The results show that the most common phonology errors that children make are: sound substitution, an elision of sound, metathesis of sound, elision of a syllable, and elision of consonants clustered in a syllable. All examined children were identified with the articulatory disorder from type bilabial lambdacism. Measuring the correlation between the average length of repeated speech and the average length of generated speech, the analysis proves that the more words a child can repeat in part “repeated speech,” the more words they can be expected to generate in part “generating sentence.” The results of this study show that the task of naming a word provides sufficient and representative information to assess the child's phonology.

Keywords: assessment, phonology, articulation, speech-language development

Procedia PDF Downloads 140
1674 Effects of Therapeutic Horseback Riding in Speech and Communication Skills of Children with Autism

Authors: Aristi Alopoudi, Sofia Beloka, Vassiliki Pliogou

Abstract:

Autism is a complex neuro-developmental disorder with a variety of difficulties in many aspects such as social interaction, communication skills and verbal communication (speech). The aim of this study was to examine the impact of therapeutic horseback riding in improving the verbal and communication skills of children diagnosed with autism during 16 sessions. The researcher examined whether the expression of speech, the use of vocabulary, semantics, pragmatics, echolalia and communication skills were influenced by the therapeutic horseback riding when we increase the frequency of the sessions. The researcher observed two subjects of primary-school aged, in a two case observation design, with autism during 16 therapeutic horseback riding sessions (one riding session per week). Compared to baseline, at the end of the 16th therapeutic session, therapeutic horseback riding increased both verbal skills such as vocabulary, semantics, pragmatics, formation of sentences and communication skills such as eye contact, greeting, participation in dialogue and spontaneous speech. It was noticeable that echolalia remained stable. Increased frequency of therapeutic horseback riding was beneficial for significant improvement in verbal and communication skills. More specifically, from the first to the last riding session there was a great increase of vocabulary, semantics, and formation of sentences. Pragmatics reached a lower level than semantics but the same as the right usage of the first person (for example, I make a hug) and echolalia used for that. A great increase of spontaneous speech was noticed. The eye contact was presented in a lower level, and there was a slow but important raise at the greeting as well as the participation in dialogue. Last but not least; this is a first study conducted in therapeutic horseback riding studying the verbal communication and communication skills in autistic children. According to the references, therapeutic horseback riding is a therapy with a variety of benefits, thus; this research made clear that in the benefits of this therapy there should be included the improvement of verbal speech and communication.

Keywords: Autism, communication skills, speech, therapeutic horseback riding

Procedia PDF Downloads 234
1673 Co-Design of Accessible Speech Recognition for Users with Dysarthric Speech

Authors: Elizabeth Howarth, Dawn Green, Sean Connolly, Geena Vabulas, Sara Smolley

Abstract:

Through the EU Horizon 2020 Nuvoic Project, the project team recruited 70 individuals in the UK and Ireland to test the Voiceitt speech recognition app and provide user feedback to developers. The app is designed for people with dysarthric speech, to support communication with unfamiliar people and access to speech-driven technologies such as smart home equipment and smart assistants. Participants with atypical speech, due to a range of conditions such as cerebral palsy, acquired brain injury, Down syndrome, stroke and hearing impairment, were recruited, primarily through organisations supporting disabled people. Most had physical or learning disabilities in addition to dysarthric speech. The project team worked with individuals, their families and local support teams, to provide access to the app, including through additional assistive technologies where needed. Testing was user-led, with participants asked to identify and test use cases most relevant to their daily lives over a period of three months or more. Ongoing technical support and training were provided remotely and in-person throughout the testing period. Structured interviews were used to collect feedback on users' experiences, with delivery adapted to individuals' needs and preferences. Informal feedback was collected through ongoing contact between participants, their families and support teams and the project team. Focus groups were held to collect feedback on specific design proposals. User feedback shared with developers has led to improvements to the user interface and functionality, including faster voice training, simplified navigation, the introduction of gamification elements and of switch access as an alternative to touchscreen access, with other feature requests from users still in development. This work offers a case-study in successful and inclusive co-design with the disabled community.

Keywords: co-design, assistive technology, dysarthria, inclusive speech recognition

Procedia PDF Downloads 73
1672 Low-Income African-American Fathers' Gendered Relationships with Their Children: A Study Examining the Impact of Child Gender on Father-Child Interactions

Authors: M. Lim Haslip

Abstract:

This quantitative study explores the correlation between child gender and father-child interactions. The author analyzes data from videotaped interactions between African-American fathers and their boy or girl toddler to explain how African-American fathers and toddlers interact with each other and whether these interactions differ by child gender. The purpose of this study is to investigate the research question: 'How, if at all, do fathers’ speech and gestures differ when interacting with their two-year-old sons versus daughters during free play?' The objectives of this study are to describe how child gender impacts African-American fathers’ verbal communication, examine how fathers gesture and speak to their toddler by gender, and to guide interventions for low-income African-American families and their children in early language development. This study involves a sample of 41 low-income African-American fathers and their 24-month-old toddlers. The videotape data will be used to observe 10-minute father-child interactions during free play. This study uses the already transcribed and coded data provided by Dr. Meredith Rowe, who did her study on the impact of African-American fathers’ verbal input on their children’s language development. The Child Language Data Exchange System (CHILDES program), created to study conversational interactions, was used for transcription and coding of the videotape data. The findings focus on the quantity of speech, diversity of speech, complexity of speech, and the quantity of gesture to inform the vocabulary usage, number of spoken words, length of speech, and the number of object pointings observed during father-toddler interactions in a free play setting. This study will help intervention and prevention scientists understand early language development in the African-American population. It will contribute to knowledge of the role of African-American fathers’ interactions on their children’s language development. It will guide interventions for the early language development of African-American children.

Keywords: parental engagement, early language development, African-American families, quantity of speech, diversity of speech, complexity of speech and the quantity of gesture

Procedia PDF Downloads 77
1671 Influence of Loudness Compression on Hearing with Bone Anchored Hearing Implants

Authors: Anja Kurz, Marc Flynn, Tobias Good, Marco Caversaccio, Martin Kompis

Abstract:

Bone Anchored Hearing Implants (BAHI) are routinely used in patients with conductive or mixed hearing loss, e.g. if conventional air conduction hearing aids cannot be used. New sound processors and new fitting software now allow the adjustment of parameters such as loudness compression ratios or maximum power output separately. Today it is unclear, how the choice of these parameters influences aided speech understanding in BAHI users. In this prospective experimental study, the effect of varying the compression ratio and lowering the maximum power output in a BAHI were investigated. Twelve experienced adult subjects with a mixed hearing loss participated in this study. Four different compression ratios (1.0; 1.3; 1.6; 2.0) were tested along with two different maximum power output settings, resulting in a total of eight different programs. Each participant tested each program during two weeks. A blinded Latin square design was used to minimize bias. For each of the eight programs, speech understanding in quiet and in noise was assessed. For speech in quiet, the Freiburg number test and the Freiburg monosyllabic word test at 50, 65, and 80 dB SPL were used. For speech in noise, the Oldenburg sentence test was administered. Speech understanding in quiet and in noise was improved significantly in the aided condition in any program, when compared to the unaided condition. However, no significant differences were found between any of the eight programs. In contrast, on a subjective level there was a significant preference for medium compression ratios of 1.3 to 1.6 and higher maximum power output.

Keywords: Bone Anchored Hearing Implant, baha, compression, maximum power output, speech understanding

Procedia PDF Downloads 347
1670 Emotional Analysis for Text Search Queries on Internet

Authors: Gemma García López

Abstract:

The goal of this study is to analyze if search queries carried out in search engines such as Google, can offer emotional information about the user that performs them. Knowing the emotional state in which the Internet user is located can be a key to achieve the maximum personalization of content and the detection of worrying behaviors. For this, two studies were carried out using tools with advanced natural language processing techniques. The first study determines if a query can be classified as positive, negative or neutral, while the second study extracts emotional content from words and applies the categorical and dimensional models for the representation of emotions. In addition, we use search queries in Spanish and English to establish similarities and differences between two languages. The results revealed that text search queries performed by users on the Internet can be classified emotionally. This allows us to better understand the emotional state of the user at the time of the search, which could involve adapting the technology and personalizing the responses to different emotional states.

Keywords: emotion classification, text search queries, emotional analysis, sentiment analysis in text, natural language processing

Procedia PDF Downloads 110
1669 Encryption and Decryption of Nucleic Acid Using Deoxyribonucleic Acid Algorithm

Authors: Iftikhar A. Tayubi, Aabdulrahman Alsubhi, Abdullah Althrwi

Abstract:

The deoxyribonucleic acid text provides a single source of high-quality Cryptography about Deoxyribonucleic acid sequence for structural biologists. We will provide an intuitive, well-organized and user-friendly web interface that allows users to encrypt and decrypt Deoxy Ribonucleic Acid sequence text. It includes complex, securing by using Algorithm to encrypt and decrypt Deoxy Ribonucleic Acid sequence. The utility of this Deoxy Ribonucleic Acid Sequence Text is that, it can provide a user-friendly interface for users to Encrypt and Decrypt store the information about Deoxy Ribonucleic Acid sequence. These interfaces created in this project will satisfy the demands of the scientific community by providing fully encrypt of Deoxy Ribonucleic Acid sequence during this website. We have adopted a methodology by using C# and Active Server Page.NET for programming which is smart and secure. Deoxy Ribonucleic Acid sequence text is a wonderful piece of equipment for encrypting large quantities of data, efficiently. The users can thus navigate from one encoding and store orange text, depending on the field for user’s interest. Algorithm classification allows a user to Protect the deoxy ribonucleic acid sequence from change, whether an alteration or error occurred during the Deoxy Ribonucleic Acid sequence data transfer. It will check the integrity of the Deoxy Ribonucleic Acid sequence data during the access.

Keywords: algorithm, ASP.NET, DNA, encrypt, decrypt

Procedia PDF Downloads 195
1668 Forensic Speaker Verification in Noisy Environmental by Enhancing the Speech Signal Using ICA Approach

Authors: Ahmed Kamil Hasan Al-Ali, Bouchra Senadji, Ganesh Naik

Abstract:

We propose a system to real environmental noise and channel mismatch for forensic speaker verification systems. This method is based on suppressing various types of real environmental noise by using independent component analysis (ICA) algorithm. The enhanced speech signal is applied to mel frequency cepstral coefficients (MFCC) or MFCC feature warping to extract the essential characteristics of the speech signal. Channel effects are reduced using an intermediate vector (i-vector) and probabilistic linear discriminant analysis (PLDA) approach for classification. The proposed algorithm is evaluated by using an Australian forensic voice comparison database, combined with car, street and home noises from QUT-NOISE at a signal to noise ratio (SNR) ranging from -10 dB to 10 dB. Experimental results indicate that the MFCC feature warping-ICA achieves a reduction in equal error rate about (48.22%, 44.66%, and 50.07%) over using MFCC feature warping when the test speech signals are corrupted with random sessions of street, car, and home noises at -10 dB SNR.

Keywords: noisy forensic speaker verification, ICA algorithm, MFCC, MFCC feature warping

Procedia PDF Downloads 376
1667 Speech Recognition Performance by Adults: A Proposal for a Battery for Marathi

Authors: S. B. Rathna Kumar, Pranjali A Ujwane, Panchanan Mohanty

Abstract:

The present study aimed to develop a battery for assessing speech recognition performance by adults in Marathi. A total of four word lists were developed by considering word frequency, word familiarity, words in common use, and phonemic balance. Each word list consists of 25 words (15 monosyllabic words in CVC structure and 10 monosyllabic words in CVCV structure). Equivalence analysis and performance-intensity function testing was carried using the four word lists on a total of 150 native speakers of Marathi belonging to different regions of Maharashtra (Vidarbha, Marathwada, Khandesh and Northern Maharashtra, Pune, and Konkan). The subjects were further equally divided into five groups based on above mentioned regions. It was found that there was no significant difference (p > 0.05) in the speech recognition performance between groups for each word list and between word lists for each group. Hence, the four word lists developed were equally difficult for all the groups and can be used interchangeably. The performance-intensity (PI) function curve showed semi-linear function, and the groups’ mean slope of the linear portions of the curve indicated an average linear slope of 4.64%, 4.73%, 4.68%, and 4.85% increase in word recognition score per dB for list 1, list 2, list 3 and list 4 respectively. Although, there is no data available on speech recognition tests for adults in Marathi, most of the findings of the study are in line with the findings of research reports on other languages. The four word lists, thus developed, were found to have sufficient reliability and validity in assessing speech recognition performance by adults in Marathi.

Keywords: speech recognition performance, phonemic balance, equivalence analysis, performance-intensity function testing, reliability, validity

Procedia PDF Downloads 323
1666 Improved Processing Speed for Text Watermarking Algorithm in Color Images

Authors: Hamza A. Al-Sewadi, Akram N. A. Aldakari

Abstract:

Copyright protection and ownership proof of digital multimedia are achieved nowadays by digital watermarking techniques. A text watermarking algorithm for protecting the property rights and ownership judgment of color images is proposed in this paper. Embedding is achieved by inserting texts elements randomly into the color image as noise. The YIQ image processing model is found to be faster than other image processing methods, and hence, it is adopted for the embedding process. An optional choice of encrypting the text watermark before embedding is also suggested (in case required by some applications), where, the text can is encrypted using any enciphering technique adding more difficulty to hackers. Experiments resulted in embedding speed improvement of more than double the speed of other considered systems (such as least significant bit method, and separate color code methods), and a fairly acceptable level of peak signal to noise ratio (PSNR) with low mean square error values for watermarking purposes.

Keywords: steganography, watermarking, time complexity measurements, private keys

Procedia PDF Downloads 113
1665 A Comparative Study on Vowel Articulation in Malayalam Speaking Children Using Cochlear Implant

Authors: Deepthy Ann Joy, N. Sreedevi

Abstract:

Hearing impairment (HI) at an early age, identified before the onset of language development can reduce the negative effect on speech and language development of children. Early rehabilitation is very important in the improvement of speech production in children with HI. Other than conventional hearing aids, Cochlear Implants are being used in the rehabilitation of children with HI. However, delay in acquisition of speech and language milestones persist in children with Cochlear Implant (CI). Delay in speech milestones are reflected through speech sound errors. These errors reflect the temporal and spectral characteristics of speech. Hence, acoustical analysis of the speech sounds will provide a better representation of speech production skills in children with CI. The present study aimed at investigating the acoustic characteristics of vowels in Malayalam speaking children with a cochlear implant. The participants of the study consisted of 20 Malayalam speaking children in the age range of four and seven years. The experimental group consisted of 10 children with CI, and the control group consisted of 10 typically developing children. Acoustic analysis was carried out for 5 short (/a/, /i/, /u/, /e/, /o/) and 5 long vowels (/a:/, /i:/, /u:/, /e:/, /o:/) in word-initial position. The responses were recorded and analyzed for acoustic parameters such as Vowel duration, Ratio of the duration of a short and long vowel, Formant frequencies (F₁ and F₂) and Formant Centralization Ratio (FCR) computed using the formula (F₂u+F₂a+F₁i+F₁u)/(F₂i+F₁a). Findings of the present study indicated that the values for vowel duration were higher in experimental group compared to the control group for all the vowels except for /u/. Ratio of duration of short and long vowel was also found to be higher in experimental group compared to control group except for /i/. Further F₁ for all vowels was found to be higher in experimental group with variability noticed in F₂ values. FCR was found be higher in experimental group, indicating vowel centralization. Further, the results of independent t-test revealed no significant difference across the parameters in both the groups. It was found that the spectral and temporal measures in children with CI moved towards normal range. The result emphasizes the significance of early rehabilitation in children with hearing impairment. The role of rehabilitation related aspects are also discussed in detail which can be clinically incorporated for the betterment of speech therapeutic services in children with CI.

Keywords: acoustics, cochlear implant, Malayalam, vowels

Procedia PDF Downloads 108
1664 The Arabic Literary Text, between Proficiency and Pedagogy

Authors: Abdul Rahman M. Chamseddine, Mahmoud El-ashiri

Abstract:

In the field of language teaching, communication skills are essential for the learner to achieve, however, these skills, in general, might not support the comprehension of some texts of literary or artistic nature like poetry. Understanding sentences and expressions is not enough to understand a poem; other skills are needed in order to understand the special structure of a text which literary meaning is inapprehensible even when the lingual meaning is well comprehended. And then there is the need for many other components that surpass one text to other similar texts that can be understood through solid traditions, which do not form an obstacle in the face of change and progress. This is not exclusive to texts that are classified as a literary but it is also the same with some daily short phrases and indicatively charged expressions that can be classified as literary or bear a taste of literary nature.. it can be found in Newpapers’ titles, TV news reports, and maybe football commentaries… the need to understand this special lingual use – described as literary – is highly important to understand this discourse that can be generally classified as very far from literature. This work will try to explore the role of the literary text in the language class and the way it is being covered or dealt with throughout all levels of acquiring proficiency. It will also attempt to survery the position of the literary text in some of the most important books for teaching Arabic around the world. The same way grammar is needed to understand the language, another (literary) grammar is also needed for understanding literature.

Keywords: language teaching, Arabic, literature, pedagogy, language proficiency

Procedia PDF Downloads 238
1663 Semi-Supervised Learning for Spanish Speech Recognition Using Deep Neural Networks

Authors: B. R. Campomanes-Alvarez, P. Quiros, B. Fernandez

Abstract:

Automatic Speech Recognition (ASR) is a machine-based process of decoding and transcribing oral speech. A typical ASR system receives acoustic input from a speaker or an audio file, analyzes it using algorithms, and produces an output in the form of a text. Some speech recognition systems use Hidden Markov Models (HMMs) to deal with the temporal variability of speech and Gaussian Mixture Models (GMMs) to determine how well each state of each HMM fits a short window of frames of coefficients that represents the acoustic input. Another way to evaluate the fit is to use a feed-forward neural network that takes several frames of coefficients as input and produces posterior probabilities over HMM states as output. Deep neural networks (DNNs) that have many hidden layers and are trained using new methods have been shown to outperform GMMs on a variety of speech recognition systems. Acoustic models for state-of-the-art ASR systems are usually training on massive amounts of data. However, audio files with their corresponding transcriptions can be difficult to obtain, especially in the Spanish language. Hence, in the case of these low-resource scenarios, building an ASR model is considered as a complex task due to the lack of labeled data, resulting in an under-trained system. Semi-supervised learning approaches arise as necessary tasks given the high cost of transcribing audio data. The main goal of this proposal is to develop a procedure based on acoustic semi-supervised learning for Spanish ASR systems by using DNNs. This semi-supervised learning approach consists of: (a) Training a seed ASR model with a DNN using a set of audios and their respective transcriptions. A DNN with a one-hidden-layer network was initialized; increasing the number of hidden layers in training, to a five. A refinement, which consisted of the weight matrix plus bias term and a Stochastic Gradient Descent (SGD) training were also performed. The objective function was the cross-entropy criterion. (b) Decoding/testing a set of unlabeled data with the obtained seed model. (c) Selecting a suitable subset of the validated data to retrain the seed model, thereby improving its performance on the target test set. To choose the most precise transcriptions, three confidence scores or metrics, regarding the lattice concept (based on the graph cost, the acoustic cost and a combination of both), was performed as selection technique. The performance of the ASR system will be calculated by means of the Word Error Rate (WER). The test dataset was renewed in order to extract the new transcriptions added to the training dataset. Some experiments were carried out in order to select the best ASR results. A comparison between a GMM-based model without retraining and the DNN proposed system was also made under the same conditions. Results showed that the semi-supervised ASR-model based on DNNs outperformed the GMM-model, in terms of WER, in all tested cases. The best result obtained an improvement of 6% relative WER. Hence, these promising results suggest that the proposed technique could be suitable for building ASR models in low-resource environments.

Keywords: automatic speech recognition, deep neural networks, machine learning, semi-supervised learning

Procedia PDF Downloads 310
1662 Developing an Advanced Algorithm Capable of Classifying News, Articles and Other Textual Documents Using Text Mining Techniques

Authors: R. B. Knudsen, O. T. Rasmussen, R. A. Alphinas

Abstract:

The reason for conducting this research is to develop an algorithm that is capable of classifying news articles from the automobile industry, according to the competitive actions that they entail, with the use of Text Mining (TM) methods. It is needed to test how to properly preprocess the data for this research by preparing pipelines which fits each algorithm the best. The pipelines are tested along with nine different classification algorithms in the realm of regression, support vector machines, and neural networks. Preliminary testing for identifying the optimal pipelines and algorithms resulted in the selection of two algorithms with two different pipelines. The two algorithms are Logistic Regression (LR) and Artificial Neural Network (ANN). These algorithms are optimized further, where several parameters of each algorithm are tested. The best result is achieved with the ANN. The final model yields an accuracy of 0.79, a precision of 0.80, a recall of 0.78, and an F1 score of 0.76. By removing three of the classes that created noise, the final algorithm is capable of reaching an accuracy of 94%.

Keywords: Artificial Neural network, Competitive dynamics, Logistic Regression, Text classification, Text mining

Procedia PDF Downloads 88
1661 Exploring Pre-Trained Automatic Speech Recognition Model HuBERT for Early Alzheimer’s Disease and Mild Cognitive Impairment Detection in Speech

Authors: Monica Gonzalez Machorro

Abstract:

Dementia is hard to diagnose because of the lack of early physical symptoms. Early dementia recognition is key to improving the living condition of patients. Speech technology is considered a valuable biomarker for this challenge. Recent works have utilized conventional acoustic features and machine learning methods to detect dementia in speech. BERT-like classifiers have reported the most promising performance. One constraint, nonetheless, is that these studies are either based on human transcripts or on transcripts produced by automatic speech recognition (ASR) systems. This research contribution is to explore a method that does not require transcriptions to detect early Alzheimer’s disease (AD) and mild cognitive impairment (MCI). This is achieved by fine-tuning a pre-trained ASR model for the downstream early AD and MCI tasks. To do so, a subset of the thoroughly studied Pitt Corpus is customized. The subset is balanced for class, age, and gender. Data processing also involves cropping the samples into 10-second segments. For comparison purposes, a baseline model is defined by training and testing a Random Forest with 20 extracted acoustic features using the librosa library implemented in Python. These are: zero-crossing rate, MFCCs, spectral bandwidth, spectral centroid, root mean square, and short-time Fourier transform. The baseline model achieved a 58% accuracy. To fine-tune HuBERT as a classifier, an average pooling strategy is employed to merge the 3D representations from audio into 2D representations, and a linear layer is added. The pre-trained model used is ‘hubert-large-ls960-ft’. Empirically, the number of epochs selected is 5, and the batch size defined is 1. Experiments show that our proposed method reaches a 69% balanced accuracy. This suggests that the linguistic and speech information encoded in the self-supervised ASR-based model is able to learn acoustic cues of AD and MCI.

Keywords: automatic speech recognition, early Alzheimer’s recognition, mild cognitive impairment, speech impairment

Procedia PDF Downloads 89
1660 Evaluating 8D Reports Using Text-Mining

Authors: Benjamin Kuester, Bjoern Eilert, Malte Stonis, Ludger Overmeyer

Abstract:

Increasing quality requirements make reliable and effective quality management indispensable. This includes the complaint handling in which the 8D method is widely used. The 8D report as a written documentation of the 8D method is one of the key quality documents as it internally secures the quality standards and acts as a communication medium to the customer. In practice, however, the 8D report is mostly faulty and of poor quality. There is no quality control of 8D reports today. This paper describes the use of natural language processing for the automated evaluation of 8D reports. Based on semantic analysis and text-mining algorithms the presented system is able to uncover content and formal quality deficiencies and thus increases the quality of the complaint processing in the long term.

Keywords: 8D report, complaint management, evaluation system, text-mining

Procedia PDF Downloads 272
1659 Detecting Elderly Abuse in US Nursing Homes Using Machine Learning and Text Analytics

Authors: Minh Huynh, Aaron Heuser, Luke Patterson, Chris Zhang, Mason Miller, Daniel Wang, Sandeep Shetty, Mike Trinh, Abigail Miller, Adaeze Enekwechi, Tenille Daniels, Lu Huynh

Abstract:

Machine learning and text analytics have been used to analyze child abuse, cyberbullying, domestic abuse and domestic violence, and hate speech. However, to the authors’ knowledge, no research to date has used these methods to study elder abuse in nursing homes or skilled nursing facilities from field inspection reports. We used machine learning and text analytics methods to analyze 356,000 inspection reports, which have been extracted from CMS Form-2567 field inspections of US nursing homes and skilled nursing facilities between 2016 and 2021. Our algorithm detected occurrences of the various types of abuse, including physical abuse, psychological abuse, verbal abuse, sexual abuse, and passive and active neglect. For example, to detect physical abuse, our algorithms search for combinations or phrases and words suggesting willful infliction of damage (hitting, pinching or burning, tethering, tying), or consciously ignoring an emergency. To detect occurrences of elder neglect, our algorithm looks for combinations or phrases and words suggesting both passive neglect (neglecting vital needs, allowing malnutrition and dehydration, allowing decubiti, deprivation of information, limitation of freedom, negligence toward safety precautions) and active neglect (intimidation and name-calling, tying the victim up to prevent falls without consent, consciously ignoring an emergency, not calling a physician in spite of indication, stopping important treatments, failure to provide essential care, deprivation of nourishment, leaving a person alone for an inappropriate amount of time, excessive demands in a situation of care). We further compare the prevalence of abuse before and after Covid-19 related restrictions on nursing home visits. We also identified the facilities with the most number of cases of abuse with no abuse facilities within a 25-mile radius as most likely candidates for additional inspections. We also built an interactive display to visualize the location of these facilities.

Keywords: machine learning, text analytics, elder abuse, elder neglect, nursing home abuse

Procedia PDF Downloads 110
1658 A Semantic Analysis of Modal Verbs in Barak Obama’s 2012 Presidential Campaign Speech

Authors: Kais A. Kadhim

Abstract:

This paper is a semantic analysis of the English modals in Obama’s speech. The main objective of this study is to analyze selected modal auxiliaries identified in selected speeches of Obama’s campaign based on Coates’ (1983) semantic clusters. A total of fifteen speeches of Obama’s campaign were selected as the primary data and the modal auxiliaries selected for analysis include will, would, can, could, should, must, ought, shall, may and might. All the modal auxiliaries taken from the speeches of Barack Obama were analyzed based on the framework of Coates’ semantic clusters. Such analytical framework was carried out to examine how modal auxiliaries are used in the context of persuading people in Obama’s campaign speeches. The findings reveal that modals of intention, prediction, futurity and modals of possibility, ability, permission are mostly used in Obama’s campaign speeches.

Keywords: modals, meaning, persuasion, speech

Procedia PDF Downloads 372
1657 Cross-Cultural Pragmatics: Apology Strategies by Libyans

Authors: Ahmed Elgadri

Abstract:

In the last thirty years, studies on cross-cultural pragmatics in general and apology strategies in specific have focused on western and East-Asian societies. A small volume of research has been conducted in investigating speech acts production by Arabic dialect speakers. Therefore, this study investigated the apology strategies used by Libyan Arabic speakers using an online Discourse Completion Task (DCT) questionnaire. The DCT consisted of six situations covering different social contexts. The survey was written in Libyan Arabic dialect to help generate vernacular speech as much as possible. The participants were 25 Libyan nationals, 12 females, and 13 males. Also, to get a deeper understanding of the motivation behind the use of certain strategies, the researcher interviewed four participants using the Libyan Arabic dialect as well. The results revealed a high use of IFID, offer of repair, and explanation. Although this might support the universality claim of speech acts strategies, it was clear that cultural norms and religion determined the choice of apology strategies significantly. This led to the discovery of new culture-specific strategies, as outlined later in this paper. This study gives an insight into politeness strategies in Libyan society, and it is hoped to contribute to the field of cross-cultural pragmatics.

Keywords: apologies, cross-cultural pragmatics, language and culture, Libyan Arabic, politeness, pragmatics, socio-pragmatics, speech acts

Procedia PDF Downloads 115
1656 Enframing the Smart City: Utilizing Heidegger's 'The Question Concerning Technology' as a Framework to Interpret Smart Urbanism

Authors: Will Brown

Abstract:

Martin Heidegger is considered to be one of the leading philosophical lights of the 20th century with his lecture/essay 'The Question Concerning Technology' proving to be an invaluable text in the study of technology and the understanding of how technology influences the world it is set upon. However, this text has not as of yet been applied to the rapid rise and proliferation of ‘smart’ cities. This article is premised upon the application of the aforementioned text and the smart city in order to provide a fresh, if not critical analysis and interpretation of this phenomena. The first section below provides a brief literature review of smart urbanism in order to lay the groundwork necessary to apply Heidegger’s work to the smart city, from which a framework is developed to interpret the infusion of digital sensing technologies and the urban milieu. This framework is comprised of four concepts put forward in Heidegger’s text: circumscribing, bringing-forth, challenging, and standing-reserve. A concluding chapter is based upon the notion of enframement, arguing that once the rubric of data collection is placed within the urban system, future systems will require the capability to harvest data, resulting in an ever-renewing smart city.

Keywords: air quality sensing, big data, Martin Heidegger, smart city

Procedia PDF Downloads 170
1655 Polycode Texts in Communication of Antisocial Groups: Functional and Pragmatic Aspects

Authors: Ivan Potapov

Abstract:

Background: The aim of this paper is to investigate poly code texts in the communication of youth antisocial groups. Nowadays, the notion of a text has numerous interpretations. Besides all the approaches to defining a text, we must take into account semiotic and cultural-semiotic ones. Rapidly developing IT, world globalization, and new ways of coding of information increase the role of the cultural-semiotic approach. However, the development of computer technologies leads also to changes in the text itself. Polycode texts play a more and more important role in the everyday communication of the younger generation. Therefore, the research of functional and pragmatic aspects of both verbal and non-verbal content is actually quite important. Methods and Material: For this survey, we applied the combination of four methods of text investigation: not only intention and content analysis but also semantic and syntactic analysis. Using these methods provided us with information on general text properties, the content of transmitted messages, and each communicants’ intentions. Besides, during our research, we figured out the social background; therefore, we could distinguish intertextual connections between certain types of polycode texts. As the sources of the research material, we used 20 public channels in the popular messenger Telegram and data extracted from smartphones, which belonged to arrested members of antisocial groups. Findings: This investigation let us assert that polycode texts can be characterized as highly intertextual language unit. Moreover, we could outline the classification of these texts based on communicants’ intentions. The most common types of antisocial polycode texts are a call to illegal actions and agitation. What is more, each type has its own semantic core: it depends on the sphere of communication. However, syntactic structure is universal for most of the polycode texts. Conclusion: Polycode texts play important role in online communication. The results of this investigation demonstrate that in some social groups using these texts has a destructive influence on the younger generation and obviously needs further researches.

Keywords: text, polycode text, internet linguistics, text analysis, context, semiotics, sociolinguistics

Procedia PDF Downloads 98
1654 Robust Features for Impulsive Noisy Speech Recognition Using Relative Spectral Analysis

Authors: Hajer Rahali, Zied Hajaiej, Noureddine Ellouze

Abstract:

The goal of speech parameterization is to extract the relevant information about what is being spoken from the audio signal. In speech recognition systems Mel-Frequency Cepstral Coefficients (MFCC) and Relative Spectral Mel-Frequency Cepstral Coefficients (RASTA-MFCC) are the two main techniques used. It will be shown in this paper that it presents some modifications to the original MFCC method. In our work the effectiveness of proposed changes to MFCC called Modified Function Cepstral Coefficients (MODFCC) were tested and compared against the original MFCC and RASTA-MFCC features. The prosodic features such as jitter and shimmer are added to baseline spectral features. The above-mentioned techniques were tested with impulsive signals under various noisy conditions within AURORA databases.

Keywords: auditory filter, impulsive noise, MFCC, prosodic features, RASTA filter

Procedia PDF Downloads 391
1653 A New Dual Forward Affine Projection Adaptive Algorithm for Speech Enhancement in Airplane Cockpits

Authors: Djendi Mohmaed

Abstract:

In this paper, we propose a dual adaptive algorithm, which is based on the combination between the forward blind source separation (FBSS) structure and the affine projection algorithm (APA). This proposed algorithm combines the advantages of the source separation properties of the FBSS structure and the fast convergence characteristics of the APA algorithm. The proposed algorithm needs two noisy observations to provide an enhanced speech signal. This process is done in a blind manner without the need for ant priori information about the source signals. The proposed dual forward blind source separation affine projection algorithm is denoted (DFAPA) and used for the first time in an airplane cockpit context to enhance the communication from- and to- the airplane. Intensive experiments were carried out in this sense to evaluate the performance of the proposed DFAPA algorithm.

Keywords: adaptive algorithm, speech enhancement, system mismatch, SNR

Procedia PDF Downloads 105