Search results for: automatic spontaneous speech analysis
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 27992

Search results for: automatic spontaneous speech analysis

27752 Identifying Missing Component in the Bechdel Test Using Principal Component Analysis Method

Authors: Raghav Lakhotia, Chandra Kanth Nagesh, Krishna Madgula

Abstract:

A lot has been said and discussed regarding the rationale and significance of the Bechdel Score. It became a digital sensation in 2013, when Swedish cinemas began to showcase the Bechdel test score of a film alongside its rating. The test has drawn criticism from experts and the film fraternity regarding its use to rate the female presence in a movie. The pundits believe that the score is too simplified and the underlying criteria of a film to pass the test must include 1) at least two women, 2) who have at least one dialogue, 3) about something other than a man, is egregious. In this research, we have considered a few more parameters which highlight how we represent females in film, like the number of female dialogues in a movie, dialogue genre, and part of speech tags in the dialogue. The parameters were missing in the existing criteria to calculate the Bechdel score. The research aims to analyze 342 movies scripts to test a hypothesis if these extra parameters, above with the current Bechdel criteria, are significant in calculating the female representation score. The result of the Principal Component Analysis method concludes that the female dialogue content is a key component and should be considered while measuring the representation of women in a work of fiction.

Keywords: Bechdel test, dialogue genre, parts of speech tags, principal component analysis

Procedia PDF Downloads 99
27751 Automatic Verification Technology of Virtual Machine Software Patch on IaaS Cloud

Authors: Yoji Yamato

Abstract:

In this paper, we propose an automatic verification technology of software patches for user virtual environments on IaaS Cloud to decrease verification costs of patches. In these days, IaaS services have been spread and many users can customize virtual machines on IaaS Cloud like their own private servers. Regarding to software patches of OS or middleware installed on virtual machines, users need to adopt and verify these patches by themselves. This task increases operation costs of users. Our proposed method replicates user virtual environments, extracts verification test cases for user virtual environments from test case DB, distributes patches to virtual machines on replicated environments and conducts those test cases automatically on replicated environments. We have implemented the proposed method on OpenStack using Jenkins and confirmed the feasibility. Using the implementation, we confirmed the effectiveness of test case creation efforts by our proposed idea of 2-tier abstraction of software functions and test cases. We also evaluated the automatic verification performance of environment replications, test cases extractions and test cases conductions.

Keywords: OpenStack, cloud computing, automatic verification, jenkins

Procedia PDF Downloads 455
27750 Automatic Classification for the Degree of Disc Narrowing from X-Ray Images Using CNN

Authors: Kwangmin Joo

Abstract:

Automatic detection of lumbar vertebrae and classification method is proposed for evaluating the degree of disc narrowing. Prior to classification, deep learning based segmentation is applied to detect individual lumbar vertebra. M-net is applied to segment five lumbar vertebrae and fine-tuning segmentation is employed to improve the accuracy of segmentation. Using the features extracted from previous step, clustering technique, k-means clustering, is applied to estimate the degree of disc space narrowing under four grade scoring system. As preliminary study, techniques proposed in this research could help building an automatic scoring system to diagnose the severity of disc narrowing from X-ray images.

Keywords: Disc space narrowing, Degenerative disc disorders, Deep learning based segmentation, Clustering technique

Procedia PDF Downloads 95
27749 Capnography for Detection of Return of Spontaneous Circulation Pseudo-Pea

Authors: Yiyuan David Hu, Alex Lindqwister, Samuel B. Klein, Karen Moodie, Norman A. Paradis

Abstract:

Introduction: Pseudo-Pulseless Electrical Activity (p-PEA) is a lifeless form of profound cardiac shock characterized by measurable cardiac mechanical activity without clinically detectable pulses. Patients in pseudo-PEA carry different prognoses than those in true PEA and may require different therapies. End-tidal carbon dioxide (ET-CO2) is a reliable indicator of the return of spontaneous circulation (ROSC) in ventricular fibrillation and true-PEA but has not been studied p-PEA. Hypothesis: ET-CO2 can be used as an independent indicator of ROSC in p-PEA resuscitation. Methods: 30kg female swine (N = 14) under intravenous anesthesia were instrumented with aortic and right atrial micromanometer pressure. ECG and ET-CO2 were measured continuously. p-PEA was induced by ventilation with 6% oxygen in 94% nitrogen and was defined as a systolic Ao less than 40 mmHg. The statistical relationships between ET-CO2 and ROSC are reported. Results: ET-CO2 during resuscitation strongly correlated with ROSC (Figure 1). Mean ET-CO2 during p-PEA was 28.4 ± 8.4, while mean ET-CO2 in ROSC for 100% O2 cohort was 42.2 ± 12.6 (p < 0.0001), mean ET-CO2 in ROSC for 100% O2 + CPR was 33.0 ± 15.4 (p < 0.0001). Analysis of slope was limited to one minute of resuscitation data to capture local linearity; assessment began 10 seconds after resuscitation started to allow the ventilator to mix 100% O2. Pigs who would recover with 100% O2 had a slope of 0.023 ± 0.001, oxygen + CPR had a slope of 0.018 ± 0.002, and oxygen + CPR + epinephrine had a slope of 0.0050 ± 0.0009. Conclusions: During resuscitation from porcine hypoxic p-PEA, a rise in ET-CO2 is indicative of ROSC.

Keywords: ET-CO2, resuscitation, capnography, pseudo-PEA

Procedia PDF Downloads 161
27748 Cross-Cultural Pragmatics: Apology Strategies by Libyans

Authors: Ahmed Elgadri

Abstract:

In the last thirty years, studies on cross-cultural pragmatics in general and apology strategies in specific have focused on western and East-Asian societies. A small volume of research has been conducted in investigating speech acts production by Arabic dialect speakers. Therefore, this study investigated the apology strategies used by Libyan Arabic speakers using an online Discourse Completion Task (DCT) questionnaire. The DCT consisted of six situations covering different social contexts. The survey was written in Libyan Arabic dialect to help generate vernacular speech as much as possible. The participants were 25 Libyan nationals, 12 females, and 13 males. Also, to get a deeper understanding of the motivation behind the use of certain strategies, the researcher interviewed four participants using the Libyan Arabic dialect as well. The results revealed a high use of IFID, offer of repair, and explanation. Although this might support the universality claim of speech acts strategies, it was clear that cultural norms and religion determined the choice of apology strategies significantly. This led to the discovery of new culture-specific strategies, as outlined later in this paper. This study gives an insight into politeness strategies in Libyan society, and it is hoped to contribute to the field of cross-cultural pragmatics.

Keywords: apologies, cross-cultural pragmatics, language and culture, Libyan Arabic, politeness, pragmatics, socio-pragmatics, speech acts

Procedia PDF Downloads 123
27747 The Influence of Neural Synchrony on Auditory Middle Latency and Late Latency Responses and Its Correlation with Audiological Profile in Individuals with Auditory Neuropathy

Authors: P. Renjitha, P. Hari Prakash

Abstract:

Auditory neuropathy spectrum disorder (ANSD) is an auditory disorder with normal cochlear outer hair cell function and disrupted auditory nerve function. It results in unique clinical characteristic with absent auditory brainstem response (ABR), absent acoustic reflex and the presence of otoacoustic emissions (OAE) and cochlear microphonics. The lesion site could be at cochlear inner hair cells, the synapse between the inner hair cells and type I auditory nerve fibers, and/or the auditory nerve itself. But the literatures on synchrony at higher auditory system are sporadic and are less understood. It might be interesting to see if there is a recovery of neural synchrony at higher auditory centers. Also, does the level at which the auditory system recovers with adequate synchrony to the extent of observable evoke response potentials (ERPs) can predict speech perception? In the current study, eight ANSD participants and healthy controls underwent detailed audiological assessment including ABR, auditory middle latency response (AMLR), and auditory late latency response (ALLR). AMLR was recorded for clicks and ALLR was evoked using 500Hz and 2 kHz tone bursts. Analysis revealed that the participant could be categorized into three groups. Group I (2/8) where ALLR was present only for 2kHz tone burst. Group II (4/8), where AMLR was absent and ALLR was seen for both the stimuli. Group III (2/8) consisted individuals with identifiable AMLR and ALLR for all the stimuli. The highest speech identification sore observed in ANSD group was 30% and hence considered having poor speech perception. Overall test result indicates that the site of neural synchrony recovery could be varying across individuals with ANSD. Some individuals show recovery of neural synchrony at the thalamocortical level while others show the same only at the cortical level. Within ALLR itself there could be variation across stimuli again could be related to neural synchrony. Nevertheless, none of these patterns could possible explain the speech perception ability of the individuals. Hence, it could be concluded that neural synchrony as measured by evoked potentials could not be a good clinical predictor speech perception.

Keywords: auditory late latency response, auditory middle latency response, auditory neuropathy spectrum disorder, correlation with speech identification score

Procedia PDF Downloads 117
27746 Intertextuality in Choreography: Investigation of Text and Movements in Making Choreography

Authors: Muhammad Fairul Azreen Mohd Zahid

Abstract:

Speech, text, and movement intensify aspects of creating choreography by connecting with emotional entanglements, tradition, literature, and other texts. This research focuses on the practice as research that will prioritise the choreography process as an inquiry approach. With the driven context, the study intervenes in critical conjunctions of choreographic theory, bringing together new reflections on the moving body, spaces of action, as well as intertextuality between text and movements in making choreography. Throughout the process, the researcher will introduce the level of deliberation from speech through movements and text to express emotion within a narrative context of an “illocutionary act.” This practice as research will produce a different meaning from the “utterance text” to “utterance movements” in the perspective of speech acts theory by J.L Austin based on fragmented text from “pidato adat” which has been used as opening speech in Randai. Looking at the theory of deconstruction by Jacque Derrida also will give a different meaning from the text. Nevertheless, the process of creating the choreography will also help to lay the basic normative structure implicit in “constative” (statement text/movement) and “performative” (command text/movement). Through this process, the researcher will also look at several methods of using text from two works by Joseph Gonzales, “Becoming King-The Pakyung Revisited” and Crystal Pite's “The Statement,” as references to produce different methods in making choreography. The perspective from the semiotic foundation will support how occurrences within dance discourses as texts through a semiotic lens. The method used in this research is qualitative, which includes an interview and simulation of the concept to get an outcome.

Keywords: intertextuality, choreography, speech act, performative, deconstruction

Procedia PDF Downloads 65
27745 Referring to Jordanian Female Relatives in Public

Authors: Ibrahim Darwish, Noora Abu Ain

Abstract:

Referring to female relatives by male Jordanian speakers in public is governed by various linguistic and social constraints. Although Jordanian society is less conservative than it was a few decades ago, women are still considered the weaker link in society and men still believe that they need to protect them. Conservative Jordanians often avoid referring to their female relatives overtly, i.e., using their real names. Instead, they use covert names, such as pseudonyms, nicknames, pet names, etc. The reason behind such language use has to do with how Arab men, in general, see women as part of their honor. This study intends to investigate to what extent Jordanian males hide their female relatives’ names in public domains. The data was collected from spontaneous informal voice-recorded interviews carried out in the village of Saham in the far north of Jordan. Saham’s dialect is part of a larger Horani dialect used by speakers along a wide area that stretches from Salt in the south to the Syrian borders in the north of Jordan. The voice-recorded interviews were originally carried out as an audio record of some customs and traditions in the village of Saham in 2013. During most of these interviews, the researchers observed how the male participants indirectly referred to their female relatives. Instead of using real names, the male speakers used broad terms to refer to their female relatives, such al-Beit ‘the home,’ al-ciyaal ‘the kids’, um-x ‘the mother of x,’ etc. All tokens related to the issue in question were collected, analyzed and quantified about three age cohorts: young, middle-aged and old speakers. The results show that young speakers are more direct in referring to their female relatives than the other two age groups. This can point to a possible change in progress in the speech community of Saham. It is argued that due to contact with other urban speech communities, the young speakers in Saham do not feel the need to hide the real names of their female relatives as they consider them as equals. Indeed, the young generation is more open to the idea of women's rights and call for expanding Jordanian women’s roles in Jordanian society.

Keywords: gender differences, Horan, proper names, social constraints

Procedia PDF Downloads 96
27744 A New Dual Forward Affine Projection Adaptive Algorithm for Speech Enhancement in Airplane Cockpits

Authors: Djendi Mohmaed

Abstract:

In this paper, we propose a dual adaptive algorithm, which is based on the combination between the forward blind source separation (FBSS) structure and the affine projection algorithm (APA). This proposed algorithm combines the advantages of the source separation properties of the FBSS structure and the fast convergence characteristics of the APA algorithm. The proposed algorithm needs two noisy observations to provide an enhanced speech signal. This process is done in a blind manner without the need for ant priori information about the source signals. The proposed dual forward blind source separation affine projection algorithm is denoted (DFAPA) and used for the first time in an airplane cockpit context to enhance the communication from- and to- the airplane. Intensive experiments were carried out in this sense to evaluate the performance of the proposed DFAPA algorithm.

Keywords: adaptive algorithm, speech enhancement, system mismatch, SNR

Procedia PDF Downloads 108
27743 Increasing the Forecasting Fidelity of Current Collection System Operating Capability by Means of Contact Pressure Simulation Modelling

Authors: Anton Golubkov, Gleb Ermachkov, Aleksandr Smerdin, Oleg Sidorov, Victor Philippov

Abstract:

Current collection quality is one of the limiting factors when increasing trains movement speed in the rail sector. With the movement speed growth, the impact forces on the current collector from the rolling stock and the aerodynamic influence increase, which leads to the spread in the contact pressure values, separation of the current collector head from the contact wire, contact arcing and excessive wear of the contact elements. The upcoming trend in resolving this issue is the use of the automatic control systems providing stabilization of the contact pressure value. The present paper considers the features of the contemporary automatic control systems of the current collector’s pressure; their major disadvantages have been stated. A scheme of current collector pressure automatic control has been proposed, distinguished by a proactive influence on undesirable effects. A mathematical model of contact strips wearing has been presented, obtained in accordance with the provisions of the central composition rotatable design program. The analysis of the obtained dependencies has been carried out. The procedures for determining the optimal current collector pressure on the contact wire and the pressure control principle in the pneumatic drive have been described.

Keywords: contact strip, current collector, high-speed running, program control, wear

Procedia PDF Downloads 116
27742 Searching Linguistic Synonyms through Parts of Speech Tagging

Authors: Faiza Hussain, Usman Qamar

Abstract:

Synonym-based searching is recognized to be a complicated problem as text mining from unstructured data of web is challenging. Finding useful information which matches user need from bulk of web pages is a cumbersome task. In this paper, a novel and practical synonym retrieval technique is proposed for addressing this problem. For replacement of semantics, user intent is taken into consideration to realize the technique. Parts-of-Speech tagging is applied for pattern generation of the query and a thesaurus for this experiment was formed and used. Comparison with Non-Context Based Searching, Context Based searching proved to be a more efficient approach while dealing with linguistic semantics. This approach is very beneficial in doing intent based searching. Finally, results and future dimensions are presented.

Keywords: natural language processing, text mining, information retrieval, parts-of-speech tagging, grammar, semantics

Procedia PDF Downloads 279
27741 Hindi Speech Synthesis by Concatenation of Recognized Hand Written Devnagri Script Using Support Vector Machines Classifier

Authors: Saurabh Farkya, Govinda Surampudi

Abstract:

Optical Character Recognition is one of the current major research areas. This paper is focussed on recognition of Devanagari script and its sound generation. This Paper consists of two parts. First, Optical Character Recognition of Devnagari handwritten Script. Second, speech synthesis of the recognized text. This paper shows an implementation of support vector machines for the purpose of Devnagari Script recognition. The Support Vector Machines was trained with Multi Domain features; Transform Domain and Spatial Domain or Structural Domain feature. Transform Domain includes the wavelet feature of the character. Structural Domain consists of Distance Profile feature and Gradient feature. The Segmentation of the text document has been done in 3 levels-Line Segmentation, Word Segmentation, and Character Segmentation. The pre-processing of the characters has been done with the help of various Morphological operations-Otsu's Algorithm, Erosion, Dilation, Filtration and Thinning techniques. The Algorithm was tested on the self-prepared database, a collection of various handwriting. Further, Unicode was used to convert recognized Devnagari text into understandable computer document. The document so obtained is an array of codes which was used to generate digitized text and to synthesize Hindi speech. Phonemes from the self-prepared database were used to generate the speech of the scanned document using concatenation technique.

Keywords: Character Recognition (OCR), Text to Speech (TTS), Support Vector Machines (SVM), Library of Support Vector Machines (LIBSVM)

Procedia PDF Downloads 468
27740 Thoughts Regarding Interprofessional Work between Nurses and Speech-Language-Hearing Therapists in Cancer Rehabilitation: An Approach for Dysphagia

Authors: Akemi Nasu, Keiko Matsumoto

Abstract:

Rehabilitation for cancer requires setting up individual goals for each patient and an approach that properly fits the stage of cancer when putting into practice. In order to cope with the daily changes in the patients' condition, the establishment of a good cooperative relationship between the nurses and the physiotherapists, occupational therapists, and speech-language-hearing therapists (therapists) becomes essential. This study will focus on the present situation of the cooperation between nurses and therapists, especially the speech-language-hearing therapists, and aim to elucidate what develops there. A semi-structured interview was conducted targeted at a physical therapist having practical experience in working in collaboration with nurses. The contents of the interview were transcribed and converted to data, and the data was encoded and categorized with sequentially increasing degrees of abstraction to conduct a qualitative explorative factor analysis of the data. When providing ethical explanations, particular care was taken to ensure that participants would not be subjected to any disadvantages as a result of participating in the study. In addition, they were also informed that their privacy would be ensured and that they have the right to decline to participate in the study. In addition, they were also informed that the results of the study would be announced publicly at an applicable nursing academic conference. This study has been approved following application to the ethical committee of the university with which the researchers are affiliated. The survey participant is a female speech-language-hearing therapist in her forties. As a result of the analysis, 6 categories were extracted consisting of 'measures to address appetite and aspiration pneumonia prevention', 'limitation of the care a therapist alone could provide', 'the all-inclusive patient- supportive care provided by nurses', 'expand the beneficial cooperation with nurses', 'providing education for nurses on the swallowing function utilizing videofluoroscopic examination of swallowing', 'enhancement of communication including conferences'. In order to improve the team performance, and for the teamwork competency necessary for the provision of safer care, mutual support is essential. As for the cooperation between nurses and therapists, this survey indicates that the maturing of the cooperation between professionals in order to improve nursing professionals' knowledge and enhance communication will lead to an improvement in the quality of the rehabilitation for cancer.

Keywords: cancer rehabilitation, nurses, speech-language-hearing therapists, interprofessional work

Procedia PDF Downloads 111
27739 The Automatic Transliteration Model of Images of the Book Hamong Tani Using Statistical Approach

Authors: Agustinus Rudatyo Himamunanto, Anastasia Rita Widiarti

Abstract:

Transliteration using Javanese manuscripts is one of methods to preserve and legate the wealth of literature in the past for the present generation in Indonesia. The transliteration manual process commonly requires philologists and takes a relatively long time. The automatic transliteration process is expected to shorten the time so as to help the works of philologists. The preprocessing and segmentation stage firstly done is used to manage the document images, thus obtaining image script units that will compile input document images free from noise and have the similarity in properties in the thickness, size, and slope. The next stage of characteristic extraction is used to find unique characteristics that will distinguish each Javanese script image. One of characteristics that is used in this research is the number of black pixels in each image units. Each image of Java scripts contained in the data training will undergo the same process similar to the input characters. The system testing was performed with the data of the book Hamong Tani. The book Hamong Tani was selected due to its content, age and number of pages. Those were considered sufficient as a model experimental input. Based on the results of random page automatic transliteration process testing, it was determined that the maximum percentage correctness obtained was 81.53%. The percentage of success was obtained in 32x32 pixel input image size with the 5x5 image window. With regard to the results, it can be concluded that the automatic transliteration model offered is relatively good.

Keywords: Javanese script, character recognition, statistical, automatic transliteration

Procedia PDF Downloads 316
27738 A Simple Adaptive Atomic Decomposition Voice Activity Detector Implemented by Matching Pursuit

Authors: Thomas Bryan, Veton Kepuska, Ivica Kostanic

Abstract:

A simple adaptive voice activity detector (VAD) is implemented using Gabor and gammatone atomic decomposition of speech for high Gaussian noise environments. Matching pursuit is used for atomic decomposition, and is shown to achieve optimal speech detection capability at high data compression rates for low signal to noise ratios. The most active dictionary elements found by matching pursuit are used for the signal reconstruction so that the algorithm adapts to the individual speakers dominant time-frequency characteristics. Speech has a high peak to average ratio enabling matching pursuit greedy heuristic of highest inner products to isolate high energy speech components in high noise environments. Gabor and gammatone atoms are both investigated with identical logarithmically spaced center frequencies, and similar bandwidths. The algorithm performs equally well for both Gabor and gammatone atoms with no significant statistical differences. The algorithm achieves 70% accuracy at a 0 dB SNR, 90% accuracy at a 5 dB SNR and 98% accuracy at a 20dB SNR using 30dB SNR as a reference for voice activity.

Keywords: atomic decomposition, gabor, gammatone, matching pursuit, voice activity detection

Procedia PDF Downloads 266
27737 Earphone Style Wearable Device for Automatic Guidance Service with Position Sensing

Authors: Dawei Cai

Abstract:

This paper describes a design of earphone style wearable device that may provide an automatic guidance service for visitors. With both position information and orientation information obtained from NFC and terrestrial magnetism sensor, a high level automatic guide service may be realized. To realize the service, a algorithm for position detection using the packet from NFC tags, and developed an algorithm to calculate the device orientation based on the data from acceleration and terrestrial magnetism sensors called as MEMS. If visitors want to know some explanation about an exhibit in front of him, what he has to do is only move to the object and stands for a moment. The identification program will automatically recognize the status based on the information from NFC and MEMS, and start playing explanation content about the exhibit. This service should be useful for improving the understanding of the exhibition items and bring more satisfactory visiting experience without less burden.

Keywords: wearable device, MEMS sensor, ubiquitous computing, NFC

Procedia PDF Downloads 218
27736 Efficient Subsurface Mapping: Automatic Integration of Ground Penetrating Radar with Geographic Information Systems

Authors: Rauf R. Hussein, Devon M. Ramey

Abstract:

Integrating Ground Penetrating Radar (GPR) with Geographic Information Systems (GIS) can provide valuable insights for various applications, such as archaeology, transportation, and utility locating. Although there has been progress toward automating the integration of GPR data with GIS, fully automatic integration has not been achieved yet. Additionally, manually integrating GPR data with GIS can be a time-consuming and error-prone process. In this study, actual, real-world GPR applications are presented, and a software named GPR-GIS 10 is created to interactively extract subsurface targets from GPR radargrams and automatically integrate them into GIS. With this software, it is possible to quickly and reliably integrate the two techniques to create informative subsurface maps. The results indicated that automatic integration of GPR with GIS can be an efficient tool to map and view any subsurface targets in their appropriate location in a 3D space with the needed precision. The findings of this study could help GPR-GIS integrators save time and reduce errors in many GPR-GIS applications.

Keywords: GPR, GIS, GPR-GIS 10, drone technology, automation

Procedia PDF Downloads 56
27735 The Role of Media Relations in the Brand Image: Case Study in Three Brands of the Automobile Industry

Authors: Rosa Sobreira, Paula Arriscado

Abstract:

Marketers are aware that media relations is an important touch point, which is also cheaper, to bring their products and their brands to the consumer. They recognize the role of journalists as moderators and transformers of public opinion, and they realize their influence on brand image. And also, they know that readers, listeners, viewers and internet users "believe" more what they read, hear and see in the news than in an advertisement. The study is focused on the automotive industry and analyses the news published about three brands that share industrial facilities and components. We wanted to understand the role of the information created by the brand`s media team in the journalists’ work, and the impact on management, activation and differentiation of brands and their products` attributes and benefits. Based on a qualitative methodology, the analysis focused on press news, making comparison between media coverage and their “narratives” about the three cars from different brands. The results point to the fact that journalists easily integrate speech from the marks on their products. In the case of this study, we found that apart from the description of the many similarities between the three cars, the average speech also "struggled" for revealing the attributes that differentiate them. This interpretation of the results helps us to understand the "marriage" between branding and media. We believe also this paper let us to understand how journalists, through news, join the speech of the brands.

Keywords: brand management, media relations, differentiation, positioning

Procedia PDF Downloads 198
27734 Bit Error Rate Monitoring for Automatic Bias Control of Quadrature Amplitude Modulators

Authors: Naji Ali Albakay, Abdulrahman Alothaim, Isa Barshushi

Abstract:

The most common quadrature amplitude modulator (QAM) applies two Mach-Zehnder Modulators (MZM) and one phase shifter to generate high order modulation format. The bias of MZM changes over time due to temperature, vibration, and aging factors. The change in the biasing causes distortion to the generated QAM signal which leads to deterioration of bit error rate (BER) performance. Therefore, it is critical to be able to lock MZM’s Q point to the required operating point for good performance. We propose a technique for automatic bias control (ABC) of QAM transmitter using BER measurements and gradient descent optimization algorithm. The proposed technique is attractive because it uses the pertinent metric, BER, which compensates for bias drifting independently from other system variations such as laser source output power. The proposed scheme performance and its operating principles are simulated using OptiSystem simulation software for 4-QAM and 16-QAM transmitters.

Keywords: automatic bias control, optical fiber communication, optical modulation, optical devices

Procedia PDF Downloads 160
27733 DEEPMOTILE: Motility Analysis of Human Spermatozoa Using Deep Learning in Sri Lankan Population

Authors: Chamika Chiran Perera, Dananjaya Perera, Chirath Dasanayake, Banuka Athuraliya

Abstract:

Male infertility is a major problem in the world, and it is a neglected and sensitive health issue in Sri Lanka. It can be determined by analyzing human semen samples. Sperm motility is one of many factors that can evaluate male’s fertility potential. In Sri Lanka, this analysis is performed manually. Manual methods are time consuming and depend on the person, but they are reliable and it can depend on the expert. Machine learning and deep learning technologies are currently being investigated to automate the spermatozoa motility analysis, and these methods are unreliable. These automatic methods tend to produce false positive results and false detection. Current automatic methods support different techniques, and some of them are very expensive. Due to the geographical variance in spermatozoa characteristics, current automatic methods are not reliable for motility analysis in Sri Lanka. The suggested system, DeepMotile, is to explore a method to analyze motility of human spermatozoa automatically and present it to the andrology laboratories to overcome current issues. DeepMotile is a novel deep learning method for analyzing spermatozoa motility parameters in the Sri Lankan population. To implement the current approach, Sri Lanka patient data were collected anonymously as a dataset, and glass slides were used as a low-cost technique to analyze semen samples. Current problem was identified as microscopic object detection and tackling the problem. YOLOv5 was customized and used as the object detector, and it achieved 94 % mAP (mean average precision), 86% Precision, and 90% Recall with the gathered dataset. StrongSORT was used as the object tracker, and it was validated with andrology experts due to the unavailability of annotated ground truth data. Furthermore, this research has identified many potential ways for further investigation, and andrology experts can use this system to analyze motility parameters with realistic accuracy.

Keywords: computer vision, deep learning, convolutional neural networks, multi-target tracking, microscopic object detection and tracking, male infertility detection, motility analysis of human spermatozoa

Procedia PDF Downloads 69
27732 Comparing Deep Architectures for Selecting Optimal Machine Translation

Authors: Despoina Mouratidis, Katia Lida Kermanidis

Abstract:

Machine translation (MT) is a very important task in Natural Language Processing (NLP). MT evaluation is crucial in MT development, as it constitutes the means to assess the success of an MT system, and also helps improve its performance. Several methods have been proposed for the evaluation of (MT) systems. Some of the most popular ones in automatic MT evaluation are score-based, such as the BLEU score, and others are based on lexical similarity or syntactic similarity between the MT outputs and the reference involving higher-level information like part of speech tagging (POS). This paper presents a language-independent machine learning framework for classifying pairwise translations. This framework uses vector representations of two machine-produced translations, one from a statistical machine translation model (SMT) and one from a neural machine translation model (NMT). The vector representations consist of automatically extracted word embeddings and string-like language-independent features. These vector representations used as an input to a multi-layer neural network (NN) that models the similarity between each MT output and the reference, as well as between the two MT outputs. To evaluate the proposed approach, a professional translation and a "ground-truth" annotation are used. The parallel corpora used are English-Greek (EN-GR) and English-Italian (EN-IT), in the educational domain and of informal genres (video lecture subtitles, course forum text, etc.) that are difficult to be reliably translated. They have tested three basic deep learning (DL) architectures to this schema: (i) fully-connected dense, (ii) Convolutional Neural Network (CNN), and (iii) Long Short-Term Memory (LSTM). Experiments show that all tested architectures achieved better results when compared against those of some of the well-known basic approaches, such as Random Forest (RF) and Support Vector Machine (SVM). Better accuracy results are obtained when LSTM layers are used in our schema. In terms of a balance between the results, better accuracy results are obtained when dense layers are used. The reason for this is that the model correctly classifies more sentences of the minority class (SMT). For a more integrated analysis of the accuracy results, a qualitative linguistic analysis is carried out. In this context, problems have been identified about some figures of speech, as the metaphors, or about certain linguistic phenomena, such as per etymology: paronyms. It is quite interesting to find out why all the classifiers led to worse accuracy results in Italian as compared to Greek, taking into account that the linguistic features employed are language independent.

Keywords: machine learning, machine translation evaluation, neural network architecture, pairwise classification

Procedia PDF Downloads 103
27731 Unsupervised Part-of-Speech Tagging for Amharic Using K-Means Clustering

Authors: Zelalem Fantahun

Abstract:

Part-of-speech tagging is the process of assigning a part-of-speech or other lexical class marker to each word into naturally occurring text. Part-of-speech tagging is the most fundamental and basic task almost in all natural language processing. In natural language processing, the problem of providing large amount of manually annotated data is a knowledge acquisition bottleneck. Since, Amharic is one of under-resourced language, the availability of tagged corpus is the bottleneck problem for natural language processing especially for POS tagging. A promising direction to tackle this problem is to provide a system that does not require manually tagged data. In unsupervised learning, the learner is not provided with classifications. Unsupervised algorithms seek out similarity between pieces of data in order to determine whether they can be characterized as forming a group. This paper explicates the development of unsupervised part-of-speech tagger using K-Means clustering for Amharic language since large amount of data is produced in day-to-day activities. In the development of the tagger, the following procedures are followed. First, the unlabeled data (raw text) is divided into 10 folds and tokenization phase takes place; at this level, the raw text is chunked at sentence level and then into words. The second phase is feature extraction which includes word frequency, syntactic and morphological features of a word. The third phase is clustering. Among different clustering algorithms, K-means is selected and implemented in this study that brings group of similar words together. The fourth phase is mapping, which deals with looking at each cluster carefully and the most common tag is assigned to a group. This study finds out two features that are capable of distinguishing one part-of-speech from others these are morphological feature and positional information and show that it is possible to use unsupervised learning for Amharic POS tagging. In order to increase performance of the unsupervised part-of-speech tagger, there is a need to incorporate other features that are not included in this study, such as semantic related information. Finally, based on experimental result, the performance of the system achieves a maximum of 81% accuracy.

Keywords: POS tagging, Amharic, unsupervised learning, k-means

Procedia PDF Downloads 414
27730 Clinical and Radiological Outcome in 300 Patients with Non-Aneurysmal Sah

Authors: Ranjith Menon, Abathar Aladi, Hans-Christean Nahser, Maneesh Bhojak, Sacha Nevin, Paul Eldridge

Abstract:

Background: Spontaneous subarachnoid haemorrhage (SAH) accounts for approximately 5% of all strokes. Patients with spontaneous SAH (as shown by CT or lumbar puncture) undergo investigations to identify or exclude an underlying structural cause, typically cerebral aneurysm. However in 10 - 20% of cases, no structural cause is found. This includes more than one imaging modality (intracranial MRA, CTA, 4DCTA and/or DSA) and in some spinal MRI. Objective: To determine; 1) If an underlying structural or vascular cause can be identified in non-aneurysmal SAH patients by comparing different imaging modalities at presentation and at follow-up. 2) If MRI spine in patients with non-aneurysmal SAH reveals an underlying SAH cause. 3)The functional outcome at discharge. Results: We performed a retrospective analysis of all non-traumatic SAH patients admitted to the Walton centre from January 2009 to December 2015. There were 1457 patients with non-traumatic SAH admitted to the Walton centre of whom 21.8% (n=300) patients were diagnosed with non-aneurysmal SAH. Males were 65.6% and females were 43.3%. The presenting symptoms were sudden onset headache (93.6%), the focal neurological deficit (12%), loss of consciousness (10.6%) and others (6%). About 285 patients received 2 modalities of imaging (CTA & DSA), 192 received 3 modalities of imaging (CTA, MRA & DSA) and 137 received MRI spine (51/137 whole spine). The modified Rankin Score at discharge were: mRS 0 = 292 (97.33%), mRS 1-2 = 6, mRS 6 = 1 (cardiac arrest in IHD patient) and unknown in 1. Follow-up imaging at 3 to 6 months in 190 (63.3%) patients did not identify an underlying cause. Conclusion: This retrospective analysis concludes that non-aneurysmal SAH has a good functional outcome. A single imaging modality (CTA (4DCTA) or MRA or DSA) was adequate to exclude an underlying cause of SAH and a delayed imaging failed to identify a cause. Routinely performing MRI spine in this group of patients appears not to be necessary according to this evidence.

Keywords: stroke, non-aneurysmal subarachnoid haemorrhage, neuroimaging, modified rankin score

Procedia PDF Downloads 228
27729 Detection of Phoneme [S] Mispronounciation for Sigmatism Diagnosis in Adults

Authors: Michal Krecichwost, Zauzanna Miodonska, Pawel Badura

Abstract:

The diagnosis of sigmatism is mostly based on the observation of articulatory organs. It is, however, not always possible to precisely observe the vocal apparatus, in particular in the oral cavity of the patient. Speech processing can allow to objectify the therapy and simplify the verification of its progress. In the described study the methodology for classification of incorrectly pronounced phoneme [s] is proposed. The recordings come from adults. They were registered with the speech recorder at the sampling rate of 44.1 kHz and the resolution of 16 bit. The database of pathological and normative speech has been collected for the study including reference assessments provided by the speech therapy experts. Ten adult subjects were asked to simulate a certain type of stigmatism under the speech therapy expert supervision. In the recordings, the analyzed phone [s] was surrounded by vowels, viz: ASA, ESE, ISI, SPA, USU, YSY. Thirteen MFCC (mel-frequency cepstral coefficients) and RMS (root mean square) values are calculated within each frame being a part of the analyzed phoneme. Additionally, 3 fricative formants along with corresponding amplitudes are determined for the entire segment. In order to aggregate the information within the segment, the average value of each MFCC coefficient is calculated. All features of other types are aggregated by means of their 75th percentile. The proposed method of features aggregation reduces the size of the feature vector used in the classification. Binary SVM (support vector machine) classifier is employed at the phoneme recognition stage. The first group consists of pathological phones, while the other of the normative ones. The proposed feature vector yields classification sensitivity and specificity measures above 90% level in case of individual logo phones. The employment of a fricative formants-based information improves the sole-MFCC classification results average of 5 percentage points. The study shows that the employment of specific parameters for the selected phones improves the efficiency of pathology detection referred to the traditional methods of speech signal parameterization.

Keywords: computer-aided pronunciation evaluation, sibilants, sigmatism diagnosis, speech processing

Procedia PDF Downloads 256
27728 Leadership Effectiveness Compared among Three Cultures Using Voice Pitches

Authors: Asena Biber, Ates Gul Ergun, Seda Bulut

Abstract:

Based on the literature, there are large numbers of studies investigating the relationship between culture and leadership effectiveness. Although giving effective speeches is vital characteristic for a leader to be perceived as effective, to our knowledge, there is no research study the determinants of perceived effective leader speech. The aim of this study is to find the effects of both culture and voice pitch on perceptions of leader's speech effectiveness. Our hypothesis is that people from high power distance countries will perceive leaders' speech effective when the leader's voice pitch is high, comparing with people from relatively low power distance countries. The participants of the study were 36 undergraduate students (12 Pakistanis, 12 Nigerians, and 12 Turks) who are studying in Turkey. National power distance scores of Nigerians ranked as first, Turks ranked as second and Pakistanis ranked as third. There are two independent variables in this study; three nationality groups that representing three levels of power distance and voice pitch of the leader which is manipulated as high and low levels. Researchers prepared an audio to manipulate high and low conditions of voice pitch. A professional whose native language is English read the predetermined speech in high and low voice pitch conditions. Voice pitch was measured using Hertz (Hz) and Decibel (dB). Each nationality group (Pakistan, Nigeria, and Turkey) were divided into groups of six students who listened to either the low or high pitch conditions in the cubicles of the laboratory. It was expected from participants to listen to the audio and fill in the questionnaire which was measuring the leadership effectiveness on a response scale ranging from 1 to 5. To determine the effects of nationality and voice pitch on perceived effectiveness of leader' voice pitch, 3 (Pakistani, Nigerian, and Turk) x 2 (low voice pitch and high voice pitch) two way between subjects analysis of variances was carried out. The results indicated that there was no significant main effect of voice pitch and interaction effect on perceived effectiveness of the leader’s voice pitch. However, there was a significant main effect of nationality on perceived effectiveness of the leader's voice pitch. Based on the results of Turkey’s HSD post-hoc test, only the perceived effectiveness of the leader's speech difference between Pakistanis and Nigerians was statistically significant. The results show that the hypothesis of this study was not supported. As limitations of the study, it is of importance to mention that the sample size should be bigger. Also, the language of the questionnaire and speech should be in the participant’s native language in further studies.

Keywords: culture, leadership effectiveness, power distance, voice pitch

Procedia PDF Downloads 157
27727 Speech and LanguageTherapists’ Advices for Multilingual Children with Developmental Language Disorders

Authors: Rudinë Fetahaj, Flaka Isufi, Kristina Hansson

Abstract:

While evidence shows that in most European countries’ multilingualism is rising, unfortunately, the focus of Speech and Language Therapy (SLT) is still monolingualism. Furthermore, there is sparse information on how the needs of multilingual children with language disorders such as Developmental Language Disorder (DLD) are being met and which factors affect the intervention approach of SLTs when treating DLD. This study aims to examine the relationship and correlation between the number of languages SLTs speak, years of experience, and length of education with the advice they give to parents of multilingual children with DLD regarding which language to be spoken. This is a cross-sectional study where a survey was completed online by 2608 SLTs across Europe and data has been used from a 2017 COST-action project. IBM-SPSS-28 was used where descriptive analysis, correlation and Kruskal-Wallis test were performed.SLTs mainly advise the parents of multilingual children with DLD to speak their native language at home. Besides years of experience, language status and the level of education showed to have no association with the type of advice SLTs give. Results showed a non-significant moderate positive correlation between SLTs years of experience and their advice regarding the native language, whereas language status and length of education showed no correlation with the advice SLTs give to parents.

Keywords: quantitative study, developmental language disorders, multilingualism, speech and language therapy, children, European context

Procedia PDF Downloads 55
27726 Recognition of Noisy Words Using the Time Delay Neural Networks Approach

Authors: Khenfer-Koummich Fatima, Mesbahi Larbi, Hendel Fatiha

Abstract:

This paper presents a recognition system for isolated words like robot commands. It’s carried out by Time Delay Neural Networks; TDNN. To teleoperate a robot for specific tasks as turn, close, etc… In industrial environment and taking into account the noise coming from the machine. The choice of TDNN is based on its generalization in terms of accuracy, in more it acts as a filter that allows the passage of certain desirable frequency characteristics of speech; the goal is to determine the parameters of this filter for making an adaptable system to the variability of speech signal and to noise especially, for this the back propagation technique was used in learning phase. The approach was applied on commands pronounced in two languages separately: The French and Arabic. The results for two test bases of 300 spoken words for each one are 87%, 97.6% in neutral environment and 77.67%, 92.67% when the white Gaussian noisy was added with a SNR of 35 dB.

Keywords: TDNN, neural networks, noise, speech recognition

Procedia PDF Downloads 252
27725 Spontaneous Eruption of Impacted Teeth While Awaiting Surgical Intervention

Authors: Alison Ryan, Himani Chhabra, Mohammed Dungarwalla, Judith Jones

Abstract:

Background: Impacted and ectopic teeth present in 1-2% of orthodontic patients and often require joint surgical and orthodontic management. The authors present two patients undergoing orthodontic treatment, where the impacted teeth, in a hopeless position, spontaneously erupted during the period of cessation of general anaesthetic lists during the COVID-19 pandemic. Patient information: A healthy 11-year-old boy was referred to the Department of Oral and Maxillofacial Surgery for the management of a mesioangular impacted LR7. The patient was seen by the joint oral surgery/orthodontic team, who planned for the removal of the LR7 under general anaesthetic. A healthy 13-year-old boy was referred to the same Department and team for surgical extraction of unerupted and buccally impacted UL3 and UR3 under general anaesthetic. Management and outcome: The majority of elective dental-alveolar work ceased as a result of the global pandemic. On resumption of activity, the first patient was reviewed in July 2021. The LR7 had spontaneously erupted in a favourable position, and following MDT review, a decision was made to forgo any further surgical intervention. The second patient was reviewed in July 2021. The UL3 had clinically erupted, and there was radiographic evidence of favourable movement of UR3. Due to the nature of the patient’s malocclusion, the decision was made to proceed with the extractions as previously planned. Key Learning Points: Severely impacted teeth do have a prospect of spontaneous eruption or alignment without clinical intervention, and current literature states the initial location, axial inclination, degree of root formation, and relation of the impacted tooth to adjacent teeth roots may influence spontaneous eruption. There is potential to introduce a period of observation to account for this possibility in the developing dentition, with the aim of reducing the unnecessary need for surgical intervention. This could help prevent episodes of general anaesthetic and allocate theatre space more appropriately.

Keywords: spontaneous eruption, impaction, observation, hopeless position, surgical, orthodontic, change in treatment plan

Procedia PDF Downloads 50
27724 Grammatical Interference in Russian-Spanish Bilingualism

Authors: Olga A. Gnatyuk

Abstract:

The article is devoted to the phenomenon of interference that occurs in the case of the Russian-Spanish language contact. The questions of the definition of the term and levels, as well as prerequisites of interference occurrence, are considered. Interference, which is an essential part of bilingualism, may become apparent at different linguistic levels. Interference is especially evident in oral speech. The article reviews some examples of grammatical interference in Russian-Spanish bilingualism of Russian immigrants living in Spain. According to the results of the research, some cases of mother-tongue interference in Russian-Speaking Spanish language learners’ speech were revealed. Special attention is paid to such key spheres of grammatical interference as articles, personal pronouns, gender, and number of nouns. In the research, the drop of a link-verb, as well as its usage in some incorrect form, are observed in Russian immigrants’ speech. Conclusions are drawn that in the Spanish language, interference errors appear because of a consequence of both the absence in the Russian language of certain phenomena and categories of the Spanish language and the discrepancy of the linguistic systems of the two languages.

Keywords: bilingualism, interference, grammatical interference, Russian language, Spanish language

Procedia PDF Downloads 133
27723 Comparison Study of Machine Learning Classifiers for Speech Emotion Recognition

Authors: Aishwarya Ravindra Fursule, Shruti Kshirsagar

Abstract:

In the intersection of artificial intelligence and human-centered computing, this paper delves into speech emotion recognition (SER). It presents a comparative analysis of machine learning models such as K-Nearest Neighbors (KNN),logistic regression, support vector machines (SVM), decision trees, ensemble classifiers, and random forests, applied to SER. The research employs four datasets: Crema D, SAVEE, TESS, and RAVDESS. It focuses on extracting salient audio signal features like Zero Crossing Rate (ZCR), Chroma_stft, Mel Frequency Cepstral Coefficients (MFCC), root mean square (RMS) value, and MelSpectogram. These features are used to train and evaluate the models’ ability to recognize eight types of emotions from speech: happy, sad, neutral, angry, calm, disgust, fear, and surprise. Among the models, the Random Forest algorithm demonstrated superior performance, achieving approximately 79% accuracy. This suggests its suitability for SER within the parameters of this study. The research contributes to SER by showcasing the effectiveness of various machine learning algorithms and feature extraction techniques. The findings hold promise for the development of more precise emotion recognition systems in the future. This abstract provides a succinct overview of the paper’s content, methods, and results.

Keywords: comparison, ML classifiers, KNN, decision tree, SVM, random forest, logistic regression, ensemble classifiers

Procedia PDF Downloads 14