Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 9935

Search results for: voice activity detection

9905 Features Dimensionality Reduction and Multi-Dimensional Voice-Processing Program to Parkinson Disease Discrimination

Authors: Djamila Meghraoui, Bachir Boudraa, Thouraya Meksen, M.Boudraa

Abstract:

Parkinson's disease is a pathology that involves characteristic perturbations in patients’ voices. This paper describes a proposed method that aims to diagnose persons with Parkinson (PWP) by analyzing on line their voices signals. First, Thresholds signals alterations are determined by the Multi-Dimensional Voice Program (MDVP). Principal Analysis (PCA) is exploited to select the main voice principal componentsthat are significantly affected in a patient. The decision phase is realized by a Mul-tinomial Bayes (MNB) Classifier that categorizes an analyzed voice in one of the two resulting classes: healthy or PWP. The prediction accuracy achieved reaching 98.8% is very promising.

Keywords: Parkinson’s disease recognition, PCA, MDVP, multinomial Naive Bayes

Procedia PDF Downloads 278

9904 Patient-Friendly Hand Gesture Recognition Using AI

Authors: K. Prabhu, K. Dinesh, M. Ranjani, M. Suhitha

Abstract:

During the tough times of covid, those people who were hospitalized found it difficult to always convey what they wanted to or needed to the attendee. Sometimes the attendees might also not be there. In that case, the patients can use simple hand gestures to control electrical appliances (like its set it for a zero watts bulb)and three other gestures for voice note intimation. In this AI-based hand recognition project, NodeMCU is used for the control action of the relay, and it is connected to the firebase for storing the value in the cloud and is interfaced with the python code via raspberry pi. For three hand gestures, a voice clip is added for intimation to the attendee. This is done with the help of Google’s text to speech and the inbuilt audio file option in the raspberry pi 4. All the five gestures will be detected when shown with their hands via the webcam, which is placed for gesture detection. The personal computer is used for displaying the gestures and for running the code in the raspberry pi imager.

Keywords: nodeMCU, AI technology, gesture, patient

Procedia PDF Downloads 166

9903 A Comprehensive Methodology for Voice Segmentation of Large Sets of Speech Files Recorded in Naturalistic Environments

Authors: Ana Londral, Burcu Demiray, Marcus Cheetham

Abstract:

Speech recording is a methodology used in many different studies related to cognitive and behaviour research. Modern advances in digital equipment brought the possibility of continuously recording hours of speech in naturalistic environments and building rich sets of sound files. Speech analysis can then extract from these files multiple features for different scopes of research in Language and Communication. However, tools for analysing a large set of sound files and automatically extract relevant features from these files are often inaccessible to researchers that are not familiar with programming languages. Manual analysis is a common alternative, with a high time and efficiency cost. In the analysis of long sound files, the first step is the voice segmentation, i.e. to detect and label segments containing speech. We present a comprehensive methodology aiming to support researchers on voice segmentation, as the first step for data analysis of a big set of sound files. Praat, an open source software, is suggested as a tool to run a voice detection algorithm, label segments and files and extract other quantitative features on a structure of folders containing a large number of sound files. We present the validation of our methodology with a set of 5000 sound files that were collected in the daily life of a group of voluntary participants with age over 65. A smartphone device was used to collect sound using the Electronically Activated Recorder (EAR): an app programmed to record 30-second sound samples that were randomly distributed throughout the day. Results demonstrated that automatic segmentation and labelling of files containing speech segments was 74% faster when compared to a manual analysis performed with two independent coders. Furthermore, the methodology presented allows manual adjustments of voiced segments with visualisation of the sound signal and the automatic extraction of quantitative information on speech. In conclusion, we propose a comprehensive methodology for voice segmentation, to be used by researchers that have to work with large sets of sound files and are not familiar with programming tools.

Keywords: automatic speech analysis, behavior analysis, naturalistic environments, voice segmentation

Procedia PDF Downloads 281

9902 Comparing Nonverbal Deception Detection of Police Officers and Human Resources Students in the Czech Republic

Authors: Lenka Mynaříková, Hedvika Boukalová

Abstract:

The study looks at the ability to detect nonverbal deception among police officers and management students in the Czech Republic. Respondents from police departments (n=197) and university students of human resources (n=161) completed a deception detection task and evaluated veracity of the statements of suspects in 21 video clips from real crime investigations. Their evaluations were based on nonverbal behavior. Voices in the video clips were modified so that words were not recognizable, yet paraverbal voice characteristics were preserved. Results suggest that respondents have a tendency to lie bias based on their profession. In the evaluation of video clips, stereotypes also played a significant role. The statements of suspects of a different ethnicity, younger age or specific visual features were considered deceitful more often. Research might be beneficial for training in professions that are in need of deception detection techniques.

Keywords: deception detection, police officers, human resources, forensic psychology, forensic studies, organizational psychology

Procedia PDF Downloads 431

9901 Acoustic Analysis for Comparison and Identification of Normal and Disguised Speech of Individuals

Authors: Surbhi Mathur, J. M. Vyas

Abstract:

Although the rapid development of forensic speaker recognition technology has been conducted, there are still many problems to be solved. The biggest problem arises when the cases involving disguised voice samples come across for the purpose of examination and identification. Such type of voice samples of anonymous callers is frequently encountered in crimes involving kidnapping, blackmailing, hoax extortion and many more, where the speaker makes a deliberate effort to manipulate their natural voice in order to conceal their identity due to the fear of being caught. Voice disguise causes serious damage to the natural vocal parameters of the speakers and thus complicates the process of identification. The sole objective of this doctoral project is to find out the possibility of rendering definite opinions in cases involving disguised speech by experimentally determining the effects of different disguise forms on personal identification and percentage rate of speaker recognition for various voice disguise techniques such as raised pitch, lower pitch, increased nasality, covering the mouth, constricting tract, obstacle in mouth etc by analyzing and comparing the amount of phonetic and acoustic variation in of artificial (disguised) and natural sample of an individual, by auditory as well as spectrographic analysis.

Keywords: forensic, speaker recognition, voice, speech, disguise, identification

Procedia PDF Downloads 368

9900 Design and Development of Automatic Onion Harvester

Authors: P. Revathi, T. Mrunalini, K. Padma Priya, P. Ramya, R. Saranya

Abstract:

During the tough times of covid, those people who were hospitalized found it difficult to always convey what they wanted to or needed to the attendee. Sometimes the attendees might also not be there. In that case, the patients can use simple hand gestures to control electrical appliances (like its set it for a zero watts bulb)and three other gestures for voice note intimation. In this AI-based hand recognition project, NodeMCU is used for the control action of the relay, and it is connected to the firebase for storing the value in the cloud and is interfaced with the python code via raspberry pi. For three hand gestures, a voice clip is added for intimation to the attendee. This is done with the help of Google’s text to speech and the inbuilt audio file option in the raspberry pi 4. All the 5 gestures will be detected when shown with their hands via a webcam which is placed for gesture detection. A personal computer is used for displaying the gestures and for running the code in the raspberry pi imager.

Keywords: onion harvesting, automatic pluging, camera, raspberry pi

Procedia PDF Downloads 198

9899 Analysis of Vocal Fold Vibrations from High-Speed Digital Images Based on Dynamic Time Warping

Authors: A. I. A. Rahman, Sh-Hussain Salleh, K. Ahmad, K. Anuar

Abstract:

Analysis of vocal fold vibration is essential for understanding the mechanism of voice production and for improving clinical assessment of voice disorders. This paper presents a Dynamic Time Warping (DTW) based approach to analyze and objectively classify vocal fold vibration patterns. The proposed technique was designed and implemented on a Glottal Area Waveform (GAW) extracted from high-speed laryngeal images by delineating the glottal edges for each image frame. Feature extraction from the GAW was performed using Linear Predictive Coding (LPC). Several types of voice reference templates from simulations of clear, breathy, fry, pressed and hyperfunctional voice productions were used. The patterns of the reference templates were first verified using the analytical signal generated through Hilbert transformation of the GAW. Samples from normal speakers’ voice recordings were then used to evaluate and test the effectiveness of this approach. The classification of the voice patterns using the technique of LPC and DTW gave the accuracy of 81%.

Keywords: dynamic time warping, glottal area waveform, linear predictive coding, high-speed laryngeal images, Hilbert transform

Procedia PDF Downloads 239

9898 Environmentally Adaptive Acoustic Echo Suppression for Barge-in Speech Recognition

Authors: Jong Han Joo, Jung Hoon Lee, Young Sun Kim, Jae Young Kang, Seung Ho Choi

Abstract:

In this study, we propose a novel technique for acoustic echo suppression (AES) during speech recognition under barge-in conditions. Conventional AES methods based on spectral subtraction apply fixed weights to the estimated echo path transfer function (EPTF) at the current signal segment and to the EPTF estimated until the previous time interval. We propose a new approach that adaptively updates weight parameters in response to abrupt changes in the acoustic environment due to background noises or double-talk. Furthermore, we devised a voice activity detector and an initial time-delay estimator for barge-in speech recognition in communication networks. The initial time delay is estimated using log-spectral distance measure, as well as cross-correlation coefficients. The experimental results show that the developed techniques can be successfully applied in barge-in speech recognition systems.

Keywords: acoustic echo suppression, barge-in, speech recognition, echo path transfer function, initial delay estimator, voice activity detector

Procedia PDF Downloads 372

9897 Leadership Effectiveness Compared among Three Cultures Using Voice Pitches

Authors: Asena Biber, Ates Gul Ergun, Seda Bulut

Abstract:

Based on the literature, there are large numbers of studies investigating the relationship between culture and leadership effectiveness. Although giving effective speeches is vital characteristic for a leader to be perceived as effective, to our knowledge, there is no research study the determinants of perceived effective leader speech. The aim of this study is to find the effects of both culture and voice pitch on perceptions of leader's speech effectiveness. Our hypothesis is that people from high power distance countries will perceive leaders' speech effective when the leader's voice pitch is high, comparing with people from relatively low power distance countries. The participants of the study were 36 undergraduate students (12 Pakistanis, 12 Nigerians, and 12 Turks) who are studying in Turkey. National power distance scores of Nigerians ranked as first, Turks ranked as second and Pakistanis ranked as third. There are two independent variables in this study; three nationality groups that representing three levels of power distance and voice pitch of the leader which is manipulated as high and low levels. Researchers prepared an audio to manipulate high and low conditions of voice pitch. A professional whose native language is English read the predetermined speech in high and low voice pitch conditions. Voice pitch was measured using Hertz (Hz) and Decibel (dB). Each nationality group (Pakistan, Nigeria, and Turkey) were divided into groups of six students who listened to either the low or high pitch conditions in the cubicles of the laboratory. It was expected from participants to listen to the audio and fill in the questionnaire which was measuring the leadership effectiveness on a response scale ranging from 1 to 5. To determine the effects of nationality and voice pitch on perceived effectiveness of leader' voice pitch, 3 (Pakistani, Nigerian, and Turk) x 2 (low voice pitch and high voice pitch) two way between subjects analysis of variances was carried out. The results indicated that there was no significant main effect of voice pitch and interaction effect on perceived effectiveness of the leader’s voice pitch. However, there was a significant main effect of nationality on perceived effectiveness of the leader's voice pitch. Based on the results of Turkey’s HSD post-hoc test, only the perceived effectiveness of the leader's speech difference between Pakistanis and Nigerians was statistically significant. The results show that the hypothesis of this study was not supported. As limitations of the study, it is of importance to mention that the sample size should be bigger. Also, the language of the questionnaire and speech should be in the participant’s native language in further studies.

Keywords: culture, leadership effectiveness, power distance, voice pitch

Procedia PDF Downloads 182

9896 Wolof Voice Response Recognition System: A Deep Learning Model for Wolof Audio Classification

Authors: Krishna Mohan Bathula, Fatou Bintou Loucoubar, FNU Kaleemunnisa, Christelle Scharff, Mark Anthony De Castro

Abstract:

Voice recognition algorithms such as automatic speech recognition and text-to-speech systems with African languages can play an important role in bridging the digital divide of Artificial Intelligence in Africa, contributing to the establishment of a fully inclusive information society. This paper proposes a Deep Learning model that can classify the user responses as inputs for an interactive voice response system. A dataset with Wolof language words ‘yes’ and ‘no’ is collected as audio recordings. A two stage Data Augmentation approach is adopted for enhancing the dataset size required by the deep neural network. Data preprocessing and feature engineering with Mel-Frequency Cepstral Coefficients are implemented. Convolutional Neural Networks (CNNs) have proven to be very powerful in image classification and are promising for audio processing when sounds are transformed into spectra. For performing voice response classification, the recordings are transformed into sound frequency feature spectra and then applied image classification methodology using a deep CNN model. The inference model of this trained and reusable Wolof voice response recognition system can be integrated with many applications associated with both web and mobile platforms.

Keywords: automatic speech recognition, interactive voice response, voice response recognition, wolof word classification

Procedia PDF Downloads 116

9895 A Machine Learning Pipeline for Real-Time Activity Detection on Low Computational Power Devices for Metaverse Applications

Authors: Amit Kumar, Amanpreet Chander, Ashish Sahani

Abstract:

This paper presents our recent work on real-time human activity detection based on the media pipe pipeline and machine learning algorithms. The proposed system can detect human activities, including running, jumping, squatting, bending to the left or right, and standing still. This is a robust solution for developing a yoga, dance, metaverse, and fitness application that checks for the correction of the pose without having any additional monitor like a personal trainer. MediaPipe solution offers an open-source cross-platform which utilizes a two-step detector-tracker ML pipeline for live detection of key landmarks on our body which can be used for motion data collection. The prediction of real-time poses uses a variety of machine learning techniques and different types of analysis. Without primarily relying on powerful desktop environments for inference, our method achieves real-time performance on the majority of contemporary mobile phones, desktops/laptops, Python, or even the web. Experimental results show that our method outperforms the existing method in terms of accuracy and real-time capability, achieving an accuracy of 99.92% on testing datasets.

Keywords: human activity detection, media pipe, machine learning, metaverse applications

Procedia PDF Downloads 179

9894 Work with Children's Music Group: Important Aspects of Didactic and Artistic Performance

Authors: Eudjen Cinc

Abstract:

Work with a human voice, especially with a child s voice and cultivating the sound of the choir, presents an area of crucial importance for a conductor. We use the term conductor because it needs to be understood that regardless of whether we have in front of us an amateur or a professional choir, whether they are singers with a wealth of experience or children who are still developing and educating their inner ear so that in the future they could contribute to the development of choir music, the person who stands in front of the group and works with them, needs to have the characteristics of a conductor. Voice formation is a long-term process, without which there is no success in both solo and collective music performance.

Keywords: music group, conductor, collective, performance

Procedia PDF Downloads 219

9893 Lovely, Lyrical, Lilting: Kubrick’s Translation of Lolita’s Voice

Authors: Taylor La Carriere

Abstract:

“What I had madly possessed was not she, but my own creation, another, fanciful Lolita perhaps, more real than Lolita; overlapping, encasing he and having no will, no consciousness indeed, no life of her own,” Vladimir Nabokov writes in his seminal work, Lolita. Throughout Nabokov’s novel, the eponymous character is rendered nonexistent through unreliable narrator Humbert Humbert’s impenetrable narrative, infused with lyrical rationalization. Instead, Lolita is “safely solipsised,” as Humbert muses, solidifying the potential for the erasure of Lolita’s agency and identity. In this literary work, Lolita’s voice is reduced to a nearly invisible presence, only seen through the eyes of her captor. However, in Stanley Kubrick’s film adaptation of Lolita (1962), the “nymphet,” as Nabokov coins, reemerges with a voice of her own, fueled by a lyric impulse, that displaces Humbert’s first-person narration. The lyric, as defined by Catherine Ing, is the voice of the invisible; it is also characterized by performance, the concentrated utterance of individual emotion, and the appearance of spontaneity. The novel’s lyricism is largely in the service of Humbert’s “seductive” voice, while the film reorients it more to Lolita’s subjectivity. Through a close analysis of Kubrick’s cinematic techniques, this paper examines the emergence and translation of Lolita’s voice in contrast with Humbert’s attempts to silence her in Nabokov’s Lolita, hypothesizing that Kubrick translates Lolita’s presence into a visual and aural voice with lyrical attributes, exemplified through the establishment of an altered power dynamic, Sue Lyon’s transformative performance as the titular character, Nelson Riddle and Bob Harris’ musical score, and the omission of Humbert’s first-person point-of-view. In doing so, the film reclaims Lolita’s agency by taking instances of Lolita’s voice in the novel as depicted in the last half of the work and expanding upon them in a way only cinematic depictions could allow. The results of this study suggest that Lolita’s voice in Kubrick’s adaptation functions without disrupting the lyricism present in Nabokov’s source text, materializing through the actions, expressions, and performance of Sue Lyon in the film. This voice, fueled by a lyric impulse of its own, refutes the silence bestowed upon the titular character and enables its ultimate reclamation upon the silver screen.

Keywords: cinema, adaptation, Lolita, lyric voice

Procedia PDF Downloads 193

9892 Reconceptualising the Voice of Children in Child Protection

Authors: Sharon Jackson, Lynn Kelly

Abstract:

This paper proposes a conceptual review of the interdisciplinary literature which has theorised the concept of ‘children’s voices’. The primary aim is to identify and consider the theoretical relevance of conceptual thought on ‘children’s voices’ for research and practice in child protection contexts. Attending to the ‘voice of the child’ has become a core principle of social work practice in contemporary child protection contexts. Discourses of voice permeate the legislative, policy and practice frameworks of child protection practices within the UK and internationally. Voice is positioned within a ‘child-centred’ moral imperative to ‘hear the voices’ of children and take their preferences and perspectives into account. This practice is now considered to be central to working in a child-centered way. The genesis of this call to voice is revealed through sociological analysis of twentieth-century child welfare reform as rooted inter alia in intersecting political, social and cultural discourses which have situated children and childhood as cites of state intervention as enshrined in the 1989 United Nations Convention on the Rights of the Child ratified by the UK government in 1991 and more specifically Article 12 of the convention. From a policy and practice perspective, the professional ‘capturing’ of children’s voices has come to saturate child protection practice. This has incited a stream of directives, resources, advisory publications and ‘how-to’ guides which attempt to articulate practice methods to ‘listen’, ‘hear’ and above all – ‘capture’ the ‘voice of the child’. The idiom ‘capturing the voice of the child’ is frequently invoked within the literature to express the requirements of the child-centered practice task to be accomplished. Despite the centrality of voice, and an obsession with ‘capturing’ voices, evidence from research, inspection processes, serious case reviews, child abuse and death inquires has consistently highlighted professional neglect of ‘the voice of the child’. Notable research studies have highlighted the relative absence of the child’s voice in social work assessment practices, a troubling lack of meaningful engagement with children and the need to more thoroughly examine communicative practices in child protection contexts. As a consequence, the project of capturing ‘the voice of the child’ has intensified, and there has been an increasing focus on developing methods and professional skills to attend to voice. This has been guided by a recognition that professionals often lack the skills and training to engage with children in age-appropriate ways. We argue however that the problem with ‘capturing’ and [re]representing ‘voice’ in child protection contexts is, more fundamentally, a failure to adequately theorise the concept of ‘voice’ in the ‘voice of the child’. For the most part, ‘The voice of the child’ incorporates psychological conceptions of child development. While these concepts are useful in the context of direct work with children, they fail to consider other strands of sociological thought, which position ‘the voice of the child’ within an agentic paradigm to emphasise the active agency of the child.

Keywords: child-centered, child protection, views of the child, voice of the child

Procedia PDF Downloads 136

9891 The Effect of Speech-Shaped Noise and Speaker’s Voice Quality on First-Grade Children’s Speech Perception and Listening Comprehension

Authors: I. Schiller, D. Morsomme, A. Remacle

Abstract:

Children’s ability to process spoken language develops until the late teenage years. At school, where efficient spoken language processing is key to academic achievement, listening conditions are often unfavorable. High background noise and poor teacher’s voice represent typical sources of interference. It can be assumed that these factors particularly affect primary school children, because their language and literacy skills are still low. While it is generally accepted that background noise and impaired voice impede spoken language processing, there is an increasing need for analyzing impacts within specific linguistic areas. Against this background, the aim of the study was to investigate the effect of speech-shaped noise and imitated dysphonic voice on first-grade primary school children’s speech perception and sentence comprehension. Via headphones, 5 to 6-year-old children, recruited within the French-speaking community of Belgium, listened to and performed a minimal-pair discrimination task and a sentence-picture matching task. Stimuli were randomly presented according to four experimental conditions: (1) normal voice / no noise, (2) normal voice / noise, (3) impaired voice / no noise, and (4) impaired voice / noise. The primary outcome measure was task score. How did performance vary with respect to listening condition? Preliminary results will be presented with respect to speech perception and sentence comprehension and carefully interpreted in the light of past findings. This study helps to support our understanding of children’s language processing skills under adverse conditions. Results shall serve as a starting point for probing new measures to optimize children’s learning environment.

Keywords: impaired voice, sentence comprehension, speech perception, speech-shaped noise, spoken language processing

Procedia PDF Downloads 192

9890 Violence Detection and Tracking on Moving Surveillance Video Using Machine Learning Approach

Authors: Abe Degale D., Cheng Jian

Abstract:

When creating automated video surveillance systems, violent action recognition is crucial. In recent years, hand-crafted feature detectors have been the primary method for achieving violence detection, such as the recognition of fighting activity. Researchers have also looked into learning-based representational models. On benchmark datasets created especially for the detection of violent sequences in sports and movies, these methods produced good accuracy results. The Hockey dataset's videos with surveillance camera motion present challenges for these algorithms for learning discriminating features. Image recognition and human activity detection challenges have shown success with deep representation-based methods. For the purpose of detecting violent images and identifying aggressive human behaviours, this research suggested a deep representation-based model using the transfer learning idea. The results show that the suggested approach outperforms state-of-the-art accuracy levels by learning the most discriminating features, attaining 99.34% and 99.98% accuracy levels on the Hockey and Movies datasets, respectively.

Keywords: violence detection, faster RCNN, transfer learning and, surveillance video

Procedia PDF Downloads 106

9889 Passive Voice in SLA: Armenian Learners’ Case Study

Authors: Emma Nemishalyan

Abstract:

It is believed that learners’ mother tongue (L1 hereafter) has a huge impact on their second language acquisition (L2 hereafter). This hypothesis has been exposed to both positive and negative criticism. Based on research results of a wide range of learners’ corpora (Chinese, Japanese, Spanish among others) the hypothesis has either been proved or disproved. However, no such study has been conducted on the Armenian learners. The aim of this paper is to understand the implication of the hypothesis on the Armenian learners’ corpus in terms of the use of the passive voice. To this end, the method of Contrastive Interlanguage Analysis (hereafter CIA) has been used on native speakers’ corpus (Louvain Corpus of Native English Essays (LOCNESS)) and Armenian learners’ corpus which has been compiled by me in compliance with International Corpus of Learner English (ICLE) guidelines. CIA compares the interlanguage (the language produced by learners) with the one produced by native speakers. With the help of this method, it is possible not only to highlight the mistakes that learners make, but also to underline the under or overuses. The choice of the grammar issue (passive voice) is conditioned by the fact that typologically Armenian and English are drastically different as they belong to different branches. Moreover, the passive voice is considered to be one of the most problematic grammar topics to be acquired by learners of the English language. Based on this difference, we hypothesized that Armenian learners would either overuse or underuse some types of the passive voice. With the help of Lancsbox software, we have identified the frequency rates of passive voice usage in LOCNESS and Armenian learners’ corpus to understand whether the latter have the same usage pattern of the passive voice as the native speakers. Secondly, we have identified the types of the passive voice used by the Armenian leaners trying to track down the reasons in their mother tongue. The results of the study showed that Armenian learners underused the passive voices in contrast to native speakers. Furthermore, the hypothesis that learners’ L1 has an impact on learners’ L2 acquisition and production was proved.

Keywords: corpus linguistics, applied linguistics, second language acquisition, corpus compilation

Procedia PDF Downloads 108

9888 Vocal Training and Practice Methods: A Glimpse on the South Indian Carnatic Music

Authors: Raghavi Janaswamy, Saraswathi K. Vasudev

Abstract:

Music is one of the supreme arts of expressions, next to the speech itself. Its evolution over centuries has paved the way with a variety of training protocols and performing methods. Indian classical music is one of the most elaborate and refined systems with immense emphasis on the voice culture related to range, breath control, quality of the tone, flexibility and diction. Several exercises namely saraliswaram, jantaswaram, dhatuswaram, upper stayi swaram, alamkaras and varnams lay the required foundation to gain the voice culture and deeper understanding on the voice development and further on to the intricacies of the raga system. This article narrates a few of the Carnatic music training methods with an emphasis on the advanced practice methods for articulating the vocal skills, continuity in the voice, ability to produce gamakams, command in the multiple speeds of rendering with reasonable volume. The creativity on these exercises and their impact on the voice production are discussed. The articulation of the outlined conscious practice methods and vocal exercises bestow the optimum use of the natural human vocal system to not only enhance the signing quality but also to gain health benefits.

Keywords: Carnatic music, Saraliswaram, Varnam, vocal training

Procedia PDF Downloads 176

9887 Minimum Data of a Speech Signal as Special Indicators of Identification in Phonoscopy

Authors: Nazaket Gazieva

Abstract:

Voice biometric data associated with physiological, psychological and other factors are widely used in forensic phonoscopy. There are various methods for identifying and verifying a person by voice. This article explores the minimum speech signal data as individual parameters of a speech signal. Monozygotic twins are believed to be genetically identical. Using the minimum data of the speech signal, we came to the conclusion that the voice imprint of monozygotic twins is individual. According to the conclusion of the experiment, we can conclude that the minimum indicators of the speech signal are more stable and reliable for phonoscopic examinations.

Keywords: phonogram, speech signal, temporal characteristics, fundamental frequency, biometric fingerprints

Procedia PDF Downloads 142

9886 Independent Encryption Technique for Mobile Voice Calls

Authors: Nael Hirzalla

Abstract:

The legality of some countries or agencies’ acts to spy on personal phone calls of the public became a hot topic to many social groups’ talks. It is believed that this act is considered an invasion to someone’s privacy. Such act may be justified if it is singling out specific cases but to spy without limits is very unacceptable. This paper discusses the needs for not only a simple and light weight technique to secure mobile voice calls but also a technique that is independent from any encryption standard or library. It then presents and tests one encrypting algorithm that is based of frequency scrambling technique to show fair and delay-free process that can be used to protect phone calls from such spying acts.

Keywords: frequency scrambling, mobile applications, real-time voice encryption, spying on calls

Procedia PDF Downloads 479

9885 Empowering Leadership and Constructive Voice: A Sequential Mediation Analysis

Authors: Umamaheswara Rao Jada, Susmita Mukhopadhyay

Abstract:

In the present highly complex, dynamic and interdependent organizational environment, employees' ideas, opinions and suggestions which is technically referred to as ‘constructive employee voice’ is increasingly being recognized and valued. Literature has consistently demonstrated the relevance of leadership in employee voicing behavior, however the new form of leadership, ‘empowering leadership’ has not been given much attention. The study, therefore, devotes itself to the effort to explore the impact of this new form of leadership on employee voice behavior and the interplay with leader member exchange (LMX) and psychological safety as mediators in the same. The study utilizes structural equation modeling for analyzing the data collected from 310 Indian service industry employees through the questionnaire developed for the study. The findings of the study demonstrate the significant impact of empowering form of leadership on employees’ constructive voice behavior. Additionally, supporting results were observed for the mediating impact of leader member exchange (LMX) and psychological safety between empowering leadership and employees’ constructive voice behavior. The results of this study provide insights into the intervening mechanisms by linking leaders’ empowering behavior with employees’ constructive voice, while also highlighting the potential importance of LMX relationship in organizations and psychological safety in the context of constructive voice behavior. The study brings forth the relevance of the new form of leadership, ‘empowering leadership’ for fostering the better exchange of ideas, opinions, and suggestions between leaders and followers which tend to benefit the organization, providing empirical evidence of the sequential mediation of LMX and psychological safety. The piece of work is assumed to benefit the leaders in organizations by providing them the basis for adopting empowering form of leadership in light of results displayed.

Keywords: constructive voice, empowering leadership, leader member exchange (LMX), psychological safety, sequential mediation, structural equation modeling

Procedia PDF Downloads 304

9884 Gesture-Controlled Interface Using Computer Vision and Python

Authors: Vedant Vardhan Rathour, Anant Agrawal

Abstract:

The project aims to provide a touchless, intuitive interface for human-computer interaction, enabling users to control their computer using hand gestures and voice commands. The system leverages advanced computer vision techniques using the MediaPipe framework and OpenCV to detect and interpret real time hand gestures, transforming them into mouse actions such as clicking, dragging, and scrolling. Additionally, the integration of a voice assistant powered by the Speech Recognition library allows for seamless execution of tasks like web searches, location navigation and gesture control on the system through voice commands.

Keywords: gesture recognition, hand tracking, machine learning, convolutional neural networks

Procedia PDF Downloads 11

9883 The Oppressive Boss and Employees' Authoritarianism: The Relation between Suppression of Voice by Employers and Employees' Preferences for Authoritarian Political Leadership

Authors: Antonia Stanojević, Agnes Akkerman

Abstract:

In contemporary society, economically active people typically spend most of their waking hours doing their job. Having that in mind, this research examines how socialization at the workplace shapes political preferences. Innovatively, it examines, in particular, the possible relationship between employees’ voice suppression by the employer and the formation of their political preferences. Since the employer is perceived as an authority figure, their behavior might induce spillovers to attitudes about political authorities and authoritarian governance. Therefore, a positive effect of suppression of voice by employers on employees' preference for authoritarian governance is expected. Furthermore, this relation is expected to be mediated by two mechanisms: system justification and power distance. Namely, it is expected that suppression of voice would create a power distance organizational climate and increase employees’ acceptance of unequal distribution of power, as well as evoke attempts of oppression rationalization through system justification. The hypotheses will be tested on the data gathered within the first wave of Work and Politics Dataset 2017 (N=6000), which allows for a wide range of demographic and psychological control variables. Although a cross-sectional analysis to be used at this point does not allow for causal inferences, the confirmation of expected relationships would encourage and justify further longitudinal research on the same panel dataset, in order to get a clearer image of the causal relationship between employers' suppression of voice and workers' political preferences.

Keywords: authoritarian values, political preferences, power distance, system justification, voice suppression

Procedia PDF Downloads 268

9882 Obstacle Detection and Path Tracking Application for Disables

Authors: Aliya Ashraf, Mehreen Sirshar, Fatima Akhtar, Farwa Kazmi, Jawaria Wazir

Abstract:

Vision, the basis for performing navigational tasks, is absent or greatly reduced in visually impaired people due to which they face many hurdles. For increasing the navigational capabilities of visually impaired people a desktop application ODAPTA is presented in this paper. The application uses camera to capture video from surroundings, apply various image processing algorithms to get information about path and obstacles, tracks them and delivers that information to user through voice commands. Experimental results show that the application works effectively for straight paths in daylight.

Keywords: visually impaired, ODAPTA, Region of Interest (ROI), driver fatigue, face detection, expression recognition, CCD camera, artificial intelligence

Procedia PDF Downloads 549

9881 Efficient Signal Detection Using QRD-M Based on Channel Condition in MIMO-OFDM System

Authors: Jae-Jeong Kim, Ki-Ro Kim, Hyoung-Kyu Song

Abstract:

In this paper, we propose an efficient signal detector that switches M parameter of QRD-M detection scheme is proposed for MIMO-OFDM system. The proposed detection scheme calculates the threshold by 1-norm condition number and then switches M parameter of QRD-M detection scheme according to channel information. If channel condition is bad, the parameter M is set to high value to increase the accuracy of detection. If channel condition is good, the parameter M is set to low value to reduce complexity of detection. Therefore, the proposed detection scheme has better trade off between BER performance and complexity than the conventional detection scheme. The simulation result shows that the complexity of proposed detection scheme is lower than QRD-M detection scheme with similar BER performance.

Keywords: MIMO-OFDM, QRD-M, channel condition, BER

Procedia PDF Downloads 370

9880 From Script to Film: The Fading Voice of the Screenwriter

Authors: Ana Sofia Torres Pereira

Abstract:

On January 15th 2015, Peter Bart, editor in chief of Variety Magazine, published an article in the aforementioned magazine posing the following question “Are screenwriters becoming obsolete in Hollywood?” Is Hollywood loosing its interest in well plotted, well written scripts crafted by professionals? That screenwriters have been undervalued, forgotten and left behind since the begging of film, is a well-known fact, but ate they now at the brink of extinction? If fiction films are about people, stories, so, simply put, all about the script, what does it mean to say that the screenwriter is becoming obsolete? What will be the consequences of the possible death of the screenwriter for the cinema world? All of these questions lead us to an ultimate one: What is the true importance of a screenwriter? What can a screenwriter do that a director, for instance, can’t? How should a script be written and read in order not to become obsolete? And what about those countries, like Portugal, for example, in which the figure of the screenwriter is yet to be heard and known? How can screenwriters find their voice in a world driven by the tyrannical voice of the Director? In a demanding cinema world where the Director is considered the author of a film, it’s important to know where we can find the voice of the screenwriter, the true language of the screenplay and the importance this voice and specific language might have for the future of story telling and of film. In a paper that admittedly poses more questions than answers, I will try to unveil the importance a screenplay might have in Hollywood, in Portugal and in the cinema and communication world in general.

Keywords: cinema, communication, director, language, screenplay, screenwriting, story

Procedia PDF Downloads 316

9879 The Use of Voice in Online Public Access Catalog as Faster Searching Device

Authors: Maisyatus Suadaa Irfana, Nove Eka Variant Anna, Dyah Puspitasari Sri Rahayu

Abstract:

Technological developments provide convenience to all the people. Nowadays, the communication of human with the computer is done via text. With the development of technology, human and computer communications have been conducted with a voice like communication between human beings. It provides an easy facility for many people, especially those who have special needs. Voice search technology is applied in the search of book collections in the OPAC (Online Public Access Catalog), so library visitors will find it faster and easier to find books that they need. Integration with Google is needed to convert the voice into text. To optimize the time and the results of searching, Server will download all the book data that is available in the server database. Then, the data will be converted into JSON format. In addition, the incorporation of some algorithms is conducted including Decomposition (parse) in the form of array of JSON format, the index making, analyzer to the result. It aims to make the process of searching much faster than the usual searching in OPAC because the data are directly taken to the database for every search warrant. Data Update Menu is provided with the purpose to enable users perform their own data updates and get the latest data information.

Keywords: OPAC, voice, searching, faster

Procedia PDF Downloads 344

9878 Reduced Complexity of ML Detection Combined with DFE

Authors: Jae-Hyun Ro, Yong-Jun Kim, Chang-Bin Ha, Hyoung-Kyu Song

Abstract:

In multiple input multiple output-orthogonal frequency division multiplexing (MIMO-OFDM) systems, many detection schemes have been developed to improve the error performance and to reduce the complexity. Maximum likelihood (ML) detection has optimal error performance but it has very high complexity. Thus, this paper proposes reduced complexity of ML detection combined with decision feedback equalizer (DFE). The error performance of the proposed detection scheme is higher than the conventional DFE. But the complexity of the proposed scheme is lower than the conventional ML detection.

Keywords: detection, DFE, MIMO-OFDM, ML

Procedia PDF Downloads 610

9877 Machine Learning Approach for Stress Detection Using Wireless Physical Activity Tracker

Authors: B. Padmaja, V. V. Rama Prasad, K. V. N. Sunitha, E. Krishna Rao Patro

Abstract:

Stress is a psychological condition that reduces the quality of sleep and affects every facet of life. Constant exposure to stress is detrimental not only for mind but also body. Nevertheless, to cope with stress, one should first identify it. This paper provides an effective method for the cognitive stress level detection by using data provided from a physical activity tracker device Fitbit. This device gathers people’s daily activities of food, weight, sleep, heart rate, and physical activities. In this paper, four major stressors like physical activities, sleep patterns, working hours and change in heart rate are used to assess the stress levels of individuals. The main motive of this system is to use machine learning approach in stress detection with the help of Smartphone sensor technology. Individually, the effect of each stressor is evaluated using logistic regression and then combined model is built and assessed using variants of ordinal logistic regression models like logit, probit and complementary log-log. Then the quality of each model is evaluated using Akaike Information Criterion (AIC) and probit is assessed as the more suitable model for our dataset. This system is experimented and evaluated in a real time environment by taking data from adults working in IT and other sectors in India. The novelty of this work lies in the fact that stress detection system should be less invasive as possible for the users.

Keywords: physical activity tracker, sleep pattern, working hours, heart rate, smartphone sensor

Procedia PDF Downloads 256

9876 Music in the Early Stages of Life: Considerations from Working with Groups of Mothers and Babies

Authors: Ana Paula Melchiors Stahlschmidt

Abstract:

This paper discusses the role of music as a ludic activity and constituent element of voice in the construction and consolidation of the relationship of the baby and his/her mother or caretaker, evaluating its implications in his/her psychic structure and constitution as a subject. The work was based on the research developed as part of the author’s doctoral activities carried out from her insertion in a project of the Music Department of Federal University of Rio Grande do Sul - UFRGS, which objective was the development of musical activities with groups of babies from 0 to 24 months old and their caretakers. Observations, video recordings of the meetings, audio testemonies, and evaluation tools applied to group participants were used as instruments for this research. Information was collected on the participation of 195 babies, among which 8 were more focused on through interviews with their mothers or caretakers. These interviews were analyzed based on the referential of French Discourse Analysis, Psychoanalysis, Psychology of Development and Musical Education. The results of the research were complemented by other posterior experiences that the author developed with similar groups, in a context of a private clinic. The information collected allowed the observation of the ludic and structural functions of musical activities, when developed in a structured environment, as well as the importance of the musicality of the mother’s voice to the psychical structuring of the baby, allowing his/her insertion in the language and his/her constituition as a subject.

Keywords: music and babies, maternal voice, Psychoanalysis and music, psychology and music

Procedia PDF Downloads 453