Search results for: automatic speech recognition
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 3068

Search results for: automatic speech recognition

2888 Speech Perception by Monolingual and Bilingual Dravidian Speakers under Adverse Listening Conditions

Authors: S. B. Rathna Kumar, Sale Kranthi, Sandya K. Varudhini

Abstract:

The precise perception of spoken language is influenced by several variables, including the listeners’ native language, distance between speaker and listener, reverberation and background noise. When noise is present in an acoustic environment, it masks the speech signal resulting in reduction in the redundancy of the acoustic and linguistic cues of speech. There is strong evidence that bilinguals face difficulty in speech perception for their second language compared with monolingual speakers under adverse listening conditions such as presence of background noise. This difficulty persists even for speakers who are highly proficient in their second language and is greater in those who have learned the second language later in life. The present study aimed to assess the performance of monolingual (Telugu speaking) and bilingual (Tamil as first language and Telugu as second language) speakers on Telugu speech perception task under quiet and noisy environments. The results indicated that both the groups performed similar in both quiet and noisy environments. The findings of the present study are not in accordance with the findings of previous studies which strongly report poorer speech perception in adverse listening conditions such as noise with bilingual speakers for their second language compared with monolinguals.

Keywords: monolingual, bilingual, second language, speech perception, quiet, noise

Procedia PDF Downloads 389
2887 Long Short-Term Memory Based Model for Modeling Nicotine Consumption Using an Electronic Cigarette and Internet of Things Devices

Authors: Hamdi Amroun, Yacine Benziani, Mehdi Ammi

Abstract:

In this paper, we want to determine whether the accurate prediction of nicotine concentration can be obtained by using a network of smart objects and an e-cigarette. The approach consists of, first, the recognition of factors influencing smoking cessation such as physical activity recognition and participant’s behaviors (using both smartphone and smartwatch), then the prediction of the configuration of the e-cigarette (in terms of nicotine concentration, power, and resistance of e-cigarette). The study uses a network of commonly connected objects; a smartwatch, a smartphone, and an e-cigarette transported by the participants during an uncontrolled experiment. The data obtained from sensors carried in the three devices were trained by a Long short-term memory algorithm (LSTM). Results show that our LSTM-based model allows predicting the configuration of the e-cigarette in terms of nicotine concentration, power, and resistance with a root mean square error percentage of 12.9%, 9.15%, and 11.84%, respectively. This study can help to better control consumption of nicotine and offer an intelligent configuration of the e-cigarette to users.

Keywords: Iot, activity recognition, automatic classification, unconstrained environment

Procedia PDF Downloads 224
2886 An Event-Related Potential Investigation of Speech-in-Noise Recognition in Native and Nonnative Speakers of English

Authors: Zahra Fotovatnia, Jeffery A. Jones, Alexandra Gottardo

Abstract:

Speech communication often occurs in environments where noise conceals part of a message. Listeners should compensate for the lack of auditory information by picking up distinct acoustic cues and using semantic and sentential context to recreate the speaker’s intended message. This situation seems to be more challenging in a nonnative than native language. On the other hand, early bilinguals are expected to show an advantage over the late bilingual and monolingual speakers of a language due to their better executive functioning components. In this study, English monolingual speakers were compared with early and late nonnative speakers of English to understand speech in noise processing (SIN) and the underlying neurobiological features of this phenomenon. Auditory mismatch negativities (MMNs) were recorded using a double-oddball paradigm in response to a minimal pair that differed in their middle vowel (beat/bit) at Wilfrid Laurier University in Ontario, Canada. The results did not show any significant structural and electroneural differences across groups. However, vocabulary knowledge correlated positively with performance on tests that measured SIN processing in participants who learned English after age 6. Moreover, their performance on the test negatively correlated with the integral area amplitudes in the left superior temporal gyrus (STG). In addition, the STG was engaged before the inferior frontal gyrus (IFG) in noise-free and low-noise test conditions in all groups. We infer that the pre-attentive processing of words engages temporal lobes earlier than the fronto-central areas and that vocabulary knowledge helps the nonnative perception of degraded speech.

Keywords: degraded speech perception, event-related brain potentials, mismatch negativities, brain regions

Procedia PDF Downloads 107
2885 Dual-Channel Multi-Band Spectral Subtraction Algorithm Dedicated to a Bilateral Cochlear Implant

Authors: Fathi Kallel, Ahmed Ben Hamida, Christian Berger-Vachon

Abstract:

In this paper, a Speech Enhancement Algorithm based on Multi-Band Spectral Subtraction (MBSS) principle is evaluated for Bilateral Cochlear Implant (BCI) users. Specifically, dual-channel noise power spectral estimation algorithm using Power Spectral Densities (PSD) and Cross Power Spectral Densities (CPSD) of the observed signals is studied. The enhanced speech signal is obtained using Dual-Channel Multi-Band Spectral Subtraction ‘DC-MBSS’ algorithm. For performance evaluation, objective speech assessment test relying on Perceptual Evaluation of Speech Quality (PESQ) score is performed to fix the optimal number of frequency bands needed in DC-MBSS algorithm. In order to evaluate the speech intelligibility, subjective listening tests are assessed with 3 deafened BCI patients. Experimental results obtained using French Lafon database corrupted by an additive babble noise at different Signal-to-Noise Ratios (SNR) showed that DC-MBSS algorithm improves speech understanding for single and multiple interfering noise sources.

Keywords: speech enhancement, spectral substracion, noise estimation, cochlear impalnt

Procedia PDF Downloads 549
2884 Attendance Management System Implementation Using Face Recognition

Authors: Zainab S. Abdullahi, Zakariyya H. Abdullahi, Sahnun Dahiru

Abstract:

Student attendance in schools is a very important aspect in school management record. In recent years, security systems have become one of the most demanding systems in school. Every institute have its own method of taking attendance, many schools in Nigeria use the old fashion way of taking attendance. That is writing the students name and registration number in a paper and submitting it to the lecturer at the end of the lecture which is time-consuming and insecure, because some students can write for their friends without the lecturer’s knowledge. In this paper, we propose a system that takes attendance using face recognition. There are many automatic methods available for this purpose i.e. biometric attendance, but they all waste time, because the students have to follow a queue to put their thumbs on a scanner which is time-consuming. This attendance is recorded by using a camera attached in front of the class room and capturing the student images, detect the faces in the image and compare the detected faces with database and mark the attendance. The principle component analysis was used to recognize the faces detected with a high accuracy rate. The paper reviews the related work in the field of attendance system, then describe the system architecture, software algorithm and result.

Keywords: attendance system, face detection, face recognition, PCA

Procedia PDF Downloads 364
2883 Freedom of Speech, Dissent and the Right to be Governed By Consensus are Inherent Rights Under Classical Islamic Law

Authors: Ziyad Motala

Abstract:

It is often proclaimed by leasers in Muslim majority countries that Islamic Law does not permit dissent against a ruler. This paper will evaluate and discuss freedom of speech and dissent as found in concrete prophetic examples during the time of the Prophet Muhammad. It will further look at the examples and practices during the time of the four Noble Caliphs, the immediate successors to the Prophet Muhammad. It will argue that the positivist position of absolute obedience to a ruler is inconsistent with the prophetic tradition. The examples of the Prophet and his immediate four successors (whose lessons Sunni Islam considers to be a source of Islamic Law) demonstrates among the earliest example of freedom of speech and dissent in human history. That tradition frowned upon an inert and uninvolved citizenry. It will conclude with lessons for modern day Muslim majority countries arguing with empirical evidence that freedom of speech, dissent and the right to be governed by consensus versus coercion are fundamental requisites of Islamic law.

Keywords: islamic law, demoracy, freedom of speech, right to dissent

Procedia PDF Downloads 74
2882 Features of Normative and Pathological Realizations of Sibilant Sounds for Computer-Aided Pronunciation Evaluation in Children

Authors: Zuzanna Miodonska, Michal Krecichwost, Pawel Badura

Abstract:

Sigmatism (lisping) is a speech disorder in which sibilant consonants are mispronounced. The diagnosis of this phenomenon is usually based on the auditory assessment. However, the progress in speech analysis techniques creates a possibility of developing computer-aided sigmatism diagnosis tools. The aim of the study is to statistically verify whether specific acoustic features of sibilant sounds may be related to pronunciation correctness. Such knowledge can be of great importance while implementing classifiers and designing novel tools for automatic sibilants pronunciation evaluation. The study covers analysis of various speech signal measures, including features proposed in the literature for the description of normative sibilants realization. Amplitudes and frequencies of three fricative formants (FF) are extracted based on local spectral maxima of the friction noise. Skewness, kurtosis, four normalized spectral moments (SM) and 13 mel-frequency cepstral coefficients (MFCC) with their 1st and 2nd derivatives (13 Delta and 13 Delta-Delta MFCC) are included in the analysis as well. The resulting feature vector contains 51 measures. The experiments are performed on the speech corpus containing words with selected sibilant sounds (/ʃ, ʒ/) pronounced by 60 preschool children with proper pronunciation or with natural pathologies. In total, 224 /ʃ/ segments and 191 /ʒ/ segments are employed in the study. The Mann-Whitney U test is employed for the analysis of stigmatism and normative pronunciation. Statistically, significant differences are obtained in most of the proposed features in children divided into these two groups at p < 0.05. All spectral moments and fricative formants appear to be distinctive between pathology and proper pronunciation. These metrics describe the friction noise characteristic for sibilants, which makes them particularly promising for the use in sibilants evaluation tools. Correspondences found between phoneme feature values and an expert evaluation of the pronunciation correctness encourage to involve speech analysis tools in diagnosis and therapy of sigmatism. Proposed feature extraction methods could be used in a computer-assisted stigmatism diagnosis or therapy systems.

Keywords: computer-aided pronunciation evaluation, sigmatism diagnosis, speech signal analysis, statistical verification

Procedia PDF Downloads 301
2881 Research and Development of Intelligent Cooling Channels Design System

Authors: Q. Niu, X. H. Zhou, W. Liu

Abstract:

The cooling channels of injection mould play a crucial role in determining the productivity of moulding process and the product quality. It’s not a simple task to design high quality cooling channels. In this paper, an intelligent cooling channels design system including automatic layout of cooling channels, interference checking and assembly of accessories is studied. Automatic layout of cooling channels using genetic algorithm is analyzed. Through integrating experience criteria of designing cooling channels, considering the factors such as the mould temperature and interference checking, the automatic layout of cooling channels is implemented. The method of checking interference based on distance constraint algorithm and the function of automatic and continuous assembly of accessories are developed and integrated into the system. Case studies demonstrate the feasibility and practicality of the intelligent design system.

Keywords: injection mould, cooling channel, intelligent design, automatic layout, interference checking

Procedia PDF Downloads 439
2880 Enhanced Face Recognition with Daisy Descriptors Using 1BT Based Registration

Authors: Sevil Igit, Merve Meric, Sarp Erturk

Abstract:

In this paper, it is proposed to improve Daisy descriptor based face recognition using a novel One-Bit Transform (1BT) based pre-registration approach. The 1BT based pre-registration procedure is fast and has low computational complexity. It is shown that the face recognition accuracy is improved with the proposed approach. The proposed approach can facilitate highly accurate face recognition using DAISY descriptor with simple matching and thereby facilitate a low-complexity approach.

Keywords: face recognition, Daisy descriptor, One-Bit Transform, image registration

Procedia PDF Downloads 367
2879 Formulating a Definition of Hate Speech: From Divergence to Convergence

Authors: Avitus A. Agbor

Abstract:

Numerous incidents, ranging from trivial to catastrophic, do come to mind when one reflects on hate. The victims of these belong to specific identifiable groups within communities. These experiences evoke discussions on Islamophobia, xenophobia, homophobia, anti-Semitism, racism, ethnic hatred, atheism, and other brutal forms of bigotry. Common to all these is an invisible but portent force that drives all of them: hatred. Such hatred is usually fueled by a profound degree of intolerance (to diversity) and the zeal to impose on others their beliefs and practices which they consider to be the conventional norm. More importantly, the perpetuation of these hateful acts is the unfortunate outcome of an overplay of invectives and hate speech which, to a greater extent, cannot be divorced from hate. From a legal perspective, acknowledging the existence of an undeniable link between hate speech and hate is quite easy. However, both within and without legal scholarship, the notion of “hate speech” remains a conundrum: a phrase that is quite easily explained through experiences than propounding a watertight definition that captures the entire essence and nature of what it is. The problem is further compounded by a few factors: first, within the international human rights framework, the notion of hate speech is not used. In limiting the right to freedom of expression, the ICCPR simply excludes specific kinds of speeches (but does not refer to them as hate speech). Regional human rights instruments are not so different, except for the subsequent developments that took place in the European Union in which the notion has been carefully delineated, and now a much clearer picture of what constitutes hate speech is provided. The legal architecture in domestic legal systems clearly shows differences in approaches and regulation: making it more difficult. In short, what may be hate speech in one legal system may very well be acceptable legal speech in another legal system. Lastly, the cornucopia of academic voices on the issue of hate speech exude the divergence thereon. Yet, in the absence of a well-formulated and universally acceptable definition, it is important to consider how hate speech can be defined. Taking an evidence-based approach, this research looks into the issue of defining hate speech in legal scholarship and how and why such a formulation is of critical importance in the prohibition and prosecution of hate speech.

Keywords: hate speech, international human rights law, international criminal law, freedom of expression

Procedia PDF Downloads 76
2878 Conversational Assistive Technology of Visually Impaired Person for Social Interaction

Authors: Komal Ghafoor, Tauqir Ahmad, Murtaza Hanif, Hira Zaheer

Abstract:

Assistive technology has been developed to support visually impaired people in their social interactions. Conversation assistive technology is designed to enhance communication skills, facilitate social interaction, and improve the quality of life of visually impaired individuals. This technology includes speech recognition, text-to-speech features, and other communication devices that enable users to communicate with others in real time. The technology uses natural language processing and machine learning algorithms to analyze spoken language and provide appropriate responses. It also includes features such as voice commands and audio feedback to provide users with a more immersive experience. These technologies have been shown to increase the confidence and independence of visually impaired individuals in social situations and have the potential to improve their social skills and relationships with others. Overall, conversation-assistive technology is a promising tool for empowering visually impaired people and improving their social interactions. One of the key benefits of conversation-assistive technology is that it allows visually impaired individuals to overcome communication barriers that they may face in social situations. It can help them to communicate more effectively with friends, family, and colleagues, as well as strangers in public spaces. By providing a more seamless and natural way to communicate, this technology can help to reduce feelings of isolation and improve overall quality of life. The main objective of this research is to give blind users the capability to move around in unfamiliar environments through a user-friendly device by face, object, and activity recognition system. This model evaluates the accuracy of activity recognition. This device captures the front view of the blind, detects the objects, recognizes the activities, and answers the blind query. It is implemented using the front view of the camera. The local dataset is collected that includes different 1st-person human activities. The results obtained are the identification of the activities that the VGG-16 model was trained on, where Hugging, Shaking Hands, Talking, Walking, Waving video, etc.

Keywords: dataset, visually impaired person, natural language process, human activity recognition

Procedia PDF Downloads 58
2877 Effect Analysis of an Improved Adaptive Speech Noise Reduction Algorithm in Online Communication Scenarios

Authors: Xingxing Peng

Abstract:

With the development of society, there are more and more online communication scenarios such as teleconference and online education. In the process of conference communication, the quality of voice communication is a very important part, and noise may cause the communication effect of participants to be greatly reduced. Therefore, voice noise reduction has an important impact on scenarios such as voice calls. This research focuses on the key technologies of the sound transmission process. The purpose is to maintain the audio quality to the maximum so that the listener can hear clearer and smoother sound. Firstly, to solve the problem that the traditional speech enhancement algorithm is not ideal when dealing with non-stationary noise, an adaptive speech noise reduction algorithm is studied in this paper. Traditional noise estimation methods are mainly used to deal with stationary noise. In this chapter, we study the spectral characteristics of different noise types, especially the characteristics of non-stationary Burst noise, and design a noise estimator module to deal with non-stationary noise. Noise features are extracted from non-speech segments, and the noise estimation module is adjusted in real time according to different noise characteristics. This adaptive algorithm can enhance speech according to different noise characteristics, improve the performance of traditional algorithms to deal with non-stationary noise, so as to achieve better enhancement effect. The experimental results show that the algorithm proposed in this chapter is effective and can better adapt to different types of noise, so as to obtain better speech enhancement effect.

Keywords: speech noise reduction, speech enhancement, self-adaptation, Wiener filter algorithm

Procedia PDF Downloads 57
2876 Statistical Feature Extraction Method for Wood Species Recognition System

Authors: Mohd Iz'aan Paiz Bin Zamri, Anis Salwa Mohd Khairuddin, Norrima Mokhtar, Rubiyah Yusof

Abstract:

Effective statistical feature extraction and classification are important in image-based automatic inspection and analysis. An automatic wood species recognition system is designed to perform wood inspection at custom checkpoints to avoid mislabeling of timber which will results to loss of income to the timber industry. The system focuses on analyzing the statistical pores properties of the wood images. This paper proposed a fuzzy-based feature extractor which mimics the experts’ knowledge on wood texture to extract the properties of pores distribution from the wood surface texture. The proposed feature extractor consists of two steps namely pores extraction and fuzzy pores management. The total number of statistical features extracted from each wood image is 38 features. Then, a backpropagation neural network is used to classify the wood species based on the statistical features. A comprehensive set of experiments on a database composed of 5200 macroscopic images from 52 tropical wood species was used to evaluate the performance of the proposed feature extractor. The advantage of the proposed feature extraction technique is that it mimics the experts’ interpretation on wood texture which allows human involvement when analyzing the wood texture. Experimental results show the efficiency of the proposed method.

Keywords: classification, feature extraction, fuzzy, inspection system, image analysis, macroscopic images

Procedia PDF Downloads 425
2875 Analysis of Interleaving Scheme for Narrowband VoIP System under Pervasive Environment

Authors: Monica Sharma, Harjit Pal Singh, Jasbinder Singh, Manju Bala

Abstract:

In Voice over Internet Protocol (VoIP) system, the speech signal is degraded when passed through the network layers. The speech signal is processed through the best effort policy based IP network, which leads to the network degradations including delay, packet loss and jitter. The packet loss is the major issue of the degradation in the VoIP signal quality; even a single lost packet may generate audible distortion in the decoded speech signal. In addition to these network degradations, the quality of the speech signal is also affected by the environmental noises and coder distortions. The signal quality of the VoIP system is improved through the interleaving technique. The performance of the system is evaluated for various types of noises at different network conditions. The performance of the enhanced VoIP signal is evaluated using perceptual evaluation of speech quality (PESQ) measurement for narrow band signal.

Keywords: VoIP, interleaving, packet loss, packet size, background noise

Procedia PDF Downloads 479
2874 Laser Data Based Automatic Generation of Lane-Level Road Map for Intelligent Vehicles

Authors: Zehai Yu, Hui Zhu, Linglong Lin, Huawei Liang, Biao Yu, Weixin Huang

Abstract:

With the development of intelligent vehicle systems, a high-precision road map is increasingly needed in many aspects. The automatic lane lines extraction and modeling are the most essential steps for the generation of a precise lane-level road map. In this paper, an automatic lane-level road map generation system is proposed. To extract the road markings on the ground, the multi-region Otsu thresholding method is applied, which calculates the intensity value of laser data that maximizes the variance between background and road markings. The extracted road marking points are then projected to the raster image and clustered using a two-stage clustering algorithm. Lane lines are subsequently recognized from these clusters by the shape features of their minimum bounding rectangle. To ensure the storage efficiency of the map, the lane lines are approximated to cubic polynomial curves using a Bayesian estimation approach. The proposed lane-level road map generation system has been tested on urban and expressway conditions in Hefei, China. The experimental results on the datasets show that our method can achieve excellent extraction and clustering effect, and the fitted lines can reach a high position accuracy with an error of less than 10 cm.

Keywords: curve fitting, lane-level road map, line recognition, multi-thresholding, two-stage clustering

Procedia PDF Downloads 128
2873 Speech Rhythm Variation in Languages and Dialects: F0, Natural and Inverted Speech

Authors: Imen Ben Abda

Abstract:

Languages have been classified into different rhythm classes. 'Stress-timed' languages are exemplified by English, 'syllable-timed' languages by French and 'mora-timed' languages by Japanese. However, to our best knowledge, acoustic studies have not been unanimous in strictly establishing which rhythm category a given language belongs to and failed to show empirical evidence for isochrony. Perception seems to be a good approach to categorize languages into different rhythm classes. This study, within the scope of experimental phonetics, includes an account of different perceptual experiments using cues from natural and inverted speech, as well as pitch extracted from speech data. It is an attempt to categorize speech rhythm over a large set of Arabic (Tunisian, Algerian, Lebanese and Moroccan) and English dialects (Welsh, Irish, Scottish and Texan) as well as other languages such as Chinese, Japanese, French, and German. Listeners managed to classify the different languages and dialects into different rhythm classes using suprasegmental cues mainly rhythm and pitch (F0). They also perceived rhythmic differences even among languages and dialects belonging to the same rhythm class. This may show that there are different subclasses within very broad rhythmic typologies.

Keywords: F0, inverted speech, mora-timing, rhythm variation, stress-timing, syllable-timing

Procedia PDF Downloads 526
2872 Printed Thai Character Recognition Using Particle Swarm Optimization Algorithm

Authors: Phawin Sangsuvan, Chutimet Srinilta

Abstract:

This Paper presents the applications of Particle Swarm Optimization (PSO) Method for Thai optical character recognition (OCR). OCR consists of the pre-processing, character recognition and post-processing. Before enter into recognition process. The Character must be “Prepped” by pre-processing process. The PSO is an optimization method that belongs to the swarm intelligence family based on the imitation of social behavior patterns of animals. Route of each particle is determined by an individual data among neighborhood particles. The interaction of the particles with neighbors is the advantage of Particle Swarm to determine the best solution. So PSO is interested by a lot of researchers in many difficult problems including character recognition. As the previous this research used a Projection Histogram to extract printed digits features and defined the simple Fitness Function for PSO. The results reveal that PSO gives 67.73% for testing dataset. So in the future there can be explored enhancement the better performance of PSO with improve the Fitness Function.

Keywords: character recognition, histogram projection, particle swarm optimization, pattern recognition techniques

Procedia PDF Downloads 477
2871 Effects of Exposing Learners to Speech Acts in the German Teaching Material Schritte International: The Case of Requests

Authors: Wan-Lin Tsai

Abstract:

Speech act of requests is an important issue in the field of language learning and teaching because we cannot avoid making requesting in our daily life. This study examined whether or not the subjects who were freshmen and majored in German at Wenzao University of Languages were able to use the linguistic forms which they had learned from their course book Schritte International to make appropriate requests through dialogue completed tasks (DCT). The results revealed that the majority of the subjects were unable to use the forms to make appropriate requests in German due to the lack of explicit instructions. Furthermore, Chinese interference was observed in students' productions. Explicit instructions in speech acts are strongly recommended.

Keywords: Chinese interference, German pragmatics, German teaching, make appropriate requests in German, speech act of requesting

Procedia PDF Downloads 465
2870 Enhanced Thai Character Recognition with Histogram Projection Feature Extraction

Authors: Benjawan Rangsikamol, Chutimet Srinilta

Abstract:

This research paper deals with extraction of Thai character features using the proposed histogram projection so as to improve the recognition performance. The process starts with transformation of image files into binary files before thinning. After character thinning, the skeletons are entered into the proposed extraction using histogram projection (horizontal and vertical) to extract unique features which are inputs of the subsequent recognition step. The recognition rate with the proposed extraction technique is as high as 97 percent since the technique works very well with the idiosyncrasies of Thai characters.

Keywords: character recognition, histogram projection, multilayer perceptron, Thai character features extraction

Procedia PDF Downloads 464
2869 The Speech Acts of Selected Classroom Encounters: Analyzing the Speech Acts of a Career Technology Lesson

Authors: Michael Amankwaa Adu

Abstract:

This study investigates the speech acts employed by a Career Technology teacher during classroom interactions in a junior high school. While much research exists on speech acts in language teaching, little attention has been given to technical subjects. This has created a gap in understanding how teachers of non-language subjects utilize speech acts in classroom communication. This study aims to analyze the types and frequencies of speech acts used by a Career Technology teacher during three key classroom encounters: lesson introduction, content delivery, and classroom management. Using a mixed-methods approach, the study examines 113 utterances from the teacher's lesson, categorizing them into four primary speech act types: directives, assertives, expressives, and commissives. Directives emerged as the most dominant form, accounting for 59.3% of the utterances, followed by assertives (20.4%), expressives (14.2%), and commissives (6.2%). No declarations were observed. The study demonstrates how the teacher uses directives to manage student behavior and assertives to reinforce information. Expressives are used sparingly but play a role in motivating or disciplining students, while commissives help establish classroom rules and set expectations. The findings contribute to understanding classroom interaction strategies in non-language subjects, offering insights that could inform teacher training and curriculum development. The study underscores the importance of effective communication in technical subjects and suggests ways in which language teaching techniques might be integrated into other subject areas.

Keywords: classroom management, directives, speech acts, technical subjects., assertives

Procedia PDF Downloads 21
2868 Design of an Augmented Automatic Choosing Control with Constrained Input by Lyapunov Functions Using Gradient Optimization Automatic Choosing Functions

Authors: Toshinori Nawata

Abstract:

In this paper a nonlinear feedback control called augmented automatic choosing control (AACC) for a class of nonlinear systems with constrained input is presented. When designing the control, a constant term which arises from linearization of a given nonlinear system is treated as a coefficient of a stable zero dynamics. Parameters of the control are suboptimally selected by maximizing the stable region in the sense of Lyapunov with the aid of a genetic algorithm. This approach is applied to a field excitation control problem of power system to demonstrate the splendidness of the AACC. Simulation results show that the new controller can improve performance remarkably well.

Keywords: augmented automatic choosing control, nonlinear control, genetic algorithm, zero dynamics

Procedia PDF Downloads 478
2867 Childhood Apraxia of Speech and Autism: Interaction Influences and Treatment

Authors: Elad Vashdi

Abstract:

It is common to find speech deficit among children diagnosed with Autism. It can be found in the clinical field and recently in research. One of the DSM-V criteria suggests a speech delay (Delay in, or total lack of, the development of spoken language), but doesn't explain the cause of it. A common perception among professionals and families is that the inability to talk results from the autism. Autism is a name for a syndrome which just describes a phenomenon and is defined behaviorally. Since it is not based yet on a physiological gold standard, one can not conclude the nature of a deficit based on the name of the syndrome. A wide retrospective research (n=270) which included children with motor speech difficulties was conducted in Israel. The study analyzed entry evaluations in a private clinic during the years 2006-2013. The data was extracted from the reports. High percentage of children diagnosed with Autism (60%) was found. This result demonstrates the high relationship between Autism and motor speech problem. It also supports recent findings in research of Childhood apraxia of speech (CAS) occurrence among children with ASD. Only small percentage of the participants in this research (10%) were diagnosed with CAS even though their verbal deficits well fitted the guidelines for CAS diagnosis set by ASHA in 2007. This fact raises questions regarding the diagnostic procedure in Israel. The understanding that CAS might highly exist within Autism and can have a remarkable influence on the course of early development should be a guiding tool within the diagnosis procedure. CAS can explain the nature of the speech problem among some of the autistic children and guide the treatment in a more accurate way. Calculating the prevalence of CAS which includes the comorbidity with ASD reveals new numbers and suggests treating differently the CAS population.

Keywords: childhood apraxia of speech, Autism, treatment, speech

Procedia PDF Downloads 275
2866 Speaker Recognition Using LIRA Neural Networks

Authors: Nestor A. Garcia Fragoso, Tetyana Baydyk, Ernst Kussul

Abstract:

This article contains information from our investigation in the field of voice recognition. For this purpose, we created a voice database that contains different phrases in two languages, English and Spanish, for men and women. As a classifier, the LIRA (Limited Receptive Area) grayscale neural classifier was selected. The LIRA grayscale neural classifier was developed for image recognition tasks and demonstrated good results. Therefore, we decided to develop a recognition system using this classifier for voice recognition. From a specific set of speakers, we can recognize the speaker’s voice. For this purpose, the system uses spectrograms of the voice signals as input to the system, extracts the characteristics and identifies the speaker. The results are described and analyzed in this article. The classifier can be used for speaker identification in security system or smart buildings for different types of intelligent devices.

Keywords: extreme learning, LIRA neural classifier, speaker identification, voice recognition

Procedia PDF Downloads 177
2865 New Approaches for the Handwritten Digit Image Features Extraction for Recognition

Authors: U. Ravi Babu, Mohd Mastan

Abstract:

The present paper proposes a novel approach for handwritten digit recognition system. The present paper extract digit image features based on distance measure and derives an algorithm to classify the digit images. The distance measure can be performing on the thinned image. Thinning is the one of the preprocessing technique in image processing. The present paper mainly concentrated on an extraction of features from digit image for effective recognition of the numeral. To find the effectiveness of the proposed method tested on MNIST database, CENPARMI, CEDAR, and newly collected data. The proposed method is implemented on more than one lakh digit images and it gets good comparative recognition results. The percentage of the recognition is achieved about 97.32%.

Keywords: handwritten digit recognition, distance measure, MNIST database, image features

Procedia PDF Downloads 461
2864 Emotion Recognition in Video and Images in the Wild

Authors: Faizan Tariq, Moayid Ali Zaidi

Abstract:

Facial emotion recognition algorithms are expanding rapidly now a day. People are using different algorithms with different combinations to generate best results. There are six basic emotions which are being studied in this area. Author tried to recognize the facial expressions using object detector algorithms instead of traditional algorithms. Two object detection algorithms were chosen which are Faster R-CNN and YOLO. For pre-processing we used image rotation and batch normalization. The dataset I have chosen for the experiments is Static Facial Expression in Wild (SFEW). Our approach worked well but there is still a lot of room to improve it, which will be a future direction.

Keywords: face recognition, emotion recognition, deep learning, CNN

Procedia PDF Downloads 187
2863 An Improved Face Recognition Algorithm Using Histogram-Based Features in Spatial and Frequency Domains

Authors: Qiu Chen, Koji Kotani, Feifei Lee, Tadahiro Ohmi

Abstract:

In this paper, we propose an improved face recognition algorithm using histogram-based features in spatial and frequency domains. For adding spatial information of the face to improve recognition performance, a region-division (RD) method is utilized. The facial area is firstly divided into several regions, then feature vectors of each facial part are generated by Binary Vector Quantization (BVQ) histogram using DCT coefficients in low frequency domains, as well as Local Binary Pattern (LBP) histogram in spatial domain. Recognition results with different regions are first obtained separately and then fused by weighted averaging. Publicly available ORL database is used for the evaluation of our proposed algorithm, which is consisted of 40 subjects with 10 images per subject containing variations in lighting, posing, and expressions. It is demonstrated that face recognition using RD method can achieve much higher recognition rate.

Keywords: binary vector quantization (BVQ), DCT coefficients, face recognition, local binary patterns (LBP)

Procedia PDF Downloads 349
2862 Multivariate Data Analysis for Automatic Atrial Fibrillation Detection

Authors: Zouhair Haddi, Stephane Delliaux, Jean-Francois Pons, Ismail Kechaf, Jean-Claude De Haro, Mustapha Ouladsine

Abstract:

Atrial fibrillation (AF) has been considered as the most common cardiac arrhythmia, and a major public health burden associated with significant morbidity and mortality. Nowadays, telemedical approaches targeting cardiac outpatients situate AF among the most challenged medical issues. The automatic, early, and fast AF detection is still a major concern for the healthcare professional. Several algorithms based on univariate analysis have been developed to detect atrial fibrillation. However, the published results do not show satisfactory classification accuracy. This work was aimed at resolving this shortcoming by proposing multivariate data analysis methods for automatic AF detection. Four publicly-accessible sets of clinical data (AF Termination Challenge Database, MIT-BIH AF, Normal Sinus Rhythm RR Interval Database, and MIT-BIH Normal Sinus Rhythm Databases) were used for assessment. All time series were segmented in 1 min RR intervals window and then four specific features were calculated. Two pattern recognition methods, i.e., Principal Component Analysis (PCA) and Learning Vector Quantization (LVQ) neural network were used to develop classification models. PCA, as a feature reduction method, was employed to find important features to discriminate between AF and Normal Sinus Rhythm. Despite its very simple structure, the results show that the LVQ model performs better on the analyzed databases than do existing algorithms, with high sensitivity and specificity (99.19% and 99.39%, respectively). The proposed AF detection holds several interesting properties, and can be implemented with just a few arithmetical operations which make it a suitable choice for telecare applications.

Keywords: atrial fibrillation, multivariate data analysis, automatic detection, telemedicine

Procedia PDF Downloads 267
2861 Deep-Learning Based Approach to Facial Emotion Recognition through Convolutional Neural Network

Authors: Nouha Khediri, Mohammed Ben Ammar, Monji Kherallah

Abstract:

Recently, facial emotion recognition (FER) has become increasingly essential to understand the state of the human mind. Accurately classifying emotion from the face is a challenging task. In this paper, we present a facial emotion recognition approach named CV-FER, benefiting from deep learning, especially CNN and VGG16. First, the data is pre-processed with data cleaning and data rotation. Then, we augment the data and proceed to our FER model, which contains five convolutions layers and five pooling layers. Finally, a softmax classifier is used in the output layer to recognize emotions. Based on the above contents, this paper reviews the works of facial emotion recognition based on deep learning. Experiments show that our model outperforms the other methods using the same FER2013 database and yields a recognition rate of 92%. We also put forward some suggestions for future work.

Keywords: CNN, deep-learning, facial emotion recognition, machine learning

Procedia PDF Downloads 95
2860 An Automatic Large Classroom Attendance Conceptual Model Using Face Counting

Authors: Sirajdin Olagoke Adeshina, Haidi Ibrahim, Akeem Salawu

Abstract:

large lecture theatres cannot be covered by a single camera but rather by a multicamera setup because of their size, shape, and seating arrangements. Although, classroom capture is achievable through a single camera. Therefore, a design and implementation of a multicamera setup for a large lecture hall were considered. Researchers have shown emphasis on the impact of class attendance taken on the academic performance of students. However, the traditional method of carrying out this exercise is below standard, especially for large lecture theatres, because of the student population, the time required, sophistication, exhaustiveness, and manipulative influence. An automated large classroom attendance system is, therefore, imperative. The common approach in this system is face detection and recognition, where known student faces are captured and stored for recognition purposes. This approach will require constant face database updates due to constant changes in the facial features. Alternatively, face counting can be performed by cropping the localized faces on the video or image into a folder and then count them. This research aims to develop a face localization-based approach to detect student faces in classroom images captured using a multicamera setup. A selected Haar-like feature cascade face detector trained with an asymmetric goal to minimize the False Rejection Rate (FRR) relative to the False Acceptance Rate (FAR) was applied on Raspberry Pi 4B. A relationship between the two factors (FRR and FAR) was established using a constant (λ) as a trade-off between the two factors for automatic adjustment during training. An evaluation of the proposed approach and the conventional AdaBoost on classroom datasets shows an improvement of 8% TPR (output result of low FRR) and 7% minimization of the FRR. The average learning speed of the proposed approach was improved with 1.19s execution time per image compared to 2.38s of the improved AdaBoost. Consequently, the proposed approach achieved 97% TPR with an overhead constraint time of 22.9s compared to 46.7s of the improved Adaboost when evaluated on images obtained from a large lecture hall (DK5) USM.

Keywords: automatic attendance, face detection, haar-like cascade, manual attendance

Procedia PDF Downloads 71
2859 Speech Motor Processing and Animal Sound Communication

Authors: Ana Cleide Vieira Gomes Guimbal de Aquino

Abstract:

Sound communication is present in most vertebrates, from fish, mainly in species that live in murky waters, to some species of reptiles, anuran amphibians, birds, and mammals, including primates. There are, in fact, relevant similarities between human language and animal sound communication, and among these similarities are the vocalizations called calls. The first specific call in human babies is crying, which has a characteristic prosodic contour and is motivated most of the time by the need for food and by affecting the puppy-caregiver interaction, with a view to communicating the necessities and food requests and guaranteeing the survival of the species. The present work aims to articulate speech processing in the motor context with aspects of the project entitled emotional states and vocalization: a comparative study of the prosodic contours of crying in human and non-human animals. First, concepts of speech motor processing and general aspects of speech evolution will be presented to relate these two approaches to animal sound communication.

Keywords: speech motor processing, animal communication, animal behaviour, language acquisition

Procedia PDF Downloads 89