Search results for: part of speech
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2013

Search results for: part of speech

1893 Using Speech Emotion Recognition as a Longitudinal Biomarker for Alzheimer’s Disease

Authors: Yishu Gong, Liangliang Yang, Jianyu Zhang, Zhengyu Chen, Sihong He, Xusheng Zhang, Wei Zhang

Abstract:

Alzheimer’s disease (AD) is a progressive neurodegenerative disorder that affects millions of people worldwide and is characterized by cognitive decline and behavioral changes. People living with Alzheimer’s disease often find it hard to complete routine tasks. However, there are limited objective assessments that aim to quantify the difficulty of certain tasks for AD patients compared to non-AD people. In this study, we propose to use speech emotion recognition (SER), especially the frustration level as a potential biomarker for quantifying the difficulty patients experience when describing a picture. We build an SER model using data from the IEMOCAP dataset and apply the model to the DementiaBank data to detect the AD/non-AD group difference and perform longitudinal analysis to track the AD disease progression. Our results show that the frustration level detected from the SER model can possibly be used as a cost-effective tool for objective tracking of AD progression in addition to the Mini-Mental State Examination (MMSE) score.

Keywords: Alzheimer’s disease, Speech Emotion Recognition, longitudinal biomarker, machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 174
1892 Spectral Analysis of Speech: A New Technique

Authors: Neeta Awasthy, J.P.Saini, D.S.Chauhan

Abstract:

ICA which is generally used for blind source separation problem has been tested for feature extraction in Speech recognition system to replace the phoneme based approach of MFCC. Applying the Cepstral coefficients generated to ICA as preprocessing has developed a new signal processing approach. This gives much better results against MFCC and ICA separately, both for word and speaker recognition. The mixing matrix A is different before and after MFCC as expected. As Mel is a nonlinear scale. However, cepstrals generated from Linear Predictive Coefficient being independent prove to be the right candidate for ICA. Matlab is the tool used for all comparisons. The database used is samples of ISOLET.

Keywords: Cepstral Coefficient, Distance measures, Independent Component Analysis, Linear Predictive Coefficients.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1915
1891 Multi Switched Split Vector Quantization of Narrowband Speech Signals

Authors: M. Satya Sai Ram, P. Siddaiah, M. Madhavi Latha

Abstract:

Vector quantization is a powerful tool for speech coding applications. This paper deals with LPC Coding of speech signals which uses a new technique called Multi Switched Split Vector Quantization (MSSVQ), which is a hybrid of Multi, switched, split vector quantization techniques. The spectral distortion performance, computational complexity, and memory requirements of MSSVQ are compared to split vector quantization (SVQ), multi stage vector quantization(MSVQ) and switched split vector quantization (SSVQ) techniques. It has been proved from results that MSSVQ has better spectral distortion performance, lower computational complexity and lower memory requirements when compared to all the above mentioned product code vector quantization techniques. Computational complexity is measured in floating point operations (flops), and memory requirements is measured in (floats).

Keywords: Linear predictive Coding, Multi stage vectorquantization, Switched Split vector quantization, Split vectorquantization, Line Spectral Frequencies (LSF).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1621
1890 Formant Tracking Linear Prediction Model using HMMs for Noisy Speech Processing

Authors: Zaineb Ben Messaoud, Dorra Gargouri, Saida Zribi, Ahmed Ben Hamida

Abstract:

This paper presents a formant-tracking linear prediction (FTLP) model for speech processing in noise. The main focus of this work is the detection of formant trajectory based on Hidden Markov Models (HMM), for improved formant estimation in noise. The approach proposed in this paper provides a systematic framework for modelling and utilization of a time- sequence of peaks which satisfies continuity constraints on parameter; the within peaks are modelled by the LP parameters. The formant tracking LP model estimation is composed of three stages: (1) a pre-cleaning multi-band spectral subtraction stage to reduce the effect of residue noise on formants (2) estimation stage where an initial estimate of the LP model of speech for each frame is obtained (3) a formant classification using probability models of formants and Viterbi-decoders. The evaluation results for the estimation of the formant tracking LP model tested in Gaussian white noise background, demonstrate that the proposed combination of the initial noise reduction stage with formant tracking and LPC variable order analysis, results in a significant reduction in errors and distortions. The performance was evaluated with noisy natual vowels extracted from international french and English vocabulary speech signals at SNR value of 10dB. In each case, the estimated formants are compared to reference formants.

Keywords: Formants Estimation, HMM, Multi Band Spectral Subtraction, Variable order LPC coding, White Gauusien Noise.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1928
1889 The Analysis of Deceptive and Truthful Speech: A Computational Linguistic Based Method

Authors: Seham El Kareh, Miramar Etman

Abstract:

Recently, detecting liars and extracting features which distinguish them from truth-tellers have been the focus of a wide range of disciplines. To the author’s best knowledge, most of the work has been done on facial expressions and body gestures but only few works have been done on the language used by both liars and truth-tellers. This paper sheds light on four axes. The first axis copes with building an audio corpus for deceptive and truthful speech for Egyptian Arabic speakers. The second axis focuses on examining the human perception of lies and proving our need for computational linguistic-based methods to extract features which characterize truthful and deceptive speech. The third axis is concerned with building a linguistic analysis program that could extract from the corpus the inter- and intra-linguistic cues for deceptive and truthful speech. The program built here is based on selected categories from the Linguistic Inquiry and Word Count program. Our results demonstrated that Egyptian Arabic speakers on one hand preferred to use first-person pronouns and present tense compared to the past tense when lying and their lies lacked of second-person pronouns, and on the other hand, when telling the truth, they preferred to use the verbs related to motion and the nouns related to time. The results also showed that there is a need for bigger data to prove the significance of words related to emotions and numbers.

Keywords: Egyptian Arabic corpus, computational analysis, deceptive features, forensic linguistics, human perception, truthful features.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1141
1888 Conspiracy Theory in Discussions of the Coronavirus Pandemic in the Gulf Region

Authors: Rasha Salameh

Abstract:

In light of the tense relationship between Saudi Arabia and Iran, this research paper sheds some light on Saudi-owned television network, Al-Arabiya’s reporting of the Coronavirus in the Gulf region. Particularly because most of the cases in the beginning were coming from Iran, some programs of this Saudi channel embraced a conspiracy theory. Hate speech has been used in the talking and discussions about the topic. The results of these discussions will be detailed in this paper in percentages with regard to the research sample, which includes five programs on the Al-Arabiya channel: ‘DNA’, ‘Marraya’ (Mirrors), ‘Panorama’, ‘Tafaolcom’ (Your Interaction) and ‘Diplomatic Street’, in the period between January 19, that is, the date of the first case in Iran, and April 10, 2020. The research shows the use of a conspiracy theory in the programs, in addition to some professional violations. The surveyed sample also shows that the matter receded due to the Arab Gulf states' preoccupation with the successively increasing cases that have appeared there since the start of the pandemic. The results indicate that hate speech was present in the sample at a rate of 98.1%, and that most of the programs that dealt with the Iranian issue under the Coronavirus pandemic on Al Arabiya used the conspiracy theory at a rate of 75.5%.

Keywords: Al-Arabiya, Iran, COVID-19, hate speech, conspiracy theory, politicization of the pandemic

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 412
1887 Solving Part Type Selection and Loading Problem in Flexible Manufacturing System Using Real Coded Genetic Algorithms – Part I: Modeling

Authors: Wayan F. Mahmudy, Romeo M. Marian, Lee H. S. Luong

Abstract:

This paper and its companion (Part 2) deal with modeling and optimization of two NP-hard problems in production planning of flexible manufacturing system (FMS), part type selection problem and loading problem. The part type selection problem and the loading problem are strongly related and heavily influence the system-s efficiency and productivity. The complexity of the problems is harder when flexibilities of operations such as the possibility of operation processed on alternative machines with alternative tools are considered. These problems have been modeled and solved simultaneously by using real coded genetic algorithms (RCGA) which uses an array of real numbers as chromosome representation. These real numbers can be converted into part type sequence and machines that are used to process the part types. This first part of the papers focuses on the modeling of the problems and discussing how the novel chromosome representation can be applied to solve the problems. The second part will discuss the effectiveness of the RCGA to solve various test bed problems.

Keywords: Flexible manufacturing system, production planning, part type selection problem, loading problem, real-coded genetic algorithm

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2067
1886 Improved Text-Independent Speaker Identification using Fused MFCC and IMFCC Feature Sets based on Gaussian Filter

Authors: Sandipan Chakroborty, Goutam Saha

Abstract:

A state of the art Speaker Identification (SI) system requires a robust feature extraction unit followed by a speaker modeling scheme for generalized representation of these features. Over the years, Mel-Frequency Cepstral Coefficients (MFCC) modeled on the human auditory system has been used as a standard acoustic feature set for speech related applications. On a recent contribution by authors, it has been shown that the Inverted Mel- Frequency Cepstral Coefficients (IMFCC) is useful feature set for SI, which contains complementary information present in high frequency region. This paper introduces the Gaussian shaped filter (GF) while calculating MFCC and IMFCC in place of typical triangular shaped bins. The objective is to introduce a higher amount of correlation between subband outputs. The performances of both MFCC & IMFCC improve with GF over conventional triangular filter (TF) based implementation, individually as well as in combination. With GMM as speaker modeling paradigm, the performances of proposed GF based MFCC and IMFCC in individual and fused mode have been verified in two standard databases YOHO, (Microphone Speech) and POLYCOST (Telephone Speech) each of which has more than 130 speakers.

Keywords: Gaussian Filter, Triangular Filter, Subbands, Correlation, MFCC, IMFCC, GMM.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2391
1885 Investigation on an Innovative Way to Connect RC Beam and Steel Column

Authors: Ahmed H. El-Masry, Mohamed A. Dabaon, Tarek F. El-Shafiey, Abd El-Hakim A. Khalil

Abstract:

An experimental study was performed to investigate the behavior and strength of proposed technique to connect reinforced concrete (RC) beam to steel or composite columns. This approach can practically be used in several types of building construction. In this technique, the main beam of the frame consists of a transfer part (part of beam; Tr.P) and a common reinforcement concrete beam. The transfer part of the beam is connected to the column, whereas the rest of the beam is connected to the transfer part from each side. Four full-scale beam-column connections were tested under static loading. The test parameters were the length of the transfer part and the column properties. The test results show that using of the transfer part technique leads to modify the deformation capabilities for the RC beam and hence it increases its resistance against failure. Increase in length of the transfer part did not necessarily indicate an enhanced behavior. The test results contribute to the characterization of the connection behavior between RC beam - steel column and can be used to calibrate numerical models for the simulation of this type of connection.

Keywords: Composite column, reinforced concrete beam, Steel Column, Transfer Part.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5256
1884 Speaker Independent Quranic Recognizer Basedon Maximum Likelihood Linear Regression

Authors: Ehab Mourtaga, Ahmad Sharieh, Mousa Abdallah

Abstract:

An automatic speech recognition system for the formal Arabic language is needed. The Quran is the most formal spoken book in Arabic, it is spoken all over the world. In this research, an automatic speech recognizer for Quranic based speakerindependent was developed and tested. The system was developed based on the tri-phone Hidden Markov Model and Maximum Likelihood Linear Regression (MLLR). The MLLR computes a set of transformations which reduces the mismatch between an initial model set and the adaptation data. It uses the regression class tree, as well as, estimates a set of linear transformations for the mean and variance parameters of a Gaussian mixture HMM system. The 30th Chapter of the Quran, with five of the most famous readers of the Quran, was used for the training and testing of the data. The chapter includes about 2000 distinct words. The advantages of using the Quranic verses as the database in this developed recognizer are the uniqueness of the words and the high level of orderliness between verses. The level of accuracy from the tested data ranged 68 to 85%.

Keywords: Hidden Markov Model (HMM), MaximumLikelihood Linear Regression (MLLR), Quran, Regression ClassTree, Speech Recognition, Speaker-independent.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1874
1883 Face Localization Using Illumination-dependent Face Model for Visual Speech Recognition

Authors: Robert E. Hursig, Jane X. Zhang

Abstract:

A robust still image face localization algorithm capable of operating in an unconstrained visual environment is proposed. First, construction of a robust skin classifier within a shifted HSV color space is described. Then various filtering operations are performed to better isolate face candidates and mitigate the effect of substantial non-skin regions. Finally, a novel Bhattacharyya-based face detection algorithm is used to compare candidate regions of interest with a unique illumination-dependent face model probability distribution function approximation. Experimental results show a 90% face detection success rate despite the demands of the visually noisy environment.

Keywords: Audio-visual speech recognition, Bhattacharyyacoefficient, face detection,

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1587
1882 Assessment of the Occupancy’s Effect on Speech Intelligibility in Al-Madinah Holy Mosque

Authors: Wasim Orfali, Hesham Tolba

Abstract:

This research investigates the acoustical characteristics of Al-Madinah Holy Mosque. Extensive field measurements were conducted in different locations of Al-Madinah Holy Mosque to characterize its acoustic characteristics. The acoustical characteristics are usually evaluated by the use of objective parameters in unoccupied rooms due to practical considerations. However, under normal conditions, the room occupancy can vary such characteristics due to the effect of the additional sound absorption present in the room or by the change in signal-to-noise ratio. Based on the acoustic measurements carried out in Al-Madinah Holy Mosque with and without occupancy, and the analysis of such measurements, the existence of acoustical deficiencies has been confirmed.

Keywords: Worship sound, Al-Madinah Holy Mosque, mosque acoustics, speech intelligibility.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 660
1881 A Sociolinguistic Study of the Outcomes of Arabic-French Contact in the Algerian Dialect Tlemcen Speech Community as a Case Study

Authors: R. Rahmoun-Mrabet

Abstract:

It is acknowledged that our style of speaking changes according to a wide range of variables such as gender, setting, the age of both the addresser and the addressee, the conversation topic, and the aim of the interaction. These differences in style are noticeable in monolingual and multilingual speech communities. Yet, they are more observable in speech communities where two or more codes coexist. The linguistic situation in Algeria reflects a state of bilingualism because of the coexistence of Arabic and French. Nevertheless, like all Arab countries, it is characterized by diglossia i.e. the concomitance of Modern Standard Arabic (MSA) and Algerian Arabic (AA), the former standing for the ‘high variety’ and the latter for the ‘low variety’. The two varieties are derived from the same source but are used to fulfil distinct functions that is, MSA is used in the domains of religion, literature, education and formal settings. AA, on the other hand, is used in informal settings, in everyday speech. French has strongly affected the Algerian language and culture because of the historical background of Algeria, thus, what can easily be noticed in Algeria is that everyday speech is characterized by code-switching from dialectal Arabic and French or by the use of borrowings. Tamazight is also very present in many regions of Algeria and is the mother tongue of many Algerians. Yet, it is not used in the west of Algeria, where the study has been conducted. The present work, which was directed in the speech community of Tlemcen-Algeria, aims at depicting some of the outcomes of the contact of Arabic with French such as code-switching, borrowing and interference. The question that has been asked is whether Algerians are aware of their use of borrowings or not. Three steps are followed in this research; the first one is to depict the sociolinguistic situation in Algeria and to describe the linguistic characteristics of the dialect of Tlemcen, which are specific to this city. The second one is concerned with data collection. Data have been collected from 57 informants who were given questionnaires and who have then been classified according to their age, gender and level of education. Information has also been collected through observation, and note taking. The third step is devoted to analysis. The results obtained reveal that most Algerians are aware of their use of borrowings. The present work clarifies how words are borrowed from French, and then adapted to Arabic. It also illustrates the way in which singular words inflect into plural. The results expose the main characteristics of borrowing as opposed to code-switching. The study also clarifies how interference occurs at the level of nouns, verbs and adjectives.

Keywords: Bilingualism, borrowing, code-switching, interference, language contact.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 888
1880 Employment Discrimination on Civil Servant Recruitment

Authors: Li Lei, Jia Jidong

Abstract:

Employment right is linked to the people’s livelihood in our society. As a most important and representative part in the labor market, the employment of public servants is always taking much attention. But the discrimination in the employment of public servants has always existed and, to become a controversy in our society. The paper try to discuss this problem from four parts as follows: First, the employment of public servants has a representative status in our labor market. The second part is about the discrimination in the employment of public servants. The third part is about the right of equality and its significance. The last part is to analysis the legal predicament about discrimination in the employment of public servants in China.

Keywords: Discrimination, Employment of public servants, Right of labor.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2061
1879 On a Pitch Duration Technique for Prosody Control

Authors: JongKuk Kim, HernSoo Hahn, Uei-Joong Yoo, MyungJin Bae

Abstract:

In this paper, we propose a method of alter duration in frequency domain that control prosody in real time after pitch alteration. If there has a method to alteration duration freely among prosody information, that may used in several fields such as speech impediment person's pronunciation proof reading or language study. The pitch alteration method used control prosody altered by PSOLA synthesis method which is in time domain processing method. However, the duration of pitch alteration speech is changed by the frequency domain. In this paper, we altered the duration with the method of duration alteration by Fast Fourier Transformation in frequency domain. Consequently, the intelligibility of the pitch and duration are controlled has a slight decrease than the case when only pitch is changed, but the proposed algorithm obtained the higher MOS score about naturalness.

Keywords: PSOLA, Pitch Alteration, Duration Control.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1636
1878 DHT-LMS Algorithm for Sensorineural Loss Patients

Authors: Sunitha S. L., V. Udayashankara

Abstract:

Hearing impairment is the number one chronic disability affecting many people in the world. Background noise is particularly damaging to speech intelligibility for people with hearing loss especially for sensorineural loss patients. Several investigations on speech intelligibility have demonstrated sensorineural loss patients need 5-15 dB higher SNR than the normal hearing subjects. This paper describes Discrete Hartley Transform Power Normalized Least Mean Square algorithm (DHT-LMS) to improve the SNR and to reduce the convergence rate of the Least Means Square (LMS) for sensorineural loss patients. The DHT transforms n real numbers to n real numbers, and has the convenient property of being its own inverse. It can be effectively used for noise cancellation with less convergence time. The simulated result shows the superior characteristics by improving the SNR at least 9 dB for input SNR with zero dB and faster convergence rate (eigenvalue ratio 12) compare to time domain method and DFT-LMS.

Keywords: Hearing Impairment, DHT-LMS, Convergence rate, SNR improvement.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1677
1877 Optimized Brain Computer Interface System for Unspoken Speech Recognition: Role of Wernicke Area

Authors: Nassib Abdallah, Pierre Chauvet, Abd El Salam Hajjar, Bassam Daya

Abstract:

In this paper, we propose an optimized brain computer interface (BCI) system for unspoken speech recognition, based on the fact that the constructions of unspoken words rely strongly on the Wernicke area, situated in the temporal lobe. Our BCI system has four modules: (i) the EEG Acquisition module based on a non-invasive headset with 14 electrodes; (ii) the Preprocessing module to remove noise and artifacts, using the Common Average Reference method; (iii) the Features Extraction module, using Wavelet Packet Transform (WPT); (iv) the Classification module based on a one-hidden layer artificial neural network. The present study consists of comparing the recognition accuracy of 5 Arabic words, when using all the headset electrodes or only the 4 electrodes situated near the Wernicke area, as well as the selection effect of the subbands produced by the WPT module. After applying the articial neural network on the produced database, we obtain, on the test dataset, an accuracy of 83.4% with all the electrodes and all the subbands of 8 levels of the WPT decomposition. However, by using only the 4 electrodes near Wernicke Area and the 6 middle subbands of the WPT, we obtain a high reduction of the dataset size, equal to approximately 19% of the total dataset, with 67.5% of accuracy rate. This reduction appears particularly important to improve the design of a low cost and simple to use BCI, trained for several words.

Keywords: Brain-computer interface, speech recognition, electroencephalography EEG, Wernicke area, artificial neural network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 846
1876 Autistic Children and Different Tense Forms

Authors: Ameneh Zare, Shahin Nematzadeh, Shahla Raghibdoust, Iran Kalbassi

Abstract:

Autism spectrum disorder is characterized by abnormalities in social communication, language abilities and repetitive behaviors. The present study focused on some grammatical deficits in autistic children. We evaluated the impairment of correct use of different Persian verb tenses in autistic children-s speech. Two standardized Language Test were administered then gathered data were analyzed. The main result of this study was significant difference between the mean scores of correct responses to present tense in comparison with past tense in Persian language. This study demonstrated that tense is severely impaired in autistic children-s speech. Our findings indicated those autistic children-s production of simple present/ past tense opposition to be better than production of future and past periphrastic forms (past perfect, present perfect, past progressive).

Keywords: Autism, Past, Persian Language, Present, Tense

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2710
1875 A Review in Advanced Digital Signal Processing Systems

Authors: Roza Dastres, Mohsen Soori

Abstract:

Digital Signal Processing (DSP) is the use of digital processing systems by computers in order to perform a variety of signal processing operations. It is the mathematical manipulation of a digital signal's numerical values in order to increase quality as well as effects of signals. DSP can include linear or nonlinear operators in order to process and analyze the input signals. The nonlinear DSP processing is closely related to nonlinear system detection and can be implemented in time, frequency and space-time domains. Applications of the DSP can be presented as control systems, digital image processing, biomedical engineering, speech recognition systems, industrial engineering, health care systems, radar signal processing and telecommunication systems. In this study, advanced methods and different applications of DSP are reviewed in order to move forward the interesting research filed.

Keywords: Digital signal processing, advanced telecommunication, nonlinear signal processing, speech recognition systems.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 947
1874 Fast Factored DCT-LMS Speech Enhancement for Performance Enhancement of Digital Hearing Aid

Authors: Sunitha. S.L., V. Udayashankara

Abstract:

Background noise is particularly damaging to speech intelligibility for people with hearing loss especially for sensorineural loss patients. Several investigations on speech intelligibility have demonstrated sensorineural loss patients need 5-15 dB higher SNR than the normal hearing subjects. This paper describes Discrete Cosine Transform Power Normalized Least Mean Square algorithm to improve the SNR and to reduce the convergence rate of the LMS for Sensory neural loss patients. Since it requires only real arithmetic, it establishes the faster convergence rate as compare to time domain LMS and also this transformation improves the eigenvalue distribution of the input autocorrelation matrix of the LMS filter. The DCT has good ortho-normal, separable, and energy compaction property. Although the DCT does not separate frequencies, it is a powerful signal decorrelator. It is a real valued function and thus can be effectively used in real-time operation. The advantages of DCT-LMS as compared to standard LMS algorithm are shown via SNR and eigenvalue ratio computations. . Exploiting the symmetry of the basis functions, the DCT transform matrix [AN] can be factored into a series of ±1 butterflies and rotation angles. This factorization results in one of the fastest DCT implementation. There are different ways to obtain factorizations. This work uses the fast factored DCT algorithm developed by Chen and company. The computer simulations results show superior convergence characteristics of the proposed algorithm by improving the SNR at least 10 dB for input SNR less than and equal to 0 dB, faster convergence speed and better time and frequency characteristics.

Keywords: Hearing Impairment, DCT Adaptive filter, Sensorineural loss patients, Convergence rate.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2134
1873 On-line Speech Enhancement by Time-Frequency Masking under Prior Knowledge of Source Location

Authors: Min Ah Kang, Sangbae Jeong, Minsoo Hahn

Abstract:

This paper presents the source extraction system which can extract only target signals with constraints on source localization in on-line systems. The proposed system is a kind of methods for enhancing a target signal and suppressing other interference signals. But, the performance of proposed system is superior to any other methods and the extraction of target source is comparatively complete. The method has a beamforming concept and uses an improved time-frequency (TF) mask-based BSS algorithm to separate a target signal from multiple noise sources. The target sources are assumed to be in front and test data was recorded in a reverberant room. The experimental results of the proposed method was evaluated by the PESQ score of real-recording sentences and showed a noticeable speech enhancement.

Keywords: Beam forming, Non-stationary noise reduction, Source separation, TF mask.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1983
1872 Grammatically Coded Corpus of Spoken Lithuanian: Methodology and Development

Authors: L. Kamandulytė-Merfeldienė

Abstract:

The paper deals with the main issues of methodology of the Corpus of Spoken Lithuanian which was started to be developed in 2006. At present, the corpus consists of 300,000 grammatically annotated word forms. The creation of the corpus consists of three main stages: collecting the data, the transcription of the recorded data, and the grammatical annotation. Collecting the data was based on the principles of balance and naturality. The recorded speech was transcribed according to the CHAT requirements of CHILDES. The transcripts were double-checked and annotated grammatically using CHILDES. The development of the Corpus of Spoken Lithuanian has led to the constant increase in studies on spontaneous communication, and various papers have dealt with a distribution of parts of speech, use of different grammatical forms, variation of inflectional paradigms, distribution of fillers, syntactic functions of adjectives, the mean length of utterances.

Keywords: CHILDES, Corpus of Spoken Lithuanian, grammatical annotation, grammatical disambiguation, lexicon, Lithuanian.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 906
1871 Investigating Medical Students’ Perspectives toward University Teachers’ Talking Features in an English as a Foreign Language Context in Urmia, Iran

Authors: Ismail Baniadam, Nafisa Tadayyon, Javid Fereidoni

Abstract:

This study aimed to investigate medical students’ attitudes toward some teachers’ talking features regarding their gender in the Iranian context. To do so, 60 male and 60 female medical students of Urmia University of Medical Sciences (UMSU) participated in the research. A researcher made Likert-type questionnaire which was initially piloted and was used to gather the data. Comparing the four different factors regarding the features of teacher talk, it was revealed that visual and extra-linguistic information factor, Lexical and syntactic familiarity, Speed of speech, and the use of Persian language had the highest to the lowest mean score, respectively. It was also indicated that female students rather than male students were significantly more in favor of speed of speech and lexical and syntactic familiarity.

Keywords: Attitude, gender, medical student, teacher talk.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 758
1870 The Haar Wavelet Transform of the DNA Signal Representation

Authors: Abdelkader Magdy, Magdy Saeb, A. Baith Mohamed, Ahmed Khadragi

Abstract:

The Deoxyribonucleic Acid (DNA) which is a doublestranded helix of nucleotides consists of: Adenine (A), Cytosine (C), Guanine (G) and Thymine (T). In this work, we convert this genetic code into an equivalent digital signal representation. Applying a wavelet transform, such as Haar wavelet, we will be able to extract details that are not so clear in the original genetic code. We compare between different organisms using the results of the Haar wavelet Transform. This is achieved by using the trend part of the signal since the trend part bears the most energy of the digital signal representation. Consequently, we will be able to quantitatively reconstruct different biological families.

Keywords: Digital Signal, DNA, Fluctuation part, Haar wavelet, Nucleotides, Trend part.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1882
1869 The Code-Mixing of Japanese, English and Thai in Line Chat

Authors: Premvadee Na Nakornpanom

Abstract:

Code- mixing in spontaneous speech has been widely discussed, but not in virtual situations; especially in context of the third language learning students. Thus, this study is an attempt to explore the linguistic characteristics of the mixing of Japanese, English and Thai in a mobile Line chat room by students with their background of English as L2, Japanese as L3 and Thai as mother tongue. The result found that insertion of Thai content words is a very common linguistic phenomenon embedded with the other two languages in the sentences. As chatting is to be ‘relational’ or ‘interactional’, it affected the style of lexical choices to be speech-like, more personal and emotionally-related. A personal pronoun in Japanese is often mixed into the sentences. The Japanese sentence-final question particle か “ka” was added to the end of the sentence based on Thai grammar rules. Some unique characteristics were created while chatting.

Keywords: Code-mixing, Japanese, English, Thai, Line chat.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3406
1868 Freedom with Limitations: The Nature of Free Expression in the European Case-Law

Authors: Laszlo Vari

Abstract:

In the digital age, the spread of the mobile world and the nature of the cyberspace, offers many new opportunities for the prevalence of the fundamental right to free expression, and therefore, for free speech and freedom of the press; however, these new information communication technologies carry many new challenges. Defamation, censorship, fake news, misleading information, hate speech, breach of copyright etc., are only some of the violations, all of which can be derived from the harmful exercise of freedom of expression, all which become more salient in the internet. Here raises the question: how can we eliminate these problems, and practice our fundamental freedom rightfully? To answer this question, we should understand the elements and the characteristic of the nature of freedom of expression, and the role of the actors whose duties and responsibilities are crucial in the prevalence of this fundamental freedom. To achieve this goal, this paper will explore the European practice to understand instructions found in the case-law of the European Court of Human rights for the rightful exercise of freedom of expression.

Keywords: Collision of rights, European case-law, freedom opinion and expression, media law, freedom of information, online expression

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 870
1867 Hand Gesture Recognition: Sign to Voice System (S2V)

Authors: Oi Mean Foong, Tan Jung Low, Satrio Wibowo

Abstract:

Hand gesture is one of the typical methods used in sign language for non-verbal communication. It is most commonly used by people who have hearing or speech problems to communicate among themselves or with normal people. Various sign language systems have been developed by manufacturers around the globe but they are neither flexible nor cost-effective for the end users. This paper presents a system prototype that is able to automatically recognize sign language to help normal people to communicate more effectively with the hearing or speech impaired people. The Sign to Voice system prototype, S2V, was developed using Feed Forward Neural Network for two-sequence signs detection. Different sets of universal hand gestures were captured from video camera and utilized to train the neural network for classification purpose. The experimental results have shown that neural network has achieved satisfactory result for sign-to-voice translation.

Keywords: Hand gesture detection, neural network, signlanguage, sequence detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1801
1866 A New Vector Quantization Front-End Process for Discrete HMM Speech Recognition System

Authors: M. Debyeche, J.P Haton, A. Houacine

Abstract:

The paper presents a complete discrete statistical framework, based on a novel vector quantization (VQ) front-end process. This new VQ approach performs an optimal distribution of VQ codebook components on HMM states. This technique that we named the distributed vector quantization (DVQ) of hidden Markov models, succeeds in unifying acoustic micro-structure and phonetic macro-structure, when the estimation of HMM parameters is performed. The DVQ technique is implemented through two variants. The first variant uses the K-means algorithm (K-means- DVQ) to optimize the VQ, while the second variant exploits the benefits of the classification behavior of neural networks (NN-DVQ) for the same purpose. The proposed variants are compared with the HMM-based baseline system by experiments of specific Arabic consonants recognition. The results show that the distributed vector quantization technique increase the performance of the discrete HMM system.

Keywords: Hidden Markov Model, Vector Quantization, Neural Network, Speech Recognition, Arabic Language

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2010
1865 An Intelligent Text Independent Speaker Identification Using VQ-GMM Model Based Multiple Classifier System

Authors: Cheima Ben Soltane, Ittansa Yonas Kelbesa

Abstract:

Speaker Identification (SI) is the task of establishing identity of an individual based on his/her voice characteristics. The SI task is typically achieved by two-stage signal processing: training and testing. The training process calculates speaker specific feature parameters from the speech and generates speaker models accordingly. In the testing phase, speech samples from unknown speakers are compared with the models and classified. Even though performance of speaker identification systems has improved due to recent advances in speech processing techniques, there is still need of improvement. In this paper, a Closed-Set Tex-Independent Speaker Identification System (CISI) based on a Multiple Classifier System (MCS) is proposed, using Mel Frequency Cepstrum Coefficient (MFCC) as feature extraction and suitable combination of vector quantization (VQ) and Gaussian Mixture Model (GMM) together with Expectation Maximization algorithm (EM) for speaker modeling. The use of Voice Activity Detector (VAD) with a hybrid approach based on Short Time Energy (STE) and Statistical Modeling of Background Noise in the pre-processing step of the feature extraction yields a better and more robust automatic speaker identification system. Also investigation of Linde-Buzo-Gray (LBG) clustering algorithm for initialization of GMM, for estimating the underlying parameters, in the EM step improved the convergence rate and systems performance. It also uses relative index as confidence measures in case of contradiction in identification process by GMM and VQ as well. Simulation results carried out on voxforge.org speech database using MATLAB highlight the efficacy of the proposed method compared to earlier work.

Keywords: Feature Extraction, Speaker Modeling, Feature Matching, Mel Frequency Cepstrum Coefficient (MFCC), Gaussian mixture model (GMM), Vector Quantization (VQ), Linde-Buzo-Gray (LBG), Expectation Maximization (EM), pre-processing, Voice Activity Detection (VAD), Short Time Energy (STE), Background Noise Statistical Modeling, Closed-Set Tex-Independent Speaker Identification System (CISI).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1828
1864 Structural Parsing of Natural Language Text in Tamil Using Phrase Structure Hybrid Language Model

Authors: Selvam M, Natarajan. A M, Thangarajan R

Abstract:

Parsing is important in Linguistics and Natural Language Processing to understand the syntax and semantics of a natural language grammar. Parsing natural language text is challenging because of the problems like ambiguity and inefficiency. Also the interpretation of natural language text depends on context based techniques. A probabilistic component is essential to resolve ambiguity in both syntax and semantics thereby increasing accuracy and efficiency of the parser. Tamil language has some inherent features which are more challenging. In order to obtain the solutions, lexicalized and statistical approach is to be applied in the parsing with the aid of a language model. Statistical models mainly focus on semantics of the language which are suitable for large vocabulary tasks where as structural methods focus on syntax which models small vocabulary tasks. A statistical language model based on Trigram for Tamil language with medium vocabulary of 5000 words has been built. Though statistical parsing gives better performance through tri-gram probabilities and large vocabulary size, it has some disadvantages like focus on semantics rather than syntax, lack of support in free ordering of words and long term relationship. To overcome the disadvantages a structural component is to be incorporated in statistical language models which leads to the implementation of hybrid language models. This paper has attempted to build phrase structured hybrid language model which resolves above mentioned disadvantages. In the development of hybrid language model, new part of speech tag set for Tamil language has been developed with more than 500 tags which have the wider coverage. A phrase structured Treebank has been developed with 326 Tamil sentences which covers more than 5000 words. A hybrid language model has been trained with the phrase structured Treebank using immediate head parsing technique. Lexicalized and statistical parser which employs this hybrid language model and immediate head parsing technique gives better results than pure grammar and trigram based model.

Keywords: Hybrid Language Model, Immediate Head Parsing, Lexicalized and Statistical Parsing, Natural Language Processing, Parts of Speech, Probabilistic Context Free Grammar, Tamil Language, Tree Bank.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3594