Search results for: speech to text
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1981

Search results for: speech to text

1891 Articles, Delimitation of Speech and Perception

Authors: Nataliya L. Ogurechnikova

Abstract:

The paper aims to clarify the function of articles in the English speech and specify their place and role in the English language, taking into account the use of articles for delimitation of speech. A focus of the paper is the use of the definite and the indefinite articles with different types of noun phrases which comprise either one noun with or without attributes, such as the King, the Queen, the Lion, the Unicorn, a dimple, a smile, a new language, an unknown dialect, or several nouns with or without attributes, such as the King and Queen of Hearts, the Lion and Unicorn, a dimple or smile, a completely isolated language or dialect. It is stated that the function of delimitation is related to perception: the number of speech units in a text correlates with the way the speaker perceives and segments the denotation. The two following combinations of words the house and garden and the house and the garden contain different numbers of speech units, one and two respectively, and reveal two different perception modes which correspond to the use of the definite article in the examples given. Thus, the function of delimitation is twofold, it is related to perception and cognition, on the one hand, and, on the other hand, to grammar, if the subject of grammar is the structure of speech. Analysis of speech units in the paper is not limited by noun phrases and is amplified by discussion of peripheral phenomena which are nevertheless important because they enable to qualify articles as a syntactic phenomenon whereas they are not infrequently described in terms of noun morphology. With this regard attention is given to the history of linguistic studies, specifically to the description of English articles by Niels Haislund, a disciple of Otto Jespersen. A discrepancy is noted between the initial plan of Jespersen who intended to describe articles as a syntactic phenomenon in ‘A Modern English Grammar on Historical Principles’ and the interpretation of articles in terms of noun morphology, finally given by Haislund. Another issue of the paper is correlation between description and denotation, being a traditional aspect of linguistic studies focused on articles. An overview of relevant studies, given in the paper, goes back to the works of G. Frege, which gave rise to a series of scientific works where the meaning of articles was described within the scope of logical semantics. Correlation between denotation and description is treated in the paper as the meaning of article, i.e. a component in its semantic structure, which differs from the function of delimitation and is similar to the meaning of other quantifiers. The paper further explains why the relation between description and denotation, i.e. the meaning of English article, is irrelevant for noun morphology and has nothing to do with nominal categories of the English language.

Keywords: delimitation of speech, denotation, description, perception, speech units, syntax

Procedia PDF Downloads 240
1890 Developing an Intonation Labeled Dataset for Hindi

Authors: Esha Banerjee, Atul Kumar Ojha, Girish Nath Jha

Abstract:

This study aims to develop an intonation labeled database for Hindi. Although no single standard for prosody labeling exists in Hindi, researchers in the past have employed perceptual and statistical methods in literature to draw inferences about the behavior of prosody patterns in Hindi. Based on such existing research and largely agreed upon intonational theories in Hindi, this study attempts to develop a manually annotated prosodic corpus of Hindi speech data, which can be used for training speech models for natural-sounding speech in the future. 100 sentences ( 500 words) each for declarative and interrogative types have been labeled using Praat.

Keywords: speech dataset, Hindi, intonation, labeled corpus

Procedia PDF Downloads 197
1889 Distant Speech Recognition Using Laser Doppler Vibrometer

Authors: Yunbin Deng

Abstract:

Most existing applications of automatic speech recognition relies on cooperative subjects at a short distance to a microphone. Standoff speech recognition using microphone arrays can extend the subject to sensor distance somewhat, but it is still limited to only a few feet. As such, most deployed applications of standoff speech recognitions are limited to indoor use at short range. Moreover, these applications require air passway between the subject and the sensor to achieve reasonable signal to noise ratio. This study reports long range (50 feet) automatic speech recognition experiments using a Laser Doppler Vibrometer (LDV) sensor. This study shows that the LDV sensor modality can extend the speech acquisition standoff distance far beyond microphone arrays to hundreds of feet. In addition, LDV enables 'listening' through the windows for uncooperative subjects. This enables new capabilities in automatic audio and speech intelligence, surveillance, and reconnaissance (ISR) for law enforcement, homeland security and counter terrorism applications. The Polytec LDV model OFV-505 is used in this study. To investigate the impact of different vibrating materials, five parallel LDV speech corpora, each consisting of 630 speakers, are collected from the vibrations of a glass window, a metal plate, a plastic box, a wood slate, and a concrete wall. These are the common materials the application could encounter in a daily life. These data were compared with the microphone counterpart to manifest the impact of various materials on the spectrum of the LDV speech signal. State of the art deep neural network modeling approaches is used to conduct continuous speaker independent speech recognition on these LDV speech datasets. Preliminary phoneme recognition results using time-delay neural network, bi-directional long short term memory, and model fusion shows great promise of using LDV for long range speech recognition. To author’s best knowledge, this is the first time an LDV is reported for long distance speech recognition application.

Keywords: covert speech acquisition, distant speech recognition, DSR, laser Doppler vibrometer, LDV, speech intelligence surveillance and reconnaissance, ISR

Procedia PDF Downloads 179
1888 The Philippines’ War on Drugs: a Pragmatic Analysis on Duterte's Commemorative Speeches

Authors: Ericson O. Alieto, Aprillete C. Devanadera

Abstract:

The main objective of the study is to determine the dominant speech acts in five commemorative speeches of President Duterte. This study employed Speech Act Theory and Discourse analysis to determine how the speech acts features connote the pragmatic meaning of Duterte’s speeches. Identifying the speech acts is significant in elucidating the underlying message or the pragmatic meaning of the speeches. From the 713 sentences or utterances from the speeches, assertive with 208 occurrences from the corpus or 29% is the dominant speech acts. It was followed by expressive with 177 or 25% occurrences, directive accounts for 152 or 15% occurrences. While commisive accounts for 104 or 15% occurrences and declarative got the lowest percentage of occurrences with 72 or 10% only. These sentences when uttered by Duterte carry a certain power of language to move or influence people. Thus, the present study shows the fundamental message perceived by the listeners. Moreover, the frequent use of assertive and expressive not only explains the pragmatic message of the speeches but also reflects the personality of President Duterte.

Keywords: commemorative speech, discourse analysis, duterte, pragmatics

Procedia PDF Downloads 287
1887 Excitation Modeling for Hidden Markov Model-Based Speech Synthesis Based on Wavelet Analysis

Authors: M. Kiran Reddy, K. Sreenivasa Rao

Abstract:

The conventional Hidden Markov Model (HMM)-based speech synthesis system (HTS) uses only a pulse excitation model, which significantly differs from natural excitation signal. Hence, buzziness can be perceived in the speech generated using HTS. This paper proposes an efficient excitation modeling method that can significantly reduce the buzziness, and improve the quality of HMM-based speech synthesis. The proposed approach models the pitch-synchronous residual frames extracted from the residual excitation signal. Each pitch synchronous residual frame is parameterized using 30 wavelet coefficients. These 30 wavelet coefficients are found to accurately capture the perceptually important information present in the residual waveform. In synthesis phase, the residual frames are reconstructed from the generated wavelet coefficients and are pitch-synchronously overlap-added to generate the excitation signal. The proposed excitation modeling method is integrated into HMM-based speech synthesis system. Evaluation results indicate that the speech synthesized by the proposed excitation model is significantly better than the speech generated using state-of-the-art excitation modeling methods.

Keywords: excitation modeling, hidden Markov models, pitch-synchronous frames, speech synthesis, wavelet coefficients

Procedia PDF Downloads 248
1886 The Untranslatability of the Qur’an

Authors: Mina Elhjouji

Abstract:

The aim of this paper is to raise awareness of the untranslatability of the Qur’an and to suggest some solutions that can help the translator in the process of transferring the meaning from the source text to the target text as much as possible. After the introduction, the miraculous character of the Qur’an shall be illustrated. Then, the difficulty of translating religious texts will be shown in terms of different causes; thematic, cultural, and linguistic. Some examples shall illustrate each type of these difficulties. Finally, some strategies that can help translate the Quran’s meanings will be suggested.

Keywords: translation, religious text, untranslatability, The Qur’an miracle, communicative theory

Procedia PDF Downloads 12
1885 Theory and Practice of Wavelets in Signal Processing

Authors: Jalal Karam

Abstract:

The methods of Fourier, Laplace, and Wavelet Transforms provide transfer functions and relationships between the input and the output signals in linear time invariant systems. This paper shows the equivalence among these three methods and in each case presenting an application of the appropriate (Fourier, Laplace or Wavelet) to the convolution theorem. In addition, it is shown that the same holds for a direct integration method. The Biorthogonal wavelets Bior3.5 and Bior3.9 are examined and the zeros distribution of their polynomials associated filters are located. This paper also presents the significance of utilizing wavelets as effective tools in processing speech signals for common multimedia applications in general, and for recognition and compression in particular. Theoretically and practically, wavelets have proved to be effective and competitive. The practical use of the Continuous Wavelet Transform (CWT) in processing and analysis of speech is then presented along with explanations of how the human ear can be thought of as a natural wavelet transformer of speech. This generates a variety of approaches for applying the (CWT) to many paradigms analysing speech, sound and music. For perception, the flexibility of implementation of this transform allows the construction of numerous scales and we include two of them. Results for speech recognition and speech compression are then included.

Keywords: continuous wavelet transform, biorthogonal wavelets, speech perception, recognition and compression

Procedia PDF Downloads 416
1884 Hate Speech Detection Using Machine Learning: A Survey

Authors: Edemealem Desalegn Kingawa, Kafte Tasew Timkete, Mekashaw Girmaw Abebe, Terefe Feyisa, Abiyot Bitew Mihretie, Senait Teklemarkos Haile

Abstract:

Currently, hate speech is a growing challenge for society, individuals, policymakers, and researchers, as social media platforms make it easy to anonymously create and grow online friends and followers and provide an online forum for debate about specific issues of community life, culture, politics, and others. Despite this, research on identifying and detecting hate speech is not satisfactory performance, and this is why future research on this issue is constantly called for. This paper provides a systematic review of the literature in this field, with a focus on approaches like word embedding techniques, machine learning, deep learning technologies, hate speech terminology, and other state-of-the-art technologies with challenges. In this paper, we have made a systematic review of the last six years of literature from Research Gate and Google Scholar. Furthermore, limitations, along with algorithm selection and use challenges, data collection, and cleaning challenges, and future research directions, are discussed in detail.

Keywords: Amharic hate speech, deep learning approach, hate speech detection review, Afaan Oromo hate speech detection

Procedia PDF Downloads 177
1883 Challenges of Teaching and Learning English Speech Sounds in Five Selected Secondary Schools in Bauchi, Bauchi State, Nigeria

Authors: Mairo Musa Galadima, Phoebe Mshelia

Abstract:

In Nigeria, the national policy of education stipulates that the kindergarten primary schools and the legislature are to use the three popular Nigerian Languages namely: Hausa, Igbo and Yoruba. However, the English language seems to be preferred and this calls for this paper. Attempts were made to draw out the challenges faced by learners in understanding English speech sounds and using them to communicate effectively in English; using 5(five) selected secondary school in Bauchi. It was discover that challenges abound in the wrong use of stress and intonation, transfer of phonetic features from their first language. Others are inadequate qualified teachers and relevant materials including text-books. It is recommended that teachers of English should lay more emphasis on the teaching of supra-segmental features and should be encouraged to go for further studies, seminars and refresher courses.

Keywords: kindergarten, stress, phonetic and intonation, Nigeria

Procedia PDF Downloads 300
1882 An Automatic Speech Recognition Tool for the Filipino Language Using the HTK System

Authors: John Lorenzo Bautista, Yoon-Joong Kim

Abstract:

This paper presents the development of a Filipino speech recognition tool using the HTK System. The system was trained from a subset of the Filipino Speech Corpus developed by the DSP Laboratory of the University of the Philippines-Diliman. The speech corpus was both used in training and testing the system by estimating the parameters for phonetic HMM-based (Hidden-Markov Model) acoustic models. Experiments on different mixture-weights were incorporated in the study. The phoneme-level word-based recognition of a 5-state HMM resulted in an average accuracy rate of 80.13 for a single-Gaussian mixture model, 81.13 after implementing a phoneme-alignment, and 87.19 for the increased Gaussian-mixture weight model. The highest accuracy rate of 88.70% was obtained from a 5-state model with 6 Gaussian mixtures.

Keywords: Filipino language, Hidden Markov Model, HTK system, speech recognition

Procedia PDF Downloads 480
1881 Short Text Classification Using Part of Speech Feature to Analyze Students' Feedback of Assessment Components

Authors: Zainab Mutlaq Ibrahim, Mohamed Bader-El-Den, Mihaela Cocea

Abstract:

Students' textual feedback can hold unique patterns and useful information about learning process, it can hold information about advantages and disadvantages of teaching methods, assessment components, facilities, and other aspects of teaching. The results of analysing such a feedback can form a key point for institutions’ decision makers to advance and update their systems accordingly. This paper proposes a data mining framework for analysing end of unit general textual feedback using part of speech feature (PoS) with four machine learning algorithms: support vector machines, decision tree, random forest, and naive bays. The proposed framework has two tasks: first, to use the above algorithms to build an optimal model that automatically classifies the whole data set into two subsets, one subset is tailored to assessment practices (assessment related), and the other one is the non-assessment related data. Second task to use the same algorithms to build an optimal model for whole data set, and the new data subsets to automatically detect their sentiment. The significance of this paper is to compare the performance of the above four algorithms using part of speech feature to the performance of the same algorithms using n-grams feature. The paper follows Knowledge Discovery and Data Mining (KDDM) framework to construct the classification and sentiment analysis models, which is understanding the assessment domain, cleaning and pre-processing the data set, selecting and running the data mining algorithm, interpreting mined patterns, and consolidating the discovered knowledge. The results of this paper experiments show that both models which used both features performed very well regarding first task. But regarding the second task, models that used part of speech feature has underperformed in comparison with models that used unigrams and bigrams.

Keywords: assessment, part of speech, sentiment analysis, student feedback

Procedia PDF Downloads 142
1880 Automatic Speech Recognition Systems Performance Evaluation Using Word Error Rate Method

Authors: João Rato, Nuno Costa

Abstract:

The human verbal communication is a two-way process which requires a mutual understanding that will result in some considerations. This kind of communication, also called dialogue, besides the supposed human agents it can also be performed between human agents and machines. The interaction between Men and Machines, by means of a natural language, has an important role concerning the improvement of the communication between each other. Aiming at knowing the performance of some speech recognition systems, this document shows the results of the accomplished tests according to the Word Error Rate evaluation method. Besides that, it is also given a set of information linked to the systems of Man-Machine communication. After this work has been made, conclusions were drawn regarding the Speech Recognition Systems, among which it can be mentioned their poor performance concerning the voice interpretation in noisy environments.

Keywords: automatic speech recognition, man-machine conversation, speech recognition, spoken dialogue systems, word error rate

Procedia PDF Downloads 322
1879 Multi-Granularity Feature Extraction and Optimization for Pathological Speech Intelligibility Evaluation

Authors: Chunying Fang, Haifeng Li, Lin Ma, Mancai Zhang

Abstract:

Speech intelligibility assessment is an important measure to evaluate the functional outcomes of surgical and non-surgical treatment, speech therapy and rehabilitation. The assessment of pathological speech plays an important role in assisting the experts. Pathological speech usually is non-stationary and mutational, in this paper, we describe a multi-granularity combined feature schemes, and which is optimized by hierarchical visual method. First of all, the difference granularity level pathological features are extracted which are BAFS (Basic acoustics feature set), local spectral characteristics MSCC (Mel s-transform cepstrum coefficients) and nonlinear dynamic characteristics based on chaotic analysis. Latterly, radar chart and F-score are proposed to optimize the features by the hierarchical visual fusion. The feature set could be optimized from 526 to 96-dimensions.The experimental results denote that new features by support vector machine (SVM) has the best performance, with a recognition rate of 84.4% on NKI-CCRT corpus. The proposed method is thus approved to be effective and reliable for pathological speech intelligibility evaluation.

Keywords: pathological speech, multi-granularity feature, MSCC (Mel s-transform cepstrum coefficients), F-score, radar chart

Procedia PDF Downloads 283
1878 Status of Communication and Swallowing Therapy in Patient with a Tracheostomy

Authors: Ya-Hui Wang

Abstract:

Lower speech therapy rate of tracheostomized patient was noted in comparison with previous researches. This study is aim to shed light on the referral status of speech therapy in those patients in Taiwan. This study developed an analysis for the size and key characteristics of the population of tracheostomized in-patient in the Taiwan. Method: We analyzed National Healthcare Insurance data (The Collaboration Center of Health Information Application, CCHIA) from Jan 1 2010 to Dec 31 2010. Result: over ages 3, number of tracheostomized in-patient is directly proportional to age. A high service loading was observed in North region in comparison with other regions. Only 4.87% of the tracheostomized in-patients were referred for speech therapy, and 1.9% for swallow examination, 2.5% for communication evaluation.

Keywords: refer, speech therapy, training, rehabilitation

Procedia PDF Downloads 440
1877 Visual Speech Perception of Arabic Emphatics

Authors: Maha Saliba Foster

Abstract:

Speech perception has been recognized as a bi-sensory process involving the auditory and visual channels. Compared to the auditory modality, the contribution of the visual signal to speech perception is not very well understood. Studying how the visual modality affects speech recognition can have pedagogical implications in second language learning, as well as clinical application in speech therapy. The current investigation explores the potential effect of speech visual cues on the perception of Arabic emphatics (AEs). The corpus consists of 36 minimal pairs each containing two contrasting consonants, an AE versus a non-emphatic (NE). Movies of four Lebanese speakers were edited to allow perceivers to have partial view of facial regions: lips only, lips-cheeks, lips-chin, lips-cheeks-chin, lips-cheeks-chin-neck. In the absence of any auditory information and relying solely on visual speech, perceivers were above chance at correctly identifying AEs or NEs across vowel contexts; moreover, the models were able to predict the probability of perceivers’ accuracy in identifying some of the COIs produced by certain speakers; additionally, results showed an overlap between the measurements selected by the computer and those selected by human perceivers. The lack of significant face effect on the perception of AEs seems to point to the lips, present in all of the videos, as the most important and often sufficient facial feature for emphasis recognition. Future investigations will aim at refining the analyses of visual cues used by perceivers by using Principal Component Analysis and including time evolution of facial feature measurements.

Keywords: Arabic emphatics, machine learning, speech perception, visual speech perception

Procedia PDF Downloads 306
1876 Identification of Text Domains and Register Variation through the Analysis of Lexical Distribution in a Bangla Mass Media Text Corpus

Authors: Mahul Bhattacharyya, Niladri Sekhar Dash

Abstract:

The present research paper is an experimental attempt to investigate the nature of variation in the register in three major text domains, namely, social, cultural, and political texts collected from the corpus of Bangla printed mass media texts. This present study uses a corpus of a moderate amount of Bangla mass media text that contains nearly one million words collected from different media sources like newspapers, magazines, advertisements, periodicals, etc. The analysis of corpus data reveals that each text has certain lexical properties that not only control their identity but also mark their uniqueness across the domains. At first, the subject domains of the texts are classified into two parameters namely, ‘Genre' and 'Text Type'. Next, some empirical investigations are made to understand how the domains vary from each other in terms of lexical properties like both function and content words. Here the method of comparative-cum-contrastive matching of lexical load across domains is invoked through word frequency count to track how domain-specific words and terms may be marked as decisive indicators in the act of specifying the textual contexts and subject domains. The study shows that the common lexical stock that percolates across all text domains are quite dicey in nature as their lexicological identity does not have any bearing in the act of specifying subject domains. Therefore, it becomes necessary for language users to anchor upon certain domain-specific lexical items to recognize a text that belongs to a specific text domain. The eventual findings of this study confirm that texts belonging to different subject domains in Bangla news text corpus clearly differ on the parameters of lexical load, lexical choice, lexical clustering, lexical collocation. In fact, based on these parameters, along with some statistical calculations, it is possible to classify mass media texts into different types to mark their relation with regard to the domains they should actually belong. The advantage of this analysis lies in the proper identification of the linguistic factors which will give language users a better insight into the method they employ in text comprehension, as well as construct a systemic frame for designing text identification strategy for language learners. The availability of huge amount of Bangla media text data is useful for achieving accurate conclusions with a certain amount of reliability and authenticity. This kind of corpus-based analysis is quite relevant for a resource-poor language like Bangla, as no attempt has ever been made to understand how the structure and texture of Bangla mass media texts vary due to certain linguistic and extra-linguistic constraints that are actively operational to specific text domains. Since mass media language is assumed to be the most 'recent representation' of the actual use of the language, this study is expected to show how the Bangla news texts reflect the thoughts of the society and how they leave a strong impact on the thought process of the speech community.

Keywords: Bangla, corpus, discourse, domains, lexical choice, mass media, register, variation

Procedia PDF Downloads 174
1875 The Acquisition of Case in Biological Domain Based on Text Mining

Authors: Shen Jian, Hu Jie, Qi Jin, Liu Wei Jie, Chen Ji Yi, Peng Ying Hong

Abstract:

In order to settle the problem of acquiring case in biological related to design problems, a biometrics instance acquisition method based on text mining is presented. Through the construction of corpus text vector space and knowledge mining, the feature selection, similarity measure and case retrieval method of text in the field of biology are studied. First, we establish a vector space model of the corpus in the biological field and complete the preprocessing steps. Then, the corpus is retrieved by using the vector space model combined with the functional keywords to obtain the biological domain examples related to the design problems. Finally, we verify the validity of this method by taking the example of text.

Keywords: text mining, vector space model, feature selection, biologically inspired design

Procedia PDF Downloads 261
1874 Speech Perception by Monolingual and Bilingual Dravidian Speakers under Adverse Listening Conditions

Authors: S. B. Rathna Kumar, Sale Kranthi, Sandya K. Varudhini

Abstract:

The precise perception of spoken language is influenced by several variables, including the listeners’ native language, distance between speaker and listener, reverberation and background noise. When noise is present in an acoustic environment, it masks the speech signal resulting in reduction in the redundancy of the acoustic and linguistic cues of speech. There is strong evidence that bilinguals face difficulty in speech perception for their second language compared with monolingual speakers under adverse listening conditions such as presence of background noise. This difficulty persists even for speakers who are highly proficient in their second language and is greater in those who have learned the second language later in life. The present study aimed to assess the performance of monolingual (Telugu speaking) and bilingual (Tamil as first language and Telugu as second language) speakers on Telugu speech perception task under quiet and noisy environments. The results indicated that both the groups performed similar in both quiet and noisy environments. The findings of the present study are not in accordance with the findings of previous studies which strongly report poorer speech perception in adverse listening conditions such as noise with bilingual speakers for their second language compared with monolinguals.

Keywords: monolingual, bilingual, second language, speech perception, quiet, noise

Procedia PDF Downloads 389
1873 Dual-Channel Multi-Band Spectral Subtraction Algorithm Dedicated to a Bilateral Cochlear Implant

Authors: Fathi Kallel, Ahmed Ben Hamida, Christian Berger-Vachon

Abstract:

In this paper, a Speech Enhancement Algorithm based on Multi-Band Spectral Subtraction (MBSS) principle is evaluated for Bilateral Cochlear Implant (BCI) users. Specifically, dual-channel noise power spectral estimation algorithm using Power Spectral Densities (PSD) and Cross Power Spectral Densities (CPSD) of the observed signals is studied. The enhanced speech signal is obtained using Dual-Channel Multi-Band Spectral Subtraction ‘DC-MBSS’ algorithm. For performance evaluation, objective speech assessment test relying on Perceptual Evaluation of Speech Quality (PESQ) score is performed to fix the optimal number of frequency bands needed in DC-MBSS algorithm. In order to evaluate the speech intelligibility, subjective listening tests are assessed with 3 deafened BCI patients. Experimental results obtained using French Lafon database corrupted by an additive babble noise at different Signal-to-Noise Ratios (SNR) showed that DC-MBSS algorithm improves speech understanding for single and multiple interfering noise sources.

Keywords: speech enhancement, spectral substracion, noise estimation, cochlear impalnt

Procedia PDF Downloads 549
1872 The Combination of the Mel Frequency Cepstral Coefficients, Perceptual Linear Prediction, Jitter and Shimmer Coefficients for the Improvement of Automatic Recognition System for Dysarthric Speech

Authors: Brahim Fares Zaidi

Abstract:

Our work aims to improve our Automatic Recognition System for Dysarthria Speech based on the Hidden Models of Markov and the Hidden Markov Model Toolkit to help people who are sick. With pronunciation problems, we applied two techniques of speech parameterization based on Mel Frequency Cepstral Coefficients and Perceptual Linear Prediction and concatenated them with JITTER and SHIMMER coefficients in order to increase the recognition rate of a dysarthria speech. For our tests, we used the NEMOURS database that represents speakers with dysarthria and normal speakers.

Keywords: ARSDS, HTK, HMM, MFCC, PLP

Procedia PDF Downloads 108
1871 Freedom of Speech, Dissent and the Right to be Governed By Consensus are Inherent Rights Under Classical Islamic Law

Authors: Ziyad Motala

Abstract:

It is often proclaimed by leasers in Muslim majority countries that Islamic Law does not permit dissent against a ruler. This paper will evaluate and discuss freedom of speech and dissent as found in concrete prophetic examples during the time of the Prophet Muhammad. It will further look at the examples and practices during the time of the four Noble Caliphs, the immediate successors to the Prophet Muhammad. It will argue that the positivist position of absolute obedience to a ruler is inconsistent with the prophetic tradition. The examples of the Prophet and his immediate four successors (whose lessons Sunni Islam considers to be a source of Islamic Law) demonstrates among the earliest example of freedom of speech and dissent in human history. That tradition frowned upon an inert and uninvolved citizenry. It will conclude with lessons for modern day Muslim majority countries arguing with empirical evidence that freedom of speech, dissent and the right to be governed by consensus versus coercion are fundamental requisites of Islamic law.

Keywords: islamic law, demoracy, freedom of speech, right to dissent

Procedia PDF Downloads 75
1870 Text Similarity in Vector Space Models: A Comparative Study

Authors: Omid Shahmirzadi, Adam Lugowski, Kenneth Younge

Abstract:

Automatic measurement of semantic text similarity is an important task in natural language processing. In this paper, we evaluate the performance of different vector space models to perform this task. We address the real-world problem of modeling patent-to-patent similarity and compare TFIDF (and related extensions), topic models (e.g., latent semantic indexing), and neural models (e.g., paragraph vectors). Contrary to expectations, the added computational cost of text embedding methods is justified only when: 1) the target text is condensed; and 2) the similarity comparison is trivial. Otherwise, TFIDF performs surprisingly well in other cases: in particular for longer and more technical texts or for making finer-grained distinctions between nearest neighbors. Unexpectedly, extensions to the TFIDF method, such as adding noun phrases or calculating term weights incrementally, were not helpful in our context.

Keywords: big data, patent, text embedding, text similarity, vector space model

Procedia PDF Downloads 175
1869 Information Disclosure And Financial Sentiment Index Using a Machine Learning Approach

Authors: Alev Atak

Abstract:

In this paper, we aim to create a financial sentiment index by investigating the company’s voluntary information disclosures. We retrieve structured content from BIST 100 companies’ financial reports for the period 1998-2018 and extract relevant financial information for sentiment analysis through Natural Language Processing. We measure strategy-related disclosures and their cross-sectional variation and classify report content into generic sections using synonym lists divided into four main categories according to their liquidity risk profile, risk positions, intra-annual information, and exposure to risk. We use Word Error Rate and Cosin Similarity for comparing and measuring text similarity and derivation in sets of texts. In addition to performing text extraction, we will provide a range of text analysis options, such as the readability metrics, word counts using pre-determined lists (e.g., forward-looking, uncertainty, tone, etc.), and comparison with reference corpus (word, parts of speech and semantic level). Therefore, we create an adequate analytical tool and a financial dictionary to depict the importance of granular financial disclosure for investors to identify correctly the risk-taking behavior and hence make the aggregated effects traceable.

Keywords: financial sentiment, machine learning, information disclosure, risk

Procedia PDF Downloads 94
1868 Formulating a Definition of Hate Speech: From Divergence to Convergence

Authors: Avitus A. Agbor

Abstract:

Numerous incidents, ranging from trivial to catastrophic, do come to mind when one reflects on hate. The victims of these belong to specific identifiable groups within communities. These experiences evoke discussions on Islamophobia, xenophobia, homophobia, anti-Semitism, racism, ethnic hatred, atheism, and other brutal forms of bigotry. Common to all these is an invisible but portent force that drives all of them: hatred. Such hatred is usually fueled by a profound degree of intolerance (to diversity) and the zeal to impose on others their beliefs and practices which they consider to be the conventional norm. More importantly, the perpetuation of these hateful acts is the unfortunate outcome of an overplay of invectives and hate speech which, to a greater extent, cannot be divorced from hate. From a legal perspective, acknowledging the existence of an undeniable link between hate speech and hate is quite easy. However, both within and without legal scholarship, the notion of “hate speech” remains a conundrum: a phrase that is quite easily explained through experiences than propounding a watertight definition that captures the entire essence and nature of what it is. The problem is further compounded by a few factors: first, within the international human rights framework, the notion of hate speech is not used. In limiting the right to freedom of expression, the ICCPR simply excludes specific kinds of speeches (but does not refer to them as hate speech). Regional human rights instruments are not so different, except for the subsequent developments that took place in the European Union in which the notion has been carefully delineated, and now a much clearer picture of what constitutes hate speech is provided. The legal architecture in domestic legal systems clearly shows differences in approaches and regulation: making it more difficult. In short, what may be hate speech in one legal system may very well be acceptable legal speech in another legal system. Lastly, the cornucopia of academic voices on the issue of hate speech exude the divergence thereon. Yet, in the absence of a well-formulated and universally acceptable definition, it is important to consider how hate speech can be defined. Taking an evidence-based approach, this research looks into the issue of defining hate speech in legal scholarship and how and why such a formulation is of critical importance in the prohibition and prosecution of hate speech.

Keywords: hate speech, international human rights law, international criminal law, freedom of expression

Procedia PDF Downloads 76
1867 Effect Analysis of an Improved Adaptive Speech Noise Reduction Algorithm in Online Communication Scenarios

Authors: Xingxing Peng

Abstract:

With the development of society, there are more and more online communication scenarios such as teleconference and online education. In the process of conference communication, the quality of voice communication is a very important part, and noise may cause the communication effect of participants to be greatly reduced. Therefore, voice noise reduction has an important impact on scenarios such as voice calls. This research focuses on the key technologies of the sound transmission process. The purpose is to maintain the audio quality to the maximum so that the listener can hear clearer and smoother sound. Firstly, to solve the problem that the traditional speech enhancement algorithm is not ideal when dealing with non-stationary noise, an adaptive speech noise reduction algorithm is studied in this paper. Traditional noise estimation methods are mainly used to deal with stationary noise. In this chapter, we study the spectral characteristics of different noise types, especially the characteristics of non-stationary Burst noise, and design a noise estimator module to deal with non-stationary noise. Noise features are extracted from non-speech segments, and the noise estimation module is adjusted in real time according to different noise characteristics. This adaptive algorithm can enhance speech according to different noise characteristics, improve the performance of traditional algorithms to deal with non-stationary noise, so as to achieve better enhancement effect. The experimental results show that the algorithm proposed in this chapter is effective and can better adapt to different types of noise, so as to obtain better speech enhancement effect.

Keywords: speech noise reduction, speech enhancement, self-adaptation, Wiener filter algorithm

Procedia PDF Downloads 57
1866 A Relationship Extraction Method from Literary Fiction Considering Korean Linguistic Features

Authors: Hee-Jeong Ahn, Kee-Won Kim, Seung-Hoon Kim

Abstract:

The knowledge of the relationship between characters can help readers to understand the overall story or plot of the literary fiction. In this paper, we present a method for extracting the specific relationship between characters from a Korean literary fiction. Generally, methods for extracting relationships between characters in text are statistical or computational methods based on the sentence distance between characters without considering Korean linguistic features. Furthermore, it is difficult to extract the relationship with direction from text, such as one-sided love, because they consider only the weight of relationship, without considering the direction of the relationship. Therefore, in order to identify specific relationships between characters, we propose a statistical method considering linguistic features, such as syntactic patterns and speech verbs in Korean. The result of our method is represented by a weighted directed graph of the relationship between the characters. Furthermore, we expect that proposed method could be applied to the relationship analysis between characters of other content like movie or TV drama.

Keywords: data mining, Korean linguistic feature, literary fiction, relationship extraction

Procedia PDF Downloads 380
1865 Analysis of Interleaving Scheme for Narrowband VoIP System under Pervasive Environment

Authors: Monica Sharma, Harjit Pal Singh, Jasbinder Singh, Manju Bala

Abstract:

In Voice over Internet Protocol (VoIP) system, the speech signal is degraded when passed through the network layers. The speech signal is processed through the best effort policy based IP network, which leads to the network degradations including delay, packet loss and jitter. The packet loss is the major issue of the degradation in the VoIP signal quality; even a single lost packet may generate audible distortion in the decoded speech signal. In addition to these network degradations, the quality of the speech signal is also affected by the environmental noises and coder distortions. The signal quality of the VoIP system is improved through the interleaving technique. The performance of the system is evaluated for various types of noises at different network conditions. The performance of the enhanced VoIP signal is evaluated using perceptual evaluation of speech quality (PESQ) measurement for narrow band signal.

Keywords: VoIP, interleaving, packet loss, packet size, background noise

Procedia PDF Downloads 479
1864 Structural Analysis of Kamaluddin Behzad's Works Based on Roland Barthes' Theory of Communication, 'Text and Image'

Authors: Mahsa Khani Oushani, Mohammad Kazem Hasanvand

Abstract:

Text and image have always been two important components in Iranian layout. The interactive connection between text and image has shaped the art of book design with multiple patterns. In this research, first the structure and visual elements in the research data were analyzed and then the position of the text element and the image element in relation to each other based on Roland Barthes theory on the three theories of text and image, were studied and analyzed and the results were compared, and interpreted. The purpose of this study is to investigate the pattern of text and image in the works of Kamaluddin Behzad based on three Roland Barthes communication theories, 1. Descriptive communication, 2. Reference communication, 3. Matched communication. The questions of this research are what is the relationship between text and image in Behzad's works? And how is it defined according to Roland Barthes theory? The method of this research has been done with a structuralist approach with a descriptive-analytical method in a library collection method. The information has been collected in the form of documents (library) and is a tool for collecting online databases. Findings show that the dominant element in Behzad's drawings is with the image and has created a reference relationship in the layout of the drawings, but in some cases it achieves a different relationship that despite the preference of the image on the page, the text is dispersed proportionally on the page and plays a more active role, played within the image. The text and the image support each other equally on the page; Roland Barthes equates this connection.

Keywords: text, image, Kamaluddin Behzad, Roland Barthes, communication theory

Procedia PDF Downloads 192
1863 Voice Commands Recognition of Mentor Robot in Noisy Environment Using HTK

Authors: Khenfer-Koummich Fatma, Hendel Fatiha, Mesbahi Larbi

Abstract:

this paper presents an approach based on Hidden Markov Models (HMM: Hidden Markov Model) using HTK tools. The goal is to create a man-machine interface with a voice recognition system that allows the operator to tele-operate a mentor robot to execute specific tasks as rotate, raise, close, etc. This system should take into account different levels of environmental noise. This approach has been applied to isolated words representing the robot commands spoken in two languages: French and Arabic. The recognition rate obtained is the same in both speeches, Arabic and French in the neutral words. However, there is a slight difference in favor of the Arabic speech when Gaussian white noise is added with a Signal to Noise Ratio (SNR) equal to 30 db, the Arabic speech recognition rate is 69% and 80% for French speech recognition rate. This can be explained by the ability of phonetic context of each speech when the noise is added.

Keywords: voice command, HMM, TIMIT, noise, HTK, Arabic, speech recognition

Procedia PDF Downloads 382
1862 Speech Rhythm Variation in Languages and Dialects: F0, Natural and Inverted Speech

Authors: Imen Ben Abda

Abstract:

Languages have been classified into different rhythm classes. 'Stress-timed' languages are exemplified by English, 'syllable-timed' languages by French and 'mora-timed' languages by Japanese. However, to our best knowledge, acoustic studies have not been unanimous in strictly establishing which rhythm category a given language belongs to and failed to show empirical evidence for isochrony. Perception seems to be a good approach to categorize languages into different rhythm classes. This study, within the scope of experimental phonetics, includes an account of different perceptual experiments using cues from natural and inverted speech, as well as pitch extracted from speech data. It is an attempt to categorize speech rhythm over a large set of Arabic (Tunisian, Algerian, Lebanese and Moroccan) and English dialects (Welsh, Irish, Scottish and Texan) as well as other languages such as Chinese, Japanese, French, and German. Listeners managed to classify the different languages and dialects into different rhythm classes using suprasegmental cues mainly rhythm and pitch (F0). They also perceived rhythmic differences even among languages and dialects belonging to the same rhythm class. This may show that there are different subclasses within very broad rhythmic typologies.

Keywords: F0, inverted speech, mora-timing, rhythm variation, stress-timing, syllable-timing

Procedia PDF Downloads 526