Search results for: statistical models of speech recognition
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 11690

Search results for: statistical models of speech recognition

11480 Emotion Recognition in Video and Images in the Wild

Authors: Faizan Tariq, Moayid Ali Zaidi

Abstract:

Facial emotion recognition algorithms are expanding rapidly now a day. People are using different algorithms with different combinations to generate best results. There are six basic emotions which are being studied in this area. Author tried to recognize the facial expressions using object detector algorithms instead of traditional algorithms. Two object detection algorithms were chosen which are Faster R-CNN and YOLO. For pre-processing we used image rotation and batch normalization. The dataset I have chosen for the experiments is Static Facial Expression in Wild (SFEW). Our approach worked well but there is still a lot of room to improve it, which will be a future direction.

Keywords: face recognition, emotion recognition, deep learning, CNN

Procedia PDF Downloads 157
11479 Analysis on Prediction Models of TBM Performance and Selection of Optimal Input Parameters

Authors: Hang Lo Lee, Ki Il Song, Hee Hwan Ryu

Abstract:

An accurate prediction of TBM(Tunnel Boring Machine) performance is very difficult for reliable estimation of the construction period and cost in preconstruction stage. For this purpose, the aim of this study is to analyze the evaluation process of various prediction models published since 2000 for TBM performance, and to select the optimal input parameters for the prediction model. A classification system of TBM performance prediction model and applied methodology are proposed in this research. Input and output parameters applied for prediction models are also represented. Based on these results, a statistical analysis is performed using the collected data from shield TBM tunnel in South Korea. By performing a simple regression and residual analysis utilizinFg statistical program, R, the optimal input parameters are selected. These results are expected to be used for development of prediction model of TBM performance.

Keywords: TBM performance prediction model, classification system, simple regression analysis, residual analysis, optimal input parameters

Procedia PDF Downloads 278
11478 Effect Analysis of an Improved Adaptive Speech Noise Reduction Algorithm in Online Communication Scenarios

Authors: Xingxing Peng

Abstract:

With the development of society, there are more and more online communication scenarios such as teleconference and online education. In the process of conference communication, the quality of voice communication is a very important part, and noise may cause the communication effect of participants to be greatly reduced. Therefore, voice noise reduction has an important impact on scenarios such as voice calls. This research focuses on the key technologies of the sound transmission process. The purpose is to maintain the audio quality to the maximum so that the listener can hear clearer and smoother sound. Firstly, to solve the problem that the traditional speech enhancement algorithm is not ideal when dealing with non-stationary noise, an adaptive speech noise reduction algorithm is studied in this paper. Traditional noise estimation methods are mainly used to deal with stationary noise. In this chapter, we study the spectral characteristics of different noise types, especially the characteristics of non-stationary Burst noise, and design a noise estimator module to deal with non-stationary noise. Noise features are extracted from non-speech segments, and the noise estimation module is adjusted in real time according to different noise characteristics. This adaptive algorithm can enhance speech according to different noise characteristics, improve the performance of traditional algorithms to deal with non-stationary noise, so as to achieve better enhancement effect. The experimental results show that the algorithm proposed in this chapter is effective and can better adapt to different types of noise, so as to obtain better speech enhancement effect.

Keywords: speech noise reduction, speech enhancement, self-adaptation, Wiener filter algorithm

Procedia PDF Downloads 26
11477 An Improved Face Recognition Algorithm Using Histogram-Based Features in Spatial and Frequency Domains

Authors: Qiu Chen, Koji Kotani, Feifei Lee, Tadahiro Ohmi

Abstract:

In this paper, we propose an improved face recognition algorithm using histogram-based features in spatial and frequency domains. For adding spatial information of the face to improve recognition performance, a region-division (RD) method is utilized. The facial area is firstly divided into several regions, then feature vectors of each facial part are generated by Binary Vector Quantization (BVQ) histogram using DCT coefficients in low frequency domains, as well as Local Binary Pattern (LBP) histogram in spatial domain. Recognition results with different regions are first obtained separately and then fused by weighted averaging. Publicly available ORL database is used for the evaluation of our proposed algorithm, which is consisted of 40 subjects with 10 images per subject containing variations in lighting, posing, and expressions. It is demonstrated that face recognition using RD method can achieve much higher recognition rate.

Keywords: binary vector quantization (BVQ), DCT coefficients, face recognition, local binary patterns (LBP)

Procedia PDF Downloads 316
11476 Deep-Learning Based Approach to Facial Emotion Recognition through Convolutional Neural Network

Authors: Nouha Khediri, Mohammed Ben Ammar, Monji Kherallah

Abstract:

Recently, facial emotion recognition (FER) has become increasingly essential to understand the state of the human mind. Accurately classifying emotion from the face is a challenging task. In this paper, we present a facial emotion recognition approach named CV-FER, benefiting from deep learning, especially CNN and VGG16. First, the data is pre-processed with data cleaning and data rotation. Then, we augment the data and proceed to our FER model, which contains five convolutions layers and five pooling layers. Finally, a softmax classifier is used in the output layer to recognize emotions. Based on the above contents, this paper reviews the works of facial emotion recognition based on deep learning. Experiments show that our model outperforms the other methods using the same FER2013 database and yields a recognition rate of 92%. We also put forward some suggestions for future work.

Keywords: CNN, deep-learning, facial emotion recognition, machine learning

Procedia PDF Downloads 61
11475 Analysis of Interleaving Scheme for Narrowband VoIP System under Pervasive Environment

Authors: Monica Sharma, Harjit Pal Singh, Jasbinder Singh, Manju Bala

Abstract:

In Voice over Internet Protocol (VoIP) system, the speech signal is degraded when passed through the network layers. The speech signal is processed through the best effort policy based IP network, which leads to the network degradations including delay, packet loss and jitter. The packet loss is the major issue of the degradation in the VoIP signal quality; even a single lost packet may generate audible distortion in the decoded speech signal. In addition to these network degradations, the quality of the speech signal is also affected by the environmental noises and coder distortions. The signal quality of the VoIP system is improved through the interleaving technique. The performance of the system is evaluated for various types of noises at different network conditions. The performance of the enhanced VoIP signal is evaluated using perceptual evaluation of speech quality (PESQ) measurement for narrow band signal.

Keywords: VoIP, interleaving, packet loss, packet size, background noise

Procedia PDF Downloads 452
11474 Speech Rhythm Variation in Languages and Dialects: F0, Natural and Inverted Speech

Authors: Imen Ben Abda

Abstract:

Languages have been classified into different rhythm classes. 'Stress-timed' languages are exemplified by English, 'syllable-timed' languages by French and 'mora-timed' languages by Japanese. However, to our best knowledge, acoustic studies have not been unanimous in strictly establishing which rhythm category a given language belongs to and failed to show empirical evidence for isochrony. Perception seems to be a good approach to categorize languages into different rhythm classes. This study, within the scope of experimental phonetics, includes an account of different perceptual experiments using cues from natural and inverted speech, as well as pitch extracted from speech data. It is an attempt to categorize speech rhythm over a large set of Arabic (Tunisian, Algerian, Lebanese and Moroccan) and English dialects (Welsh, Irish, Scottish and Texan) as well as other languages such as Chinese, Japanese, French, and German. Listeners managed to classify the different languages and dialects into different rhythm classes using suprasegmental cues mainly rhythm and pitch (F0). They also perceived rhythmic differences even among languages and dialects belonging to the same rhythm class. This may show that there are different subclasses within very broad rhythmic typologies.

Keywords: F0, inverted speech, mora-timing, rhythm variation, stress-timing, syllable-timing

Procedia PDF Downloads 479
11473 Effects of Exposing Learners to Speech Acts in the German Teaching Material Schritte International: The Case of Requests

Authors: Wan-Lin Tsai

Abstract:

Speech act of requests is an important issue in the field of language learning and teaching because we cannot avoid making requesting in our daily life. This study examined whether or not the subjects who were freshmen and majored in German at Wenzao University of Languages were able to use the linguistic forms which they had learned from their course book Schritte International to make appropriate requests through dialogue completed tasks (DCT). The results revealed that the majority of the subjects were unable to use the forms to make appropriate requests in German due to the lack of explicit instructions. Furthermore, Chinese interference was observed in students' productions. Explicit instructions in speech acts are strongly recommended.

Keywords: Chinese interference, German pragmatics, German teaching, make appropriate requests in German, speech act of requesting

Procedia PDF Downloads 437
11472 Emotion-Convolutional Neural Network for Perceiving Stress from Audio Signals: A Brain Chemistry Approach

Authors: Anup Anand Deshmukh, Catherine Soladie, Renaud Seguier

Abstract:

Emotion plays a key role in many applications like healthcare, to gather patients’ emotional behavior. Unlike typical ASR (Automated Speech Recognition) problems which focus on 'what was said', it is equally important to understand 'how it was said.' There are certain emotions which are given more importance due to their effectiveness in understanding human feelings. In this paper, we propose an approach that models human stress from audio signals. The research challenge in speech emotion detection is finding the appropriate set of acoustic features corresponding to an emotion. Another difficulty lies in defining the very meaning of emotion and being able to categorize it in a precise manner. Supervised Machine Learning models, including state of the art Deep Learning classification methods, rely on the availability of clean and labelled data. One of the problems in affective computation is the limited amount of annotated data. The existing labelled emotions datasets are highly subjective to the perception of the annotator. We address the first issue of feature selection by exploiting the use of traditional MFCC (Mel-Frequency Cepstral Coefficients) features in Convolutional Neural Network. Our proposed Emo-CNN (Emotion-CNN) architecture treats speech representations in a manner similar to how CNN’s treat images in a vision problem. Our experiments show that Emo-CNN consistently and significantly outperforms the popular existing methods over multiple datasets. It achieves 90.2% categorical accuracy on the Emo-DB dataset. We claim that Emo-CNN is robust to speaker variations and environmental distortions. The proposed approach achieves 85.5% speaker-dependant categorical accuracy for SAVEE (Surrey Audio-Visual Expressed Emotion) dataset, beating the existing CNN based approach by 10.2%. To tackle the second problem of subjectivity in stress labels, we use Lovheim’s cube, which is a 3-dimensional projection of emotions. Monoamine neurotransmitters are a type of chemical messengers in the brain that transmits signals on perceiving emotions. The cube aims at explaining the relationship between these neurotransmitters and the positions of emotions in 3D space. The learnt emotion representations from the Emo-CNN are mapped to the cube using three component PCA (Principal Component Analysis) which is then used to model human stress. This proposed approach not only circumvents the need for labelled stress data but also complies with the psychological theory of emotions given by Lovheim’s cube. We believe that this work is the first step towards creating a connection between Artificial Intelligence and the chemistry of human emotions.

Keywords: deep learning, brain chemistry, emotion perception, Lovheim's cube

Procedia PDF Downloads 124
11471 Childhood Apraxia of Speech and Autism: Interaction Influences and Treatment

Authors: Elad Vashdi

Abstract:

It is common to find speech deficit among children diagnosed with Autism. It can be found in the clinical field and recently in research. One of the DSM-V criteria suggests a speech delay (Delay in, or total lack of, the development of spoken language), but doesn't explain the cause of it. A common perception among professionals and families is that the inability to talk results from the autism. Autism is a name for a syndrome which just describes a phenomenon and is defined behaviorally. Since it is not based yet on a physiological gold standard, one can not conclude the nature of a deficit based on the name of the syndrome. A wide retrospective research (n=270) which included children with motor speech difficulties was conducted in Israel. The study analyzed entry evaluations in a private clinic during the years 2006-2013. The data was extracted from the reports. High percentage of children diagnosed with Autism (60%) was found. This result demonstrates the high relationship between Autism and motor speech problem. It also supports recent findings in research of Childhood apraxia of speech (CAS) occurrence among children with ASD. Only small percentage of the participants in this research (10%) were diagnosed with CAS even though their verbal deficits well fitted the guidelines for CAS diagnosis set by ASHA in 2007. This fact raises questions regarding the diagnostic procedure in Israel. The understanding that CAS might highly exist within Autism and can have a remarkable influence on the course of early development should be a guiding tool within the diagnosis procedure. CAS can explain the nature of the speech problem among some of the autistic children and guide the treatment in a more accurate way. Calculating the prevalence of CAS which includes the comorbidity with ASD reveals new numbers and suggests treating differently the CAS population.

Keywords: childhood apraxia of speech, Autism, treatment, speech

Procedia PDF Downloads 249
11470 Facial Emotion Recognition Using Deep Learning

Authors: Ashutosh Mishra, Nikhil Goyal

Abstract:

A 3D facial emotion recognition model based on deep learning is proposed in this paper. Two convolution layers and a pooling layer are employed in the deep learning architecture. After the convolution process, the pooling is finished. The probabilities for various classes of human faces are calculated using the sigmoid activation function. To verify the efficiency of deep learning-based systems, a set of faces. The Kaggle dataset is used to verify the accuracy of a deep learning-based face recognition model. The model's accuracy is about 65 percent, which is lower than that of other facial expression recognition techniques. Despite significant gains in representation precision due to the nonlinearity of profound image representations.

Keywords: facial recognition, computational intelligence, convolutional neural network, depth map

Procedia PDF Downloads 196
11469 Grid Pattern Recognition and Suppression in Computed Radiographic Images

Authors: Igor Belykh

Abstract:

Anti-scatter grids used in radiographic imaging for the contrast enhancement leave specific artifacts. Those artifacts may be visible or may cause Moiré effect when a digital image is resized on a diagnostic monitor. In this paper, we propose an automated grid artifacts detection and suppression algorithm which is still an actual problem. Grid artifacts detection is based on statistical approach in spatial domain. Grid artifacts suppression is based on Kaiser bandstop filter transfer function design and application avoiding ringing artifacts. Experimental results are discussed and concluded with description of advantages over existing approaches.

Keywords: grid, computed radiography, pattern recognition, image processing, filtering

Procedia PDF Downloads 254
11468 Its about Cortana, Microsoft’s Virtual Assistant

Authors: Aya Idriss, Esraa Othman, Lujain Malak

Abstract:

Artificial intelligence is the emulation of human intelligence processes by machines, particularly computer systems that act logically. Some of the specific applications of AI include natural language processing, speech recognition, and machine vision. Cortana is a virtual assistant and she’s an example of an AI Application. Microsoft made it possible for this app to be accessed not only on laptops and PCs but can be downloaded on mobile phones and used as a virtual assistant which was a huge success. Cortana can offer a lot apart from the basic orders such as setting alarms and marking the calendar. Its capabilities spread past that, for example, it provides us with listening to music and podcasts on the go, managing my to-do list and emails, connecting with my contacts hands-free by simply just telling the virtual assistant to call somebody, gives me instant answers and so on. A questionnaire was sent online to numerous friends and family members to perform the study, which is critical in evaluating Cortana's recognition capacity and the majority of the answers were in favor of Cortana’s capabilities. The results of the questionnaire assisted us in determining the level of Cortana's skills.

Keywords: artificial intelligence, Cortana, AI, abstract

Procedia PDF Downloads 154
11467 Small Text Extraction from Documents and Chart Images

Authors: Rominkumar Busa, Shahira K. C., Lijiya A.

Abstract:

Text recognition is an important area in computer vision which deals with detecting and recognising text from an image. The Optical Character Recognition (OCR) is a saturated area these days and with very good text recognition accuracy. However the same OCR methods when applied on text with small font sizes like the text data of chart images, the recognition rate is less than 30%. In this work, aims to extract small text in images using the deep learning model, CRNN with CTC loss. The text recognition accuracy is found to improve by applying image enhancement by super resolution prior to CRNN model. We also observe the text recognition rate further increases by 18% by applying the proposed method, which involves super resolution and character segmentation followed by CRNN with CTC loss. The efficiency of the proposed method shows that further pre-processing on chart image text and other small text images will improve the accuracy further, thereby helping text extraction from chart images.

Keywords: small text extraction, OCR, scene text recognition, CRNN

Procedia PDF Downloads 95
11466 Speech Motor Processing and Animal Sound Communication

Authors: Ana Cleide Vieira Gomes Guimbal de Aquino

Abstract:

Sound communication is present in most vertebrates, from fish, mainly in species that live in murky waters, to some species of reptiles, anuran amphibians, birds, and mammals, including primates. There are, in fact, relevant similarities between human language and animal sound communication, and among these similarities are the vocalizations called calls. The first specific call in human babies is crying, which has a characteristic prosodic contour and is motivated most of the time by the need for food and by affecting the puppy-caregiver interaction, with a view to communicating the necessities and food requests and guaranteeing the survival of the species. The present work aims to articulate speech processing in the motor context with aspects of the project entitled emotional states and vocalization: a comparative study of the prosodic contours of crying in human and non-human animals. First, concepts of speech motor processing and general aspects of speech evolution will be presented to relate these two approaches to animal sound communication.

Keywords: speech motor processing, animal communication, animal behaviour, language acquisition

Procedia PDF Downloads 58
11465 Naïve Bayes: A Classical Approach for the Epileptic Seizures Recognition

Authors: Bhaveek Maini, Sanjay Dhanka, Surita Maini

Abstract:

Electroencephalography (EEG) is used to classify several epileptic seizures worldwide. It is a very crucial task for the neurologist to identify the epileptic seizure with manual EEG analysis, as it takes lots of effort and time. Human error is always at high risk in EEG, as acquiring signals needs manual intervention. Disease diagnosis using machine learning (ML) has continuously been explored since its inception. Moreover, where a large number of datasets have to be analyzed, ML is acting as a boon for doctors. In this research paper, authors proposed two different ML models, i.e., logistic regression (LR) and Naïve Bayes (NB), to predict epileptic seizures based on general parameters. These two techniques are applied to the epileptic seizures recognition dataset, available on the UCI ML repository. The algorithms are implemented on an 80:20 train test ratio (80% for training and 20% for testing), and the performance of the model was validated by 10-fold cross-validation. The proposed study has claimed accuracy of 81.87% and 95.49% for LR and NB, respectively.

Keywords: epileptic seizure recognition, logistic regression, Naïve Bayes, machine learning

Procedia PDF Downloads 34
11464 Short Review on Models to Estimate the Risk in the Financial Area

Authors: Tiberiu Socaciu, Tudor Colomeischi, Eugenia Iancu

Abstract:

Business failure affects in various proportions shareholders, managers, lenders (banks), suppliers, customers, the financial community, government and society as a whole. In the era in which we have telecommunications networks, exists an interdependence of markets, the effect of a failure of a company is relatively instant. To effectively manage risk exposure is thus require sophisticated support systems, supported by analytical tools to measure, monitor, manage and control operational risks that may arise. As we know, bankruptcy is a phenomenon that managers do not want no matter what stage of life is the company they direct / lead. In the analysis made by us, by the nature of economic models that are reviewed (Altman, Conan-Holder etc.), estimating the risk of bankruptcy of a company corresponds to some extent with its own business cycle tracing of the company. Various models for predicting bankruptcy take into account direct / indirect aspects such as market position, company growth trend, competition structure, characteristics and customer retention, organization and distribution, location etc. From the perspective of our research we will now review the economic models known in theory and practice for estimating the risk of bankruptcy; such models are based on indicators drawn from major accounting firms.

Keywords: Anglo-Saxon models, continental models, national models, statistical models

Procedia PDF Downloads 377
11463 Evaluation of Parameters of Subject Models and Their Mutual Effects

Authors: A. G. Kovalenko, Y. N. Amirgaliyev, A. U. Kalizhanova, L. S. Balgabayeva, A. H. Kozbakova, Z. S. Aitkulov

Abstract:

It is known that statistical information on operation of the compound multisite system is often far from the description of actual state of the system and does not allow drawing any conclusions about the correctness of its operation. For example, from the world practice of operation of systems of water supply, water disposal, it is known that total measurements at consumers and at suppliers differ between 40-60%. It is connected with mathematical measure of inaccuracy as well as ineffective running of corresponding systems. Analysis of widely-distributed systems is more difficult, in which subjects, which are self-maintained in decision-making, carry out economic interaction in production, act of purchase and sale, resale and consumption. This work analyzed mathematical models of sellers, consumers, arbitragers and the models of their interaction in the provision of dispersed single-product market of perfect competition. On the basis of these models, the methods, allowing estimation of every subject’s operating options and systems as a whole are given.

Keywords: dispersed systems, models, hydraulic network, algorithms

Procedia PDF Downloads 261
11462 Localization of Frontal and Temporal Speech Areas in Brain Tumor Patients by Their Structural Connections with Probabilistic Tractography

Authors: B.Shukir, H.Woo, P.Barzo, D.Kis

Abstract:

Preoperative brain mapping in tumors involving the speech areas has an important role to reduce surgical risks. Functional magnetic resonance imaging (fMRI) is the gold standard method to localize cortical speech areas preoperatively, but its availability in clinical routine is difficult. Diffusion MRI based probabilistic tractography is available in head MRI. It’s used to segment cortical subregions by their structural connectivity. In our study, we used probabilistic tractography to localize the frontal and temporal cortical speech areas. 15 patients with left frontal tumor were enrolled to our study. Speech fMRI and diffusion MRI acquired preoperatively. The standard automated anatomical labelling atlas 3 (AAL3) cortical atlas used to define 76 left frontal and 118 left temporal potential speech areas. 4 types of tractography were run according to the structural connection of these regions to the left arcuate fascicle (FA) to localize those cortical areas which have speech functions: 1, frontal through FA; 2, frontal with FA; 3, temporal to FA; 4, temporal with FA connections were determined. Thresholds of 1%, 5%, 10% and 15% applied. At each level, the number of affected frontal and temporal regions by fMRI and tractography were defined, the sensitivity and specificity were calculated. At the level of 1% threshold showed the best results. Sensitivity was 61,631,4% and 67,1523,12%, specificity was 87,210,4% and 75,611,37% for frontal and temporal regions, respectively. From our study, we conclude that probabilistic tractography is a reliable preoperative technique to localize cortical speech areas. However, its results are not feasible that the neurosurgeon rely on during the operation.

Keywords: brain mapping, brain tumor, fMRI, probabilistic tractography

Procedia PDF Downloads 125
11461 Recognition and Protection of Indigenous Society in Indonesia

Authors: Triyanto, Rima Vien Permata Hartanto

Abstract:

Indonesia is a legal state. The consequence of this status is the recognition and protection of the existence of indigenous peoples. This paper aims to describe the dynamics of legal recognition and protection for indigenous peoples within the framework of Indonesian law. This paper is library research based on literature. The result states that although the constitution has normatively recognized the existence of indigenous peoples and their traditional rights, in reality, not all rights were recognized and protected. The protection and recognition for indigenous people need to be strengthened.

Keywords: indigenous peoples, customary law, state law, state of law

Procedia PDF Downloads 295
11460 Detecting Characters as Objects Towards Character Recognition on Licence Plates

Authors: Alden Boby, Dane Brown, James Connan

Abstract:

Character recognition is a well-researched topic across disciplines. Regardless, creating a solution that can cater to multiple situations is still challenging. Vehicle licence plates lack an international standard, meaning that different countries and regions have their own licence plate format. A problem that arises from this is that the typefaces and designs from different regions make it difficult to create a solution that can cater to a wide range of licence plates. The main issue concerning detection is the character recognition stage. This paper aims to create an object detection-based character recognition model trained on a custom dataset that consists of typefaces of licence plates from various regions. Given that characters have featured consistently maintained across an array of fonts, YOLO can be trained to recognise characters based on these features, which may provide better performance than OCR methods such as Tesseract OCR.

Keywords: computer vision, character recognition, licence plate recognition, object detection

Procedia PDF Downloads 88
11459 Relevant LMA Features for Human Motion Recognition

Authors: Insaf Ajili, Malik Mallem, Jean-Yves Didier

Abstract:

Motion recognition from videos is actually a very complex task due to the high variability of motions. This paper describes the challenges of human motion recognition, especially motion representation step with relevant features. Our descriptor vector is inspired from Laban Movement Analysis method. We propose discriminative features using the Random Forest algorithm in order to remove redundant features and make learning algorithms operate faster and more effectively. We validate our method on MSRC-12 and UTKinect datasets.

Keywords: discriminative LMA features, features reduction, human motion recognition, random forest

Procedia PDF Downloads 161
11458 Mood Choices and Modality Patterns in Donald Trump’s Inaugural Presidential Speech

Authors: Mary Titilayo Olowe

Abstract:

The controversies that trailed the political campaign and eventual choice of Donald Trump as the American president is so great that expectations are high as to what the content of his inaugural speech will portray. Given the fact that language is a dynamic vehicle of expressing intentions, the speech needs to be objectively assessed so as to access its content in the manner intended through the three strands of meaning postulated by the Systemic Functional Grammar (SFG): the ideational, the interpersonal and the textual. The focus of this paper, however, is on the interpersonal meaning which deals with how language exhibits social roles and relationship. This paper, therefore, attempts to analyse President Donald Trump’s inaugural speech to elicit interpersonal meaning in it. The analysis is done from the perspective of mood and modality which are housed in SFG. Results of the mood choice which is basically declarative, reveal an information-centered speech while the high option for the modal verb operator ‘will’ shows president Donald Trump’s ability to establish an equal and reliant relationship with his audience, i.e., the Americans. In conclusion, the appeal of the speech to different levels of Interpersonal meaning is largely responsible for its overall effectiveness. One can, therefore, understand the reason for the massive reaction it generates at the center of global discourse.

Keywords: interpersonal, modality, mood, systemic functional grammar

Procedia PDF Downloads 189
11457 Modelling High-Frequency Crude Oil Dynamics Using Affine and Non-Affine Jump-Diffusion Models

Authors: Katja Ignatieva, Patrick Wong

Abstract:

We investigated the dynamics of high frequency energy prices, including crude oil and electricity prices. The returns of underlying quantities are modelled using various parametric models such as stochastic framework with jumps and stochastic volatility (SVCJ) as well as non-parametric alternatives, which are purely data driven and do not require specification of the drift or the diffusion coefficient function. Using different statistical criteria, we investigate the performance of considered parametric and nonparametric models in their ability to forecast price series and volatilities. Our models incorporate possible seasonalities in the underlying dynamics and utilise advanced estimation techniques for the dynamics of energy prices.

Keywords: stochastic volatility, affine jump-diffusion models, high frequency data, model specification, markov chain monte carlo

Procedia PDF Downloads 68
11456 Effects of Reversible Watermarking on Iris Recognition Performance

Authors: Andrew Lock, Alastair Allen

Abstract:

Fragile watermarking has been proposed as a means of adding additional security or functionality to biometric systems, particularly for authentication and tamper detection. In this paper we describe an experimental study on the effect of watermarking iris images with a particular class of fragile algorithm, reversible algorithms, and the ability to correctly perform iris recognition. We investigate two scenarios, matching watermarked images to unmodified images, and matching watermarked images to watermarked images. We show that different watermarking schemes give very different results for a given capacity, highlighting the importance of investigation. At high embedding rates most algorithms cause significant reduction in recognition performance. However, in many cases, for low embedding rates, recognition accuracy is improved by the watermarking process.

Keywords: biometrics, iris recognition, reversible watermarking, vision engineering

Procedia PDF Downloads 421
11455 Speech Identification Test for Individuals with High-Frequency Sloping Hearing Loss in Telugu

Authors: S. B. Rathna Kumar, Sandya K. Varudhini, Aparna Ravichandran

Abstract:

Telugu is a south central Dravidian language spoken in Andhra Pradesh, a southern state of India. The available speech identification tests in Telugu have been developed to determine the communication problems of individuals having a flat frequency hearing loss. These conventional speech audiometric tests would provide redundant information when used on individuals with high-frequency sloping hearing loss because of better hearing sensitivity in the low- and mid-frequency regions. Hence, conventional speech identification tests do not indicate the true nature of the communication problem of individuals with high-frequency sloping hearing loss. It is highly possible that a person with a high-frequency sloping hearing loss may get maximum scores if conventional speech identification tests are used. Hence, there is a need to develop speech identification test materials that are specifically designed to assess the speech identification performance of individuals with high-frequency sloping hearing loss. The present study aimed to develop speech identification test for individuals with high-frequency sloping hearing loss in Telugu. Individuals with high-frequency sloping hearing loss have difficulty in perception of voiceless consonants whose spectral energy is above 1000 Hz. Hence, the word lists constructed with phonemes having mid- and high-frequency spectral energy will estimate speech identification performance better for such individuals. The phonemes /k/, /g/, /c/, /ṭ/ /t/, /p/, /s/, /ś/, /ṣ/ and /h/are preferred for the construction of words as these phonemes have spectral energy distributed in the frequencies above 1000 KHz predominantly. The present study developed two word lists in Telugu (each word list contained 25 words) for evaluating speech identification performance of individuals with high-frequency sloping hearing loss. The performance of individuals with high-frequency sloping hearing loss was evaluated using both conventional and high-frequency word lists under recorded voice condition. The results revealed that the developed word lists were found to be more sensitive in identifying the true nature of the communication problem of individuals with high-frequency sloping hearing loss.

Keywords: speech identification test, high-frequency sloping hearing loss, recorded voice condition, Telugu

Procedia PDF Downloads 394
11454 A Corpus-Based Contrastive Analysis of Directive Speech Act Verbs in English and Chinese Legal Texts

Authors: Wujian Han

Abstract:

In the process of human interaction and communication, speech act verbs are considered to be the most active component and the main means for information transmission, and are also taken as an indication of the structure of linguistic behavior. The theoretical value and practical significance of such everyday built-in metalanguage have long been recognized. This paper, which is part of a bigger study, is aimed to provide useful insights for a more precise and systematic application to speech act verbs translation between English and Chinese, especially with regard to the degree to which generic integrity is maintained in the practice of translation of legal documents. In this study, the corpus, i.e. Chinese legal texts and their English translations, English legal texts, ordinary Chinese texts, and ordinary English texts, serve as a testing ground for examining contrastively the usage of English and Chinese directive speech act verbs in legal genre. The scope of this paper is relatively wide and essentially covers all directive speech act verbs which are used in ordinary English and Chinese, such as order, command, request, prohibit, threat, advice, warn and permit. The researcher, by combining the corpus methodology with a contrastive perspective, explored a range of characteristics of English and Chinese directive speech act verbs including their semantic, syntactic and pragmatic features, and then contrasted them in a structured way. It has been found that there are similarities between English and Chinese directive speech act verbs in legal genre, such as similar semantic components between English speech act verbs and their translation equivalents in Chinese, formal and accurate usage of English and Chinese directive speech act verbs in legal contexts. But notable differences have been identified in areas of difference between their usage in the original Chinese and English legal texts such as valency patterns and frequency of occurrences. For example, the subjects of some directive speech act verbs are very frequently omitted in Chinese legal texts, but this is not the case in English legal texts. One of the practicable methods to achieve adequacy and conciseness in speech act verb translation from Chinese into English in legal genre is to repeat the subjects or the message with discrepancy, and vice versa. In addition, translation effects such as overuse and underuse of certain directive speech act verbs are also found in the translated English texts compared to the original English texts. Legal texts constitute a particularly valuable material for speech act verb study. Building up such a contrastive picture of the Chinese and English speech act verbs in legal language would yield results of value and interest to legal translators and students of language for legal purposes and have practical application to legal translation between English and Chinese.

Keywords: contrastive analysis, corpus-based, directive speech act verbs, legal texts, translation between English and Chinese

Procedia PDF Downloads 447
11453 Subband Coding and Glottal Closure Instant (GCI) Using SEDREAMS Algorithm

Authors: Harisudha Kuresan, Dhanalakshmi Samiappan, T. Rama Rao

Abstract:

In modern telecommunication applications, Glottal Closure Instants location finding is important and is directly evaluated from the speech waveform. Here, we study the GCI using Speech Event Detection using Residual Excitation and the Mean Based Signal (SEDREAMS) algorithm. Speech coding uses parameter estimation using audio signal processing techniques to model the speech signal combined with generic data compression algorithms to represent the resulting modeled in a compact bit stream. This paper proposes a sub-band coder SBC, which is a type of transform coding and its performance for GCI detection using SEDREAMS are evaluated. In SBCs code in the speech signal is divided into two or more frequency bands and each of these sub-band signal is coded individually. The sub-bands after being processed are recombined to form the output signal, whose bandwidth covers the whole frequency spectrum. Then the signal is decomposed into low and high-frequency components and decimation and interpolation in frequency domain are performed. The proposed structure significantly reduces error, and precise locations of Glottal Closure Instants (GCIs) are found using SEDREAMS algorithm.

Keywords: SEDREAMS, GCI, SBC, GOI

Procedia PDF Downloads 326
11452 The Advancements of Transformer Models in Part-of-Speech Tagging System for Low-Resource Tigrinya Language

Authors: Shamm Kidane, Ibrahim Abdella, Fitsum Gaim, Simon Mulugeta, Sirak Asmerom, Natnael Ambasager, Yoel Ghebrihiwot

Abstract:

The call for natural language processing (NLP) systems for low-resource languages has become more apparent than ever in the past few years, with the arduous challenges still present in preparing such systems. This paper presents an improved dataset version of the Nagaoka Tigrinya Corpus for Parts-of-Speech (POS) classification system in the Tigrinya language. The size of the initial Nagaoka dataset was incremented, totaling the new tagged corpus to 118K tokens, which comprised the 12 basic POS annotations used previously. The additional content was also annotated manually in a stringent manner, followed similar rules to the former dataset and was formatted in CONLL format. The system made use of the novel approach in NLP tasks and use of the monolingually pre-trained TiELECTRA, TiBERT and TiRoBERTa transformer models. The highest achieved score is an impressive weighted F1-score of 94.2%, which surpassed the previous systems by a significant measure. The system will prove useful in the progress of NLP-related tasks for Tigrinya and similarly related low-resource languages with room for cross-referencing higher-resource languages.

Keywords: Tigrinya POS corpus, TiBERT, TiRoBERTa, conditional random fields

Procedia PDF Downloads 59
11451 ICanny: CNN Modulation Recognition Algorithm

Authors: Jingpeng Gao, Xinrui Mao, Zhibin Deng

Abstract:

Aiming at the low recognition rate on the composite signal modulation in low signal to noise ratio (SNR), this paper proposes a modulation recognition algorithm based on ICanny-CNN. Firstly, the radar signal is transformed into the time-frequency image by Choi-Williams Distribution (CWD). Secondly, we propose an image processing algorithm using the Guided Filter and the threshold selection method, which is combined with the hole filling and the mask operation. Finally, the shallow convolutional neural network (CNN) is combined with the idea of the depth-wise convolution (Dw Conv) and the point-wise convolution (Pw Conv). The proposed CNN is designed to complete image classification and realize modulation recognition of radar signal. The simulation results show that the proposed algorithm can reach 90.83% at 0dB and 71.52% at -8dB. Therefore, the proposed algorithm has a good classification and anti-noise performance in radar signal modulation recognition and other fields.

Keywords: modulation recognition, image processing, composite signal, improved Canny algorithm

Procedia PDF Downloads 163