Search results for: speech recognition performance
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 14157

Search results for: speech recognition performance

13977 Formulating a Definition of Hate Speech: From Divergence to Convergence

Authors: Avitus A. Agbor

Abstract:

Numerous incidents, ranging from trivial to catastrophic, do come to mind when one reflects on hate. The victims of these belong to specific identifiable groups within communities. These experiences evoke discussions on Islamophobia, xenophobia, homophobia, anti-Semitism, racism, ethnic hatred, atheism, and other brutal forms of bigotry. Common to all these is an invisible but portent force that drives all of them: hatred. Such hatred is usually fueled by a profound degree of intolerance (to diversity) and the zeal to impose on others their beliefs and practices which they consider to be the conventional norm. More importantly, the perpetuation of these hateful acts is the unfortunate outcome of an overplay of invectives and hate speech which, to a greater extent, cannot be divorced from hate. From a legal perspective, acknowledging the existence of an undeniable link between hate speech and hate is quite easy. However, both within and without legal scholarship, the notion of “hate speech” remains a conundrum: a phrase that is quite easily explained through experiences than propounding a watertight definition that captures the entire essence and nature of what it is. The problem is further compounded by a few factors: first, within the international human rights framework, the notion of hate speech is not used. In limiting the right to freedom of expression, the ICCPR simply excludes specific kinds of speeches (but does not refer to them as hate speech). Regional human rights instruments are not so different, except for the subsequent developments that took place in the European Union in which the notion has been carefully delineated, and now a much clearer picture of what constitutes hate speech is provided. The legal architecture in domestic legal systems clearly shows differences in approaches and regulation: making it more difficult. In short, what may be hate speech in one legal system may very well be acceptable legal speech in another legal system. Lastly, the cornucopia of academic voices on the issue of hate speech exude the divergence thereon. Yet, in the absence of a well-formulated and universally acceptable definition, it is important to consider how hate speech can be defined. Taking an evidence-based approach, this research looks into the issue of defining hate speech in legal scholarship and how and why such a formulation is of critical importance in the prohibition and prosecution of hate speech.

Keywords: hate speech, international human rights law, international criminal law, freedom of expression

Procedia PDF Downloads 39
13976 New Approaches for the Handwritten Digit Image Features Extraction for Recognition

Authors: U. Ravi Babu, Mohd Mastan

Abstract:

The present paper proposes a novel approach for handwritten digit recognition system. The present paper extract digit image features based on distance measure and derives an algorithm to classify the digit images. The distance measure can be performing on the thinned image. Thinning is the one of the preprocessing technique in image processing. The present paper mainly concentrated on an extraction of features from digit image for effective recognition of the numeral. To find the effectiveness of the proposed method tested on MNIST database, CENPARMI, CEDAR, and newly collected data. The proposed method is implemented on more than one lakh digit images and it gets good comparative recognition results. The percentage of the recognition is achieved about 97.32%.

Keywords: handwritten digit recognition, distance measure, MNIST database, image features

Procedia PDF Downloads 434
13975 Emotion Recognition in Video and Images in the Wild

Authors: Faizan Tariq, Moayid Ali Zaidi

Abstract:

Facial emotion recognition algorithms are expanding rapidly now a day. People are using different algorithms with different combinations to generate best results. There are six basic emotions which are being studied in this area. Author tried to recognize the facial expressions using object detector algorithms instead of traditional algorithms. Two object detection algorithms were chosen which are Faster R-CNN and YOLO. For pre-processing we used image rotation and batch normalization. The dataset I have chosen for the experiments is Static Facial Expression in Wild (SFEW). Our approach worked well but there is still a lot of room to improve it, which will be a future direction.

Keywords: face recognition, emotion recognition, deep learning, CNN

Procedia PDF Downloads 157
13974 Deep-Learning Based Approach to Facial Emotion Recognition through Convolutional Neural Network

Authors: Nouha Khediri, Mohammed Ben Ammar, Monji Kherallah

Abstract:

Recently, facial emotion recognition (FER) has become increasingly essential to understand the state of the human mind. Accurately classifying emotion from the face is a challenging task. In this paper, we present a facial emotion recognition approach named CV-FER, benefiting from deep learning, especially CNN and VGG16. First, the data is pre-processed with data cleaning and data rotation. Then, we augment the data and proceed to our FER model, which contains five convolutions layers and five pooling layers. Finally, a softmax classifier is used in the output layer to recognize emotions. Based on the above contents, this paper reviews the works of facial emotion recognition based on deep learning. Experiments show that our model outperforms the other methods using the same FER2013 database and yields a recognition rate of 92%. We also put forward some suggestions for future work.

Keywords: CNN, deep-learning, facial emotion recognition, machine learning

Procedia PDF Downloads 61
13973 Speech Rhythm Variation in Languages and Dialects: F0, Natural and Inverted Speech

Authors: Imen Ben Abda

Abstract:

Languages have been classified into different rhythm classes. 'Stress-timed' languages are exemplified by English, 'syllable-timed' languages by French and 'mora-timed' languages by Japanese. However, to our best knowledge, acoustic studies have not been unanimous in strictly establishing which rhythm category a given language belongs to and failed to show empirical evidence for isochrony. Perception seems to be a good approach to categorize languages into different rhythm classes. This study, within the scope of experimental phonetics, includes an account of different perceptual experiments using cues from natural and inverted speech, as well as pitch extracted from speech data. It is an attempt to categorize speech rhythm over a large set of Arabic (Tunisian, Algerian, Lebanese and Moroccan) and English dialects (Welsh, Irish, Scottish and Texan) as well as other languages such as Chinese, Japanese, French, and German. Listeners managed to classify the different languages and dialects into different rhythm classes using suprasegmental cues mainly rhythm and pitch (F0). They also perceived rhythmic differences even among languages and dialects belonging to the same rhythm class. This may show that there are different subclasses within very broad rhythmic typologies.

Keywords: F0, inverted speech, mora-timing, rhythm variation, stress-timing, syllable-timing

Procedia PDF Downloads 479
13972 Effects of Exposing Learners to Speech Acts in the German Teaching Material Schritte International: The Case of Requests

Authors: Wan-Lin Tsai

Abstract:

Speech act of requests is an important issue in the field of language learning and teaching because we cannot avoid making requesting in our daily life. This study examined whether or not the subjects who were freshmen and majored in German at Wenzao University of Languages were able to use the linguistic forms which they had learned from their course book Schritte International to make appropriate requests through dialogue completed tasks (DCT). The results revealed that the majority of the subjects were unable to use the forms to make appropriate requests in German due to the lack of explicit instructions. Furthermore, Chinese interference was observed in students' productions. Explicit instructions in speech acts are strongly recommended.

Keywords: Chinese interference, German pragmatics, German teaching, make appropriate requests in German, speech act of requesting

Procedia PDF Downloads 436
13971 Childhood Apraxia of Speech and Autism: Interaction Influences and Treatment

Authors: Elad Vashdi

Abstract:

It is common to find speech deficit among children diagnosed with Autism. It can be found in the clinical field and recently in research. One of the DSM-V criteria suggests a speech delay (Delay in, or total lack of, the development of spoken language), but doesn't explain the cause of it. A common perception among professionals and families is that the inability to talk results from the autism. Autism is a name for a syndrome which just describes a phenomenon and is defined behaviorally. Since it is not based yet on a physiological gold standard, one can not conclude the nature of a deficit based on the name of the syndrome. A wide retrospective research (n=270) which included children with motor speech difficulties was conducted in Israel. The study analyzed entry evaluations in a private clinic during the years 2006-2013. The data was extracted from the reports. High percentage of children diagnosed with Autism (60%) was found. This result demonstrates the high relationship between Autism and motor speech problem. It also supports recent findings in research of Childhood apraxia of speech (CAS) occurrence among children with ASD. Only small percentage of the participants in this research (10%) were diagnosed with CAS even though their verbal deficits well fitted the guidelines for CAS diagnosis set by ASHA in 2007. This fact raises questions regarding the diagnostic procedure in Israel. The understanding that CAS might highly exist within Autism and can have a remarkable influence on the course of early development should be a guiding tool within the diagnosis procedure. CAS can explain the nature of the speech problem among some of the autistic children and guide the treatment in a more accurate way. Calculating the prevalence of CAS which includes the comorbidity with ASD reveals new numbers and suggests treating differently the CAS population.

Keywords: childhood apraxia of speech, Autism, treatment, speech

Procedia PDF Downloads 249
13970 Defect Localization and Interaction on Surfaces with Projection Mapping and Gesture Recognition

Authors: Qiang Wang, Hongyang Yu, MingRong Lai, Miao Luo

Abstract:

This paper presents a method for accurately localizing and interacting with known surface defects by overlaying patterns onto real-world surfaces using a projection system. Given the world coordinates of the defects, we project corresponding patterns onto the surfaces, providing an intuitive visualization of the specific defect locations. To enable users to interact with and retrieve more information about individual defects, we implement a gesture recognition system based on a pruned and optimized version of YOLOv6. This lightweight model achieves an accuracy of 82.8% and is suitable for deployment on low-performance devices. Our approach demonstrates the potential for enhancing defect identification, inspection processes, and user interaction in various applications.

Keywords: defect localization, projection mapping, gesture recognition, YOLOv6

Procedia PDF Downloads 52
13969 A Communication Signal Recognition Algorithm Based on Holder Coefficient Characteristics

Authors: Hui Zhang, Ye Tian, Fang Ye, Ziming Guo

Abstract:

Communication signal modulation recognition technology is one of the key technologies in the field of modern information warfare. At present, communication signal automatic modulation recognition methods are mainly divided into two major categories. One is the maximum likelihood hypothesis testing method based on decision theory, the other is a statistical pattern recognition method based on feature extraction. Now, the most commonly used is a statistical pattern recognition method, which includes feature extraction and classifier design. With the increasingly complex electromagnetic environment of communications, how to effectively extract the features of various signals at low signal-to-noise ratio (SNR) is a hot topic for scholars in various countries. To solve this problem, this paper proposes a feature extraction algorithm for the communication signal based on the improved Holder cloud feature. And the extreme learning machine (ELM) is used which aims at the problem of the real-time in the modern warfare to classify the extracted features. The algorithm extracts the digital features of the improved cloud model without deterministic information in a low SNR environment, and uses the improved cloud model to obtain more stable Holder cloud features and the performance of the algorithm is improved. This algorithm addresses the problem that a simple feature extraction algorithm based on Holder coefficient feature is difficult to recognize at low SNR, and it also has a better recognition accuracy. The results of simulations show that the approach in this paper still has a good classification result at low SNR, even when the SNR is -15dB, the recognition accuracy still reaches 76%.

Keywords: communication signal, feature extraction, Holder coefficient, improved cloud model

Procedia PDF Downloads 120
13968 Facial Emotion Recognition Using Deep Learning

Authors: Ashutosh Mishra, Nikhil Goyal

Abstract:

A 3D facial emotion recognition model based on deep learning is proposed in this paper. Two convolution layers and a pooling layer are employed in the deep learning architecture. After the convolution process, the pooling is finished. The probabilities for various classes of human faces are calculated using the sigmoid activation function. To verify the efficiency of deep learning-based systems, a set of faces. The Kaggle dataset is used to verify the accuracy of a deep learning-based face recognition model. The model's accuracy is about 65 percent, which is lower than that of other facial expression recognition techniques. Despite significant gains in representation precision due to the nonlinearity of profound image representations.

Keywords: facial recognition, computational intelligence, convolutional neural network, depth map

Procedia PDF Downloads 196
13967 Its about Cortana, Microsoft’s Virtual Assistant

Authors: Aya Idriss, Esraa Othman, Lujain Malak

Abstract:

Artificial intelligence is the emulation of human intelligence processes by machines, particularly computer systems that act logically. Some of the specific applications of AI include natural language processing, speech recognition, and machine vision. Cortana is a virtual assistant and she’s an example of an AI Application. Microsoft made it possible for this app to be accessed not only on laptops and PCs but can be downloaded on mobile phones and used as a virtual assistant which was a huge success. Cortana can offer a lot apart from the basic orders such as setting alarms and marking the calendar. Its capabilities spread past that, for example, it provides us with listening to music and podcasts on the go, managing my to-do list and emails, connecting with my contacts hands-free by simply just telling the virtual assistant to call somebody, gives me instant answers and so on. A questionnaire was sent online to numerous friends and family members to perform the study, which is critical in evaluating Cortana's recognition capacity and the majority of the answers were in favor of Cortana’s capabilities. The results of the questionnaire assisted us in determining the level of Cortana's skills.

Keywords: artificial intelligence, Cortana, AI, abstract

Procedia PDF Downloads 154
13966 Interactive Shadow Play Animation System

Authors: Bo Wan, Xiu Wen, Lingling An, Xiaoling Ding

Abstract:

The paper describes a Chinese shadow play animation system based on Kinect. Users, without any professional training, can personally manipulate the shadow characters to finish a shadow play performance by their body actions and get a shadow play video through giving the record command to our system if they want. In our system, Kinect is responsible for capturing human movement and voice commands data. Gesture recognition module is used to control the change of the shadow play scenes. After packaging the data from Kinect and the recognition result from gesture recognition module, VRPN transmits them to the server-side. At last, the server-side uses the information to control the motion of shadow characters and video recording. This system not only achieves human-computer interaction, but also realizes the interaction between people. It brings an entertaining experience to users and easy to operate for all ages. Even more important is that the application background of Chinese shadow play embodies the protection of the art of shadow play animation.

Keywords: hadow play animation, Kinect, gesture recognition, VRPN, HCI

Procedia PDF Downloads 370
13965 Identity Verification Based on Multimodal Machine Learning on Red Green Blue (RGB) Red Green Blue-Depth (RGB-D) Voice Data

Authors: LuoJiaoyang, Yu Hongyang

Abstract:

In this paper, we experimented with a new approach to multimodal identification using RGB, RGB-D and voice data. The multimodal combination of RGB and voice data has been applied in tasks such as emotion recognition and has shown good results and stability, and it is also the same in identity recognition tasks. We believe that the data of different modalities can enhance the effect of the model through mutual reinforcement. We try to increase the three modalities on the basis of the dual modalities and try to improve the effectiveness of the network by increasing the number of modalities. We also implemented the single-modal identification system separately, tested the data of these different modalities under clean and noisy conditions, and compared the performance with the multimodal model. In the process of designing the multimodal model, we tried a variety of different fusion strategies and finally chose the fusion method with the best performance. The experimental results show that the performance of the multimodal system is better than that of the single modality, especially in dealing with noise, and the multimodal system can achieve an average improvement of 5%.

Keywords: multimodal, three modalities, RGB-D, identity verification

Procedia PDF Downloads 47
13964 Hand Detection and Recognition for Malay Sign Language

Authors: Mohd Noah A. Rahman, Afzaal H. Seyal, Norhafilah Bara

Abstract:

Developing a software application using an interface with computers and peripheral devices using gestures of human body such as hand movements keeps growing in interest. A review on this hand gesture detection and recognition based on computer vision technique remains a very challenging task. This is to provide more natural, innovative and sophisticated way of non-verbal communication, such as sign language, in human computer interaction. Nevertheless, this paper explores hand detection and hand gesture recognition applying a vision based approach. The hand detection and recognition used skin color spaces such as HSV and YCrCb are applied. However, there are limitations that are needed to be considered. Almost all of skin color space models are sensitive to quickly changing or mixed lighting circumstances. There are certain restrictions in order for the hand recognition to give better results such as the distance of user’s hand to the webcam and the posture and size of the hand.

Keywords: hand detection, hand gesture, hand recognition, sign language

Procedia PDF Downloads 276
13963 Small Text Extraction from Documents and Chart Images

Authors: Rominkumar Busa, Shahira K. C., Lijiya A.

Abstract:

Text recognition is an important area in computer vision which deals with detecting and recognising text from an image. The Optical Character Recognition (OCR) is a saturated area these days and with very good text recognition accuracy. However the same OCR methods when applied on text with small font sizes like the text data of chart images, the recognition rate is less than 30%. In this work, aims to extract small text in images using the deep learning model, CRNN with CTC loss. The text recognition accuracy is found to improve by applying image enhancement by super resolution prior to CRNN model. We also observe the text recognition rate further increases by 18% by applying the proposed method, which involves super resolution and character segmentation followed by CRNN with CTC loss. The efficiency of the proposed method shows that further pre-processing on chart image text and other small text images will improve the accuracy further, thereby helping text extraction from chart images.

Keywords: small text extraction, OCR, scene text recognition, CRNN

Procedia PDF Downloads 95
13962 Speech Motor Processing and Animal Sound Communication

Authors: Ana Cleide Vieira Gomes Guimbal de Aquino

Abstract:

Sound communication is present in most vertebrates, from fish, mainly in species that live in murky waters, to some species of reptiles, anuran amphibians, birds, and mammals, including primates. There are, in fact, relevant similarities between human language and animal sound communication, and among these similarities are the vocalizations called calls. The first specific call in human babies is crying, which has a characteristic prosodic contour and is motivated most of the time by the need for food and by affecting the puppy-caregiver interaction, with a view to communicating the necessities and food requests and guaranteeing the survival of the species. The present work aims to articulate speech processing in the motor context with aspects of the project entitled emotional states and vocalization: a comparative study of the prosodic contours of crying in human and non-human animals. First, concepts of speech motor processing and general aspects of speech evolution will be presented to relate these two approaches to animal sound communication.

Keywords: speech motor processing, animal communication, animal behaviour, language acquisition

Procedia PDF Downloads 57
13961 Localization of Frontal and Temporal Speech Areas in Brain Tumor Patients by Their Structural Connections with Probabilistic Tractography

Authors: B.Shukir, H.Woo, P.Barzo, D.Kis

Abstract:

Preoperative brain mapping in tumors involving the speech areas has an important role to reduce surgical risks. Functional magnetic resonance imaging (fMRI) is the gold standard method to localize cortical speech areas preoperatively, but its availability in clinical routine is difficult. Diffusion MRI based probabilistic tractography is available in head MRI. It’s used to segment cortical subregions by their structural connectivity. In our study, we used probabilistic tractography to localize the frontal and temporal cortical speech areas. 15 patients with left frontal tumor were enrolled to our study. Speech fMRI and diffusion MRI acquired preoperatively. The standard automated anatomical labelling atlas 3 (AAL3) cortical atlas used to define 76 left frontal and 118 left temporal potential speech areas. 4 types of tractography were run according to the structural connection of these regions to the left arcuate fascicle (FA) to localize those cortical areas which have speech functions: 1, frontal through FA; 2, frontal with FA; 3, temporal to FA; 4, temporal with FA connections were determined. Thresholds of 1%, 5%, 10% and 15% applied. At each level, the number of affected frontal and temporal regions by fMRI and tractography were defined, the sensitivity and specificity were calculated. At the level of 1% threshold showed the best results. Sensitivity was 61,631,4% and 67,1523,12%, specificity was 87,210,4% and 75,611,37% for frontal and temporal regions, respectively. From our study, we conclude that probabilistic tractography is a reliable preoperative technique to localize cortical speech areas. However, its results are not feasible that the neurosurgeon rely on during the operation.

Keywords: brain mapping, brain tumor, fMRI, probabilistic tractography

Procedia PDF Downloads 125
13960 Hybrid Approach for Face Recognition Combining Gabor Wavelet and Linear Discriminant Analysis

Authors: A: Annis Fathima, V. Vaidehi, S. Ajitha

Abstract:

Face recognition system finds many applications in surveillance and human computer interaction systems. As the applications using face recognition systems are of much importance and demand more accuracy, more robustness in the face recognition system is expected with less computation time. In this paper, a hybrid approach for face recognition combining Gabor Wavelet and Linear Discriminant Analysis (HGWLDA) is proposed. The normalized input grayscale image is approximated and reduced in dimension to lower the processing overhead for Gabor filters. This image is convolved with bank of Gabor filters with varying scales and orientations. LDA, a subspace analysis techniques are used to reduce the intra-class space and maximize the inter-class space. The techniques used are 2-dimensional Linear Discriminant Analysis (2D-LDA), 2-dimensional bidirectional LDA ((2D)2LDA), Weighted 2-dimensional bidirectional Linear Discriminant Analysis (Wt (2D)2 LDA). LDA reduces the feature dimension by extracting the features with greater variance. k-Nearest Neighbour (k-NN) classifier is used to classify and recognize the test image by comparing its feature with each of the training set features. The HGWLDA approach is robust against illumination conditions as the Gabor features are illumination invariant. This approach also aims at a better recognition rate using less number of features for varying expressions. The performance of the proposed HGWLDA approaches is evaluated using AT&T database, MIT-India face database and faces94 database. It is found that the proposed HGWLDA approach provides better results than the existing Gabor approach.

Keywords: face recognition, Gabor wavelet, LDA, k-NN classifier

Procedia PDF Downloads 447
13959 Recognition and Protection of Indigenous Society in Indonesia

Authors: Triyanto, Rima Vien Permata Hartanto

Abstract:

Indonesia is a legal state. The consequence of this status is the recognition and protection of the existence of indigenous peoples. This paper aims to describe the dynamics of legal recognition and protection for indigenous peoples within the framework of Indonesian law. This paper is library research based on literature. The result states that although the constitution has normatively recognized the existence of indigenous peoples and their traditional rights, in reality, not all rights were recognized and protected. The protection and recognition for indigenous people need to be strengthened.

Keywords: indigenous peoples, customary law, state law, state of law

Procedia PDF Downloads 295
13958 Relevant LMA Features for Human Motion Recognition

Authors: Insaf Ajili, Malik Mallem, Jean-Yves Didier

Abstract:

Motion recognition from videos is actually a very complex task due to the high variability of motions. This paper describes the challenges of human motion recognition, especially motion representation step with relevant features. Our descriptor vector is inspired from Laban Movement Analysis method. We propose discriminative features using the Random Forest algorithm in order to remove redundant features and make learning algorithms operate faster and more effectively. We validate our method on MSRC-12 and UTKinect datasets.

Keywords: discriminative LMA features, features reduction, human motion recognition, random forest

Procedia PDF Downloads 160
13957 Mood Choices and Modality Patterns in Donald Trump’s Inaugural Presidential Speech

Authors: Mary Titilayo Olowe

Abstract:

The controversies that trailed the political campaign and eventual choice of Donald Trump as the American president is so great that expectations are high as to what the content of his inaugural speech will portray. Given the fact that language is a dynamic vehicle of expressing intentions, the speech needs to be objectively assessed so as to access its content in the manner intended through the three strands of meaning postulated by the Systemic Functional Grammar (SFG): the ideational, the interpersonal and the textual. The focus of this paper, however, is on the interpersonal meaning which deals with how language exhibits social roles and relationship. This paper, therefore, attempts to analyse President Donald Trump’s inaugural speech to elicit interpersonal meaning in it. The analysis is done from the perspective of mood and modality which are housed in SFG. Results of the mood choice which is basically declarative, reveal an information-centered speech while the high option for the modal verb operator ‘will’ shows president Donald Trump’s ability to establish an equal and reliant relationship with his audience, i.e., the Americans. In conclusion, the appeal of the speech to different levels of Interpersonal meaning is largely responsible for its overall effectiveness. One can, therefore, understand the reason for the massive reaction it generates at the center of global discourse.

Keywords: interpersonal, modality, mood, systemic functional grammar

Procedia PDF Downloads 189
13956 A Corpus-Based Contrastive Analysis of Directive Speech Act Verbs in English and Chinese Legal Texts

Authors: Wujian Han

Abstract:

In the process of human interaction and communication, speech act verbs are considered to be the most active component and the main means for information transmission, and are also taken as an indication of the structure of linguistic behavior. The theoretical value and practical significance of such everyday built-in metalanguage have long been recognized. This paper, which is part of a bigger study, is aimed to provide useful insights for a more precise and systematic application to speech act verbs translation between English and Chinese, especially with regard to the degree to which generic integrity is maintained in the practice of translation of legal documents. In this study, the corpus, i.e. Chinese legal texts and their English translations, English legal texts, ordinary Chinese texts, and ordinary English texts, serve as a testing ground for examining contrastively the usage of English and Chinese directive speech act verbs in legal genre. The scope of this paper is relatively wide and essentially covers all directive speech act verbs which are used in ordinary English and Chinese, such as order, command, request, prohibit, threat, advice, warn and permit. The researcher, by combining the corpus methodology with a contrastive perspective, explored a range of characteristics of English and Chinese directive speech act verbs including their semantic, syntactic and pragmatic features, and then contrasted them in a structured way. It has been found that there are similarities between English and Chinese directive speech act verbs in legal genre, such as similar semantic components between English speech act verbs and their translation equivalents in Chinese, formal and accurate usage of English and Chinese directive speech act verbs in legal contexts. But notable differences have been identified in areas of difference between their usage in the original Chinese and English legal texts such as valency patterns and frequency of occurrences. For example, the subjects of some directive speech act verbs are very frequently omitted in Chinese legal texts, but this is not the case in English legal texts. One of the practicable methods to achieve adequacy and conciseness in speech act verb translation from Chinese into English in legal genre is to repeat the subjects or the message with discrepancy, and vice versa. In addition, translation effects such as overuse and underuse of certain directive speech act verbs are also found in the translated English texts compared to the original English texts. Legal texts constitute a particularly valuable material for speech act verb study. Building up such a contrastive picture of the Chinese and English speech act verbs in legal language would yield results of value and interest to legal translators and students of language for legal purposes and have practical application to legal translation between English and Chinese.

Keywords: contrastive analysis, corpus-based, directive speech act verbs, legal texts, translation between English and Chinese

Procedia PDF Downloads 447
13955 Dynamic Gabor Filter Facial Features-Based Recognition of Emotion in Video Sequences

Authors: T. Hari Prasath, P. Ithaya Rani

Abstract:

In the world of visual technology, recognizing emotions from the face images is a challenging task. Several related methods have not utilized the dynamic facial features effectively for high performance. This paper proposes a method for emotions recognition using dynamic facial features with high performance. Initially, local features are captured by Gabor filter with different scale and orientations in each frame for finding the position and scale of face part from different backgrounds. The Gabor features are sent to the ensemble classifier for detecting Gabor facial features. The region of dynamic features is captured from the Gabor facial features in the consecutive frames which represent the dynamic variations of facial appearances. In each region of dynamic features is normalized using Z-score normalization method which is further encoded into binary pattern features with the help of threshold values. The binary features are passed to Multi-class AdaBoost classifier algorithm with the well-trained database contain happiness, sadness, surprise, fear, anger, disgust, and neutral expressions to classify the discriminative dynamic features for emotions recognition. The developed method is deployed on the Ryerson Multimedia Research Lab and Cohn-Kanade databases and they show significant performance improvement owing to their dynamic features when compared with the existing methods.

Keywords: detecting face, Gabor filter, multi-class AdaBoost classifier, Z-score normalization

Procedia PDF Downloads 246
13954 Real-Time Recognition of Dynamic Hand Postures on a Neuromorphic System

Authors: Qian Liu, Steve Furber

Abstract:

To explore how the brain may recognize objects in its general,accurate and energy-efficient manner, this paper proposes the use of a neuromorphic hardware system formed from a Dynamic Video Sensor~(DVS) silicon retina in concert with the SpiNNaker real-time Spiking Neural Network~(SNN) simulator. As a first step in the exploration on this platform a recognition system for dynamic hand postures is developed, enabling the study of the methods used in the visual pathways of the brain. Inspired by the behaviours of the primary visual cortex, Convolutional Neural Networks (CNNs) are modeled using both linear perceptrons and spiking Leaky Integrate-and-Fire (LIF) neurons. In this study's largest configuration using these approaches, a network of 74,210 neurons and 15,216,512 synapses is created and operated in real-time using 290 SpiNNaker processor cores in parallel and with 93.0% accuracy. A smaller network using only 1/10th of the resources is also created, again operating in real-time, and it is able to recognize the postures with an accuracy of around 86.4% -only 6.6% lower than the much larger system. The recognition rate of the smaller network developed on this neuromorphic system is sufficient for a successful hand posture recognition system, and demonstrates a much-improved cost to performance trade-off in its approach.

Keywords: spiking neural network (SNN), convolutional neural network (CNN), posture recognition, neuromorphic system

Procedia PDF Downloads 438
13953 Face Recognition Using Discrete Orthogonal Hahn Moments

Authors: Fatima Akhmedova, Simon Liao

Abstract:

One of the most critical decision points in the design of a face recognition system is the choice of an appropriate face representation. Effective feature descriptors are expected to convey sufficient, invariant and non-redundant facial information. In this work, we propose a set of Hahn moments as a new approach for feature description. Hahn moments have been widely used in image analysis due to their invariance, non-redundancy and the ability to extract features either globally and locally. To assess the applicability of Hahn moments to Face Recognition we conduct two experiments on the Olivetti Research Laboratory (ORL) database and University of Notre-Dame (UND) X1 biometric collection. Fusion of the global features along with the features from local facial regions are used as an input for the conventional k-NN classifier. The method reaches an accuracy of 93% of correctly recognized subjects for the ORL database and 94% for the UND database.

Keywords: face recognition, Hahn moments, recognition-by-parts, time-lapse

Procedia PDF Downloads 340
13952 Implementing Text Using Political and Current Issues to Create Choreography: “The Pledge 2.0”

Authors: Muhammad Fairul Azreen bin Mohd Zahid, Melissa Querk, Aimi Nabila bt Anizaim

Abstract:

For this particular research, the focus is based on the practice as research which will produce a choreography as the outcome. The ideas organically develop as an “epiphany” from the meeting, brainstorming, or situation that revolves around surroundings. In this study, the researchers are approaching the national pillar of Malaysia known as ‘Rukun Negara’ to develop a choreographic idea. The concept theory of Speech Act by J.L Austin is used to compose the choreography alongside with national pillar ‘Rukun Negara’ as a guideline for a contemporary work titled, The Pledge 2.0, besides fostering the spirit of unity. These approaches will offer flexibility in creating a choreography piece. The pledge has crossed the boundaries by using texts and heavy issues in choreography developments. It will emphasize the concept of delivering the speech via verbal and nonverbal body language. Besides using the Theory of Speech Acts, the development process of creating this piece will lay the bare normative structure implicit in performance practice. Converging current issues into the final choreographic piece for this research is vital as this research will explore a few choreography methods from different perspectives. Hence, the audience will be able to see the world of dance that always revolves in line with the diachronic process in many ways. The method used in this research is qualitative, which will be used in finding the movement that fits the given facts.

Keywords: performing arts, speech act, performative, nationalism, choreography, politic in dance

Procedia PDF Downloads 60
13951 Topology-Based Character Recognition Method for Coin Date Detection

Authors: Xingyu Pan, Laure Tougne

Abstract:

For recognizing coins, the graved release date is important information to identify precisely its monetary type. However, reading characters in coins meets much more obstacles than traditional character recognition tasks in the other fields, such as reading scanned documents or license plates. To address this challenging issue in a numismatic context, we propose a training-free approach dedicated to detection and recognition of the release date of the coin. In the first step, the date zone is detected by comparing histogram features; in the second step, a topology-based algorithm is introduced to recognize coin numbers with various font types represented by binary gradient map. Our method obtained a recognition rate of 92% on synthetic data and of 44% on real noised data.

Keywords: coin, detection, character recognition, topology

Procedia PDF Downloads 225
13950 A Survey on Speech Emotion-Based Music Recommendation System

Authors: Chirag Kothawade, Gourie Jagtap, PreetKaur Relusinghani, Vedang Chavan, Smitha S. Bhosale

Abstract:

Psychological research has proven that music relieves stress, elevates mood, and is responsible for the release of “feel-good” chemicals like oxytocin, serotonin, and dopamine. It comes as no surprise that music has been a popular tool in rehabilitation centers and therapy for various disorders, thus with the interminably rising numbers of people facing mental health-related issues across the globe, addressing mental health concerns is more crucial than ever. Despite the existing music recommendation systems, there is a dearth of holistically curated algorithms that take care of the needs of users. Given that, an undeniable majority of people turn to music on a regular basis and that music has been proven to increase cognition, memory, and sleep quality while reducing anxiety, pain, and blood pressure, it is the need of the hour to fashion a product that extracts all the benefits of music in the most extensive and deployable method possible. Our project aims to ameliorate our users’ mental state by building a comprehensive mood-based music recommendation system called “Viby”.

Keywords: language, communication, speech recognition, interaction

Procedia PDF Downloads 33
13949 Improved Dynamic Bayesian Networks Applied to Arabic On Line Characters Recognition

Authors: Redouane Tlemsani, Abdelkader Benyettou

Abstract:

Work is in on line Arabic character recognition and the principal motivation is to study the Arab manuscript with on line technology. This system is a Markovian system, which one can see as like a Dynamic Bayesian Network (DBN). One of the major interests of these systems resides in the complete models training (topology and parameters) starting from training data. Our approach is based on the dynamic Bayesian Networks formalism. The DBNs theory is a Bayesians networks generalization to the dynamic processes. Among our objective, amounts finding better parameters, which represent the links (dependences) between dynamic network variables. In applications in pattern recognition, one will carry out the fixing of the structure, which obliges us to admit some strong assumptions (for example independence between some variables). Our application will relate to the Arabic isolated characters on line recognition using our laboratory database: NOUN. A neural tester proposed for DBN external optimization. The DBN scores and DBN mixed are respectively 70.24% and 62.50%, which lets predict their further development; other approaches taking account time were considered and implemented until obtaining a significant recognition rate 94.79%.

Keywords: Arabic on line character recognition, dynamic Bayesian network, pattern recognition, computer vision

Procedia PDF Downloads 402
13948 Compensatory Articulation of Pressure Consonants in Telugu Cleft Palate Speech: A Spectrographic Analysis

Authors: Indira Kothalanka

Abstract:

For individuals born with a cleft palate (CP), there is no separation between the nasal cavity and the oral cavity, due to which they cannot build up enough air pressure in the mouth for speech. Therefore, it is common for them to have speech problems. Common cleft type speech errors include abnormal articulation (compensatory or obligatory) and abnormal resonance (hyper, hypo and mixed nasality). These are generally resolved after palate repair. However, in some individuals, articulation problems do persist even after the palate repair. Such individuals develop variant articulations in an attempt to compensate for the inability to produce the target phonemes. A spectrographic analysis is used to investigate the compensatory articulatory behaviours of pressure consonants in the speech of 10 Telugu speaking individuals aged between 7-17 years with a history of cleft palate. Telugu is a Dravidian language which is spoken in Andhra Pradesh and Telangana states in India. It is a language with the third largest number of native speakers in India and the most spoken Dravidian language. The speech of the informants is analysed using single word list, sentences, passage and conversation. Spectrographic analysis is carried out using PRAAT, speech analysis software. The place and manner of articulation of consonant sounds is studied through spectrograms with the help of various acoustic cues. The types of compensatory articulation identified are glottal stops, palatal stops, uvular, velar stops and nasal fricatives which are non-native in Telugu.

Keywords: cleft palate, compensatory articulation, spectrographic analysis, PRAAT

Procedia PDF Downloads 417