Search results for: multilingual automatic speech recognition
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 3168

Search results for: multilingual automatic speech recognition

2628 Cockpit Integration and Piloted Assessment of an Upset Detection and Recovery System

Authors: Hafid Smaili, Wilfred Rouwhorst, Paul Frost

Abstract:

The trend of recent accident and incident cases worldwide show that the state-of-the-art automation and operations, for current and future demanding operational environments, does not provide the desired level of operational safety under crew peak workload conditions, specifically in complex situations such as loss-of-control in-flight (LOC-I). Today, the short term focus is on preparing crews to recognise and handle LOC-I situations through upset recovery training. This paper describes the cockpit integration aspects and piloted assessment of both a manually assisted and automatic upset detection and recovery system that has been developed and demonstrated within the European Advanced Cockpit for Reduction Of StreSs and workload (ACROSS) programme. The proposed system is a function that continuously monitors and intervenes when the aircraft enters an upset and provides either manually pilot-assisted guidance or takes over full control of the aircraft to recover from an upset. In order to mitigate the highly physical and psychological impact during aircraft upset events, the system provides new cockpit functionalities to support the pilot in recovering from any upset both manually assisted and automatically. A piloted simulator assessment was made in Oct-Nov 2015 using ten pilots in a representative civil large transport fly-by-wire aircraft in terms of the preference of the tested upset detection and recovery system configurations to reduce pilot workload, increase situational awareness and safe interaction with the manually assisted or automated modes. The piloted simulator evaluation of the upset detection and recovery system showed that the functionalities of the system are able to support pilots during an upset. The experiment showed that pilots are willing to rely on the guidance provided by the system during an upset. Thereby, it is important for pilots to see and understand what the aircraft is doing and trying to do especially in automatic modes. Comparing the manually assisted and the automatic recovery modes, the pilot’s opinion was that an automatic recovery reduces the workload so that they could perform a proper screening of the primary flight display. The results further show that the manually assisted recoveries, with recovery guidance cues on the cockpit primary flight display, reduced workload for severe upsets compared to today’s situation. The level of situation awareness was improved for automatic upset recoveries where the pilot could monitor what the system was trying to accomplish compared to automatic recovery modes without any guidance. An improvement in situation awareness was also noticeable with the manually assisted upset recovery functionalities as compared to the current non-assisted recovery procedures. This study shows that automatic upset detection and recovery functionalities are likely to positively impact the operational safety by means of reduced workload, improved situation awareness and crew stress reduction. It is thus believed that future developments for upset recovery guidance and loss-of-control prevention should focus on automatic recovery solutions.

Keywords: aircraft accidents, automatic flight control, loss-of-control, upset recovery

Procedia PDF Downloads 206
2627 Keyframe Extraction Using Face Quality Assessment and Convolution Neural Network

Authors: Rahma Abed, Sahbi Bahroun, Ezzeddine Zagrouba

Abstract:

Due to the huge amount of data in videos, extracting the relevant frames became a necessity and an essential step prior to performing face recognition. In this context, we propose a method for extracting keyframes from videos based on face quality and deep learning for a face recognition task. This method has two steps. We start by generating face quality scores for each face image based on the use of three face feature extractors, including Gabor, LBP, and HOG. The second step consists in training a Deep Convolutional Neural Network in a supervised manner in order to select the frames that have the best face quality. The obtained results show the effectiveness of the proposed method compared to the methods of the state of the art.

Keywords: keyframe extraction, face quality assessment, face in video recognition, convolution neural network

Procedia PDF Downloads 226
2626 New Formula for Revenue Recognition Likely to Change the Prescription for Pharma Industry

Authors: Shruti Hajirnis

Abstract:

In May 2014, FASB issued Accounting Standards Update (ASU) 2014-09, Revenue from Contracts with Customers (Topic 606), and the International Accounting Standards Board (IASB) issued International Financial Reporting Standards (IFRS) 15, Revenue from Contracts with Customers that will supersede virtually all revenue recognition requirements in IFRS and US GAAP. FASB and the IASB have basically achieved convergence with these standards, with only some minor differences such as collectability threshold, interim disclosure requirements, early application and effective date, impairment loss reversal and nonpublic entity requirements. This paper discusses the impact of five-step model prescribed in new revenue standard on the entities operating in Pharma industry. It also outlines the considerations for these entities while implementing the new standard.

Keywords: revenue recognition, pharma industry, standard, requirements

Procedia PDF Downloads 438
2625 Development and Application of the Proctoring System with Face Recognition for User Registration on the Educational Information Portal

Authors: Meruyert Serik, Nassipzhan Duisegaliyeva, Danara Tleumagambetova, Madina Ermaganbetova

Abstract:

This research paper explores the process of creating a proctoring system by evaluating the implementation of practical face recognition algorithms. Students of educational programs reviewed the research work "6B01511-Computer Science", "7M01511-Computer Science", "7M01525- STEM Education," and "8D01511-Computer Science" of Eurasian National University named after L.N. Gumilyov. As an outcome, a proctoring system will be created, enabling the conduction of tests and ensuring academic integrity checks within the system. Due to the correct operation of the system, test works are carried out. The result of the creation of the proctoring system will be the basis for the automation of the informational, educational portal developed by machine learning.

Keywords: artificial intelligence, education portal, face recognition, machine learning, proctoring

Procedia PDF Downloads 113
2624 Automatic Integrated Inverter Type Smart Device for Safe Kitchen

Authors: K. M. Jananni, R. Nandini

Abstract:

The proposed wireless, inverter type design of a LPG leakage monitoring system aims to provide a smart and safe kitchen. The system detects the LPG gas leak using Nano-sensors and alerts the concerned individual through GSM system. The system uses two sensors, one attached to the chimney and other to the regulator of the LPG cylinder. Upon a leakage being detected, the sensor at the regulator actuates the system to cut off the gas supply immediately using a solenoid control valve. The sensor at the chimney checks for the permissible level of LPG mix in the air and when the level exceeds the threshold, the system sends an automatic SMS to the numbers saved. Further the sensor actuates the mini suction system fixed at the chimney within 20 seconds of a leakage to suck out the gas until the level falls well below the threshold. As a safety measure, an automatic window opening and alarm feature is also incorporated into the system. The key feature of this design is that the system is provided with a special inverter designed to make the device function effectively even during power failures. In this paper, utilization of sensors in the kitchen area is discussed and this gives the proposed architecture for real time field monitoring with a PIC Micro-controller.

Keywords: nano sensors, global system for mobile communication, GSM, micro controller, inverter

Procedia PDF Downloads 471
2623 Unsupervised Learning with Self-Organizing Maps for Named Entity Recognition in the CONLL2003 Dataset

Authors: Assel Jaxylykova, Alexnder Pak

Abstract:

This study utilized a Self-Organizing Map (SOM) for unsupervised learning on the CONLL-2003 dataset for Named Entity Recognition (NER). The process involved encoding words into 300-dimensional vectors using FastText. These vectors were input into a SOM grid, where training adjusted node weights to minimize distances. The SOM provided a topological representation for identifying and clustering named entities, demonstrating its efficacy without labeled examples. Results showed an F1-measure of 0.86, highlighting SOM's viability. Although some methods achieve higher F1 measures, SOM eliminates the need for labeled data, offering a scalable and efficient alternative. The SOM's ability to uncover hidden patterns provides insights that could enhance existing supervised methods. Further investigation into potential limitations and optimization strategies is suggested to maximize benefits.

Keywords: named entity recognition, natural language processing, self-organizing map, CONLL-2003, semantics

Procedia PDF Downloads 37
2622 2.5D Face Recognition Using Gabor Discrete Cosine Transform

Authors: Ali Cheraghian, Farshid Hajati, Soheila Gheisari, Yongsheng Gao

Abstract:

In this paper, we present a novel 2.5D face recognition method based on Gabor Discrete Cosine Transform (GDCT). In the proposed method, the Gabor filter is applied to extract feature vectors from the texture and the depth information. Then, Discrete Cosine Transform (DCT) is used for dimensionality and redundancy reduction to improve computational efficiency. The system is combined texture and depth information in the decision level, which presents higher performance compared to methods, which use texture and depth information, separately. The proposed algorithm is examined on publically available Bosphorus database including models with pose variation. The experimental results show that the proposed method has a higher performance compared to the benchmark.

Keywords: Gabor filter, discrete cosine transform, 2.5d face recognition, pose

Procedia PDF Downloads 324
2621 Audio-Visual Recognition Based on Effective Model and Distillation

Authors: Heng Yang, Tao Luo, Yakun Zhang, Kai Wang, Wei Qin, Liang Xie, Ye Yan, Erwei Yin

Abstract:

Recent years have seen that audio-visual recognition has shown great potential in a strong noise environment. The existing method of audio-visual recognition has explored methods with ResNet and feature fusion. However, on the one hand, ResNet always occupies a large amount of memory resources, restricting the application in engineering. On the other hand, the feature merging also brings some interferences in a high noise environment. In order to solve the problems, we proposed an effective framework with bidirectional distillation. At first, in consideration of the good performance in extracting of features, we chose the light model, Efficientnet as our extractor of spatial features. Secondly, self-distillation was applied to learn more information from raw data. Finally, we proposed a bidirectional distillation in decision-level fusion. In more detail, our experimental results are based on a multi-model dataset from 24 volunteers. Eventually, the lipreading accuracy of our framework was increased by 2.3% compared with existing systems, and our framework made progress in audio-visual fusion in a high noise environment compared with the system of audio recognition without visual.

Keywords: lipreading, audio-visual, Efficientnet, distillation

Procedia PDF Downloads 130
2620 Medical Neural Classifier Based on Improved Genetic Algorithm

Authors: Fadzil Ahmad, Noor Ashidi Mat Isa

Abstract:

This study introduces an improved genetic algorithm procedure that focuses search around near optimal solution corresponded to a group of elite chromosome. This is achieved through a novel crossover technique known as Segmented Multi Chromosome Crossover. It preserves the highly important information contained in a gene segment of elite chromosome and allows an offspring to carry information from gene segment of multiple chromosomes. In this way the algorithm has better possibility to effectively explore the solution space. The improved GA is applied for the automatic and simultaneous parameter optimization and feature selection of artificial neural network in pattern recognition of medical problem, the cancer and diabetes disease. The experimental result shows that the average classification accuracy of the cancer and diabetes dataset has improved by 0.1% and 0.3% respectively using the new algorithm.

Keywords: genetic algorithm, artificial neural network, pattern clasification, classification accuracy

Procedia PDF Downloads 470
2619 A Stylistic Analysis of the Short Story ‘The Escape’ by Qaisra Shahraz

Authors: Huma Javed

Abstract:

Stylistics is a broad term that is concerned with both literature and linguistics, due to which the significance of the stylistics increases. This research aims to analyze Qaisra Shahraz's short story ‘The Escape’ from the stylistic analysis viewpoint. The focus of this study is on three aspects grammar category, lexical category, and figure of speech of the short story. The research designs for this article are both explorative and descriptive. The analysis of the data shows that the writer has used more nouns in the story as compared to other lexical items, which suggests that story has a descriptive style rather than narrative.

Keywords: The Escape, stylistics, grammatical category, lexical category, figure of speech

Procedia PDF Downloads 228
2618 Imprecise Vowel Articulation in Down Syndrome: An Acoustic Study

Authors: Anitha Naittee Abraham, N. Sreedevi

Abstract:

Individuals with Down syndrome (DS) have relatively better expressive language compared to other individuals with intellectual disabilities. Reduced speech intelligibility is one of the major concerns of this group of individuals due to their anatomical and physiological differences. The study investigated the vowel articulation of Malayalam speaking children with DS in the age range of 5-10 years. The vowel production of 10 children with DS was compared with typically developing children in the same age range. Vowels were extracted from 3 words with the corner vowels /a/, /i/ and /u/ in the word-initial position, using Praat (version 5.3.23) software. Acoustic analysis was based on vowel space area (VSA), Formant centralization ration (FCR) and F2i/F2u. The findings revealed increased formant values for the control group except for F2a and F2u. Also, the experimental group had higher FCR, lower VSA, and F2i/F2u values suggestive of imprecise vowel articulation due to restricted tongue movements. The results of the independent t-test revealed a significant difference in F1a, F2i, F2u, VSA, FCR and F2i/F2u values between the experimental and control group. These findings support the fact that children with DS have imprecise vowel articulation that interferes with the overall speech intelligibility. Hence it is essential to target the oromotor skills to enhance the speech intelligibility which in turn benefit in the social and vocational domains of these individuals.

Keywords: Down syndrome, FCR, vowel articulation, vowel space

Procedia PDF Downloads 180
2617 Sports Fans and Non-Interested Public Recognition of the Problems of Sports in Egypt through Caricature

Authors: Alaaeldin Hamdy Ahmed Mohammed

Abstract:

Introduction: This study examines sports’ fans and non-interested public perception and recognition of the problems that have negative impacts upon the Egyptian sports, particularly football, through caricatures. Eight caricature paintings were designed to express eight problems affecting the Egyptian sports and its development. These paintings were distributed on two groups of the fans and the non-interested public. Methods: The study was limited to eight caricatures representing the eight issues which are: the impact of stopping the sports activity on athletes, the effect of clubs’ disagreement, fanaticism between the members of the ultras of different clubs, the negative impact of the mingling of politics into sports, the negative role of the clubs affects the professionalism of the promising players, the conflict between the national organization responsible for sports, the breaking in of the fans to the playgrounds, the impact of the lack of planning on the national team. The Results: The results showed that both sports fans and those who are not interested in sports recognized the problems that the caricatures refer to and criticizes exaggeration although the rate was higher for the fans. These caricatures contributed also in their recognition of the danger of the negative impact of these problems on the Egyptian sports, particularly football which is the most common at the Egyptian sports fans. Discussion: This finding echoes the conclusion that caricatures are distinctive in the adults’ facial stimuli that are either systematically exaggerated recognition of them.

Keywords: caricature, fans, football, sports

Procedia PDF Downloads 314
2616 An Ensemble-based Method for Vehicle Color Recognition

Authors: Saeedeh Barzegar Khalilsaraei, Manoocheher Kelarestaghi, Farshad Eshghi

Abstract:

The vehicle color, as a prominent and stable feature, helps to identify a vehicle more accurately. As a result, vehicle color recognition is of great importance in intelligent transportation systems. Unlike conventional methods which use only a single Convolutional Neural Network (CNN) for feature extraction or classification, in this paper, four CNNs, with different architectures well-performing in different classes, are trained to extract various features from the input image. To take advantage of the distinct capability of each network, the multiple outputs are combined using a stack generalization algorithm as an ensemble technique. As a result, the final model performs better than each CNN individually in vehicle color identification. The evaluation results in terms of overall average accuracy and accuracy variance show the proposed method’s outperformance compared to the state-of-the-art rivals.

Keywords: Vehicle Color Recognition, Ensemble Algorithm, Stack Generalization, Convolutional Neural Network

Procedia PDF Downloads 78
2615 Developement of a New Wearable Device for Automatic Guidance Service

Authors: Dawei Cai

Abstract:

In this paper, we present a new wearable device that provide an automatic guidance servie for visitors. By combining the position information from NFC and the orientation information from a 6 axis acceleration and terrestrial magnetism sensor, the head's direction can be calculated. We developed an algorithm to calculate the device orientation based on the data from acceleration and terrestrial magnetism sensor. If visitors want to know some explanation about an exhibit in front of him, what he has to do is just lift up his mobile device. The identification program will automatically identify the status based on the information from NFC and MEMS, and start playing explanation content for him. This service may be convenient for old people or disables or children.

Keywords: wearable device, ubiquitous computing, guide sysem, MEMS sensor, NFC

Procedia PDF Downloads 420
2614 GIS-Based Automatic Flight Planning of Camera-Equipped UAVs for Fire Emergency Response

Authors: Mohammed Sulaiman, Hexu Liu, Mohamed Binalhaj, William W. Liou, Osama Abudayyeh

Abstract:

Emerging technologies such as camera-equipped unmanned aerial vehicles (UAVs) are increasingly being applied in building fire rescue to provide real-time visualization and 3D reconstruction of the entire fireground. However, flight planning of camera-equipped UAVs is usually a manual process, which is not sufficient to fulfill the needs of emergency management. This research proposes a Geographic Information System (GIS)-based approach to automatic flight planning of camera-equipped UAVs for building fire emergency response. In this research, Haversine formula and lawn mowing patterns are employed to automate flight planning based on geometrical and spatial information from GIS. The resulting flight mission satisfies the requirements of 3D reconstruction purposes of the fireground, in consideration of flight execution safety and visibility of camera frames. The proposed approach is implemented within a GIS environment through an application programming interface. A case study is used to demonstrate the effectiveness of the proposed approach. The result shows that flight mission can be generated in a timely manner for application to fire emergency response.

Keywords: GIS, camera-equipped UAVs, automatic flight planning, fire emergency response

Procedia PDF Downloads 119
2613 Physiology of Temporal Lobe and Limbic System

Authors: Khaled A. Abdel-Sater

Abstract:

There are four areas of the temporal lobe. Primary auditory area (areas 41 and 42); it is for the perception of auditory impulse, auditory association area (area 22, 21, and 20): Areas 21 and 20 are for understanding and interpretation of auditory sensation, recognition of language, and long-term memories. Area 22, also called Wernicke’s area, and a sensory speech centre. It is for interpretation of auditory and visual information, formation of thoughts in the mind, and choice of words to be used. Ideas and thoughts originate in it. The limbic system is a part of cortical and subcortical structure forming a ring around the brainstem. Cortical structures are the orbitofrontal area, subcallosal gyrus, cingulate gyrus, parahippocampal gyrus, and uncus. Subcortical structures are the hypothalamus, hippocampus, amygdala, septum, paraolfactory area, anterior nucleus of the thalamus portions of the basal ganglia. There are several physiological functions of the limbic system, including regulation of behavior, motivation, and emotion.

Keywords: limbic system, motivation, emotions, temporal lobe

Procedia PDF Downloads 196
2612 A Smart Monitoring System for Preventing Gas Risks in Indoor

Authors: Gyoutae Park, Geunjun Lyu, Yeonjae Lee, Jaheon Gu, Sanguk Ahn, Hiesik Kim

Abstract:

In this paper, we propose a system for preventing gas risks through the use of wireless communication modules and intelligent gas safety appliances. Our system configuration consists of an automatic extinguishing system, detectors, a wall-pad, and a microcomputer controlled micom gas meter to monitor gas flow and pressure as well as the occurrence of earthquakes. The automatic fire extinguishing system checks for both combustible gaseous leaks and monitors the environmental temperature, while the detector array measures smoke and CO gas concentrations. Depending on detected conditions, the micom gas meter cuts off an inner valve and generates a warning, the automatic fire-extinguishing system cuts off an external valve and sprays extinguishing materials, or the sensors generate signals and take further action when smoke or CO are detected. Information on intelligent measures taken by the gas safety appliances and sensors are transmitted to the wall-pad, which in turn relays this as real time data to a server that can be monitored via an external network (BcN) connection to a web or mobile application for the management of gas safety. To validate this smart-home gas management system, we field-tested its suitability for use in Korean apartments under several scenarios.

Keywords: gas sensor, leak, gas safety, gas meter, gas risk, wireless communication

Procedia PDF Downloads 411
2611 An Approach to Autonomous Drones Using Deep Reinforcement Learning and Object Detection

Authors: K. R. Roopesh Bharatwaj, Avinash Maharana, Favour Tobi Aborisade, Roger Young

Abstract:

Presently, there are few cases of complete automation of drones and its allied intelligence capabilities. In essence, the potential of the drone has not yet been fully utilized. This paper presents feasible methods to build an intelligent drone with smart capabilities such as self-driving, and obstacle avoidance. It does this through advanced Reinforcement Learning Techniques and performs object detection using latest advanced algorithms, which are capable of processing light weight models with fast training in real time instances. For the scope of this paper, after researching on the various algorithms and comparing them, we finally implemented the Deep-Q-Networks (DQN) algorithm in the AirSim Simulator. In future works, we plan to implement further advanced self-driving and object detection algorithms, we also plan to implement voice-based speech recognition for the entire drone operation which would provide an option of speech communication between users (People) and the drone in the time of unavoidable circumstances. Thus, making drones an interactive intelligent Robotic Voice Enabled Service Assistant. This proposed drone has a wide scope of usability and is applicable in scenarios such as Disaster management, Air Transport of essentials, Agriculture, Manufacturing, Monitoring people movements in public area, and Defense. Also discussed, is the entire drone communication based on the satellite broadband Internet technology for faster computation and seamless communication service for uninterrupted network during disasters and remote location operations. This paper will explain the feasible algorithms required to go about achieving this goal and is more of a reference paper for future researchers going down this path.

Keywords: convolution neural network, natural language processing, obstacle avoidance, satellite broadband technology, self-driving

Procedia PDF Downloads 244
2610 Fuzzy Inference System for Determining Collision Risk of Ship in Madura Strait Using Automatic Identification System

Authors: Emmy Pratiwi, Ketut B. Artana, A. A. B. Dinariyana

Abstract:

Madura Strait is considered as one of the busiest shipping channels in Indonesia. High vessel traffic density in Madura Strait gives serious threat due to navigational safety in this area, i.e. ship collision. This study is necessary as an attempt to enhance the safety of marine traffic. Fuzzy inference system (FIS) is proposed to calculate risk collision of ships. Collision risk is evaluated based on ship domain, Distance to Closest Point of Approach (DCPA), and Time to Closest Point of Approach (TCPA). Data were collected by utilizing Automatic Identification System (AIS). This study considers several ships’ domain models to give the characteristic of marine traffic in the waterways. Each encounter in the ship domain is analyzed to obtain the level of collision risk. Risk level of ships, as the result in this study, can be used as guidance to avoid the accident, providing brief description about safety traffic in Madura Strait and improving the navigational safety in the area.

Keywords: automatic identification system, collision risk, DCPA, fuzzy inference system, TCPA

Procedia PDF Downloads 547
2609 Human Computer Interaction Using Computer Vision and Speech Processing

Authors: Shreyansh Jain Jeetmal, Shobith P. Chadaga, Shreyas H. Srinivas

Abstract:

Internet of Things (IoT) is seen as the next major step in the ongoing revolution in the Information Age. It is predicted that in the near future billions of embedded devices will be communicating with each other to perform a plethora of tasks with or without human intervention. One of the major ongoing hotbed of research activity in IoT is Human Computer Interaction (HCI). HCI is used to facilitate communication between an intelligent system and a user. An intelligent system typically comprises of a system consisting of various sensors, actuators and embedded controllers which communicate with each other to monitor data collected from the environment. Communication by the user to the system is typically done using voice. One of the major ongoing applications of HCI is in home automation as a personal assistant. The prime objective of our project is to implement a use case of HCI for home automation. Our system is designed to detect and recognize the users and personalize the appliances in the house according to their individual preferences. Our HCI system is also capable of speaking with the user when certain commands are spoken such as searching on the web for information and controlling appliances. Our system can also monitor the environment in the house such as air quality and gas leakages for added safety.

Keywords: human computer interaction, internet of things, computer vision, sensor networks, speech to text, text to speech, android

Procedia PDF Downloads 358
2608 Human-Machine Cooperation in Facial Comparison Based on Likelihood Scores

Authors: Lanchi Xie, Zhihui Li, Zhigang Li, Guiqiang Wang, Lei Xu, Yuwen Yan

Abstract:

Image-based facial features can be classified into category recognition features and individual recognition features. Current automated face recognition systems extract a specific feature vector of different dimensions from a facial image according to their pre-trained neural network. However, to improve the efficiency of parameter calculation, an algorithm generally reduces the image details by pooling. The operation will overlook the details concerned much by forensic experts. In our experiment, we adopted a variety of face recognition algorithms based on deep learning, compared a large number of naturally collected face images with the known data of the same person's frontal ID photos. Downscaling and manual handling were performed on the testing images. The results supported that the facial recognition algorithms based on deep learning detected structural and morphological information and rarely focused on specific markers such as stains and moles. Overall performance, distribution of genuine scores and impostor scores, and likelihood ratios were tested to evaluate the accuracy of biometric systems and forensic experts. Experiments showed that the biometric systems were skilled in distinguishing category features, and forensic experts were better at discovering the individual features of human faces. In the proposed approach, a fusion was performed at the score level. At the specified false accept rate, the framework achieved a lower false reject rate. This paper contributes to improving the interpretability of the objective method of facial comparison and provides a novel method for human-machine collaboration in this field.

Keywords: likelihood ratio, automated facial recognition, facial comparison, biometrics

Procedia PDF Downloads 125
2607 Investigating Activity Recognition Using 9-Axis Sensors and Filters in Wearable Devices

Authors: Jun Gil Ahn, Jong Kang Park, Jong Tae Kim

Abstract:

In this paper, we analyze major components of activity recognition (AR) in wearable device with 9-axis sensors and sensor fusion filters. 9-axis sensors commonly include 3-axis accelerometer, 3-axis gyroscope and 3-axis magnetometer. We chose sensor fusion filters as Kalman filter and Direction Cosine Matrix (DCM) filter. We also construct sensor fusion data from each activity sensor data and perform classification by accuracy of AR using Naïve Bayes and SVM. According to the classification results, we observed that the DCM filter and the specific combination of the sensing axes are more effective for AR in wearable devices while classifying walking, running, ascending and descending.

Keywords: accelerometer, activity recognition, directiona cosine matrix filter, gyroscope, Kalman filter, magnetometer

Procedia PDF Downloads 329
2606 Facial Emotion Recognition with Convolutional Neural Network Based Architecture

Authors: Koray U. Erbas

Abstract:

Neural networks are appealing for many applications since they are able to learn complex non-linear relationships between input and output data. As the number of neurons and layers in a neural network increase, it is possible to represent more complex relationships with automatically extracted features. Nowadays Deep Neural Networks (DNNs) are widely used in Computer Vision problems such as; classification, object detection, segmentation image editing etc. In this work, Facial Emotion Recognition task is performed by proposed Convolutional Neural Network (CNN)-based DNN architecture using FER2013 Dataset. Moreover, the effects of different hyperparameters (activation function, kernel size, initializer, batch size and network size) are investigated and ablation study results for Pooling Layer, Dropout and Batch Normalization are presented.

Keywords: convolutional neural network, deep learning, deep learning based FER, facial emotion recognition

Procedia PDF Downloads 267
2605 Random Subspace Neural Classifier for Meteor Recognition in the Night Sky

Authors: Carlos Vera, Tetyana Baydyk, Ernst Kussul, Graciela Velasco, Miguel Aparicio

Abstract:

This article describes the Random Subspace Neural Classifier (RSC) for the recognition of meteors in the night sky. We used images of meteors entering the atmosphere at night between 8:00 p.m.-5: 00 a.m. The objective of this project is to classify meteor and star images (with stars as the image background). The monitoring of the sky and the classification of meteors are made for future applications by scientists. The image database was collected from different websites. We worked with RGB-type images with dimensions of 220x220 pixels stored in the BitMap Protocol (BMP) format. Subsequent window scanning and processing were carried out for each image. The scan window where the characteristics were extracted had the size of 20x20 pixels with a scanning step size of 10 pixels. Brightness, contrast and contour orientation histograms were used as inputs for the RSC. The RSC worked with two classes and classified into: 1) with meteors and 2) without meteors. Different tests were carried out by varying the number of training cycles and the number of images for training and recognition. The percentage error for the neural classifier was calculated. The results show a good RSC classifier response with 89% correct recognition. The results of these experiments are presented and discussed.

Keywords: contour orientation histogram, meteors, night sky, RSC neural classifier, stars

Procedia PDF Downloads 133
2604 Automatic Processing of Trauma-Related Visual Stimuli in Female Patients Suffering From Post-Traumatic Stress Disorder after Interpersonal Traumatization

Authors: Theresa Slump, Paula Neumeister, Katharina Feldker, Carina Y. Heitmann, Thomas Straube

Abstract:

A characteristic feature of post-traumatic stress disorder (PTSD) is the automatic processing of disorder-specific stimuli that expresses itself in intrusive symptoms such as intense physical and psychological reactions to trauma-associated stimuli. That automatic processing plays an essential role in the development and maintenance of symptoms. The aim of our study was, therefore, to investigate the behavioral and neural correlates of automatic processing of trauma-related stimuli in PTSD. Although interpersonal traumatization is a form of traumatization that often occurs, it has not yet been sufficiently studied. That is why, in our study, we focused on patients suffering from interpersonal traumatization. While previous imaging studies on PTSD mainly used faces, words, or generally negative visual stimuli, our study presented complex trauma-related and neutral visual scenes. We examined 19 female subjects suffering from PTSD and examined 19 healthy women as a control group. All subjects did a geometric comparison task while lying in a functional-magnetic-resonance-imaging (fMRI) scanner. Trauma-related scenes and neutral visual scenes that were not relevant to the task were presented while the subjects were doing the task. Regarding the behavioral level, there were not any significant differences between the task performance of the two groups. Regarding the neural level, the PTSD patients showed significant hyperactivation of the hippocampus for task-irrelevant trauma-related stimuli versus neutral stimuli when compared with healthy control subjects. Connectivity analyses revealed altered connectivity between the hippocampus and other anxiety-related areas in PTSD patients, too. Overall, those findings suggest that fear-related areas are involved in PTSD patients' processing of trauma-related stimuli even if the stimuli that were used in the study were task-irrelevant.

Keywords: post-traumatic stress disorder, automatic processing, hippocampus, functional magnetic resonance imaging

Procedia PDF Downloads 194
2603 Emotion-Convolutional Neural Network for Perceiving Stress from Audio Signals: A Brain Chemistry Approach

Authors: Anup Anand Deshmukh, Catherine Soladie, Renaud Seguier

Abstract:

Emotion plays a key role in many applications like healthcare, to gather patients’ emotional behavior. Unlike typical ASR (Automated Speech Recognition) problems which focus on 'what was said', it is equally important to understand 'how it was said.' There are certain emotions which are given more importance due to their effectiveness in understanding human feelings. In this paper, we propose an approach that models human stress from audio signals. The research challenge in speech emotion detection is finding the appropriate set of acoustic features corresponding to an emotion. Another difficulty lies in defining the very meaning of emotion and being able to categorize it in a precise manner. Supervised Machine Learning models, including state of the art Deep Learning classification methods, rely on the availability of clean and labelled data. One of the problems in affective computation is the limited amount of annotated data. The existing labelled emotions datasets are highly subjective to the perception of the annotator. We address the first issue of feature selection by exploiting the use of traditional MFCC (Mel-Frequency Cepstral Coefficients) features in Convolutional Neural Network. Our proposed Emo-CNN (Emotion-CNN) architecture treats speech representations in a manner similar to how CNN’s treat images in a vision problem. Our experiments show that Emo-CNN consistently and significantly outperforms the popular existing methods over multiple datasets. It achieves 90.2% categorical accuracy on the Emo-DB dataset. We claim that Emo-CNN is robust to speaker variations and environmental distortions. The proposed approach achieves 85.5% speaker-dependant categorical accuracy for SAVEE (Surrey Audio-Visual Expressed Emotion) dataset, beating the existing CNN based approach by 10.2%. To tackle the second problem of subjectivity in stress labels, we use Lovheim’s cube, which is a 3-dimensional projection of emotions. Monoamine neurotransmitters are a type of chemical messengers in the brain that transmits signals on perceiving emotions. The cube aims at explaining the relationship between these neurotransmitters and the positions of emotions in 3D space. The learnt emotion representations from the Emo-CNN are mapped to the cube using three component PCA (Principal Component Analysis) which is then used to model human stress. This proposed approach not only circumvents the need for labelled stress data but also complies with the psychological theory of emotions given by Lovheim’s cube. We believe that this work is the first step towards creating a connection between Artificial Intelligence and the chemistry of human emotions.

Keywords: deep learning, brain chemistry, emotion perception, Lovheim's cube

Procedia PDF Downloads 149
2602 Developing Communicative Skills in Foreign Languages by Video Tasks

Authors: Ekaterina G. Lipatova

Abstract:

The developing potential of a video task in teaching foreign languages involves the opportunities to improve four aspects of speech production process: listening, reading, speaking and writing. A video represents the sequence of actions, realized in the pictures logically connected and verbalized speech flow that simplifies and stimulates the process of perception. In this connection listening skills of students are developed effectively as well as their intellectual properties such as synthesizing, analyzing and generalizing the information. In terms of teaching capacity, a video task, in our opinion, is more stimulating than a traditional listening, since it involves the student into the plot of the communicative situation, emotional background and potentially makes them react to the gist in the cognitive and communicative ways. To be an effective method of teaching the video task should be structured in the way of psycho-linguistic characteristics of speech production process, in other words, should include three phases: before-watching, while-watching and after-watching. The system of tasks provided to each phase might involve the situations on reflecting to the video content in the forms of filling-the-gap tasks, multiple choice, True-or-False tasks (reading skills), exercises on expressing the opinion, project fulfilling (writing and speaking skills). In the before-watching phase we offer the students to adjust their perception mechanism to the topic and the problem of the chosen video by such task as “what do you know about such a problem?”, “is it new for you?”, “have you ever faced the situation of…?”. Then we proceed with the lexical and grammatical analysis of language units that form the body of a speech sample to lessen the perception and develop the student’s lexicon. The goal of while-watching phase is to build the student’s awareness about the problem presented in the video and challenge their inner attitude towards what they have seen by identifying the mistakes in the statements about the video content or making the summary, justifying their understanding. Finally, we move on to development of their speech skills within the communicative situation they observed and learnt by stimulating them to search the similar ideas in their backgrounds and represent them orally or in the written form or express their own opinion on the problem. It is compulsory to highlight, that a video task should contain the urgent, valid and interesting event related to the future profession of the student, since it will help to activate cognitive, emotional, verbal and ethic capacity of students. Also, logically structured video tasks are easily integrated into the system of e-learning and can provide the opportunity for the students to work with the foreign language on their own.

Keywords: communicative situation, perception mechanism, speech production process, speech skills

Procedia PDF Downloads 243
2601 Human Action Recognition Using Variational Bayesian HMM with Dirichlet Process Mixture of Gaussian Wishart Emission Model

Authors: Wanhyun Cho, Soonja Kang, Sangkyoon Kim, Soonyoung Park

Abstract:

In this paper, we present the human action recognition method using the variational Bayesian HMM with the Dirichlet process mixture (DPM) of the Gaussian-Wishart emission model (GWEM). First, we define the Bayesian HMM based on the Dirichlet process, which allows an infinite number of Gaussian-Wishart components to support continuous emission observations. Second, we have considered an efficient variational Bayesian inference method that can be applied to drive the posterior distribution of hidden variables and model parameters for the proposed model based on training data. And then we have derived the predictive distribution that may be used to classify new action. Third, the paper proposes a process of extracting appropriate spatial-temporal feature vectors that can be used to recognize a wide range of human behaviors from input video image. Finally, we have conducted experiments that can evaluate the performance of the proposed method. The experimental results show that the method presented is more efficient with human action recognition than existing methods.

Keywords: human action recognition, Bayesian HMM, Dirichlet process mixture model, Gaussian-Wishart emission model, Variational Bayesian inference, prior distribution and approximate posterior distribution, KTH dataset

Procedia PDF Downloads 348
2600 Analyzing Speech Acts in Reddit Posts of Formerly Incarcerated Youths

Authors: Yusra Ibrahim

Abstract:

This study explores the online discourse of justice-involved youth on Reddit, focusing on how anonymity and asynchronicity influence their ability to share and reflect on their incarceration experiences within the "Ask Me Anything" (AMA) community. The study utilizes a quantitative analysis of speech acts to examine the varied communication patterns exhibited by youths and commenters across two AMA threads. The results indicate that, although Reddit is not specifically designed for formerly incarcerated youths, its features provide a supportive environment for them to share their incarceration experiences with non-incarcerated individuals. The level of empathy and support from the audience varies based on the audience’s perspectives on incarceration and related traumatic experiences. Additionally, the study identifies a reciprocal relationship where youths benefit from community support while offering insights into the juvenile justice system and helping the audience understand the experience of incarceration. The study also reveals cultural shocks in physical and digital environments that youth experience after release and when using social media platforms and the internet. The study has implications for juvenile justice personnel, policymakers, and researchers in the juvenile justice system.

Keywords: juvenile justice, online discourse, reddit AMA, anonymity, speech acts taxonomy, reintegration, online community support

Procedia PDF Downloads 36
2599 Leadership Effectiveness Compared among Three Cultures Using Voice Pitches

Authors: Asena Biber, Ates Gul Ergun, Seda Bulut

Abstract:

Based on the literature, there are large numbers of studies investigating the relationship between culture and leadership effectiveness. Although giving effective speeches is vital characteristic for a leader to be perceived as effective, to our knowledge, there is no research study the determinants of perceived effective leader speech. The aim of this study is to find the effects of both culture and voice pitch on perceptions of leader's speech effectiveness. Our hypothesis is that people from high power distance countries will perceive leaders' speech effective when the leader's voice pitch is high, comparing with people from relatively low power distance countries. The participants of the study were 36 undergraduate students (12 Pakistanis, 12 Nigerians, and 12 Turks) who are studying in Turkey. National power distance scores of Nigerians ranked as first, Turks ranked as second and Pakistanis ranked as third. There are two independent variables in this study; three nationality groups that representing three levels of power distance and voice pitch of the leader which is manipulated as high and low levels. Researchers prepared an audio to manipulate high and low conditions of voice pitch. A professional whose native language is English read the predetermined speech in high and low voice pitch conditions. Voice pitch was measured using Hertz (Hz) and Decibel (dB). Each nationality group (Pakistan, Nigeria, and Turkey) were divided into groups of six students who listened to either the low or high pitch conditions in the cubicles of the laboratory. It was expected from participants to listen to the audio and fill in the questionnaire which was measuring the leadership effectiveness on a response scale ranging from 1 to 5. To determine the effects of nationality and voice pitch on perceived effectiveness of leader' voice pitch, 3 (Pakistani, Nigerian, and Turk) x 2 (low voice pitch and high voice pitch) two way between subjects analysis of variances was carried out. The results indicated that there was no significant main effect of voice pitch and interaction effect on perceived effectiveness of the leader’s voice pitch. However, there was a significant main effect of nationality on perceived effectiveness of the leader's voice pitch. Based on the results of Turkey’s HSD post-hoc test, only the perceived effectiveness of the leader's speech difference between Pakistanis and Nigerians was statistically significant. The results show that the hypothesis of this study was not supported. As limitations of the study, it is of importance to mention that the sample size should be bigger. Also, the language of the questionnaire and speech should be in the participant’s native language in further studies.

Keywords: culture, leadership effectiveness, power distance, voice pitch

Procedia PDF Downloads 180