Search results for: spoken word recognition
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2494

Search results for: spoken word recognition

2104 IMPERTIO: An Efficient Communication Interface for Cerebral Palsy Patients

Authors: M. Zaïgouche, A. Kouvahe, F. Stefanelli

Abstract:

IMPERTIO is a high technology based project aiming at offering efficient assistance help in communication for persons affected by Cerebral Palsy. The systems currently available are hardly used by these patients who are not satisfied by ergonomics and response time. The project rests upon the concept that, opposite to usual master-slave communication giving power to the entity with larger range of possibilities, providing conversely the mastery to the entity with smaller range of possibilities will allow a better understanding ground for both parties. Entirely customizable, the application developed from this idea gives full freedom to the user. Through pictograms (one button linked to a word or a sentence) and adapted keyboard, noticeable improvements are brought to the response time and ease to use ergonomics.

Keywords: cerebral palsy, master-slave relation, communication interface, virtual keyboard, word construction algorithm

Procedia PDF Downloads 377
2103 Inclusive Cultural Heritage Tourism Project

Authors: L. Cruz-Lopes, M. Sell, P. Escudeiro, B. Esteves

Abstract:

It might be difficult for deaf people to communicate since spoken and written languages are different from sign language. When it comes to getting information, going to places of cultural heritage, or using services and infrastructure, there is a clear lack of inclusiveness. By creating assistive technology that enables deaf individuals to get around communication hurdles and encourage inclusive tourism, the ICHT- Inclusive Cultural Heritage Tourism initiative hopes to increase knowledge of sign language. The purpose of the Inclusive Cultural Heritage Tourism (ICHT) project is to develop online and on-site sign language tools and material for usage at popular tourist destinations in the northern region of Portugal, including Torre dos Clérigos, the Lello bookstore, Maia Zoo, Porto wine cellars, and São Pedro do Sul (Viseu) thermae. The ICHT system consists of an application using holography, a mobile game, an online platform for collaboration with deaf and hearing users, and a collection of International Sign training courses. The project also offers a prospect for a more inclusive society by introducing a method of teaching sign languages to tourism industry professionals. As a result, the teaching and learning of sign language along with the assistive technology tools created by the project sets up an inclusive environment for the deaf community, producing results in the area of automatic sign language translation and aiding in the global recognition of the Portuguese tourism industry.

Keywords: inclusive tourism, games, international sign training, deaf community

Procedia PDF Downloads 89
2102 Humanitarian Emergency of the Refugee Condition for Central American Immigrants in Irregular Situation

Authors: María de los Ángeles Cerda González, Itzel Arriaga Hurtado, Pascacio José Martínez Pichardo

Abstract:

In México, the recognition of refugee condition is a fundamental right which, as host State, has the obligation of respect, protect, and fulfill to the foreigners – where we can find the figure of immigrants in irregular situation-, that cannot return to their country of origin for humanitarian reasons. The recognition of the refugee condition as a fundamental right in the Mexican law system proceeds under these situations: 1. The immigrant applies for the refugee condition, even without the necessary proving elements to accredit the humanitarian character of his departure from his country of origin. 2. The immigrant does not apply for the recognition of refugee because he does not know he has the right to, even if he has the profile to apply for. 3. The immigrant who applies fulfills the requirements of the administrative procedure and has access to the refugee recognition. Of the three situations above, only the last one is contemplated for the national indexes of the status refugee; and the first two prove the inefficiency of the governmental system viewed from its lack of sensibility consequence of the no education in human rights matter and which results in the legal vulnerability of the immigrants in irregular situation because they do not have access to the procuration and administration of justice. In the aim of determining the causes and consequences of the no recognition of the refugee status, this investigation was structured from a systemic analysis which objective is to show the advances in Central American humanitarian emergency investigation, the Mexican States actions to protect, respect and fulfil the fundamental right of refugee of immigrants in irregular situation and the social and legal vulnerabilities suffered by Central Americans in Mexico. Therefore, to achieve the deduction of the legal nature of the humanitarian emergency from the Human Rights as a branch of the International Public Law, a conceptual framework is structured using the inductive deductive method. The problem statement is made from a legal framework to approach a theoretical scheme under the theory of social systems, from the analysis of the lack of communication of the governmental and normative subsystems of the Mexican legal system relative to the process undertaken by the Central American immigrants to achieve the recognition of the refugee status as a human right. Accordingly, is determined that fulfilling the obligations of the State referent to grant the right of the recognition of the refugee condition, would mean a guideline for a new stage in Mexican Law, because it would enlarge the constitutional benefits to everyone whose right to the recognition of refugee has been denied an as consequence, a great advance in human rights matter would be achieved.

Keywords: central American immigrants in irregular situation, humanitarian emergency, human rights, refugee

Procedia PDF Downloads 267
2101 VIAN-DH: Computational Multimodal Conversation Analysis Software and Infrastructure

Authors: Teodora Vukovic, Christoph Hottiger, Noah Bubenhofer

Abstract:

The development of VIAN-DH aims at bridging two linguistic approaches: conversation analysis/interactional linguistics (IL), so far a dominantly qualitative field, and computational/corpus linguistics and its quantitative and automated methods. Contemporary IL investigates the systematic organization of conversations and interactions composed of speech, gaze, gestures, and body positioning, among others. These highly integrated multimodal behaviour is analysed based on video data aimed at uncovering so called “multimodal gestalts”, patterns of linguistic and embodied conduct that reoccur in specific sequential positions employed for specific purposes. Multimodal analyses (and other disciplines using videos) are so far dependent on time and resource intensive processes of manual transcription of each component from video materials. Automating these tasks requires advanced programming skills, which is often not in the scope of IL. Moreover, the use of different tools makes the integration and analysis of different formats challenging. Consequently, IL research often deals with relatively small samples of annotated data which are suitable for qualitative analysis but not enough for making generalized empirical claims derived quantitatively. VIAN-DH aims to create a workspace where many annotation layers required for the multimodal analysis of videos can be created, processed, and correlated in one platform. VIAN-DH will provide a graphical interface that operates state-of-the-art tools for automating parts of the data processing. The integration of tools that already exist in computational linguistics and computer vision, facilitates data processing for researchers lacking programming skills, speeds up the overall research process, and enables the processing of large amounts of data. The main features to be introduced are automatic speech recognition for the transcription of language, automatic image recognition for extraction of gestures and other visual cues, as well as grammatical annotation for adding morphological and syntactic information to the verbal content. In the ongoing instance of VIAN-DH, we focus on gesture extraction (pointing gestures, in particular), making use of existing models created for sign language and adapting them for this specific purpose. In order to view and search the data, VIAN-DH will provide a unified format and enable the import of the main existing formats of annotated video data and the export to other formats used in the field, while integrating different data source formats in a way that they can be combined in research. VIAN-DH will adapt querying methods from corpus linguistics to enable parallel search of many annotation levels, combining token-level and chronological search for various types of data. VIAN-DH strives to bring crucial and potentially revolutionary innovation to the field of IL, (that can also extend to other fields using video materials). It will allow the processing of large amounts of data automatically and, the implementation of quantitative analyses, combining it with the qualitative approach. It will facilitate the investigation of correlations between linguistic patterns (lexical or grammatical) with conversational aspects (turn-taking or gestures). Users will be able to automatically transcribe and annotate visual, spoken and grammatical information from videos, and to correlate those different levels and perform queries and analyses.

Keywords: multimodal analysis, corpus linguistics, computational linguistics, image recognition, speech recognition

Procedia PDF Downloads 81
2100 Hand Symbol Recognition Using Canny Edge Algorithm and Convolutional Neural Network

Authors: Harshit Mittal, Neeraj Garg

Abstract:

Hand symbol recognition is a pivotal component in the domain of computer vision, with far-reaching applications spanning sign language interpretation, human-computer interaction, and accessibility. This research paper discusses the approach with the integration of the Canny Edge algorithm and convolutional neural network. The significance of this study lies in its potential to enhance communication and accessibility for individuals with hearing impairments or those engaged in gesture-based interactions with technology. In the experiment mentioned, the data is manually collected by the authors from the webcam using Python codes, to increase the dataset augmentation, is applied to original images, which makes the model more compatible and advanced. Further, the dataset of about 6000 coloured images distributed equally in 5 classes (i.e., 1, 2, 3, 4, 5) are pre-processed first to gray images and then by the Canny Edge algorithm with threshold 1 and 2 as 150 each. After successful data building, this data is trained on the Convolutional Neural Network model, giving accuracy: 0.97834, precision: 0.97841, recall: 0.9783, and F1 score: 0.97832. For user purposes, a block of codes is built in Python to enable a window for hand symbol recognition. This research, at its core, seeks to advance the field of computer vision by providing an advanced perspective on hand sign recognition. By leveraging the capabilities of the Canny Edge algorithm and convolutional neural network, this study contributes to the ongoing efforts to create more accurate, efficient, and accessible solutions for individuals with diverse communication needs.

Keywords: hand symbol recognition, computer vision, Canny edge algorithm, convolutional neural network

Procedia PDF Downloads 37
2099 Multimodal Database of Emotional Speech, Video and Gestures

Authors: Tomasz Sapiński, Dorota Kamińska, Adam Pelikant, Egils Avots, Cagri Ozcinar, Gholamreza Anbarjafari

Abstract:

People express emotions through different modalities. Integration of verbal and non-verbal communication channels creates a system in which the message is easier to understand. Expanding the focus to several expression forms can facilitate research on emotion recognition as well as human-machine interaction. In this article, the authors present a Polish emotional database composed of three modalities: facial expressions, body movement and gestures, and speech. The corpora contains recordings registered in studio conditions, acted out by 16 professional actors (8 male and 8 female). The data is labeled with six basic emotions categories, according to Ekman’s emotion categories. To check the quality of performance, all recordings are evaluated by experts and volunteers. The database is available to academic community and might be useful in the study on audio-visual emotion recognition.

Keywords: body movement, emotion recognition, emotional corpus, facial expressions, gestures, multimodal database, speech

Procedia PDF Downloads 328
2098 An Approach for Vocal Register Recognition Based on Spectral Analysis of Singing

Authors: Aleksandra Zysk, Pawel Badura

Abstract:

Recognizing and controlling vocal registers during singing is a difficult task for beginner vocalist. It requires among others identifying which part of natural resonators is being used when a sound propagates through the body. Thus, an application has been designed allowing for sound recording, automatic vocal register recognition (VRR), and a graphical user interface providing real-time visualization of the signal and recognition results. Six spectral features are determined for each time frame and passed to the support vector machine classifier yielding a binary decision on the head or chest register assignment of the segment. The classification training and testing data have been recorded by ten professional female singers (soprano, aged 19-29) performing sounds for both chest and head register. The classification accuracy exceeded 93% in each of various validation schemes. Apart from a hard two-class clustering, the support vector classifier returns also information on the distance between particular feature vector and the discrimination hyperplane in a feature space. Such an information reflects the level of certainty of the vocal register classification in a fuzzy way. Thus, the designed recognition and training application is able to assess and visualize the continuous trend in singing in a user-friendly graphical mode providing an easy way to control the vocal emission.

Keywords: classification, singing, spectral analysis, vocal emission, vocal register

Procedia PDF Downloads 282
2097 ViraPart: A Text Refinement Framework for Automatic Speech Recognition and Natural Language Processing Tasks in Persian

Authors: Narges Farokhshad, Milad Molazadeh, Saman Jamalabbasi, Hamed Babaei Giglou, Saeed Bibak

Abstract:

The Persian language is an inflectional subject-object-verb language. This fact makes Persian a more uncertain language. However, using techniques such as Zero-Width Non-Joiner (ZWNJ) recognition, punctuation restoration, and Persian Ezafe construction will lead us to a more understandable and precise language. In most of the works in Persian, these techniques are addressed individually. Despite that, we believe that for text refinement in Persian, all of these tasks are necessary. In this work, we proposed a ViraPart framework that uses embedded ParsBERT in its core for text clarifications. First, used the BERT variant for Persian followed by a classifier layer for classification procedures. Next, we combined models outputs to output cleartext. In the end, the proposed model for ZWNJ recognition, punctuation restoration, and Persian Ezafe construction performs the averaged F1 macro scores of 96.90%, 92.13%, and 98.50%, respectively. Experimental results show that our proposed approach is very effective in text refinement for the Persian language.

Keywords: Persian Ezafe, punctuation, ZWNJ, NLP, ParsBERT, transformers

Procedia PDF Downloads 181
2096 Cross Attention Fusion for Dual-Stream Speech Emotion Recognition

Authors: Shaode Yu, Jiajian Meng, Bing Zhu, Hang Yu, Qiurui Sun

Abstract:

Speech emotion recognition (SER) is for recognizing human subjective emotions through audio data in-depth analysis. From speech audios, how to comprehensively extract emotional information and how to effectively fuse extracted features remain challenging. This paper presents a dual-stream SER framework that embraces both full training and transfer learning of different networks for thorough feature encoding. Besides, a plug-and-play cross-attention fusion (CAF) module is implemented for the valid integration of the dual-stream encoder output. The effectiveness of the proposed CAF module is compared to the other three fusion modules (feature summation, feature concatenation, and feature-wise linear modulation) on two databases (RAVDESS and IEMO-CAP) using different dual-stream encoders (full training network, DPCNN or TextRCNN; transfer learning network, HuBERT or Wav2Vec2). Experimental results suggest that the CAF module can effectively reconcile conflicts between features from different encoders and outperform the other three feature fusion modules on the SER task. In the future, the plug-and-play CAF module can be extended for multi-branch feature fusion, and the dual-stream SER framework can be widened for multi-stream data representation to improve the recognition performance and generalization capacity.

Keywords: speech emotion recognition, cross-attention fusion, dual-stream, pre-trained

Procedia PDF Downloads 44
2095 Algorithm for Path Recognition in-between Tree Rows for Agricultural Wheeled-Mobile Robots

Authors: Anderson Rocha, Pedro Miguel de Figueiredo Dinis Oliveira Gaspar

Abstract:

Machine vision has been widely used in recent years in agriculture, as a tool to promote the automation of processes and increase the levels of productivity. The aim of this work is the development of a path recognition algorithm based on image processing to guide a terrestrial robot in-between tree rows. The proposed algorithm was developed using the software MATLAB, and it uses several image processing operations, such as threshold detection, morphological erosion, histogram equalization and the Hough transform, to find edge lines along tree rows on an image and to create a path to be followed by a mobile robot. To develop the algorithm, a set of images of different types of orchards was used, which made possible the construction of a method capable of identifying paths between trees of different heights and aspects. The algorithm was evaluated using several images with different characteristics of quality and the results showed that the proposed method can successfully detect a path in different types of environments.

Keywords: agricultural mobile robot, image processing, path recognition, hough transform

Procedia PDF Downloads 123
2094 Deep Learning Application for Object Image Recognition and Robot Automatic Grasping

Authors: Shiuh-Jer Huang, Chen-Zon Yan, C. K. Huang, Chun-Chien Ting

Abstract:

Since the vision system application in industrial environment for autonomous purposes is required intensely, the image recognition technique becomes an important research topic. Here, deep learning algorithm is employed in image system to recognize the industrial object and integrate with a 7A6 Series Manipulator for object automatic gripping task. PC and Graphic Processing Unit (GPU) are chosen to construct the 3D Vision Recognition System. Depth Camera (Intel RealSense SR300) is employed to extract the image for object recognition and coordinate derivation. The YOLOv2 scheme is adopted in Convolution neural network (CNN) structure for object classification and center point prediction. Additionally, image processing strategy is used to find the object contour for calculating the object orientation angle. Then, the specified object location and orientation information are sent to robotic controller. Finally, a six-axis manipulator can grasp the specific object in a random environment based on the user command and the extracted image information. The experimental results show that YOLOv2 has been successfully employed to detect the object location and category with confidence near 0.9 and 3D position error less than 0.4 mm. It is useful for future intelligent robotic application in industrial 4.0 environment.

Keywords: deep learning, image processing, convolution neural network, YOLOv2, 7A6 series manipulator

Procedia PDF Downloads 216
2093 Text Localization in Fixed-Layout Documents Using Convolutional Networks in a Coarse-to-Fine Manner

Authors: Beier Zhu, Rui Zhang, Qi Song

Abstract:

Text contained within fixed-layout documents can be of great semantic value and so requires a high localization accuracy, such as ID cards, invoices, cheques, and passports. Recently, algorithms based on deep convolutional networks achieve high performance on text detection tasks. However, for text localization in fixed-layout documents, such algorithms detect word bounding boxes individually, which ignores the layout information. This paper presents a novel architecture built on convolutional neural networks (CNNs). A global text localization network and a regional bounding-box regression network are introduced to tackle the problem in a coarse-to-fine manner. The text localization network simultaneously locates word bounding points, which takes the layout information into account. The bounding-box regression network inputs the features pooled from arbitrarily sized RoIs and refine the localizations. These two networks share their convolutional features and are trained jointly. A typical type of fixed-layout documents: ID cards, is selected to evaluate the effectiveness of the proposed system. These networks are trained on data cropped from nature scene images, and synthetic data produced by a synthetic text generation engine. Experiments show that our approach locates high accuracy word bounding boxes and achieves state-of-the-art performance.

Keywords: bounding box regression, convolutional networks, fixed-layout documents, text localization

Procedia PDF Downloads 169
2092 A Grey-Box Text Attack Framework Using Explainable AI

Authors: Esther Chiramal, Kelvin Soh Boon Kai

Abstract:

Explainable AI is a strong strategy implemented to understand complex black-box model predictions in a human-interpretable language. It provides the evidence required to execute the use of trustworthy and reliable AI systems. On the other hand, however, it also opens the door to locating possible vulnerabilities in an AI model. Traditional adversarial text attack uses word substitution, data augmentation techniques, and gradient-based attacks on powerful pre-trained Bidirectional Encoder Representations from Transformers (BERT) variants to generate adversarial sentences. These attacks are generally white-box in nature and not practical as they can be easily detected by humans e.g., Changing the word from “Poor” to “Rich”. We proposed a simple yet effective Grey-box cum Black-box approach that does not require the knowledge of the model while using a set of surrogate Transformer/BERT models to perform the attack using Explainable AI techniques. As Transformers are the current state-of-the-art models for almost all Natural Language Processing (NLP) tasks, an attack generated from BERT1 is transferable to BERT2. This transferability is made possible due to the attention mechanism in the transformer that allows the model to capture long-range dependencies in a sequence. Using the power of BERT generalisation via attention, we attempt to exploit how transformers learn by attacking a few surrogate transformer variants which are all based on a different architecture. We demonstrate that this approach is highly effective to generate semantically good sentences by changing as little as one word that is not detectable by humans while still fooling other BERT models.

Keywords: BERT, explainable AI, Grey-box text attack, transformer

Procedia PDF Downloads 116
2091 'Value-Based Re-Framing' in Identity-Based Conflicts: A Skill for Mediators in Multi-Cultural Societies

Authors: Hami-Ziniman Revital, Ashwall Rachelly

Abstract:

The conflict resolution realm has developed tremendously during the last half-decade. Three main approaches should be mentioned: an Alternative Dispute Resolution (ADR) suggesting processes such as Arbitration or Interests-based Negotiation was developed as an answer to obligations and rights-based conflicts. The Pragmatic mediation approach focuses on the gap between interests and needs of disputants. The Transformative mediation approach focusses on relations and suits identity-based conflicts. In the current study, we examine the conflictual relations between religious and non-religious Jews in Israel and the impact of three transformative mechanisms: Inter-group recognition, In-group empowerment and Value-based reframing on the relations between the participants. The research was conducted during four facilitated joint mediation classes. A unique finding was found. Using both transformative mechanisms and the Contact Hypothesis criteria, we identify transformation in participants’ relations and a considerable change from anger, alienation, and suspiciousness to an increased understanding, affection and interpersonal concern towards the out-group members. Intergroup Recognition, In-group empowerment, and Values-based reframing were the skills discovered as the main enablers of the change in the relations and the research participants’ fostered mutual recognition of the out-group values and identity-based issues. We conclude this transformation was possible due to a constant intergroup contact, based on the Contact Hypothesis criteria. In addition, as Interests-based mediation uses “Reframing” as a skill to acknowledge both mutual and opposite needs of the disputants, we suggest the use of “Value-based Reframing” in intergroup identity-based conflicts, as a skill contributes to the empowerment and the recognition of both mutual and different out-group values. We offer to implement those insights and skills to assist conflict resolution facilitators in various intergroup identity-based conflicts resolution efforts and to establish further research and knowledge.

Keywords: empowerment, identity-based conflict, intergroup recognition, intergroup relations, mediation skills, multi-cultural society, reframing, value-based recognition

Procedia PDF Downloads 317
2090 Facial Recognition Technology in Institutions of Higher Learning: Exploring the Use in Kenya

Authors: Samuel Mwangi, Josephine K. Mule

Abstract:

Access control as a security technique regulates who or what can access resources. It is a fundamental concept in security that minimizes risks to the institutions that use access control. Regulating access to institutions of higher learning is key to ensure only authorized personnel and students are allowed into the institutions. The use of biometrics has been criticized due to the setup and maintenance costs, hygiene concerns, and trepidations regarding data privacy, among other apprehensions. Facial recognition is arguably a fast and accurate way of validating identity in order to guard protected areas. It guarantees that only authorized individuals gain access to secure locations while requiring far less personal information whilst providing an additional layer of security beyond keys, fobs, or identity cards. This exploratory study sought to investigate the use of facial recognition in controlling access in institutions of higher learning in Kenya. The sample population was drawn from both private and public higher learning institutions. The data is based on responses from staff and students. Questionnaires were used for data collection and follow up interviews conducted to understand responses from the questionnaires. 80% of the sampled population indicated that there were many security breaches by unauthorized people, with some resulting in terror attacks. These security breaches were attributed to stolen identity cases, where staff or student identity cards were stolen and used by criminals to access the institutions. These unauthorized accesses have resulted in losses to the institutions, including reputational damages. The findings indicate that security breaches are a major problem in institutions of higher learning in Kenya. Consequently, access control would be beneficial if employed to curb security breaches. We suggest the use of facial recognition technology, given its uniqueness in identifying users and its non-repudiation capabilities.

Keywords: facial recognition, access control, technology, learning

Procedia PDF Downloads 103
2089 The Power of Words: A Corpus Analysis of Campaign Speeches of President Donald J. Trump

Authors: Aiza Dalman

Abstract:

Words are powerful when these are used wisely and strategically. In this study, twelve (12) campaign speeches of President Donald J. Trump were analyzed as to frequently used words and ethos, pathos and logos being employed. The speeches were read thoroughly, analyzed and interpreted. With the use of Word Counter Tool and Text Analyzer software accessible online, it was found out that the word ‘will’ has the highest frequency of 121, followed by Hillary (58), American (38), going (35), plan and Clinton (32), illegal (30), government (28), corruption (26) and criminal (24). When the speeches were analyzed as to ethos, pathos and logos, on the other hand, it revealed that these were all employed in his speeches. The statements under these pointed out against Hillary or in his favor. The unique strategy of President Donald J. Trump as to frequently used words and ethos, pathos and logos in persuading people perhaps lead the way to his victory.

Keywords: campaign speeches, corpus analysis, ethos, logos and pathos, power of words

Procedia PDF Downloads 253
2088 Face Recognition Using Eigen Faces Algorithm

Authors: Shweta Pinjarkar, Shrutika Yawale, Mayuri Patil, Reshma Adagale

Abstract:

Face recognition is the technique which can be applied to the wide variety of problems like image and film processing, human computer interaction, criminal identification etc. This has motivated researchers to develop computational models to identify the faces, which are easy and simple to implement. In this, demonstrates the face recognition system in android device using eigenface. The system can be used as the base for the development of the recognition of human identity. Test images and training images are taken directly with the camera in android device.The test results showed that the system produces high accuracy. The goal is to implement model for particular face and distinguish it with large number of stored faces. face recognition system detects the faces in picture taken by web camera or digital camera and these images then checked with training images dataset based on descriptive features. Further this algorithm can be extended to recognize the facial expressions of a person.recognition could be carried out under widely varying conditions like frontal view,scaled frontal view subjects with spectacles. The algorithm models the real time varying lightning conditions. The implemented system is able to perform real-time face detection, face recognition and can give feedback giving a window with the subject's info from database and sending an e-mail notification to interested institutions using android application. Face recognition is the technique which can be applied to the wide variety of problems like image and film processing, human computer interaction, criminal identification etc. This has motivated researchers to develop computational models to identify the faces, which are easy and simple to implement. In this , demonstrates the face recognition system in android device using eigenface. The system can be used as the base for the development of the recognition of human identity. Test images and training images are taken directly with the camera in android device.The test results showed that the system produces high accuracy. The goal is to implement model for particular face and distinguish it with large number of stored faces. face recognition system detects the faces in picture taken by web camera or digital camera and these images then checked with training images dataset based on descriptive features. Further this algorithm can be extended to recognize the facial expressions of a person.recognition could be carried out under widely varying conditions like frontal view,scaled frontal view subjects with spectacles. The algorithm models the real time varying lightning conditions. The implemented system is able to perform real-time face detection, face recognition and can give feedback giving a window with the subject's info from database and sending an e-mail notification to interested institutions using android application.

Keywords: face detection, face recognition, eigen faces, algorithm

Procedia PDF Downloads 338
2087 Improving Low English Oral Skills of 5 Second-Year English Major Students at Debark University

Authors: Belyihun Muchie

Abstract:

This study investigates the low English oral communication skills of 5 second-year English major students at Debark University. It aims to identify the key factors contributing to their weaknesses and propose effective interventions to improve their spoken English proficiency. Mixed-methods research will be employed, utilizing observations, questionnaires, and semi-structured interviews to gather data from the participants. To clearly identify these factors, structured and informal observations will be employed; the former will be used to identify their fluency, pronunciation, vocabulary use, and grammar accuracy, and the later will be suited to observe the natural interactions and communication patterns of learners in the classroom setting. The questionnaires will assess their self-perceptions of their skills, perceived barriers to fluency, and preferred learning styles. Interviews will also delve deeper into their experiences and explore specific obstacles faced in oral communication. Data analysis will involve both quantitative and qualitative responses. The structured observation and questionnaire will be analyzed quantitatively, whereas the informal observation and interview transcripts will be analyzed thematically. Findings will be used to identify the major causes of low oral communication skills, such as limited vocabulary, grammatical errors, pronunciation difficulties, or lack of confidence. They are also helpful to develop targeted solutions addressing these causes, such as intensive pronunciation practice, conversation simulations, personalized feedback, or anxiety-reduction techniques. Finally, the findings will guide designing an intervention plan for implementation during the action research phase. The study's outcomes are expected to provide valuable insights into the challenges faced by English major students in developing oral communication skills, contribute to the development of evidence-based interventions for improving spoken English proficiency in similar contexts, and offer practical recommendations for English language instructors and curriculum developers to enhance student learning outcomes. By addressing the specific needs of these students and implementing tailored interventions, this research aims to bridge the gap between theoretical knowledge and practical speaking ability, equipping them with the confidence and skills to flourish in English communication settings.

Keywords: oral communication skills, mixed-methods, evidence-based interventions, spoken English proficiency

Procedia PDF Downloads 26
2086 Memory Retrieval and Implicit Prosody during Reading: Anaphora Resolution by L1 and L2 Speakers of English

Authors: Duong Thuy Nguyen, Giulia Bencini

Abstract:

The present study examined structural and prosodic factors on the computation of antecedent-reflexive relationships and sentence comprehension in native English (L1) and Vietnamese-English bilinguals (L2). Participants read sentences presented on the computer screen in one of three presentation formats aimed at manipulating prosodic parsing: word-by-word (RSVP), phrase-segment (self-paced), or whole-sentence (self-paced), then completed a grammaticality rating and a comprehension task (following Pratt & Fernandez, 2016). The design crossed three factors: syntactic structure (simple; complex), grammaticality (target-match; target-mismatch) and presentation format. An example item is provided in (1): (1) The actress that (Mary/John) interviewed at the awards ceremony (about two years ago/organized outside the theater) described (herself/himself) as an extreme workaholic). Results showed that overall, both L1 and L2 speakers made use of a good-enough processing strategy at the expense of more detailed syntactic analyses. L1 and L2 speakers’ comprehension and grammaticality judgements were negatively affected by the most prosodically disrupting condition (word-by-word). However, the two groups demonstrated differences in their performance in the other two reading conditions. For L1 speakers, the whole-sentence and the phrase-segment formats were both facilitative in the grammaticality rating and comprehension tasks; for L2, compared with the whole-sentence condition, the phrase-segment paradigm did not significantly improve accuracy or comprehension. These findings are consistent with the findings of Pratt & Fernandez (2016), who found a similar pattern of results in the processing of subject-verb agreement relations using the same experimental paradigm and prosodic manipulation with English L1 and L2 English-Spanish speakers. The results provide further support for a Good-Enough cue model of sentence processing that integrates cue-based retrieval and implicit prosodic parsing (Pratt & Fernandez, 2016) and highlights similarities and differences between L1 and L2 sentence processing and comprehension.

Keywords: anaphora resolution, bilingualism, implicit prosody, sentence processing

Procedia PDF Downloads 125
2085 Unsupervised Assistive and Adaptative Intelligent Agent in Smart Enviroment

Authors: Sebastião Pais, João Casal, Ricardo Ponciano, Sérgio Lorenço

Abstract:

The adaptation paradigm is a basic defining feature for pervasive computing systems. Adaptation systems must work efficiently in a smart environment while providing suitable information relevant to the user system interaction. The key objective is to deduce the information needed information changes. Therefore relying on fixed operational models would be inappropriate. This paper presents a study on developing an Intelligent Personal Assistant to assist the user in interacting with their Smart Environment. We propose an Unsupervised and Language-Independent Adaptation through Intelligent Speech Interface and a set of methods of Acquiring Knowledge, namely Semantic Similarity and Unsupervised Learning.

Keywords: intelligent personal assistants, intelligent speech interface, unsupervised learning, language-independent, knowledge acquisition, association measures, symmetric word similarities, attributional word similarities

Procedia PDF Downloads 531
2084 Unsupervised Assistive and Adaptive Intelligent Agent in Smart Environment

Authors: Sebastião Pais, João Casal, Ricardo Ponciano, Sérgio Lourenço

Abstract:

The adaptation paradigm is a basic defining feature for pervasive computing systems. Adaptation systems must work efficiently in smart environment while providing suitable information relevant to the user system interaction. The key objective is to deduce the information needed information changes. Therefore, relying on fixed operational models would be inappropriate. This paper presents a study on developing a Intelligent Personal Assistant to assist the user in interacting with their Smart Environment. We propose a Unsupervised and Language-Independent Adaptation through Intelligent Speech Interface and a set of methods of Acquiring Knowledge, namely Semantic Similarity and Unsupervised Learning.

Keywords: intelligent personal assistants, intelligent speech interface, unsupervised learning, language-independent, knowledge acquisition, association measures, symmetric word similarities, attributional word similarities

Procedia PDF Downloads 609
2083 Burnout Recognition for Call Center Agents by Using Skin Color Detection with Hand Poses

Authors: El Sayed A. Sharara, A. Tsuji, K. Terada

Abstract:

Call centers have been expanding and they have influence on activation in various markets increasingly. A call center’s work is known as one of the most demanding and stressful jobs. In this paper, we propose the fatigue detection system in order to detect burnout of call center agents in the case of a neck pain and upper back pain. Our proposed system is based on the computer vision technique combined skin color detection with the Viola-Jones object detector. To recognize the gesture of hand poses caused by stress sign, the YCbCr color space is used to detect the skin color region including face and hand poses around the area related to neck ache and upper back pain. A cascade of clarifiers by Viola-Jones is used for face recognition to extract from the skin color region. The detection of hand poses is given by the evaluation of neck pain and upper back pain by using skin color detection and face recognition method. The system performance is evaluated using two groups of dataset created in the laboratory to simulate call center environment. Our call center agent burnout detection system has been implemented by using a web camera and has been processed by MATLAB. From the experimental results, our system achieved 96.3% for upper back pain detection and 94.2% for neck pain detection.

Keywords: call center agents, fatigue, skin color detection, face recognition

Procedia PDF Downloads 271
2082 Development of an EEG-Based Real-Time Emotion Recognition System on Edge AI

Authors: James Rigor Camacho, Wansu Lim

Abstract:

Over the last few years, the development of new wearable and processing technologies has accelerated in order to harness physiological data such as electroencephalograms (EEGs) for EEG-based applications. EEG has been demonstrated to be a source of emotion recognition signals with the highest classification accuracy among physiological signals. However, when emotion recognition systems are used for real-time classification, the training unit is frequently left to run offline or in the cloud rather than working locally on the edge. That strategy has hampered research, and the full potential of using an edge AI device has yet to be realized. Edge AI devices are computers with high performance that can process complex algorithms. It is capable of collecting, processing, and storing data on its own. It can also analyze and apply complicated algorithms like localization, detection, and recognition on a real-time application, making it a powerful embedded device. The NVIDIA Jetson series, specifically the Jetson Nano device, was used in the implementation. The cEEGrid, which is integrated to the open-source brain computer-interface platform (OpenBCI), is used to collect EEG signals. An EEG-based real-time emotion recognition system on Edge AI is proposed in this paper. To perform graphical spectrogram categorization of EEG signals and to predict emotional states based on input data properties, machine learning-based classifiers were used. Until the emotional state was identified, the EEG signals were analyzed using the K-Nearest Neighbor (KNN) technique, which is a supervised learning system. In EEG signal processing, after each EEG signal has been received in real-time and translated from time to frequency domain, the Fast Fourier Transform (FFT) technique is utilized to observe the frequency bands in each EEG signal. To appropriately show the variance of each EEG frequency band, power density, standard deviation, and mean are calculated and employed. The next stage is to identify the features that have been chosen to predict emotion in EEG data using the K-Nearest Neighbors (KNN) technique. Arousal and valence datasets are used to train the parameters defined by the KNN technique.Because classification and recognition of specific classes, as well as emotion prediction, are conducted both online and locally on the edge, the KNN technique increased the performance of the emotion recognition system on the NVIDIA Jetson Nano. Finally, this implementation aims to bridge the research gap on cost-effective and efficient real-time emotion recognition using a resource constrained hardware device, like the NVIDIA Jetson Nano. On the cutting edge of AI, EEG-based emotion identification can be employed in applications that can rapidly expand the research and implementation industry's use.

Keywords: edge AI device, EEG, emotion recognition system, supervised learning algorithm, sensors

Procedia PDF Downloads 82
2081 The Meaning of Happiness and Unhappiness among Female Teenagers in Urban Finland: A Social Representations Approach

Authors: Jennifer De Paola

Abstract:

Objectives: The literature is saturated with figures and hard data on happiness and its rates, causes and effects at a large scale, whereas very little is known about the way specific groups of people within societies understand and talk about happiness in their everyday life. The present study contributes to fill this gap in the happiness research by analyzing social representations of happiness among young women through the theoretical frame provided by Moscovici’s Social Representation Theory. Methods: Participants were (N= 351) female students (16-18 year olds) from Finnish, Swedish and English speaking high schools in the Helsinki region, Finland. Main source of data collection were word associations using the stimulus word ‘happiness’ and word associations using as stimulus the term that in the participants’ opinion represents the opposite of happiness. The allowed number of associations was five per stimulus word (10 associations per participant). In total, the 351 participants produced 6973 associations with the two stimulus words given: 3500 (50,19%) associations with ‘happiness’ and 3473 (49,81%) associations with ‘opposite of happiness’. The associations produced were analyzed qualitatively to identify associations with similar meaning and then coded combining similar associations in larger categories. Results: In total, 33 categories were identified respectively for the stimulus word ‘happiness’ and for the stimulus word ‘opposite of happiness’. In general terms, the 33 categories identified for ‘happiness’ included associations regarding relationships with key people considered important, such as ‘family’, abstract concepts such as meaningful life, success and moral values as well as more mundane and hedonic elements like food, pleasure and fun. Similarly, the 33 categories emerged for ‘opposite of happiness’ included relationship problems and arguments, negative feelings such as sadness, depression, stress as well as more concrete issues such as financial problems. Participants were also asked to rate their own level of happiness on a scale from 1 to 10. Results indicated the mean of the self-rated level of happiness was 7,93 (the range varied from 1 to 10; SD = 1, 50). Participants’ responses were further divided into three different groups according to the self-rated level of happiness: group 1 (level 10-9), group 2 (level 8-6), and group 3 (level 5 and lower) in order to investigate the way the categories mentioned above were distributed among the different groups. Preliminary results show that the category ‘family’ is associated with higher level of happiness, whereas its presence gradually decreases among the participants with a lower level of happiness. Moreover, the category ‘depression’ seems to be mainly present among participants in group 3, whereas the category ‘sadness’ is mainly present among participants with higher level of happiness. Conclusion: In conclusion, this study indicates the prevalent ways of thinking about happiness and its opposite among young female students, suggesting that representations varied to some extent depending on the happiness level of the participants. This study contributes to bringing new knowledge as it considers happiness as a holistic state, thus going beyond the literature that so far has too often viewed happiness as a mere unidimensional spectrum.

Keywords: female, happiness, social representations, unhappiness

Procedia PDF Downloads 202
2080 COVID-19’s Effect on Pre-Existing Hearing Loss

Authors: Jonathan A. Mikhail, Arsenio Paez

Abstract:

It is not uncommon for a viral infection to cause hearing loss. Many viral infections are associated with sudden-onset, often unilateral, idiopathic sensorineural hearing loss. We conducted an exploratory study with thirty patients with pre-existing hearing loss between 50 and 64 to evaluate if COVID-19 was associated with exacerbated hearing loss. We hypothesized that hearing loss would be exacerbated by COVID-19 infection in patients with pre-existing hearing loss. A statistically significant paired T-test between pure tone averages (PTAs) at the patient’s original diagnosis and a current, updated audiometric assessment indicated a regression in hearing (p-value < .001) sensitivity following the contraction of COVID-19. Speech reception thresholds (SRTs) and word recognition scores (WRSs) were also considered, as well as the participants' gender. SRTs between each ear exhibited a statistically significant change (p-value of .002 and p-value < .001). WRSs did not show statistically significant differences (p-value of .290 and p-value of .098). A non-statistically significant Two-Way ANOVA was performed to evaluate gender’s potential role in exacerbated hearing loss and proved to be statistically insignificant (p-value of .214). This study discusses practical implications for clinical and educational pursuits in understanding COVID-19's effect on the auditory system and the need to evaluate the deadly virus further.

Keywords: audiology, COVID-19, sensorineural hearing loss, otology, auditory research

Procedia PDF Downloads 56
2079 Freedom of Information and Freedom of Expression

Authors: Amin Pashaye Amiri

Abstract:

Freedom of information, according to which the public has a right to have access to government-held information, is largely considered as a tool for improving transparency and accountability in governments, and as a requirement of self-governance and good governance. So far, more than ninety countries have recognized citizens’ right to have access to public information. This recognition often took place through the adoption of an act referred to as “freedom of information act”, “access to public records act”, and so on. A freedom of information act typically imposes a positive obligation on a government to initially and regularly release certain public information, and also obliges it to provide individuals with information they request. Such an act usually allows governmental bodies to withhold information only when it falls within a limited number of exemptions enumerated in the act such as exemptions for protecting privacy of individuals and protecting national security. Some steps have been taken at the national and international level towards the recognition of freedom of information as a human right. Freedom of information was recognized in a few countries as a part of freedom of expression, and therefore, as a human right. Freedom of information was also recognized by some international bodies as a human right. The Inter-American Court of Human Rights ruled in 2006 that Article 13 of the American Convention on Human Rights, which concerns the human right to freedom of expression, protects the right of all people to request access to government information. The European Court of Human Rights has recently taken a considerable step towards recognizing freedom of information as a human right. However, in spite of the measures that have been taken, public access to government information is not yet widely accepted as an international human right. The paper will consider the degree to which freedom of information has been recognized as a human right, and study the possibility of widespread recognition of such a human right in the future. It will also examine the possible benefits of such recognition for the development of the human right to free expression.

Keywords: freedom of information, freedom of expression, human rights, government information

Procedia PDF Downloads 521
2078 Power Quality Modeling Using Recognition Learning Methods for Waveform Disturbances

Authors: Sang-Keun Moon, Hong-Rok Lim, Jin-O Kim

Abstract:

This paper presents a Power Quality (PQ) modeling and filtering processes for the distribution system disturbances using recognition learning methods. Typical PQ waveforms with mathematical applications and gathered field data are applied to the proposed models. The objective of this paper is analyzing PQ data with respect to monitoring, discriminating, and evaluating the waveform of power disturbances to ensure the system preventative system failure protections and complex system problem estimations. Examined signal filtering techniques are used for the field waveform noises and feature extractions. Using extraction and learning classification techniques, the efficiency was verified for the recognition of the PQ disturbances with focusing on interactive modeling methods in this paper. The waveform of selected 8 disturbances is modeled with randomized parameters of IEEE 1159 PQ ranges. The range, parameters, and weights are updated regarding field waveform obtained. Along with voltages, currents have same process to obtain the waveform features as the voltage apart from some of ratings and filters. Changing loads are causing the distortion in the voltage waveform due to the drawing of the different patterns of current variation. In the conclusion, PQ disturbances in the voltage and current waveforms indicate different types of patterns of variations and disturbance, and a modified technique based on the symmetrical components in time domain was proposed in this paper for the PQ disturbances detection and then classification. Our method is based on the fact that obtained waveforms from suggested trigger conditions contain potential information for abnormality detections. The extracted features are sequentially applied to estimation and recognition learning modules for further studies.

Keywords: power quality recognition, PQ modeling, waveform feature extraction, disturbance trigger condition, PQ signal filtering

Procedia PDF Downloads 165
2077 Multimodal Deep Learning for Human Activity Recognition

Authors: Ons Slimene, Aroua Taamallah, Maha Khemaja

Abstract:

In recent years, human activity recognition (HAR) has been a key area of research due to its diverse applications. It has garnered increasing attention in the field of computer vision. HAR plays an important role in people’s daily lives as it has the ability to learn advanced knowledge about human activities from data. In HAR, activities are usually represented by exploiting different types of sensors, such as embedded sensors or visual sensors. However, these sensors have limitations, such as local obstacles, image-related obstacles, sensor unreliability, and consumer concerns. Recently, several deep learning-based approaches have been proposed for HAR and these approaches are classified into two categories based on the type of data used: vision-based approaches and sensor-based approaches. This research paper highlights the importance of multimodal data fusion from skeleton data obtained from videos and data generated by embedded sensors using deep neural networks for achieving HAR. We propose a deep multimodal fusion network based on a twostream architecture. These two streams use the Convolutional Neural Network combined with the Bidirectional LSTM (CNN BILSTM) to process skeleton data and data generated by embedded sensors and the fusion at the feature level is considered. The proposed model was evaluated on a public OPPORTUNITY++ dataset and produced a accuracy of 96.77%.

Keywords: human activity recognition, action recognition, sensors, vision, human-centric sensing, deep learning, context-awareness

Procedia PDF Downloads 71
2076 Exploring Polar Syntactic Effects of Verbal Extensions in Basà Language

Authors: Imoh Philip

Abstract:

This work investigates four verbal extensions; two in each set resulting in two opposite effects of the valency of verbs in Basà language. Basà language is an indigenous language spoken in Kogi, Nasarawa, Benue, Niger states and all the Federal Capital Territory (FCT) councils. Crozier & Blench (1992) and Blench & Williamson (1988) classify Basà as belonging to Proto–Kru, under the sub-phylum Western –Kru. It studies the effects of such morphosyntactic operations in Basà language with special focus on ‘reflexives’ ‘reciprocals’ versus ‘causativization’ and ‘applicativization’ both sets are characterized by polar syntactic processes of either decreasing or increasing the verb’s valency by one argument vis-à-vis the basic number of arguments, but by the similar morphological processes. In addition to my native intuitions as a native speaker of Basà language, data elicited for this work include discourse observation, staged and elicited spoken data from fluent native speakers. The paper argues that affixes attached to the verb root, result in either deriving an intransitive verb from a transitive one or a transitive verb from a bi/ditransitive verb and equally increase the verb’s valence deriving either a bitransitive verb from a transitive verb or a transitive verb from a intransitive one. Where the operation increases the verb’s valency, it triggers a transformation of arguments in the derived structure. In this case, the applied arguments displace the inherent ones. This investigation can stimulate further study on other transformations that are either syntactic or morphosyntactic in Basà and can also be replicated in other African and non-African languages.

Keywords: verbal extension, valency, reflexive, reciprocal, causativization, applicativization, Basà

Procedia PDF Downloads 182
2075 Behavioral and EEG Reactions in Children during Recognition of Emotionally Colored Sentences That Describe the Choice Situation

Authors: Tuiana A. Aiusheeva, Sergey S. Tamozhnikov, Alexander E. Saprygin, Arina A. Antonenko, Valentina V. Stepanova, Natalia N. Tolstykh, Alexander N. Savostyanov

Abstract:

Situation of choice is an important condition for the formation of essential character qualities of a child, such as being initiative, responsible, hard-working. We have studied the behavioral and EEG reactions in Russian schoolchildren during recognition of syntactic errors in emotionally colored sentences that describe the choice situation. Twenty healthy children (mean age 9,0±0,3 years, 12 boys, 8 girls) were examined. Forty sentences were selected for the experiment; the half of them contained a syntactic error. The experiment additionally had the hidden condition: 50% of the sentences described the children's own choice and were emotionally colored (positive or negative). The other 50% of the sentences described the forced-choice situation, also with positive or negative coloring. EEG were recorded during execution of error-recognition task. Reaction time and quality of syntactic error detection were chosen as behavioral measures. Event-related spectral perturbation (ERSP) was applied to characterize the oscillatory brain activity of children. There were two time-frequency intervals in EEG reactions: (1) 500-800 ms in the 3-7 Hz frequency range (theta synchronization) and (2) 500-1000 ms in the 8-12 Hz range (alpha desynchronization). We found out that behavioral and brain reactions in child brain during recognition of positive and negative sentences describing forced-choice situation did not have significant differences. Theta synchronization and alpha desynchronization were stronger during recognition of sentences with children's own choice, especially with negative coloring. Also, the quality and execution time of the task were higher for this types of sentences. The results of our study will be useful for improvement of teaching methods and diagnostics of children affective disorders.

Keywords: choice situation, electroencephalogram (EEG), emotionally colored sentences, schoolchildren

Procedia PDF Downloads 248