Search results for: face in video recognition
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 5093

Search results for: face in video recognition

4973 The Effect of Pixelation on Face Detection: Evidence from Eye Movements

Authors: Kaewmart Pongakkasira

Abstract:

This study investigated how different levels of pixelation affect face detection in natural scenes. Eye movements and reaction times, while observers searched for faces in natural scenes rendered in different ranges of pixels, were recorded. Detection performance for coarse visual detail at lower pixel size (3 x 3) was better than with very blurred detail carried by higher pixel size (9 x 9). The result is consistent with the notion that face detection relies on gross detail information of face-shape template, containing crude shape structure and features. In contrast, detection was impaired when face shape and features are obscured. However, it was considered that the degradation of scenic information might also contribute to the effect. In the next experiment, a more direct measurement of the effect of pixelation on face detection, only the embedded face photographs, but not the scene background, will be filtered.

Keywords: eye movements, face detection, face-shape information, pixelation

Procedia PDF Downloads 317
4972 Efficient Video Compression Technique Using Convolutional Neural Networks and Generative Adversarial Network

Authors: P. Karthick, K. Mahesh

Abstract:

Video has become an increasingly significant component of our digital everyday contact. With the advancement of greater contents and shows of the resolution, its significant volume poses serious obstacles to the objective of receiving, distributing, compressing, and revealing video content of high quality. In this paper, we propose the primary beginning to complete a deep video compression model that jointly upgrades all video compression components. The video compression method involves splitting the video into frames, comparing the images using convolutional neural networks (CNN) to remove duplicates, repeating the single image instead of the duplicate images by recognizing and detecting minute changes using generative adversarial network (GAN) and recorded with long short-term memory (LSTM). Instead of the complete image, the small changes generated using GAN are substituted, which helps in frame level compression. Pixel wise comparison is performed using K-nearest neighbours (KNN) over the frame, clustered with K-means, and singular value decomposition (SVD) is applied for each and every frame in the video for all three color channels [Red, Green, Blue] to decrease the dimension of the utility matrix [R, G, B] by extracting its latent factors. Video frames are packed with parameters with the aid of a codec and converted to video format, and the results are compared with the original video. Repeated experiments on several videos with different sizes, duration, frames per second (FPS), and quality results demonstrate a significant resampling rate. On average, the result produced had approximately a 10% deviation in quality and more than 50% in size when compared with the original video.

Keywords: video compression, K-means clustering, convolutional neural network, generative adversarial network, singular value decomposition, pixel visualization, stochastic gradient descent, frame per second extraction, RGB channel extraction, self-detection and deciding system

Procedia PDF Downloads 187
4971 Possibilities, Challenges and the State of the Art of Automatic Speech Recognition in Air Traffic Control

Authors: Van Nhan Nguyen, Harald Holone

Abstract:

Over the past few years, a lot of research has been conducted to bring Automatic Speech Recognition (ASR) into various areas of Air Traffic Control (ATC), such as air traffic control simulation and training, monitoring live operators for with the aim of safety improvements, air traffic controller workload measurement and conducting analysis on large quantities controller-pilot speech. Due to the high accuracy requirements of the ATC context and its unique challenges, automatic speech recognition has not been widely adopted in this field. With the aim of providing a good starting point for researchers who are interested bringing automatic speech recognition into ATC, this paper gives an overview of possibilities and challenges of applying automatic speech recognition in air traffic control. To provide this overview, we present an updated literature review of speech recognition technologies in general, as well as specific approaches relevant to the ATC context. Based on this literature review, criteria for selecting speech recognition approaches for the ATC domain are presented, and remaining challenges and possible solutions are discussed.

Keywords: automatic speech recognition, asr, air traffic control, atc

Procedia PDF Downloads 399
4970 A Contribution to Human Activities Recognition Using Expert System Techniques

Authors: Malika Yaici, Soraya Aloui, Sara Semchaoui

Abstract:

This paper deals with human activity recognition from sensor data. It is an active research area, and the main objective is to obtain a high recognition rate. In this work, a recognition system based on expert systems is proposed; the recognition is performed using the objects, object states, and gestures and taking into account the context (the location of the objects and of the person performing the activity, the duration of the elementary actions and the activity). The system recognizes complex activities after decomposing them into simple, easy-to-recognize activities. The proposed method can be applied to any type of activity. The simulation results show the robustness of our system and its speed of decision.

Keywords: human activity recognition, ubiquitous computing, context-awareness, expert system

Procedia PDF Downloads 118
4969 Switching to the Latin Alphabet in Kazakhstan: A Brief Overview of Character Recognition Methods

Authors: Ainagul Yermekova, Liudmila Goncharenko, Ali Baghirzade, Sergey Sybachin

Abstract:

In this article, we address the problem of Kazakhstan's transition to the Latin alphabet. The transition process started in 2017 and is scheduled to be completed in 2025. In connection with these events, the problem of recognizing the characters of the new alphabet is raised. Well-known character recognition programs such as ABBYY FineReader, FormReader, MyScript Stylus did not recognize specific Kazakh letters that were used in Cyrillic. The author tries to give an assessment of the well-known method of character recognition that could be in demand as part of the country's transition to the Latin alphabet. Three methods of character recognition: template, structured, and feature-based, are considered through the algorithms of operation. At the end of the article, a general conclusion is made about the possibility of applying a certain method to a particular recognition process: for example, in the process of population census, recognition of typographic text in Latin, or recognition of photos of car numbers, store signs, etc.

Keywords: text detection, template method, recognition algorithm, structured method, feature method

Procedia PDF Downloads 186
4968 Recognizing an Individual, Their Topic of Conversation and Cultural Background from 3D Body Movement

Authors: Gheida J. Shahrour, Martin J. Russell

Abstract:

The 3D body movement signals captured during human-human conversation include clues not only to the content of people’s communication but also to their culture and personality. This paper is concerned with automatic extraction of this information from body movement signals. For the purpose of this research, we collected a novel corpus from 27 subjects, arranged them into groups according to their culture. We arranged each group into pairs and each pair communicated with each other about different topics. A state-of-art recognition system is applied to the problems of person, culture, and topic recognition. We borrowed modeling, classification, and normalization techniques from speech recognition. We used Gaussian Mixture Modeling (GMM) as the main technique for building our three systems, obtaining 77.78%, 55.47%, and 39.06% from the person, culture, and topic recognition systems respectively. In addition, we combined the above GMM systems with Support Vector Machines (SVM) to obtain 85.42%, 62.50%, and 40.63% accuracy for person, culture, and topic recognition respectively. Although direct comparison among these three recognition systems is difficult, it seems that our person recognition system performs best for both GMM and GMM-SVM, suggesting that inter-subject differences (i.e. subject’s personality traits) are a major source of variation. When removing these traits from culture and topic recognition systems using the Nuisance Attribute Projection (NAP) and the Intersession Variability Compensation (ISVC) techniques, we obtained 73.44% and 46.09% accuracy from culture and topic recognition systems respectively.

Keywords: person recognition, topic recognition, culture recognition, 3D body movement signals, variability compensation

Procedia PDF Downloads 541
4967 Video Foreground Detection Based on Adaptive Mixture Gaussian Model for Video Surveillance Systems

Authors: M. A. Alavianmehr, A. Tashk, A. Sodagaran

Abstract:

Modeling background and moving objects are significant techniques for video surveillance and other video processing applications. This paper presents a foreground detection algorithm that is robust against illumination changes and noise based on adaptive mixture Gaussian model (GMM), and provides a novel and practical choice for intelligent video surveillance systems using static cameras. In the previous methods, the image of still objects (background image) is not significant. On the contrary, this method is based on forming a meticulous background image and exploiting it for separating moving objects from their background. The background image is specified either manually, by taking an image without vehicles, or is detected in real-time by forming a mathematical or exponential average of successive images. The proposed scheme can offer low image degradation. The simulation results demonstrate high degree of performance for the proposed method.

Keywords: image processing, background models, video surveillance, foreground detection, Gaussian mixture model

Procedia PDF Downloads 516
4966 Performance of High Efficiency Video Codec over Wireless Channels

Authors: Mohd Ayyub Khan, Nadeem Akhtar

Abstract:

Due to recent advances in wireless communication technologies and hand-held devices, there is a huge demand for video-based applications such as video surveillance, video conferencing, remote surgery, Digital Video Broadcast (DVB), IPTV, online learning courses, YouTube, WhatsApp, Instagram, Facebook, Interactive Video Games. However, the raw videos posses very high bandwidth which makes the compression a must before its transmission over the wireless channels. The High Efficiency Video Codec (HEVC) (also called H.265) is latest state-of-the-art video coding standard developed by the Joint effort of ITU-T and ISO/IEC teams. HEVC is targeted for high resolution videos such as 4K or 8K resolutions that can fulfil the recent demands for video services. The compression ratio achieved by the HEVC is twice as compared to its predecessor H.264/AVC for same quality level. The compression efficiency is generally increased by removing more correlation between the frames/pixels using complex techniques such as extensive intra and inter prediction techniques. As more correlation is removed, the chances of interdependency among coded bits increases. Thus, bit errors may have large effect on the reconstructed video. Sometimes even single bit error can lead to catastrophic failure of the reconstructed video. In this paper, we study the performance of HEVC bitstream over additive white Gaussian noise (AWGN) channel. Moreover, HEVC over Quadrature Amplitude Modulation (QAM) combined with forward error correction (FEC) schemes are also explored over the noisy channel. The video will be encoded using HEVC, and the coded bitstream is channel coded to provide some redundancies. The channel coded bitstream is then modulated using QAM and transmitted over AWGN channel. At the receiver, the symbols are demodulated and channel decoded to obtain the video bitstream. The bitstream is then used to reconstruct the video using HEVC decoder. It is observed that as the signal to noise ratio of channel is decreased the quality of the reconstructed video decreases drastically. Using proper FEC codes, the quality of the video can be restored up to certain extent. Thus, the performance analysis of HEVC presented in this paper may assist in designing the optimized code rate of FEC such that the quality of the reconstructed video is maximized over wireless channels.

Keywords: AWGN, forward error correction, HEVC, video coding, QAM

Procedia PDF Downloads 149
4965 The Development of Educational Video Games Aimed at Enhancing Academic Motivation and Learning Among African American Males

Authors: Kenneth Philip Jones

Abstract:

This dissertation investigates the potential of developing educational-based video games to motivate and engage African American males. The study employed a qualitative methodological approach by investigating African American males who are avid video game players and are currently enrolled at a college or university. The participants were individually and collectively video and audio recorded during the interviews and observations. Situated Learning theory analyzed how motivation and engagement can transfer from a video game to an educational context. The research aims to address the disparities in our educational systems when it comes to providing a culture, climate, and atmosphere that will enable the academic development of African American males. The primary objective of the findings is based on the participants’ responses and the data collected to provide recommendations to educators and scholars on how to address the issues that have demoralized African American males in education and provide a platform that will allow for equality in educational development and advancement.

Keywords: video games, motivation, behavioral, learning transfer

Procedia PDF Downloads 121
4964 Optimized Deep Learning-Based Facial Emotion Recognition System

Authors: Erick C. Valverde, Wansu Lim

Abstract:

Facial emotion recognition (FER) system has been recently developed for more advanced computer vision applications. The ability to identify human emotions would enable smart healthcare facility to diagnose mental health illnesses (e.g., depression and stress) as well as better human social interactions with smart technologies. The FER system involves two steps: 1) face detection task and 2) facial emotion recognition task. It classifies the human expression in various categories such as angry, disgust, fear, happy, sad, surprise, and neutral. This system requires intensive research to address issues with human diversity, various unique human expressions, and variety of human facial features due to age differences. These issues generally affect the ability of the FER system to detect human emotions with high accuracy. Early stage of FER systems used simple supervised classification task algorithms like K-nearest neighbors (KNN) and artificial neural networks (ANN). These conventional FER systems have issues with low accuracy due to its inefficiency to extract significant features of several human emotions. To increase the accuracy of FER systems, deep learning (DL)-based methods, like convolutional neural networks (CNN), are proposed. These methods can find more complex features in the human face by means of the deeper connections within its architectures. However, the inference speed and computational costs of a DL-based FER system is often disregarded in exchange for higher accuracy results. To cope with this drawback, an optimized DL-based FER system is proposed in this study.An extreme version of Inception V3, known as Xception model, is leveraged by applying different network optimization methods. Specifically, network pruning and quantization are used to enable lower computational costs and reduce memory usage, respectively. To support low resource requirements, a 68-landmark face detector from Dlib is used in the early step of the FER system.Furthermore, a DL compiler is utilized to incorporate advanced optimization techniques to the Xception model to improve the inference speed of the FER system. In comparison to VGG-Net and ResNet50, the proposed optimized DL-based FER system experimentally demonstrates the objectives of the network optimization methods used. As a result, the proposed approach can be used to create an efficient and real-time FER system.

Keywords: deep learning, face detection, facial emotion recognition, network optimization methods

Procedia PDF Downloads 118
4963 The Use of Video in Increasing Speaking Ability of the First Year Students of SMAN 12 Pekanbaru in the Academic Year 2011/2012

Authors: Elvira Wahyuni

Abstract:

This study is a classroom action research. The general objective of this study was to find out students’ speaking ability through teaching English by using video and to find out the effectiveness of using video in teaching English to improve students’ speaking ability. The subjects of this study were 34 of the first-year students of SMAN 12 Pekanbaru who were learning English as a foreign language (EFL). Students were given pre-test before the treatment and post-test after the treatment. Quantitative data was collected by using speaking test requiring the students to respond to the recorded questions. Qualitative data was collected through observation sheets and field notes. The research finding reveals that there is a significant improvement of the students’ speaking ability through the use of video in speaking class. The qualitative data gave a description and additional information about the learning process done by the students. The research findings indicate that the use of video in teaching and learning is good in increasing learning outcome.

Keywords: English teaching, fun learning, speaking ability, video

Procedia PDF Downloads 256
4962 Stereotypical Motor Movement Recognition Using Microsoft Kinect with Artificial Neural Network

Authors: M. Jazouli, S. Elhoufi, A. Majda, A. Zarghili, R. Aalouane

Abstract:

Autism spectrum disorder is a complex developmental disability. It is defined by a certain set of behaviors. Persons with Autism Spectrum Disorders (ASD) frequently engage in stereotyped and repetitive motor movements. The objective of this article is to propose a method to automatically detect this unusual behavior. Our study provides a clinical tool which facilitates for doctors the diagnosis of ASD. We focus on automatic identification of five repetitive gestures among autistic children in real time: body rocking, hand flapping, fingers flapping, hand on the face and hands behind back. In this paper, we present a gesture recognition system for children with autism, which consists of three modules: model-based movement tracking, feature extraction, and gesture recognition using artificial neural network (ANN). The first one uses the Microsoft Kinect sensor, the second one chooses points of interest from the 3D skeleton to characterize the gestures, and the last one proposes a neural connectionist model to perform the supervised classification of data. The experimental results show that our system can achieve above 93.3% recognition rate.

Keywords: ASD, artificial neural network, kinect, stereotypical motor movements

Procedia PDF Downloads 306
4961 Human Activities Recognition Based on Expert System

Authors: Malika Yaici, Soraya Aloui, Sara Semchaoui

Abstract:

Recognition of human activities from sensor data is an active research area, and the main objective is to obtain a high recognition rate. In this work, we propose a recognition system based on expert systems. The proposed system makes the recognition based on the objects, object states, and gestures, taking into account the context (the location of the objects and of the person performing the activity, the duration of the elementary actions, and the activity). This work focuses on complex activities which are decomposed into simple easy to recognize activities. The proposed method can be applied to any type of activity. The simulation results show the robustness of our system and its speed of decision.

Keywords: human activity recognition, ubiquitous computing, context-awareness, expert system

Procedia PDF Downloads 139
4960 The Effectiveness of Using Video Modeling Procedures on the ipad to Teach Play Skills Children with ASD

Authors: Esra Orum Cattik

Abstract:

This study evaluated the effects of using video modeling procedures on the iPad to teach play skills to children with autism spectrum disorders. A male student with autism spectrum disorders participated in this study. A multiple baseline-across-skills single-subject design was used to evaluate the effects of using video modeling procedures on the iPad. During baseline, no prompts were presented to participants. In the intervention phase, the teacher gave video model on iPad to the first skill and asked play with toys for him. When the first play skill completed the second play skill began intervention. This procedure continued till all three play skill completed intervention. Finally, the participant learned all three play skills to use video modeling presented on the iPad. Based upon findings of this study, suggestions have been made to future researches.

Keywords: autism spectrum disorders, play, play skills, video modeling, single subject design

Procedia PDF Downloads 406
4959 Comparative Study of Skeletonization and Radial Distance Methods for Automated Finger Enumeration

Authors: Mohammad Hossain Mohammadi, Saif Al Ameri, Sana Ziaei, Jinane Mounsef

Abstract:

Automated enumeration of the number of hand fingers is widely used in several motion gaming and distance control applications, and is discussed in several published papers as a starting block for hand recognition systems. The automated finger enumeration technique should not only be accurate, but also must have a fast response for a moving-picture input. The high performance of video in motion games or distance control will inhibit the program’s overall speed, for image processing software such as Matlab need to produce results at high computation speeds. Since an automated finger enumeration with minimum error and processing time is desired, a comparative study between two finger enumeration techniques is presented and analyzed in this paper. In the pre-processing stage, various image processing functions were applied on a real-time video input to obtain the final cleaned auto-cropped image of the hand to be used for the two techniques. The first technique uses the known morphological tool of skeletonization to count the number of skeleton’s endpoints for fingers. The second technique uses a radial distance method to enumerate the number of fingers in order to obtain a one dimensional hand representation. For both discussed methods, the different steps of the algorithms are explained. Then, a comparative study analyzes the accuracy and speed of both techniques. Through experimental testing in different background conditions, it was observed that the radial distance method was more accurate and responsive to a real-time video input compared to the skeletonization method. All test results were generated in Matlab and were based on displaying a human hand for three different orientations on top of a plain color background. Finally, the limitations surrounding the enumeration techniques are presented.

Keywords: comparative study, hand recognition, fingertip detection, skeletonization, radial distance, Matlab

Procedia PDF Downloads 382
4958 The Effect of Computer-Mediated vs. Face-to-Face Instruction on L2 Pragmatics: A Meta-Analysis

Authors: Marziyeh Yousefi, Hossein Nassaji

Abstract:

This paper reports the results of a meta-analysis of studies on the effects of instruction mode on learning second language pragmatics during the last decade (from 2006 to 2016). After establishing related inclusion/ exclusion criteria, 39 published studies were retrieved and included in the present meta-analysis. Studies were later coded for face-to-face and computer-assisted mode of instruction. Statistical procedures were applied to obtain effect sizes. It was found that Computer-Assisted-Language-Learning studies generated larger effects than Face-to-Face instruction.

Keywords: meta-analysis, effect size, L2 pragmatics, comprehensive meta-analysis, face-to-face, computer-assisted language learning

Procedia PDF Downloads 221
4957 Automatic Checkpoint System Using Face and Card Information

Authors: Kriddikorn Kaewwongsri, Nikom Suvonvorn

Abstract:

In the deep south of Thailand, checkpoints for people verification are necessary for the security management of risk zones, such as official buildings in the conflict area. In this paper, we propose an automatic checkpoint system that verifies persons using information from ID cards and facial features. The methods for a person’s information abstraction and verification are introduced based on useful information such as ID number and name, extracted from official cards, and facial images from videos. The proposed system shows promising results and has a real impact on the local society.

Keywords: face comparison, card recognition, OCR, checkpoint system, authentication

Procedia PDF Downloads 321
4956 An MrPPG Method for Face Anti-Spoofing

Authors: Lan Zhang, Cailing Zhang

Abstract:

In recent years, many face anti-spoofing algorithms have high detection accuracy when detecting 2D face anti-spoofing or 3D mask face anti-spoofing alone in the field of face anti-spoofing, but their detection performance is greatly reduced in multidimensional and cross-datasets tests. The rPPG method used for face anti-spoofing uses the unique vital information of real face to judge real faces and face anti-spoofing, so rPPG method has strong stability compared with other methods, but its detection rate of 2D face anti-spoofing needs to be improved. Therefore, in this paper, we improve an rPPG(Remote Photoplethysmography) method(MrPPG) for face anti-spoofing which through color space fusion, using the correlation of pulse signals between real face regions and background regions, and introducing the cyclic neural network (LSTM) method to improve accuracy in 2D face anti-spoofing. Meanwhile, the MrPPG also has high accuracy and good stability in face anti-spoofing of multi-dimensional and cross-data datasets. The improved method was validated on Replay-Attack, CASIA-FASD, Siw and HKBU_MARs_V2 datasets, the experimental results show that the performance and stability of the improved algorithm proposed in this paper is superior to many advanced algorithms.

Keywords: face anti-spoofing, face presentation attack detection, remote photoplethysmography, MrPPG

Procedia PDF Downloads 178
4955 Challenges in Video Based Object Detection in Maritime Scenario Using Computer Vision

Authors: Dilip K. Prasad, C. Krishna Prasath, Deepu Rajan, Lily Rachmawati, Eshan Rajabally, Chai Quek

Abstract:

This paper discusses the technical challenges in maritime image processing and machine vision problems for video streams generated by cameras. Even well documented problems of horizon detection and registration of frames in a video are very challenging in maritime scenarios. More advanced problems of background subtraction and object detection in video streams are very challenging. Challenges arising from the dynamic nature of the background, unavailability of static cues, presence of small objects at distant backgrounds, illumination effects, all contribute to the challenges as discussed here.

Keywords: autonomous maritime vehicle, object detection, situation awareness, tracking

Procedia PDF Downloads 457
4954 Smartphone Video Source Identification Based on Sensor Pattern Noise

Authors: Raquel Ramos López, Anissa El-Khattabi, Ana Lucila Sandoval Orozco, Luis Javier García Villalba

Abstract:

An increasing number of mobile devices with integrated cameras has meant that most digital video comes from these devices. These digital videos can be made anytime, anywhere and for different purposes. They can also be shared on the Internet in a short period of time and may sometimes contain recordings of illegal acts. The need to reliably trace the origin becomes evident when these videos are used for forensic purposes. This work proposes an algorithm to identify the brand and model of mobile device which generated the video. Its procedure is as follows: after obtaining the relevant video information, a classification algorithm based on sensor noise and Wavelet Transform performs the aforementioned identification process. We also present experimental results that support the validity of the techniques used and show promising results.

Keywords: digital video, forensics analysis, key frame, mobile device, PRNU, sensor noise, source identification

Procedia PDF Downloads 428
4953 Content Based Face Sketch Images Retrieval in WHT, DCT, and DWT Transform Domain

Authors: W. S. Besbas, M. A. Artemi, R. M. Salman

Abstract:

Content based face sketch retrieval can be used to find images of criminals from their sketches for 'Crime Prevention'. This paper investigates the problem of CBIR of face sketch images in transform domain. Face sketch images that are similar to the query image are retrieved from the face sketch database. Features of the face sketch image are extracted in the spectrum domain of a selected transforms. These transforms are Discrete Cosine Transform (DCT), Discrete Wavelet Transform (DWT), and Walsh Hadamard Transform (WHT). For the performance analyses of features selection methods three face images databases are used. These are 'Sheffield face database', 'Olivetti Research Laboratory (ORL) face database', and 'Indian face database'. The City block distance measure is used to evaluate the performance of the retrieval process. The investigation concludes that, the retrieval rate is database dependent. But in general, the DCT is the best. On the other hand, the WHT is the best with respect to the speed of retrieving images.

Keywords: Content Based Image Retrieval (CBIR), face sketch image retrieval, features selection for CBIR, image retrieval in transform domain

Procedia PDF Downloads 493
4952 Performance Evaluation of Routing Protocols for Video Conference over MPLS VPN Network

Authors: Abdullah Al Mamun, Tarek R. Sheltami

Abstract:

Video conferencing is a highly demanding facility now a days in order to its real time characteristics, but faster communication is the prior requirement of this technology. Multi Protocol Label Switching (MPLS) IP Virtual Private Network (VPN) address this problem and it is able to make a communication faster than others techniques. However, this paper studies the performance comparison of video traffic between two routing protocols namely the Enhanced Interior Gateway Protocol(EIGRP) and Open Shortest Path First (OSPF). The combination of traditional routing and MPLS improve the forwarding mechanism, scalability and overall network performance. We will use GNS3 and OPNET Modeler 14.5 to simulate many different scenarios and metrics such as delay, jitter and mean opinion score (MOS) value are measured. The simulation result will show that OSPF and BGP-MPLS VPN offers best performance for video conferencing application.

Keywords: OSPF, BGP, EIGRP, MPLS, Video conference, Provider router, edge router, layer3 VPN

Procedia PDF Downloads 331
4951 Face Sketch Recognition in Forensic Application Using Scale Invariant Feature Transform and Multiscale Local Binary Patterns Fusion

Authors: Gargi Phadke, Mugdha Joshi, Shamal Salunkhe

Abstract:

Facial sketches are used as a crucial clue by criminal investigators for identification of suspects when the description of eyewitness or victims are only available as evidence. A forensic artist develops a sketch as per the verbal description is given by an eyewitness that shows the facial look of the culprit. In this paper, the fusion of Scale Invariant Feature Transform (SIFT) and multiscale local binary patterns (MLBP) are proposed as a feature to recognize a forensic face sketch images from a gallery of mugshot photos. This work focuses on comparative analysis of proposed scheme with existing algorithms in different challenges like illumination change and rotation condition. Experimental results show that proposed scheme can lead to better performance for the defined problem.

Keywords: SIFT feature, MLBP, PCA, face sketch

Procedia PDF Downloads 336
4950 Review of Speech Recognition Research on Low-Resource Languages

Authors: XuKe Cao

Abstract:

This paper reviews the current state of research on low-resource languages in the field of speech recognition, focusing on the challenges faced by low-resource language speech recognition, including the scarcity of data resources, the lack of linguistic resources, and the diversity of dialects and accents. The article reviews recent progress in low-resource language speech recognition, including techniques such as data augmentation, end to-end models, transfer learning, and multi-task learning. Based on the challenges currently faced, the paper also provides an outlook on future research directions. Through these studies, it is expected that the performance of speech recognition for low resource languages can be improved, promoting the widespread application and adoption of related technologies.

Keywords: low-resource languages, speech recognition, data augmentation techniques, NLP

Procedia PDF Downloads 12
4949 Modern Machine Learning Conniptions for Automatic Speech Recognition

Authors: S. Jagadeesh Kumar

Abstract:

This expose presents a luculent of recent machine learning practices as employed in the modern and as pertinent to prospective automatic speech recognition schemes. The aspiration is to promote additional traverse ablution among the machine learning and automatic speech recognition factions that have transpired in the precedent. The manuscript is structured according to the chief machine learning archetypes that are furthermore trendy by now or have latency for building momentous hand-outs to automatic speech recognition expertise. The standards offered and convoluted in this article embraces adaptive and multi-task learning, active learning, Bayesian learning, discriminative learning, generative learning, supervised and unsupervised learning. These learning archetypes are aggravated and conferred in the perspective of automatic speech recognition tools and functions. This manuscript bequeaths and surveys topical advances of deep learning and learning with sparse depictions; further limelight is on their incessant significance in the evolution of automatic speech recognition.

Keywords: automatic speech recognition, deep learning methods, machine learning archetypes, Bayesian learning, supervised and unsupervised learning

Procedia PDF Downloads 447
4948 Acute Bronchiolitis: Impact of an Educational Video on Mothers’ Knowledge, Attitudes, and Practices

Authors: Atitallah Sofien, Missaoui Nada, Ben Rabeh Rania, Yahyaoui Salem, Mazigh Sonia, Bouyahia Olfa, Boukthir Samir

Abstract:

Introduction: Acute bronchiolitis (AB) is a real public health problem on a global and national scale. Its treatment is most often outpatient. The use of audio-visual supports, such as educational videos, is an innovation in therapeutic education in outpatient treatment. The aim of our study was to evaluate the impact of an educational video on the knowledge, attitudes, and practices of mothers of infants with AB. Methodology: This was a descriptive, analytical, and cross-sectional study with prospective data collection, including mothers of infants with AB. We assessed mothers' knowledge, attitudes, and practices regarding AB, and we created an educational video. We used a questionnaire written in Tunisian Arabic concerning sociodemographic data, mothers' knowledge and attitudes regarding AB, and their opinions on the video, as well as an observation grid to evaluate their practices on the nasopharyngeal unblocking technique. We compared the different parameters before and after watching the video. Results: We noted a statistically significant improvement in mothers' knowledge scores on AB (7.46 in the pre-test versus 14.08 in the post-test; p≤0.05), practices (12.42 in the pre-test versus 18 in the post-test; p≤0.05) and attitudes (5.86 in pre-test versus 9.02 in post-test; p≤0.05). Conclusion: The use of an educational video has a positive impact on the knowledge, practices, and attitudes of mothers towards AB.

Keywords: acute bronchiolitis, therapeutic education, mothers, educational video

Procedia PDF Downloads 68
4947 The Effect of Video Using in Teaching Speaking on Students of Non-Native English Speakers at STIE Perbanas Surabaya

Authors: Kartika Marta Budiana

Abstract:

Low competence in speaking for the students of Non English native speakers have been crucial so far for the teachers in language teaching in Indonesia. This study attempts to explore the effect of video using in teaching speaking onstudents of non-native English speakers at STIE Perbanas Surabaya. This includes investigate the students` attitudes toward the video used in classroom. This is a quantitative research that is an experimental one based on analyses derived the concepts of from teaching speaking and the use of video. There are two classes observed, the experimental and the control one. The experimental consist of 28 students and the control class consists of 25 students. Before the treatment given, both of the group is given the pre-test to check their ability level. Then, after the treatment is given, the post-test is given to the both groups. Then, the students were given treatment how to conduct a meeting that they learnt from a video of business English. The post test was held after they undergone a treatment. The instruments to get the data are the oral test and questionnaire. The data of this study is students` score and from the tests` score it can be seen there is a positive significant difference in the experimental group. The t-test to test hypothesize also shows that it is accepted which said that there is an improvement on the students` speaking competence achievement. In conclusion, the video effects on the significant difference on the students speaking competence achievement.

Keywords: video, teaching, speaking, Indonesia

Procedia PDF Downloads 435
4946 Advances in Artificial intelligence Using Speech Recognition

Authors: Khaled M. Alhawiti

Abstract:

This research study aims to present a retrospective study about speech recognition systems and artificial intelligence. Speech recognition has become one of the widely used technologies, as it offers great opportunity to interact and communicate with automated machines. Precisely, it can be affirmed that speech recognition facilitates its users and helps them to perform their daily routine tasks, in a more convenient and effective manner. This research intends to present the illustration of recent technological advancements, which are associated with artificial intelligence. Recent researches have revealed the fact that speech recognition is found to be the utmost issue, which affects the decoding of speech. In order to overcome these issues, different statistical models were developed by the researchers. Some of the most prominent statistical models include acoustic model (AM), language model (LM), lexicon model, and hidden Markov models (HMM). The research will help in understanding all of these statistical models of speech recognition. Researchers have also formulated different decoding methods, which are being utilized for realistic decoding tasks and constrained artificial languages. These decoding methods include pattern recognition, acoustic phonetic, and artificial intelligence. It has been recognized that artificial intelligence is the most efficient and reliable methods, which are being used in speech recognition.

Keywords: speech recognition, acoustic phonetic, artificial intelligence, hidden markov models (HMM), statistical models of speech recognition, human machine performance

Procedia PDF Downloads 477
4945 The Chemistry in the Video Game No Man’s Sky

Authors: Diogo Santos, Nelson Zagalo, Carla Morais

Abstract:

No Man’s Sky (NMS) is a sci-fi video game about survival and exploration where players fly spaceships, search for elements, and use them to survive. NMS isn’t a serious game, and not all the science in the game is presented with scientific evidence. To find how players felt about the scientific content in the game and how they perceive the chemistry in it, a survey was sent to NMS’s players, from which were collected answers from 124 respondents from 23 countries. Chemophobia is still a phenomenon when chemistry or chemicals are a subject of discussion, but 68,9% of our respondents showed a positive attitude towards the presence of chemistry in NMS, with 57% stating that playing the video game motivated them to know more about science. 8% of the players stated that NMS often prompted conversations about the science in the video game between them and teachers, parents, or friends. These results give us ideas on how an entertainment game can potentially help scientists, educators, and science communicators reach a growing, evolving, vibrant, diverse, and demanding audience.

Keywords: digital games, science communication, chemistry, informal learning, No Man’s Sky

Procedia PDF Downloads 110
4944 Theoretical Reflections on Metaphor and Cohesion and the Coherence of Face-To-Face Interactions

Authors: Afef Badri

Abstract:

The role of metaphor in creating the coherence and the cohesion of discourse in online interactive talk has almost received no attention. This paper intends to provide some theoretical reflections on metaphorical coherence as a jointly constructed process that evolves in online, face-to-face interactions. It suggests that the presence of a global conceptual structure in a conversation makes it conceptually cohesive. Yet, coherence remains a process largely determined by other variables (shared goals, communicative intentions, and framework of understanding). Metaphorical coherence created by these variables can be useful in detecting bias in media reporting.

Keywords: coherence, cohesion, face-to-face interactions, metaphor

Procedia PDF Downloads 247