Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 4863

Search results for: face in video recognition

4773 Video Sharing System Based On Wi-fi Camera

Authors: Qidi Lin, Jinbin Huang, Weile Liang

Abstract:

This paper introduces a video sharing platform based on WiFi, which consists of camera, mobile phone and PC server. This platform can receive wireless signal from the camera and show the live video on the mobile phone captured by camera. In addition that, it is able to send commands to camera and control the camera’s holder to rotate. The platform can be applied to interactive teaching and dangerous area’s monitoring and so on. Testing results show that the platform can share the live video of mobile phone. Furthermore, if the system’s PC sever and the camera and many mobile phones are connected together, it can transfer photos concurrently.

Keywords: Wifi Camera, socket mobile, platform video monitoring, remote control

Procedia PDF Downloads 301

4772 Evaluation of Video Quality Metrics and Performance Comparison on Contents Taken from Most Commonly Used Devices

Authors: Pratik Dhabal Deo, Manoj P.

Abstract:

With the increasing number of social media users, the amount of video content available has also significantly increased. Currently, the number of smartphone users is at its peak, and many are increasingly using their smartphones as their main photography and recording devices. There have been a lot of developments in the field of Video Quality Assessment (VQA) and metrics like VMAF, SSIM etc. are said to be some of the best performing metrics, but the evaluation of these metrics is dominantly done on professionally taken video contents using professional tools, lighting conditions etc. No study particularly pinpointing the performance of the metrics on the contents taken by users on very commonly available devices has been done. Datasets that contain a huge number of videos from different high-end devices make it difficult to analyze the performance of the metrics on the content from most used devices even if they contain contents taken in poor lighting conditions using lower-end devices. These devices face a lot of distortions due to various factors since the spectrum of contents recorded on these devices is huge. In this paper, we have presented an analysis of the objective VQA metrics on contents taken only from most used devices and their performance on them, focusing on full-reference metrics. To carry out this research, we created a custom dataset containing a total of 90 videos that have been taken from three most commonly used devices, and android smartphone, an IOS smartphone and a DSLR. On the videos taken on each of these devices, the six most common types of distortions that users face have been applied on addition to already existing H.264 compression based on four reference videos. These six applied distortions have three levels of degradation each. A total of the five most popular VQA metrics have been evaluated on this dataset and the highest values and the lowest values of each of the metrics on the distortions have been recorded. Finally, it is found that blur is the artifact on which most of the metrics didn’t perform well. Thus, in order to understand the results better the amount of blur in the data set has been calculated and an additional evaluation of the metrics was done using HEVC codec, which is the next version of H.264 compression, on the camera that proved to be the sharpest among the devices. The results have shown that as the resolution increases, the performance of the metrics tends to become more accurate and the best performing metric among them is VQM with very few inconsistencies and inaccurate results when the compression applied is H.264, but when the compression is applied is HEVC, SSIM and VMAF have performed significantly better.

Keywords: distortion, metrics, performance, resolution, video quality assessment

Procedia PDF Downloads 178

4771 A Unified Deep Framework for Joint 3d Pose Estimation and Action Recognition from a Single Color Camera

Authors: Huy Hieu Pham, Houssam Salmane, Louahdi Khoudour, Alain Crouzil, Pablo Zegers, Sergio Velastin

Abstract:

We present a deep learning-based multitask framework for joint 3D human pose estimation and action recognition from color video sequences. Our approach proceeds along two stages. In the first, we run a real-time 2D pose detector to determine the precise pixel location of important key points of the body. A two-stream neural network is then designed and trained to map detected 2D keypoints into 3D poses. In the second, we deploy the Efficient Neural Architecture Search (ENAS) algorithm to find an optimal network architecture that is used for modeling the Spatio-temporal evolution of the estimated 3D poses via an image-based intermediate representation and performing action recognition. Experiments on Human3.6M, Microsoft Research Redmond (MSR) Action3D, and Stony Brook University (SBU) Kinect Interaction datasets verify the effectiveness of the proposed method on the targeted tasks. Moreover, we show that our method requires a low computational budget for training and inference.

Keywords: human action recognition, pose estimation, D-CNN, deep learning

Procedia PDF Downloads 115

4770 Proposed Solutions Based on Affective Computing

Authors: Diego Adrian Cardenas Jorge, Gerardo Mirando Guisado, Alfredo Barrientos Padilla

Abstract:

A system based on Affective Computing can detect and interpret human information like voice, facial expressions and body movement to detect emotions and execute a corresponding response. This data is important due to the fact that a person can communicate more effectively with emotions than can be possible with words. This information can be processed through technological components like Facial Recognition, Gait Recognition or Gesture Recognition. As of now, solutions proposed using this technology only consider one component at a given moment. This research investigation proposes two solutions based on Affective Computing taking into account more than one component for emotion detection. The proposals reflect the levels of dependency between hardware devices and software, as well as the interaction process between the system and the user which implies the development of scenarios where both proposals will be put to the test in a live environment. Both solutions are to be developed in code by software engineers to prove the feasibility. To validate the impact on society and business interest, interviews with stakeholders are conducted with an investment mind set where each solution is labeled on a scale of 1 through 5, being one a minimum possible investment and 5 the maximum.

Keywords: affective computing, emotions, emotion detection, face recognition, gait recognition

Procedia PDF Downloads 336

4769 Internet Optimization by Negotiating Traffic Times

Authors: Carlos Gonzalez

Abstract:

This paper describes a system to optimize the use of the internet by clients requiring downloading of videos at peak hours. The system consists of a web server belonging to a provider of video contents, a provider of internet communications and a software application running on a client’s computer. The client using the application software will communicate to the video provider a list of the client’s future video demands. The video provider calculates which videos are going to be more in demand for download in the immediate future, and proceeds to request the internet provider the most optimal hours to do the downloading. The times of the downloading will be sent to the application software, which will use the information of pre-established hours negotiated between the video provider and the internet provider to download those videos. The videos will be saved in a special protected section of the user’s hard disk, which will only be accessed by the application software in the client’s computer. When the client is ready to see a video, the application will search the list of current existent videos in the area of the hard disk; if it does exist, it will use this video directly without the need for internet access. We found that the best way to optimize the download traffic of videos is by negotiation between the internet communication provider and the video content provider.

Keywords: internet optimization, video download, future demands, secure storage

Procedia PDF Downloads 113

4768 Pupil Size: A Measure of Identification Memory in Target Present Lineups

Authors: Camilla Elphick, Graham Hole, Samuel Hutton, Graham Pike

Abstract:

Pupil size has been found to change irrespective of luminosity, suggesting that it can be used to make inferences about cognitive processes, such as cognitive load. To see whether identifying a target requires a different cognitive load to rejecting distractors, the effect of viewing a target (compared with viewing distractors) on pupil size was investigated using a sequential video lineup procedure with two lineup sessions. Forty one participants were chosen randomly via the university. Pupil sizes were recorded when viewing pre target distractors and post target distractors and compared to pupil size when viewing the target. Overall, pupil size was significantly larger when viewing the target compared with viewing distractors. In the first session, pupil size changes were significantly different between participants who identified the target (Hits) and those who did not. Specifically, the pupil size of Hits reduced significantly after viewing the target (by 26%), suggesting that cognitive load reduced following identification. The pupil sizes of Misses (who made no identification) and False Alarms (who misidentified a distractor) did not reduce, suggesting that the cognitive load remained high in participants who failed to make the correct identification. In the second session, pupil sizes were smaller overall, suggesting that cognitive load was smaller in this session, and there was no significant difference between Hits, Misses and False Alarms. Furthermore, while the frequency of Hits increased, so did False Alarms. These two findings suggest that the benefits of including a second session remain uncertain, as the second session neither provided greater accuracy nor a reliable way to measure it. It is concluded that pupil size is a measure of face recognition strength in the first session of a target present lineup procedure. However, it is still not known whether cognitive load is an adequate explanation for this, or whether cognitive engagement might describe the effect more appropriately. If cognitive load and cognitive engagement can be teased apart with further investigation, this would have positive implications for understanding eyewitness identification. Nevertheless, this research has the potential to provide a tool for improving the reliability of lineup procedures.

Keywords: cognitive load, eyewitness identification, face recognition, pupillometry

Procedia PDF Downloads 371

4767 Recognition of Objects in a Maritime Environment Using a Combination of Pre- and Post-Processing of the Polynomial Fit Method

Authors: R. R. Hordijk, O. J. G. Somsen

Abstract:

Traditionally, radar systems are the eyes and ears of a ship. However, these systems have their drawbacks and nowadays they are extended with systems that work with video and photos. Processing of data from these videos and photos is however very labour-intensive and efforts are being made to automate this process. A major problem when trying to recognize objects in water is that the 'background' is not homogeneous so that traditional image recognition technics do not work well. Main question is, can a method be developed which automate this recognition process. There are a large number of parameters involved to facilitate the identification of objects on such images. One is varying the resolution. In this research, the resolution of some images has been reduced to the extreme value of 1% of the original to reduce clutter before the polynomial fit (pre-processing). It turned out that the searched object was clearly recognizable as its grey value was well above the average. Another approach is to take two images of the same scene shortly after each other and compare the result. Because the water (waves) fluctuates much faster than an object floating in the water one can expect that the object is the only stable item in the two images. Both these methods (pre-processing and comparing two images of the same scene) delivered useful results. Though it is too early to conclude that with these methods all image problems can be solved they are certainly worthwhile for further research.

Keywords: image processing, image recognition, polynomial fit, water

Procedia PDF Downloads 507

4766 Teleconsultations and The Need of Onsite Additional Medical Services

Authors: Cristina Hotoleanu

Abstract:

Introduction: The recent Covid-19 pandemic accelerated the development of e-health, including telemedicine, smartphone applications, and medical wearable devices. Providing remote teleconsultations supposes challenges which may require further face-to-face medical interactions. The aim of this study was to assess the correlation between the types of teleconsultations and the need of onsite medical services (investigations and medical visits) for the diagnosis and treatment. Methods: a retrospective study including all the teleconsultations using the platform offered by a telehealth provider in Romania (Telios Care SA) between May 1, 2021- April 30, 2022, was performed. Binary data were analysed using the chi-square test with a significance level of p < 0.05. Results: out of 7163 consultations, 3961 were phone calls, 1981 were online messages, and 1221 were video calls. Onsite medical services were indicated in 3327 (46.44%) cases; the onsite investigations or the onsite visits were recommended for 2908 patients as follows: 2326 in case of phone calls, 582 in case of online messages, none in case of video calls. Both onsite investigations and visits were indicated for 419 patients. The need for onsite additional medical services was significantly higher in the case of phone calls than in the other 2 types of teleconsultations (Chi square= 1207.06, p= 0.00001). The indication for onsite services was done mainly after teleconsultations covering medical specialties (87.34%), significantly higher than the other specialties (Chi square=914.59, p=0.00001). Teleconsultations in surgical specialties and other fields (pharmacy, dentistry, psychology, wellbeing- nutrition, fitness) resulted in 12.13%, respective less than 1%, indication for onsite investigations or visits, explained by using of video calls in most of the cases. Conclusion: a further onsite medical service was necessary in less than a half of the teleconsultations. This indication was done mainly after phone calls and teleconsultations in medical specialties. Video calls were used mostly in psychology, nutrition, and fitness teleconsultations and did not require a further onsite medical service. Other studies are necessary to assess better the types of teleconsultations and the specialties bringing the biggest benefit for the patients.

Keywords: onsite medical services, phone calls, teleconsultations, telemedicine

Procedia PDF Downloads 75

4765 Grounded Theory of Consumer Loyalty: A Perspective through Video Game Addiction

Authors: Bassam Shaikh, R. S. A. Jumain

Abstract:

Game addiction has become an extremely important topic in psychology researchers, particularly in understanding and explaining why individuals become addicted (to video games). In previous studies, effect of online game addiction on social responsibilities, health problems, government action, and the behaviors of individuals to purchase and the causes of making individuals addicted on the video games has been discussed. Extending these concepts in marketing, it could be argued than the phenomenon could enlighten and extending our understanding on consumer loyalty. This study took the Grounded Theory approach, and found that motivation, satisfaction, fulfillments, exploration and achievements to be part of the important elements that builds consumer loyalty.

Keywords: grounded theory, consumer loyalty, video games, video game addiction

Procedia PDF Downloads 509

4764 Non-Invasive Data Extraction from Machine Display Units Using Video Analytics

Authors: Ravneet Kaur, Joydeep Acharya, Sudhanshu Gaur

Abstract:

Artificial Intelligence (AI) has the potential to transform manufacturing by improving shop floor processes such as production, maintenance and quality. However, industrial datasets are notoriously difficult to extract in a real-time, streaming fashion thus, negating potential AI benefits. The main example is some specialized industrial controllers that are operated by custom software which complicates the process of connecting them to an Information Technology (IT) based data acquisition network. Security concerns may also limit direct physical access to these controllers for data acquisition. To connect the Operational Technology (OT) data stored in these controllers to an AI application in a secure, reliable and available way, we propose a novel Industrial IoT (IIoT) solution in this paper. In this solution, we demonstrate how video cameras can be installed in a factory shop floor to continuously obtain images of the controller HMIs. We propose image pre-processing to segment the HMI into regions of streaming data and regions of fixed meta-data. We then evaluate the performance of multiple Optical Character Recognition (OCR) technologies such as Tesseract and Google vision to recognize the streaming data and test it for typical factory HMIs and realistic lighting conditions. Finally, we use the meta-data to match the OCR output with the temporal, domain-dependent context of the data to improve the accuracy of the output. Our IIoT solution enables reliable and efficient data extraction which will improve the performance of subsequent AI applications.

Keywords: human machine interface, industrial internet of things, internet of things, optical character recognition, video analytics

Procedia PDF Downloads 83

4763 E Learning/Teaching and the Impact on Student Performance at the Postgraduate Level

Authors: Charles Lemckert

Abstract:

E-Learning and E-Teaching can mean many things to different people. For some, the implication is that all material must be delivered in an E way, while for others it only forms part of the learning/teaching process, and (unfortunately) for some it is considered too much work. However, just look around and you will see all generations learning using E devices. In this study we used different forms of teaching, including E, to look at how students responded to set activities and how they performed academically. The particular context was set around a postgraduate university course where students were either present at a face-to-face intensive workshop (on water treatment plant design) or where they were not. For the latter, students needed to make sole use of E media. It is relevant to note that even though some were at the face-to-face class, they were still exposed to E material as the lecturer did use PC projections. Additionally, some also accessed the associate E material (pdf slides and video recordings) to assist their required activities. Analysis of the student performance, in their set assignment, showed that the actual form of delivery did not affect the student performance. This is because, in the end, all the students had access to the recorded/presented E material. The study also showed (somewhat expectedly) that when the material they required for the assignment was clear, the student performance did drop. Therefore, it is possible to enhance future delivery of courses through careful reflection and appropriate support. In the end, we must remember innovation is not just restricted to E.

Keywords: postgraduate, engineering, assignment, perforamance

Procedia PDF Downloads 303

4762 Tackling the Digital Divide: Enhancing Video Consultation Access for Digital Illiterate Patients in the Hospital

Authors: Wieke Ellen Bouwes

Abstract:

This study aims to unravel which factors enhance accessibility of video consultations (VCs) for patients with low digital literacy. Thirteen in-depth interviews with patients, hospital employees, eHealth experts, and digital support organizations were held. Patients with low digital literacy received in-home support during real-time video consultations and are observed during the set-up of these consultations. Key findings highlight the importance of patient acceptance, emphasizing video consultations benefits and avoiding standardized courses. The lack of a uniform video consultation system across healthcare providers poses a barrier. Familiarity with support organizations – to support patients in usage of digital tools - among healthcare practitioners enhances accessibility. Moreover, considerations regarding the Dutch General Data Protection Regulation (GDPR) law influence support patients receive. Also, provider readiness to use video consultations influences patient access. Further, alignment between learning styles and support methods seems to determine abilities to learn how to use video consultations. Future research could delve into tailored learning styles and technological solutions for remote access to further explore effectiveness of learning methods.

Keywords: video consultations, digital literacy skills, effectiveness of support, intra- and inter-organizational relationships, patient acceptance of video consultations

Procedia PDF Downloads 41

4761 The Effect of Video Games on English as a Foreign Language Students' Language Learning Motivation

Authors: Shamim Ali

Abstract:

Researchers and teachers have begun developing digital games and model environments for educational purpose; therefore this study examines the effect of a videos game on secondary school students’ language learning motivation. Secondly, it tries to find out the opportunities to develop a decision making process and simultaneously it analyzes the solutions for further implementation in educational setting. Participants were 30 male students randomly assigned to one of the following three treatments: 10 students were assigned to read the game’s story; 10 students were players, who played video game; and, and the last 10 students acted as watchers and observers, their duty was to watch their classmates play the digital video game. A language learning motivation scale was developed and it was given to the participants as a pre- and post-test. Results indicated a significant language learning motivation and the participants were quite motivated in the end. It is, thus, concluded that the use of video games can help enhance high school students’ language learning motivation. It was suggested that video games should be used as a complementary activity not as a replacement for textbook since excessive use of video games can divert the original purpose of learning.

Keywords: EFL, English as a Foreign Language, motivation, video games, EFL learners

Procedia PDF Downloads 152

4760 Content Based Video Retrieval System Using Principal Object Analysis

Authors: Van Thinh Bui, Anh Tuan Tran, Quoc Viet Ngo, The Bao Pham

Abstract:

Video retrieval is a searching problem on videos or clips based on content in which they are relatively close to an input image or video. The application of this retrieval consists of selecting video in a folder or recognizing a human in security camera. However, some recent approaches have been in challenging problem due to the diversity of video types, frame transitions and camera positions. Besides, that an appropriate measures is selected for the problem is a question. In order to overcome all obstacles, we propose a content-based video retrieval system in some main steps resulting in a good performance. From a main video, we process extracting keyframes and principal objects using Segmentation of Aggregating Superpixels (SAS) algorithm. After that, Speeded Up Robust Features (SURF) are selected from those principal objects. Then, the model “Bag-of-words” in accompanied by SVM classification are applied to obtain the retrieval result. Our system is performed on over 300 videos in diversity from music, history, movie, sports, and natural scene to TV program show. The performance is evaluated in promising comparison to the other approaches.

Keywords: video retrieval, principal objects, keyframe, segmentation of aggregating superpixels, speeded up robust features, bag-of-words, SVM

Procedia PDF Downloads 275

4759 Remote Video Supervision via DVB-H Channels

Authors: Hanen Ghabi, Youssef Oudhini, Hassen Mnif

Abstract:

By reference to recent publications dealing with the same problem, and as a follow-up to this research work already published, we propose in this article a new original idea of tele supervision exploiting the opportunities offered by the DVB-H system. The objective is to exploit the RF channels of the DVB-H network in order to insert digital remote monitoring images dedicated to a remote solar power plant. Indeed, the DVB-H (Digital Video Broadcast-Handheld) broadcasting system was designed and deployed for digital broadcasting on the same platform as the parent system, DVB-T. We claim to be able to exploit this approach in order to satisfy the operator of remote photovoltaic sites (and others) in order to remotely control the components of isolated installations by means of video surveillance.

Keywords: video surveillance, digital video broadcast-handheld, photovoltaic sites, AVC

Procedia PDF Downloads 149

4758 Video-Based Psychoeducation for Caregivers of Persons with Schizophrenia

Authors: Jilu David

Abstract:

Background: Schizophrenia is one of the most misunderstood mental illnesses across the globe. Lack of understanding about mental illnesses often delay treatment, severely affects the functionality of the person, and causes distress to the family. The study, Video-based Psychoeducation for Caregivers of Persons with Schizophrenia, consisted of developing a psychoeducational video about Schizophrenia, its symptoms, causes, treatment, and the importance of family support. Methodology: A quasi-experimental pre-post design was used to understand the feasibility of the study. Qualitative analysis strengthened the feasibility outcomes. Knowledge About Schizophrenia Interview was used to assess the level of knowledge of 10 participants, before and after the screening of the video. Results: Themes of usefulness, length, content, educational component, format of the intervention, and language emerged in the qualitative analysis. There was a statistically significant difference in the knowledge level of participants before and after the video screening. Conclusion: The statistical and qualitative analysis revealed that the video-based psychoeducation program was feasible and that it facilitated a general improvement in knowledge of the participants.

Keywords: Schizophrenia, mental illness, psychoeducation, video-based psychoeducation, family support

Procedia PDF Downloads 103

4757 Multimodal Database of Emotional Speech, Video and Gestures

Authors: Tomasz Sapiński, Dorota Kamińska, Adam Pelikant, Egils Avots, Cagri Ozcinar, Gholamreza Anbarjafari

Abstract:

People express emotions through different modalities. Integration of verbal and non-verbal communication channels creates a system in which the message is easier to understand. Expanding the focus to several expression forms can facilitate research on emotion recognition as well as human-machine interaction. In this article, the authors present a Polish emotional database composed of three modalities: facial expressions, body movement and gestures, and speech. The corpora contains recordings registered in studio conditions, acted out by 16 professional actors (8 male and 8 female). The data is labeled with six basic emotions categories, according to Ekman’s emotion categories. To check the quality of performance, all recordings are evaluated by experts and volunteers. The database is available to academic community and might be useful in the study on audio-visual emotion recognition.

Keywords: body movement, emotion recognition, emotional corpus, facial expressions, gestures, multimodal database, speech

Procedia PDF Downloads 322

4756 The Video Database for Teaching and Learning in Football Refereeing

Authors: M. Armenteros, A. Domínguez, M. Fernández, A. J. Benítez

Abstract:

The following paper describes the video database tool used by the Fédération Internationale de Football Association (FIFA) as part of the research project developed in collaboration with the Carlos III University of Madrid. The database project began in 2012, with the aim of creating an educational tool for the training of instructors, referees and assistant referees, and it has been used in all FUTURO III courses since 2013. The platform now contains 3,135 video clips of different match situations from FIFA competitions. It has 1,835 users (FIFA instructors, referees and assistant referees). In this work, the main features of the database are described, such as the use of a search tool and the creation of multimedia presentations and video quizzes. The database has been developed in MySQL, ActionScript, Ruby on Rails and HTML. This tool has been rated by users as "very good" in all courses, which prompt us to introduce it as an ideal tool for any other sport that requires the use of video analysis.

Keywords: assistants referees, cloud computing, e-learning, instructors, FIFA, referees, soccer, video database

Procedia PDF Downloads 410

4755 The Effects of Watching Text-Relevant Video Segments with/without Subtitles on Vocabulary Development of Arabic as a Foreign Language Learners

Authors: Amirreza Karami, Hawraa Nafea Hameed Alzouwain, Freddie A. Bowles

Abstract:

This study investigates the effects of watching text-relevant video segments with/without subtitles on vocabulary development of Arabic as a Foreign Language (AFL) learners. The participants of the study were assigned to two groups: one control group and one experimental group. The control group received no video-based instruction while the experimental group watched a text-relevant video segment in three stages: pre, while, and post-instruction. The preliminary results of the pre-test and post-test show that watching text-relevant video segments through following a pre-while-post procedure can help the vocabulary development of AFL learners more than non-video-based instruction.

Keywords: text-relevant video segments, vocabulary development, Arabic as a Foreign Language, AFL, pre-while-post instruction

Procedia PDF Downloads 136

4754 Automated Video Surveillance System for Detection of Suspicious Activities during Academic Offline Examination

Authors: G. Sandhya Devi, G. Suvarna Kumar, S. Chandini

Abstract:

This research work aims to develop a system that will analyze and identify students who indulge in malpractices/suspicious activities during the course of an academic offline examination. Automated Video Surveillance provides an optimal solution which helps in monitoring the students and identifying the malpractice event immediately. This work is organized into three modules. The first module deals with performing an impersonation check using a PCA-based face recognition method which is done by cross checking his profile with the database. The presence or absence of the student is even determined in this module by implementing an image registration technique wherein a grid is formed by considering all the images registered using the frontal camera at the determined positions. Second, detecting such facial malpractices in which a student gets involved in conversation with another, trying to obtain unauthorized information etc., based on the threshold range evaluated by considering his/her mouth state whether open or closed. The third module deals with identification of unauthorized material or gadgets used in the examination hall by training the positive samples of the object through various stages. Here, a top view camera feed is analyzed to detect the suspicious activities. The system automatically alerts the administration when any suspicious activities are identified, thereby reducing the error rate caused due to manual monitoring. This work is an improvement over our previous work published in identifying suspicious activities done by examinees in an offline examination.

Keywords: impersonation, image registration, incrimination, object detection, threshold evaluation

Procedia PDF Downloads 201

4753 Human Action Recognition Using Variational Bayesian HMM with Dirichlet Process Mixture of Gaussian Wishart Emission Model

Authors: Wanhyun Cho, Soonja Kang, Sangkyoon Kim, Soonyoung Park

Abstract:

In this paper, we present the human action recognition method using the variational Bayesian HMM with the Dirichlet process mixture (DPM) of the Gaussian-Wishart emission model (GWEM). First, we define the Bayesian HMM based on the Dirichlet process, which allows an infinite number of Gaussian-Wishart components to support continuous emission observations. Second, we have considered an efficient variational Bayesian inference method that can be applied to drive the posterior distribution of hidden variables and model parameters for the proposed model based on training data. And then we have derived the predictive distribution that may be used to classify new action. Third, the paper proposes a process of extracting appropriate spatial-temporal feature vectors that can be used to recognize a wide range of human behaviors from input video image. Finally, we have conducted experiments that can evaluate the performance of the proposed method. The experimental results show that the method presented is more efficient with human action recognition than existing methods.

Keywords: human action recognition, Bayesian HMM, Dirichlet process mixture model, Gaussian-Wishart emission model, Variational Bayesian inference, prior distribution and approximate posterior distribution, KTH dataset

Procedia PDF Downloads 320

4752 The Study of Mirror Self-Recognition in Wildlife

Authors: Azwan Hamdan, Mohd Qayyum Ab Latip, Hasliza Abu Hassim, Tengku Rinalfi Putra Tengku Azizan, Hafandi Ahmad

Abstract:

Animal cognition provides some evidence for self-recognition, which is described as the ability to recognize oneself as an individual separate from the environment and other individuals. The mirror self-recognition (MSR) or mark test is a behavioral technique to determine whether an animal have the ability of self-recognition or self-awareness in front of the mirror. It also describes the capability for an animal to be aware of and make judgments about its new environment. Thus, the objectives of this study are to measure and to compare the ability of wild and captive wildlife in mirror self-recognition. Wild animals from the Royal Belum Rainforest Malaysia were identified based on the animal trails and salt lick grounds. Acrylic mirrors with wood frame (200 x 250cm) were located near to animal trails. Camera traps (Bushnell, UK) with motion-detection infrared sensor are placed near the animal trails or hiding spot. For captive wildlife, animals such as Malayan sun bear (Helarctos malayanus) and chimpanzee (Pan troglodytes) were selected from Zoo Negara Malaysia. The captive animals were also marked using odorless and non-toxic white paint on its forehead. An acrylic mirror with wood frame (200 x 250cm) and a video camera were placed near the cage. The behavioral data were analyzed using ethogram and classified through four stages of MSR; social responses, physical inspection, repetitive mirror-testing behavior and realization of seeing themselves. Results showed that wild animals such as barking deer (Muntiacus muntjak) and long-tailed macaque (Macaca fascicularis) increased their physical inspection (e.g inspecting the reflected image) and repetitive mirror-testing behavior (e.g rhythmic head and leg movement). This would suggest that the ability to use a mirror is most likely related to learning process and cognitive evolution in wild animals. However, the sun bear’s behaviors were inconsistent and did not clearly undergo four stages of MSR. This result suggests that when keeping Malayan sun bear in captivity, it may promote communication and familiarity between conspecific. Interestingly, chimp has positive social response (e.g manipulating lips) and physical inspection (e.g using hand to inspect part of the face) when they facing a mirror. However, both animals did not show any sign towards the mark due to lost of interest in the mark and realization that the mark is inconsequential. Overall, the results suggest that the capacity for MSR is the beginning of a developmental process of self-awareness and mental state attribution. In addition, our findings show that self-recognition may be based on different complex neurological and level of encephalization in animals. Thus, research on self-recognition in animals will have profound implications in understanding the cognitive ability of an animal as an effort to help animals, such as enhanced management, design of captive individuals’ enclosures and exhibits, and in programs to re-establish populations of endangered or threatened species.

Keywords: mirror self-recognition (MSR), self-recognition, self-awareness, wildlife

Procedia PDF Downloads 241

4751 Texture-Based Image Forensics from Video Frame

Authors: Li Zhou, Yanmei Fang

Abstract:

With current technology, images and videos can be obtained more easily than ever. It is so easy to manipulate these digital multimedia information when obtained, and that the content or source of the image and video could be easily tampered. In this paper, we propose to identify the image and video frame by the texture-based approach, e.g. Markov Transition Probability (MTP), which is in space domain, DCT domain and DWT domain, respectively. In the experiment, image and video frame database is constructed, and is used to train and test the classifier Support Vector Machine (SVM). Experiment results show that the texture-based approach has good performance. In order to verify the experiment result, and testify the universality and robustness of algorithm, we build a random testing dataset, the random testing result is in keeping with above experiment.

Keywords: multimedia forensics, video frame, LBP, MTP, SVM

Procedia PDF Downloads 400

4750 The Effect of Pixelation on Face Detection: Evidence from Eye Movements

Authors: Kaewmart Pongakkasira

Abstract:

This study investigated how different levels of pixelation affect face detection in natural scenes. Eye movements and reaction times, while observers searched for faces in natural scenes rendered in different ranges of pixels, were recorded. Detection performance for coarse visual detail at lower pixel size (3 x 3) was better than with very blurred detail carried by higher pixel size (9 x 9). The result is consistent with the notion that face detection relies on gross detail information of face-shape template, containing crude shape structure and features. In contrast, detection was impaired when face shape and features are obscured. However, it was considered that the degradation of scenic information might also contribute to the effect. In the next experiment, a more direct measurement of the effect of pixelation on face detection, only the embedded face photographs, but not the scene background, will be filtered.

Keywords: eye movements, face detection, face-shape information, pixelation

Procedia PDF Downloads 288

4749 Adaptive Multiple Transforms Hardware Architecture for Versatile Video Coding

Authors: T. Damak, S. Houidi, M. A. Ben Ayed, N. Masmoudi

Abstract:

The Versatile Video Coding standard (VVC) is actually under development by the Joint Video Exploration Team (or JVET). An Adaptive Multiple Transforms (AMT) approach was announced. It is based on different transform modules that provided an efficient coding. However, the AMT solution raises several issues especially regarding the complexity of the selected set of transforms. This can be an important issue, particularly for a future industrial adoption. This paper proposed an efficient hardware implementation of the most used transform in AMT approach: the DCT II. The developed circuit is adapted to different block sizes and can reach a minimum frequency of 192 MHz allowing an optimized execution time.

Keywords: adaptive multiple transforms, AMT, DCT II, hardware, transform, versatile video coding, VVC

Procedia PDF Downloads 119

4748 Effect of Educational Information with Video Compact Disc on Anxiety Level in Patients Undergoing Bronchoscopy in Ramathibodi Hospital

Authors: Chariya Laohavich, Viboon Bunsrangsuk

Abstract:

Objective: Bronchoscopy is a common outpatient procedure. The authors compared the patient anxiety level before and after received video-assisted procedural information. Method: One hundred and twenty patients who never received bronchoscopy and scheduled for elective bronchoscopy at outpatient Bronchosope unit at Ramathibodi Hospital, Mahidol University were randomized into control and intervention group. Video-assisted procedural information was given in intervention group. Pre and post procedural anxiety score were recorded and compared between two groups. Paired T-test was used for statistical analysis. Result: There was statistically significant decrease (p < 0.001) for anxiety score in patients who received video assisted procedural information compare with control group. Conclusion: Video-assisted procedural information should be given to patient who will have bronchoscopy to reduce anxiety.

Keywords: anxiety, bronchoscopy, video compact disc (VCD)

Procedia PDF Downloads 322

4747 Going Viral: Constructively Aligning the Use of Digital Video to Effectively Support Faculty Development

Authors: Samuel Olugbenga King

Abstract:

This review article, which is a synthesis of the relevant research literature, focuses on the capabilities of digital video to support, facilitate and enhance faculty development. Based on the literature review, faculty development (i.e., academic or educational development) requires the continued adoption of cohesive, theoretical frameworks to guide research and practice; incorporation of relevant tools from analogous fields, such as teacher professional development; systematic program evaluations; and detailed descriptions of practice to further practice and creative development. A cohesive, five-heuristic framework is subsequently outlined to inform the design and evaluation of the use of digital video, so as to address the barriers to advancing faculty development, as identified through the literature review. Alternative impact evaluation approaches are also described, while the limitations of using digital video for faculty development are highlighted. This paper is therefore conceived as one way to meaningfully leverage the educational affordances of digital video to address some lingering gaps in faculty development.

Keywords: digital video, faculty/educational development, evaluation, scholarship of teaching and learning (SoTL)

Procedia PDF Downloads 322

4746 Possibilities, Challenges and the State of the Art of Automatic Speech Recognition in Air Traffic Control

Authors: Van Nhan Nguyen, Harald Holone

Abstract:

Over the past few years, a lot of research has been conducted to bring Automatic Speech Recognition (ASR) into various areas of Air Trafﬁc Control (ATC), such as air trafﬁc control simulation and training, monitoring live operators for with the aim of safety improvements, air trafﬁc controller workload measurement and conducting analysis on large quantities controller-pilot speech. Due to the high accuracy requirements of the ATC context and its unique challenges, automatic speech recognition has not been widely adopted in this ﬁeld. With the aim of providing a good starting point for researchers who are interested bringing automatic speech recognition into ATC, this paper gives an overview of possibilities and challenges of applying automatic speech recognition in air trafﬁc control. To provide this overview, we present an updated literature review of speech recognition technologies in general, as well as speciﬁc approaches relevant to the ATC context. Based on this literature review, criteria for selecting speech recognition approaches for the ATC domain are presented, and remaining challenges and possible solutions are discussed.

Keywords: automatic speech recognition, asr, air traffic control, atc

Procedia PDF Downloads 364

4745 Efficient Video Compression Technique Using Convolutional Neural Networks and Generative Adversarial Network

Authors: P. Karthick, K. Mahesh

Abstract:

Video has become an increasingly significant component of our digital everyday contact. With the advancement of greater contents and shows of the resolution, its significant volume poses serious obstacles to the objective of receiving, distributing, compressing, and revealing video content of high quality. In this paper, we propose the primary beginning to complete a deep video compression model that jointly upgrades all video compression components. The video compression method involves splitting the video into frames, comparing the images using convolutional neural networks (CNN) to remove duplicates, repeating the single image instead of the duplicate images by recognizing and detecting minute changes using generative adversarial network (GAN) and recorded with long short-term memory (LSTM). Instead of the complete image, the small changes generated using GAN are substituted, which helps in frame level compression. Pixel wise comparison is performed using K-nearest neighbours (KNN) over the frame, clustered with K-means, and singular value decomposition (SVD) is applied for each and every frame in the video for all three color channels [Red, Green, Blue] to decrease the dimension of the utility matrix [R, G, B] by extracting its latent factors. Video frames are packed with parameters with the aid of a codec and converted to video format, and the results are compared with the original video. Repeated experiments on several videos with different sizes, duration, frames per second (FPS), and quality results demonstrate a significant resampling rate. On average, the result produced had approximately a 10% deviation in quality and more than 50% in size when compared with the original video.

Keywords: video compression, K-means clustering, convolutional neural network, generative adversarial network, singular value decomposition, pixel visualization, stochastic gradient descent, frame per second extraction, RGB channel extraction, self-detection and deciding system

Procedia PDF Downloads 161

4744 A Contribution to Human Activities Recognition Using Expert System Techniques

Authors: Malika Yaici, Soraya Aloui, Sara Semchaoui

Abstract:

This paper deals with human activity recognition from sensor data. It is an active research area, and the main objective is to obtain a high recognition rate. In this work, a recognition system based on expert systems is proposed; the recognition is performed using the objects, object states, and gestures and taking into account the context (the location of the objects and of the person performing the activity, the duration of the elementary actions and the activity). The system recognizes complex activities after decomposing them into simple, easy-to-recognize activities. The proposed method can be applied to any type of activity. The simulation results show the robustness of our system and its speed of decision.

Keywords: human activity recognition, ubiquitous computing, context-awareness, expert system

Procedia PDF Downloads 68