Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 41162

Search results for: big video data analysis

41042 High Level Synthesis of Canny Edge Detection Algorithm on Zynq Platform

Authors: Hanaa M. Abdelgawad, Mona Safar, Ayman M. Wahba

Abstract:

Real-time image and video processing is a demand in many computer vision applications, e.g. video surveillance, traffic management and medical imaging. The processing of those video applications requires high computational power. Therefore, the optimal solution is the collaboration of CPU and hardware accelerators. In this paper, a Canny edge detection hardware accelerator is proposed. Canny edge detection is one of the common blocks in the pre-processing phase of image and video processing pipeline. Our presented approach targets offloading the Canny edge detection algorithm from processing system (PS) to programmable logic (PL) taking the advantage of High Level Synthesis (HLS) tool flow to accelerate the implementation on Zynq platform. The resulting implementation enables up to a 100x performance improvement through hardware acceleration. The CPU utilization drops down and the frame rate jumps to 60 fps of 1080p full HD input video stream.

Keywords: high level synthesis, canny edge detection, hardware accelerators, computer vision

Procedia PDF Downloads 455

41041 Constructivist Design Approaches to Video Production for Distance Education in Business and Economics

Authors: C. von Essen

Abstract:

This study outlines and evaluates a constructivist design approach to the creation of educational video on postgraduate business degree programmes. Many online courses are tapping into the educational affordances of video, as this form of online learning has the potential to create rich, multimodal experiences. And yet, in many learning contexts video is still being used to transmit instruction to passive learners, rather than promote learner engagement and knowledge creation. Constructivism posits the notion that learning is shaped as students make connections between their experiences and ideas. This paper pivots on the following research question: how can we design educational video in ways which promote constructivist learning and stimulate analytic viewing? By exploring and categorizing over two thousand educational videos created since 2014 for over thirty postgraduate courses in business, economics, mathematics and statistics, this paper presents and critically reflects on a taxonomy of video styles and features. It links the pedagogical intent of video – be it concept explanation, skill demonstration, feedback, real-world application of ideas, community creation, or the cultivation of course narrative – to specific presentational characteristics such as visual effects including diagrammatic and real-life graphics and aminations, commentary and sound options, chronological sequencing, interactive elements, and presenter set-up. The findings of this study inform a framework which captures the pedagogical, technological and production considerations instructional designers and educational media specialists should be conscious of when planning and preparing the video. More broadly, the paper demonstrates how learning theory and technology can coalesce to produce informed and pedagogical grounded instructional design choices. This paper reveals how crafting video in a more conscious and critical manner can produce powerful, new educational design.

Keywords: educational video, constructivism, instructional design, business education

Procedia PDF Downloads 208

41040 Motion Estimator Architecture with Optimized Number of Processing Elements for High Efficiency Video Coding

Authors: Seongsoo Lee

Abstract:

Motion estimation occupies the heaviest computation in HEVC (high efficiency video coding). Many fast algorithms such as TZS (test zone search) have been proposed to reduce the computation. Still the huge computation of the motion estimation is a critical issue in the implementation of HEVC video codec. In this paper, motion estimator architecture with optimized number of PEs (processing element) is presented by exploiting early termination. It also reduces hardware size by exploiting parallel processing. The presented motion estimator architecture has 8 PEs, and it can efficiently perform TZS with very high utilization of PEs.

Keywords: motion estimation, test zone search, high efficiency video coding, processing element, optimization

Procedia PDF Downloads 336

41039 Forensic Comparison of Facial Images for Human Identification

Authors: D. P. Gangwar

Abstract:

Identification of human through facial images has got great importance in forensic science. The video recordings, CCTV footage, passports, driver licenses and other related documents are invariably sent to the laboratory for comparison of the questioned photographs as well as video recordings with suspected photographs/recordings to prove the identity of a person. More than 300 questioned and 300 control photographs received in actual crime cases, received from various investigation agencies, have been compared by me so far using various familiar analysis and comparison techniques such as Holistic comparison, Morphological analysis, Photo-anthropometry and superimposition. On the basis of findings obtained during the examination huge photo exhibits, a realistic and comprehensive technique has been proposed which could be very useful for forensic.

Keywords: CCTV Images, facial features, photo-anthropometry, superimposition

Procedia PDF Downloads 507

41038 Object Trajectory Extraction by Using Mean of Motion Vectors Form Compressed Video Bitstream

Authors: Ching-Ting Hsu, Wei-Hua Ho, Yi-Chun Chang

Abstract:

Video object tracking is one of the popular research topics in computer graphics area. The trajectory can be applied in security, traffic control, even the sports training. The trajectory for sports training can be utilized to analyze the athlete’s performance without traditional sensors. There are many relevant works which utilize mean shift algorithm with background subtraction. This kind of the schemes should select a kernel function which may affect the accuracy and performance. In this paper, we consider the motion information in the pre-coded bitstream. The proposed algorithm extracts the trajectory by composing the motion vectors from the pre-coded bitstream. We gather the motion vectors from the overlap area of the object and calculate mean of the overlapped motion vectors. We implement and simulate our proposed algorithm in H.264 video codec. The performance is better than relevant works and keeps the accuracy of the object trajectory. The experimental results show that the proposed trajectory extraction can extract trajectory form the pre-coded bitstream in high accuracy and achieve higher performance other relevant works.

Keywords: H.264, video bitstream, video object tracking, sports training

Procedia PDF Downloads 407

41037 VIAN-DH: Computational Multimodal Conversation Analysis Software and Infrastructure

Authors: Teodora Vukovic, Christoph Hottiger, Noah Bubenhofer

Abstract:

The development of VIAN-DH aims at bridging two linguistic approaches: conversation analysis/interactional linguistics (IL), so far a dominantly qualitative field, and computational/corpus linguistics and its quantitative and automated methods. Contemporary IL investigates the systematic organization of conversations and interactions composed of speech, gaze, gestures, and body positioning, among others. These highly integrated multimodal behaviour is analysed based on video data aimed at uncovering so called “multimodal gestalts”, patterns of linguistic and embodied conduct that reoccur in specific sequential positions employed for specific purposes. Multimodal analyses (and other disciplines using videos) are so far dependent on time and resource intensive processes of manual transcription of each component from video materials. Automating these tasks requires advanced programming skills, which is often not in the scope of IL. Moreover, the use of different tools makes the integration and analysis of different formats challenging. Consequently, IL research often deals with relatively small samples of annotated data which are suitable for qualitative analysis but not enough for making generalized empirical claims derived quantitatively. VIAN-DH aims to create a workspace where many annotation layers required for the multimodal analysis of videos can be created, processed, and correlated in one platform. VIAN-DH will provide a graphical interface that operates state-of-the-art tools for automating parts of the data processing. The integration of tools that already exist in computational linguistics and computer vision, facilitates data processing for researchers lacking programming skills, speeds up the overall research process, and enables the processing of large amounts of data. The main features to be introduced are automatic speech recognition for the transcription of language, automatic image recognition for extraction of gestures and other visual cues, as well as grammatical annotation for adding morphological and syntactic information to the verbal content. In the ongoing instance of VIAN-DH, we focus on gesture extraction (pointing gestures, in particular), making use of existing models created for sign language and adapting them for this specific purpose. In order to view and search the data, VIAN-DH will provide a unified format and enable the import of the main existing formats of annotated video data and the export to other formats used in the field, while integrating different data source formats in a way that they can be combined in research. VIAN-DH will adapt querying methods from corpus linguistics to enable parallel search of many annotation levels, combining token-level and chronological search for various types of data. VIAN-DH strives to bring crucial and potentially revolutionary innovation to the field of IL, (that can also extend to other fields using video materials). It will allow the processing of large amounts of data automatically and, the implementation of quantitative analyses, combining it with the qualitative approach. It will facilitate the investigation of correlations between linguistic patterns (lexical or grammatical) with conversational aspects (turn-taking or gestures). Users will be able to automatically transcribe and annotate visual, spoken and grammatical information from videos, and to correlate those different levels and perform queries and analyses.

Keywords: multimodal analysis, corpus linguistics, computational linguistics, image recognition, speech recognition

Procedia PDF Downloads 80

41036 Normalized Compression Distance Based Scene Alteration Analysis of a Video

Authors: Lakshay Kharbanda, Aabhas Chauhan

Abstract:

In this paper, an application of Normalized Compression Distance (NCD) to detect notable scene alterations occurring in videos is presented. Several research groups have been developing methods to perform image classification using NCD, a computable approximation to Normalized Information Distance (NID) by studying the degree of similarity in images. The timeframes where significant aberrations between the frames of a video have occurred have been identified by obtaining a threshold NCD value, using two compressors: LZMA and BZIP2 and defining scene alterations using Pixel Difference Percentage metrics.

Keywords: image compression, Kolmogorov complexity, normalized compression distance, root mean square error

Procedia PDF Downloads 307

41035 Kinematic Analysis of Human Gait for Typical Postures of Walking, Running and Cart Pulling

Authors: Nupur Karmaker, Hasin Aupama Azhari, Abdul Al Mortuza, Abhijit Chanda, Golam Abu Zakaria

Abstract:

Purpose: The purpose of gait analysis is to determine the biomechanics of the joint, phases of gait cycle, graphical and analytical analysis of degree of rotation, analysis of the electrical activity of muscles and force exerted on the hip joint at different locomotion during walking, running and cart pulling. Methods and Materials: Visual gait analysis and electromyography method has been used to detect the degree of rotation of joints and electrical activity of muscles. In cinematography method an object is observed from different sides and takes its video. Cart pulling length has been divided into frames with respect to time by using video splitter software. Phases of gait cycle, degree of rotation of joints, EMG profile and force analysis during walking and running has been taken from different papers. Gait cycle and degree of rotation of joints during cart pulling has been prepared by using video camera, stop watch, video splitter software and Microsoft Excel. Results and Discussion: During the cart pulling the force exerted on hip is the resultant of various forces. The force on hip is the vector sum of the force Fg= mg, due the body of weight of the person and Fa= ma, due to the velocity. Maximum stance phase shows during cart pulling and minimum shows during running. During cart pulling shows maximum degree of rotation of hip joint, knee: running, and ankle: cart pulling. During walking, it has been observed minimum degree of rotation of hip, ankle: during running. During cart pulling, dynamic force depends on the walking velocity, body weight and load weight. Conclusions: 80% people suffer gait related disease with increasing their age. Proper care should take during cart pulling. It will be better to establish the gait laboratory to determine the gait related diseases. If the way of cart pulling is changed i.e the design of cart pulling machine, load bearing system is changed then it would possible to reduce the risk of limb loss, flat foot syndrome and varicose vein in lower limb.

Keywords: kinematic, gait, gait lab, phase, force analysis

Procedia PDF Downloads 555

41034 The Application of Video Segmentation Methods for the Purpose of Action Detection in Videos

Authors: Nassima Noufail, Sara Bouhali

Abstract:

In this work, we develop a semi-supervised solution for the purpose of action detection in videos and propose an efficient algorithm for video segmentation. The approach is divided into video segmentation, feature extraction, and classification. In the first part, a video is segmented into clips, and we used the K-means algorithm for this segmentation; our goal is to find groups based on similarity in the video. The application of k-means clustering into all the frames is time-consuming; therefore, we started by the identification of transition frames where the scene in the video changes significantly, and then we applied K-means clustering into these transition frames. We used two image filters, the gaussian filter and the Laplacian of Gaussian. Each filter extracts a set of features from the frames. The Gaussian filter blurs the image and omits the higher frequencies, and the Laplacian of gaussian detects regions of rapid intensity changes; we then used this vector of filter responses as an input to our k-means algorithm. The output is a set of cluster centers. Each video frame pixel is then mapped to the nearest cluster center and painted with a corresponding color to form a visual map. The resulting visual map had similar pixels grouped. We then computed a cluster score indicating how clusters are near each other and plotted a signal representing frame number vs. clustering score. Our hypothesis was that the evolution of the signal would not change if semantically related events were happening in the scene. We marked the breakpoints at which the root mean square level of the signal changes significantly, and each breakpoint is an indication of the beginning of a new video segment. In the second part, for each segment from part 1, we randomly selected a 16-frame clip, then we extracted spatiotemporal features using convolutional 3D network C3D for every 16 frames using a pre-trained model. The C3D final output is a 512-feature vector dimension; hence we used principal component analysis (PCA) for dimensionality reduction. The final part is the classification. The C3D feature vectors are used as input to a multi-class linear support vector machine (SVM) for the training model, and we used a multi-classifier to detect the action. We evaluated our experiment on the UCF101 dataset, which consists of 101 human action categories, and we achieved an accuracy that outperforms the state of art by 1.2%.

Keywords: video segmentation, action detection, classification, Kmeans, C3D

Procedia PDF Downloads 51

41033 Evaluation of Video Quality Metrics and Performance Comparison on Contents Taken from Most Commonly Used Devices

Authors: Pratik Dhabal Deo, Manoj P.

Abstract:

With the increasing number of social media users, the amount of video content available has also significantly increased. Currently, the number of smartphone users is at its peak, and many are increasingly using their smartphones as their main photography and recording devices. There have been a lot of developments in the field of Video Quality Assessment (VQA) and metrics like VMAF, SSIM etc. are said to be some of the best performing metrics, but the evaluation of these metrics is dominantly done on professionally taken video contents using professional tools, lighting conditions etc. No study particularly pinpointing the performance of the metrics on the contents taken by users on very commonly available devices has been done. Datasets that contain a huge number of videos from different high-end devices make it difficult to analyze the performance of the metrics on the content from most used devices even if they contain contents taken in poor lighting conditions using lower-end devices. These devices face a lot of distortions due to various factors since the spectrum of contents recorded on these devices is huge. In this paper, we have presented an analysis of the objective VQA metrics on contents taken only from most used devices and their performance on them, focusing on full-reference metrics. To carry out this research, we created a custom dataset containing a total of 90 videos that have been taken from three most commonly used devices, and android smartphone, an IOS smartphone and a DSLR. On the videos taken on each of these devices, the six most common types of distortions that users face have been applied on addition to already existing H.264 compression based on four reference videos. These six applied distortions have three levels of degradation each. A total of the five most popular VQA metrics have been evaluated on this dataset and the highest values and the lowest values of each of the metrics on the distortions have been recorded. Finally, it is found that blur is the artifact on which most of the metrics didn’t perform well. Thus, in order to understand the results better the amount of blur in the data set has been calculated and an additional evaluation of the metrics was done using HEVC codec, which is the next version of H.264 compression, on the camera that proved to be the sharpest among the devices. The results have shown that as the resolution increases, the performance of the metrics tends to become more accurate and the best performing metric among them is VQM with very few inconsistencies and inaccurate results when the compression applied is H.264, but when the compression is applied is HEVC, SSIM and VMAF have performed significantly better.

Keywords: distortion, metrics, performance, resolution, video quality assessment

Procedia PDF Downloads 182

41032 Developing Communicative Skills in Foreign Languages by Video Tasks

Authors: Ekaterina G. Lipatova

Abstract:

The developing potential of a video task in teaching foreign languages involves the opportunities to improve four aspects of speech production process: listening, reading, speaking and writing. A video represents the sequence of actions, realized in the pictures logically connected and verbalized speech flow that simplifies and stimulates the process of perception. In this connection listening skills of students are developed effectively as well as their intellectual properties such as synthesizing, analyzing and generalizing the information. In terms of teaching capacity, a video task, in our opinion, is more stimulating than a traditional listening, since it involves the student into the plot of the communicative situation, emotional background and potentially makes them react to the gist in the cognitive and communicative ways. To be an effective method of teaching the video task should be structured in the way of psycho-linguistic characteristics of speech production process, in other words, should include three phases: before-watching, while-watching and after-watching. The system of tasks provided to each phase might involve the situations on reflecting to the video content in the forms of filling-the-gap tasks, multiple choice, True-or-False tasks (reading skills), exercises on expressing the opinion, project fulfilling (writing and speaking skills). In the before-watching phase we offer the students to adjust their perception mechanism to the topic and the problem of the chosen video by such task as “what do you know about such a problem?”, “is it new for you?”, “have you ever faced the situation of…?”. Then we proceed with the lexical and grammatical analysis of language units that form the body of a speech sample to lessen the perception and develop the student’s lexicon. The goal of while-watching phase is to build the student’s awareness about the problem presented in the video and challenge their inner attitude towards what they have seen by identifying the mistakes in the statements about the video content or making the summary, justifying their understanding. Finally, we move on to development of their speech skills within the communicative situation they observed and learnt by stimulating them to search the similar ideas in their backgrounds and represent them orally or in the written form or express their own opinion on the problem. It is compulsory to highlight, that a video task should contain the urgent, valid and interesting event related to the future profession of the student, since it will help to activate cognitive, emotional, verbal and ethic capacity of students. Also, logically structured video tasks are easily integrated into the system of e-learning and can provide the opportunity for the students to work with the foreign language on their own.

Keywords: communicative situation, perception mechanism, speech production process, speech skills

Procedia PDF Downloads 224

41031 A Framework for Rating Synchronous Video E-Learning Applications

Authors: Alex Vakaloudis, Juan Manuel Escano-Gonzalez

Abstract:

Setting up a system to broadcast live lectures on the web is a procedure which on the surface does not require any serious technical skills mainly due to the facilities provided by popular learning management systems and their plugins. Nevertheless, producing a system of outstanding quality is a multidisciplinary and by no means a straightforward task. This complicatedness may be responsible for the delivery of an overall poor experience to the learners, and it calls for a formal rating framework that takes into account the diverse aspects of an architecture for synchronous video e-learning systems. We discuss the specifications of such a framework which at its final stage employs fuzzy logic technique to transform from qualitative to quantitative results.

Keywords: synchronous video, fuzzy logic, rating framework, e-learning

Procedia PDF Downloads 533

41030 A Study on the Relationship Between Adult Videogaming and Wellbeing, Health, and Labor Supply

Authors: William Marquis, Fang Dong

Abstract:

There has been a growing concern in recent years over the economic and social effects of adult video gaming. It has been estimated that the number of people who played video games during the COVID-19 pandemic is close to three billion, and there is evidence that this form of entertainment is here to stay. Many people are concerned that this growing use of time could crowd out time that could be spent on alternative forms of entertainment with family, friends, sports, and other social activities that build community. For example, recent studies of children suggest that playing videogames crowds out time that could be spent on homework, watching TV, or in other social activities. Similar studies of adults have shown that video gaming is negatively associated with earnings, time spent at work, and socializing with others. The primary objective of this paper is to examine how time adults spend on video gaming could displace time they could spend working and on activities that enhance their health and well-being. We use data from the American Time Use Survey (ATUS), maintained by the Bureau of Labor Statistics, to analyze the effects of time-use decisions on three measures of well-being. We pool the ATUS Well-being Module for multiple years, 2010, 2012, 2013, and 2021, along with the ATUS Activity and Who files for these years. This pooled data set provides three broad measures of well-being, e.g., health, life satisfaction, and emotional well-being. Seven variants of each are used as a dependent variable in different multivariate regressions. We add to the existing literature in the following ways. First, we investigate whether the time adults spend in video gaming crowds out time spent working or in social activities that promote health and life satisfaction. Second, we investigate the relationship between adult gaming and their emotional well-being, also known as negative or positive affect, a factor that is related to depression, health, and labor market productivity. The results of this study suggest that the time adult gamers spend on video gaming has no effect on their supply of labor, a negligible effect on their time spent socializing and studying, and mixed effects on their emotional well-being, such as increasing feelings of pain and reducing feelings of happiness and stress.

Keywords: online gaming, health, social capital, emotional wellbeing

Procedia PDF Downloads 20

41029 Managing Type 1 Diabetes in College: A Thematic Analysis of Online Narratives Posted on YouTube

Authors: Ekaterina Malova

Abstract:

Type 1 diabetes (T1D) is a chronic illness requiring immense lifestyle changes to reduce the chance of life-threatening complications. Moving to a college may be the first time for a young adult with T1D to take responsibility for all the aspects of their diabetes care. In addition, people with T1D constantly face stigmatization and discrimination as a result of their health condition, which puts additional pressure on young adults with T1D. Hence, omissions in diabetes self-care often occur during the time of transition to college when both the social and physical environment of young adults changes drastically and contribute to the fact that emerging young adults remain one of the age groups with the highest hemoglobin levels and poorest diabetes control. However, despite potential severe health risks caused by a lack of proper diabetes self-care, little is known about the experiences of emerging adults embarking on a higher education journey as this population. Thus, young adults with type 1 diabetes are a 'forgotten group,' meaning that their experiences are rarely addressed by researchers. Given that self-disclosure and information-seeking can be challenging for individuals with stigmatized illnesses, online platforms like YouTube have become a popular medium of self-disclosure and information-seeking for people living with T1D. Thus, this study aims to provide an analysis of experiences that college students with T1D choose to share with the general public online and explore the nature of information being communicated by college students with T1D to the online community in personal narratives posted on YouTube. A systematic approach was used to retrieve a video sample by searching YouTube with keywords 'type 1 diabetes' and 'college,' with results ordered by relevance. A total of 18 videos were saved. Video lengths ranged from 2 to 28 minutes. The data were coded using NVivo. Video transcripts were coded and analyzed utilizing the thematic analysis method. Three key themes emerged from thematic analysis: 1) Advice, 2) Personal experience, and 3) Things I wish everyone knew about T1D. In addition, Theme 1 was divided into subtopics to differentiate between the most common types of advice: 1) Overcoming stigma and b) Seeking social support. The identified themes indicate that two groups of the population can potentially benefit from watching students’ video testimonies: 1) lay public and 2) other students with T1D. Given that students in the videos reported a lack of T1D education in the lay public, such video narratives can serve important educational purposes and reduce health stigma, while perceived similarity and identification with students in the videos may facilitate the transition of health information to other individuals with T1D and positively affect their diabetes routine. Thus, online video narratives can potentially serve both educational and persuasive purposes, empowering students with T1D to stay in control of T1D while succeeding academically.

Keywords: type 1 diabetes, college students, health communication, transition period

Procedia PDF Downloads 132

41028 Cross-Tier Collaboration between Preservice and Inservice Language Teachers in Designing Online Video-Based Pragmatic Assessment

Authors: Mei-Hui Liu

Abstract:

This paper reports the progression of language teachers’ learning to assess students’ speech act performance via online videos in a cross-tier professional growth community. This yearlong research project collected multiple data sources from several stakeholders, including 12 preservice and 4 inservice English as a foreign language (EFL) teachers, 4 English professionals, and 82 high school students. Data sources included surveys, (focus group) interviews, online reflection journals, online video-based assessment items/scores, and artifacts related to teacher professional learning. The major findings depicted the effectiveness of this proposed learning module on language teacher development in pragmatic assessment as well as its impact on student learning experience. All these teachers appreciated this professional learning experience which enhanced their knowledge in assessing students’ pragmalinguistic and sociopragmatic performance in an English speech act (i.e., making refusals). They learned how to design online video-based assessment items by attending to specific linguistic structures, semantic formula, and sociocultural issues. They further became aware of how to sharpen pragmatic instructional skills in the near future after putting theories into online assessment and related classroom practices. Additionally, data analysis revealed students’ achievement in and satisfaction with the designed online assessment. Yet, during the professional learning process most participating teachers encountered challenges in reaching a consensus on selecting appropriate video clips from available sources to present the sociocultural values in English-speaking refusal contexts. Also included was to construct test items which could testify the influence of interlanguage transfer on students’ pragmatic performance in various conversational scenarios. With pedagogical implications and research suggestions, this study adds to the increasing amount of research into integrating preservice and inservice EFL teacher education in pragmatic assessment and relevant instruction. Acknowledgment: This research project is sponsored by the Ministry of Science and Technology in the Republic of China under the grant number of MOST 106-2410-H-029-038.

Keywords: cross-tier professional development, inservice EFL teachers, pragmatic assessment, preservice EFL teachers, student learning experience

Procedia PDF Downloads 232

41027 A Normalized Non-Stationary Wavelet Based Analysis Approach for a Computer Assisted Classification of Laryngoscopic High-Speed Video Recordings

Authors: Mona K. Fehling, Jakob Unger, Dietmar J. Hecker, Bernhard Schick, Joerg Lohscheller

Abstract:

Voice disorders origin from disturbances of the vibration patterns of the two vocal folds located within the human larynx. Consequently, the visual examination of vocal fold vibrations is an integral part within the clinical diagnostic process. For an objective analysis of the vocal fold vibration patterns, the two-dimensional vocal fold dynamics are captured during sustained phonation using an endoscopic high-speed camera. In this work, we present an approach allowing a fully automatic analysis of the high-speed video data including a computerized classification of healthy and pathological voices. The approach bases on a wavelet-based analysis of so-called phonovibrograms (PVG), which are extracted from the high-speed videos and comprise the entire two-dimensional vibration pattern of each vocal fold individually. Using a principal component analysis (PCA) strategy a low-dimensional feature set is computed from each phonovibrogram. From the PCA-space clinically relevant measures can be derived that quantify objectively vibration abnormalities. In the first part of the work it will be shown that, using a machine learning approach, the derived measures are suitable to distinguish automatically between healthy and pathological voices. Within the approach the formation of the PCA-space and consequently the extracted quantitative measures depend on the clinical data, which were used to compute the principle components. Therefore, in the second part of the work we proposed a strategy to achieve a normalization of the PCA-space by registering the PCA-space to a coordinate system using a set of synthetically generated vibration patterns. The results show that owing to the normalization step potential ambiguousness of the parameter space can be eliminated. The normalization further allows a direct comparison of research results, which bases on PCA-spaces obtained from different clinical subjects.

Keywords: Wavelet-based analysis, Multiscale product, normalization, computer assisted classification, high-speed laryngoscopy, vocal fold analysis, phonovibrogram

Procedia PDF Downloads 238

41026 Virtual and Augmented Reality Based Heritage Gamification: Basilica of Smyrna in Turkey

Authors: Tugba Saricaoglu

Abstract:

This study argues about the potential representation and interpretation of Basilica of Smyrna through gamification. Representation can be defined as a key which plays a role as a converter in order to provide interpretation of something according to the person who perceives. Representation of cultural heritage is a hypothetical and factual approach in terms of its sustainable conservation. Today, both site interpreters and public of cultural heritage have varying perspectives due to their different demographic, social, and even cultural backgrounds. Additionally, gamification application offers diversion of methods suchlike video games to improve user perspective of non-game platforms, contexts, and issues. Hence, cultural heritage and video game decided to be analyzed. Moreover, there are basically different ways of representation of cultural heritage such as digital, physical, and virtual methods in terms of conservation. Virtual reality (VR) and augmented reality (AR) technologies are two of the contemporary digital methods of heritage conservation. In this study, 3D documented ruins of the Basilica will be presented in the virtual and augmented reality based technology as a theoretical gamification sample. Also, this paper will focus on two sub-topics: First, evaluation of the video-game platforms applied to cultural heritage sites, and second, potentials of cultural heritage to be represented in video game platforms. The former will cover the analysis of some case(s) with regard to the concepts and representational aspects of cultural heritage. The latter will include the investigation of cultural heritage sites which carry such a potential and their sustainable conversation. Consequently, after mutual collection of information from cultural heritage and video game platforms, a perspective will be provided in terms of interpretation of representation of cultural heritage by sampling that on Basilica of Smyrna by using VR and AR based technologies.

Keywords: Basilica of Smyrna, cultural heritage, digital heritage, gamification

Procedia PDF Downloads 436

41025 Thick Data Techniques for Identifying Abnormality in Video Frames for Wireless Capsule Endoscopy

Authors: Jinan Fiaidhi, Sabah Mohammed, Petros Zezos

Abstract:

Capsule endoscopy (CE) is an established noninvasive diagnostic modality in investigating small bowel disease. CE has a pivotal role in assessing patients with suspected bleeding or identifying evidence of active Crohn's disease in the small bowel. However, CE produces lengthy videos with at least eighty thousand frames, with a frequency rate of 2 frames per second. Gastroenterologists cannot dedicate 8 to 15 hours to reading the CE video frames to arrive at a diagnosis. This is why the issue of analyzing CE videos based on modern artificial intelligence techniques becomes a necessity. However, machine learning, including deep learning, has failed to report robust results because of the lack of large samples to train its neural nets. In this paper, we are describing a thick data approach that learns from a few anchor images. We are using sound datasets like KVASIR and CrohnIPI to filter candidate frames that include interesting anomalies in any CE video. We are identifying candidate frames based on feature extraction to provide representative measures of the anomaly, like the size of the anomaly and the color contrast compared to the image background, and later feed these features to a decision tree that can classify the candidate frames as having a condition like the Crohn's Disease. Our thick data approach reported accuracy of detecting Crohn's Disease based on the availability of ulcer areas at the candidate frames for KVASIR was 89.9% and for the CrohnIPI was 83.3%. We are continuing our research to fine-tune our approach by adding more thick data methods for enhancing diagnosis accuracy.

Keywords: thick data analytics, capsule endoscopy, Crohn’s disease, siamese neural network, decision tree

Procedia PDF Downloads 123

41024 Hidden Hot Spots: Identifying and Understanding the Spatial Distribution of Crime

Authors: Lauren C. Porter, Andrew Curtis, Eric Jefferis, Susanne Mitchell

Abstract:

A wealth of research has been generated examining the variation in crime across neighborhoods. However, there is also a striking degree of crime concentration within neighborhoods. A number of studies show that a small percentage of street segments, intersections, or addresses account for a large portion of crime. Not surprisingly, a focus on these crime hot spots can be an effective strategy for reducing community level crime and related ills, such as health problems. However, research is also limited in an important respect. Studies tend to use official data to identify hot spots, such as 911 calls or calls for service. While the use of call data may be more representative of the actual level and distribution of crime than some other official measures (e.g. arrest data), call data still suffer from the 'dark figure of crime.' That is, there is most certainly a degree of error between crimes that occur versus crimes that are reported to the police. In this study, we present an alternative method of identifying crime hot spots, that does not rely on official data. In doing so, we highlight the potential utility of neighborhood-insiders to identify and understand crime dynamics within geographic spaces. Specifically, we use spatial video and geo-narratives to record the crime insights of 36 police, ex-offenders, and residents of a high crime neighborhood in northeast Ohio. Spatial mentions of crime are mapped to identify participant-identified hot spots, and these are juxtaposed with calls for service (CFS) data. While there are bound to be differences between these two sources of data, we find that one location, in particular, a corner store, emerges as a hot spot for all three groups of participants. Yet it does not emerge when we examine CFS data. A closer examination of the space around this corner store and a qualitative analysis of narrative data reveal important clues as to why this store may indeed be a hot spot, but not generate disproportionate calls to the police. In short, our results suggest that researchers who rely solely on official data to study crime hot spots may risk missing some of the most dangerous places.

Keywords: crime, narrative, video, neighborhood

Procedia PDF Downloads 215

41023 FLIME - Fast Low Light Image Enhancement for Real-Time Video

Authors: Vinay P., Srinivas K. S.

Abstract:

Low Light Image Enhancement is of utmost impor- tance in computer vision based tasks. Applications include vision systems for autonomous driving, night vision devices for defence systems, low light object detection tasks. Many of the existing deep learning methods are resource intensive during the inference step and take considerable time for processing. The algorithm should take considerably less than 41 milliseconds in order to process a real-time video feed with 24 frames per second and should be even less for a video with 30 or 60 frames per second. The paper presents a fast and efficient solution which has two main advantages, it has the potential to be used for a real-time video feed, and it can be used in low compute environments because of the lightweight nature. The proposed solution is a pipeline of three steps, the first one is the use of a simple function to map input RGB values to output RGB values, the second is to balance the colors and the final step is to adjust the contrast of the image. Hence a custom dataset is carefully prepared using images taken in low and bright lighting conditions. The preparation of the dataset, the proposed model, the processing time are discussed in detail and the quality of the enhanced images using different methods is shown.

Keywords: low light image enhancement, real-time video, computer vision, machine learning

Procedia PDF Downloads 171

41022 Exploring Accessible Filmmaking and Video for Deafblind Audiences through Multisensory Participatory Design

Authors: Aikaterini Tavoulari, Mike Richardson

Abstract:

Objective: This abstract presents a multisensory participatory design project, inspired by a deafblind PhD student's ambition to climb Mount Everest. The project aims to explore accessible routes for filmmaking and video content creation, catering to the needs of individuals with hearing and sight loss. By engaging participants from the Southwest area of England, recruited through multiple networks, the project seeks to gather qualitative data and insights to inform the development of inclusive media practices. Design: It will be a community-based participatory research design. The workshop will feature various stations that stimulate different senses, such as scent, touch, sight, hearing as well as movement. Participants will have the opportunity to engage with these multisensory experiences, providing valuable feedback on their effectiveness and potential for enhancing accessibility in filmmaking and video content. Methods: Brief semi-structured interviews will be conducted to collect qualitative data, allowing participants to share their perspectives, challenges, and suggestions for improvement. The participatory design approach emphasizes the importance of involving the target audience in the creative process. By actively engaging individuals with hearing and sight loss, the project aims to ensure that their needs and preferences are central to the development of accessible filmmaking techniques and video content. This collaborative effort seeks to bridge the gap between content creators and diverse audiences, fostering a more inclusive media landscape. Results: The findings from this study will contribute to the growing body of research on accessible filmmaking and video content creation. Via inductive thematic analysis of the qualitative data collected through interviews and observations, the researchers aim to identify key themes, challenges, and opportunities for creating engaging and inclusive media experiences for deafblind audiences. The insights will inform the development of best practices and guidelines for accessible filmmaking, empowering content creators to produce more inclusive and immersive video content. Conclusion: The abstract targets the hybrid International Conference for Disability and Diversity in Canada (January 2025), as this platform provides an excellent opportunity to share the outcomes of the project with a global audience of researchers, practitioners, and advocates working towards inclusivity and accessibility in various disability domains. By presenting this research at the conference in person, the authors aim to contribute to the ongoing discourse on disability and diversity, highlighting the importance of multisensory experiences and participatory design in creating accessible media content for the deafblind community and the community with sensory impairments more broadly.

Keywords: vision impairment, hearing impairment, deafblindness, accessibility, filmmaking

Procedia PDF Downloads 20

41021 Exertainment: Designing Active Video Games to Get Youth Moving

Authors: Geoff Skinner, Ilung Pranata

Abstract:

The advancement of ICT innovations provides us with a comfortable and convenient modern lifestyle. However, this modern easy lifestyle is proving to have some serious health consequences. Such technological advancements that have dramatically increased ones time in front of screens have been a contributing factor to increasing rates of obesity. In particular the youth obesity issue has gained more and more attention from researchers and health institutions around the world. Although technology innovations may lead to a sedate modern life, they also have a potential to solve the obesity issue in children. This paper provides a review of the issues in child obesity and the potential of active video games to mitigate these issues. Additionally, the paper also discusses the key requirements to develop an active video game that hopes to help combat child obesity through motivating youth to exergame. A framework is introduced to meet the requirements, from which a prototype was implemented. Discussion of the simulation and testing that were performed to verify the attainment of objectives is also detailed.

Keywords: e-video games, exergaming, health informatics, human computer interaction

Procedia PDF Downloads 418

41020 Free to Select vTuber Avatar eLearning Video for University Ray Tracing Course

Authors: Rex Hsieh, Kosei Yamamura, Satoshi Cho, Hisashi Sato

Abstract:

This project took place in the fall semester of 2019 from September 2019 to February 2020. It improves upon the design of a previous vTuber based eLearning video system by correcting criticisms from students and enhancing the positive aspects of the previous system. The transformed audio which has proven to be ineffective in previous experiments was not used in this experiment. The result is videos featuring 3 avatars covering different Ray Tracing subject matters being released weekly. Students are free to pick which videos they want to watch and can also re-watch any videos they want. The students' subjective impressions of each video is recorded and analysed to help further improve the system.

Keywords: vTuber, eLearning, Ray Tracing, Avatar

Procedia PDF Downloads 166

41019 The Relationship between Resilient Qualities and Health Management in Video Testimonials of Adolescents and Young Adults with Cancer

Authors: A. Sainvil, J. Mallela, L. M. Pereira

Abstract:

Adolescents and young adults (AYA) diagnosed with cancer are tasked with managing their health through treatment, a time when reliance on and independence from parents may change in unexpected ways. Resilience allows patients to cope and manage their own health through treatment, promoting motivation and a healthier lifestyle. The film acts as a source of reflection through the cancer journey, which may have an impact on how patients cope. The current research investigated relationships between resilient linguistic qualities of the video narratives and attitudes toward personal health management. N=24 patients diagnosed between ages 11-18 were recruited. First, participants provided demographic information, then made a video testimonial about their cancer experience. After filming, participants then completed a questionnaire on the perceived benefits for themselves and others for making the video. Videos were transcribed and analyzed for thematic content via codebook and for linguistic qualities, indicating resilience with the use of the Linguistic Inquiry and Word Count Analysis Program (LIWC). Linear regressions were then calculated to explore relationships between resilient qualities, thematic content, and participants’ perceptions of their medical team and willingness to care for themselves. Participants who spoke with greater narrator connectedness were more likely to change their view of their medical team (β=.628 p=.034). When a participant believed that providers were likely to view their video, they were marginally more likely to want to take better care of themselves (β=.367, p=.078). Participants who spoke in depth about their health reported higher intention to take better care of themselves (β=.785, p=.033). AYAs with cancer who showcased certain resilient qualities within their narrative were more likely to consider taking better care of themselves. Additionally, the more patients reflected on their health, the more they wanted to take better care of themselves. These relationships were stronger when a patient believed that a provider would watch their video. Study findings highlight the utility of film in uncovering aspects of resilience and coping that may lead to healthier behaviors in AYAs with cancer.

Keywords: adolescents, cancer, resilience, health management

Procedia PDF Downloads 61

41018 Evaluation of University Students of a Video Game to Sensitize Young People about Mental Health Problems

Authors: Adolfo Cangas, Noelia Navarro

Abstract:

The current study shows the assessment made by university students of a video game entitled Stigma-Stop where the characters present different mental disorders. The objective is that players have more real information about mental disorders and empathize with them and thus reduce stigma. The sample consisted of 169 university students studying degrees related to education, social care and welfare (i.e., Social Education, Psychology, Early Childhood Education, Special Education, and Social Work). The participants valued the video game positively, especially in relation to utility, being somewhat lower the score awarded to the degree of entertainment. They detect the disorders and point out that in many occasions they felt the same (particularly in the case of depression, being lower in agoraphobia and bipolar disorder, and even lower in the case of schizophrenia), most students recommend the use of the video game. They emphasize that Stigma-Stop offers intervention strategies, information regarding the symptomatology and sensitizes against stigma.

Keywords: schizophrenia, social stigma, students, mental health

Procedia PDF Downloads 258

41017 Investigation of Delivery of Triple Play Data in GE-PON Fiber to the Home Network

Authors: Ashima Anurag Sharma

Abstract:

Optical fiber based networks can deliver performance that can support the increasing demands for high speed connections. One of the new technologies that have emerged in recent years is Passive Optical Networks. This research paper is targeted to show the simultaneous delivery of triple play service (data, voice, and video). The comparison between various data rates is presented. It is demonstrated that as we increase the data rate, number of users to be decreases due to increase in bit error rate.

Keywords: BER, PON, TDMPON, GPON, CWDM, OLT, ONT

Procedia PDF Downloads 502

41016 Multiplayer RC-car Driving System in a Collaborative Augmented Reality Environment

Authors: Kikuo Asai, Yuji Sugimoto

Abstract:

We developed a prototype system for multiplayer RC-car driving in a collaborative Augmented Reality (AR) environment. The tele-existence environment is constructed by superimposing digital data onto images captured by a camera on an RC-car, enabling players to experience an augmented coexistence of the digital content and the real world. Marker-based tracking was used for estimating position and orientation of the camera. The plural RC-cars can be operated in a field where square markers are arranged. The video images captured by the camera are transmitted to a PC for visual tracking. The RC-cars are also tracked by using an infrared camera attached to the ceiling, so that the instability is reduced in the visual tracking. Multimedia data such as texts and graphics are visualized to be overlaid onto the video images in the geometrically correct manner. The prototype system allows a tele-existence sensation to be augmented in a collaborative AR environment.

Keywords: multiplayer, RC-car, collaborative environment, augmented reality

Procedia PDF Downloads 262

41015 Terraria AI: YOLO Interface for Decision-Making Algorithms

Authors: Emmanuel Barrantes Chaves, Ernesto Rivera Alvarado

Abstract:

This paper presents a method to enable agents for the Terraria game to evaluate algorithms commonly used in general video game artificial intelligence competitions. The usage of the ‘You Only Look Once’ model in the first layer of the process obtains information from the screen, translating this information into a video game description language known as “Video Game Description Language”; the agents take that as input to make decisions. For this, the state-of-the-art algorithms were tested and compared; Monte Carlo Tree Search and Rolling Horizon Evolutionary; in this case, Rolling Horizon Evolutionary shows a better performance. This approach’s main advantage is that a VGDL beforehand is unnecessary. It will be built on the fly and opens the road for using more games as a framework for AI.

Keywords: AI, MCTS, RHEA, Terraria, VGDL, YOLOv5

Procedia PDF Downloads 67

41014 A Data Envelopment Analysis Model in a Multi-Objective Optimization with Fuzzy Environment

Authors: Michael Gidey Gebru

Abstract:

Most of Data Envelopment Analysis models operate in a static environment with input and output parameters that are chosen by deterministic data. However, due to ambiguity brought on shifting market conditions, input and output data are not always precisely gathered in real-world scenarios. Fuzzy numbers can be used to address this kind of ambiguity in input and output data. Therefore, this work aims to expand crisp Data Envelopment Analysis into Data Envelopment Analysis with fuzzy environment. In this study, the input and output data are regarded as fuzzy triangular numbers. Then, the Data Envelopment Analysis model with fuzzy environment is solved using a multi-objective method to gauge the Decision Making Units' efficiency. Finally, the developed Data Envelopment Analysis model is illustrated with an application on real data 50 educational institutions.

Keywords: efficiency, Data Envelopment Analysis, fuzzy, higher education, input, output

Procedia PDF Downloads 22

41013 A Unified Webcam Proctoring Solution on Edge

Authors: Saw Thiha, Jay Rajasekera

Abstract:

A boom in video conferencing generated millions of hours of video data daily to be analyzed. However, such enormous data pose certain scalability issues to be analyzed efficiently, let alone do it in real-time, as online conferences can involve hundreds of people and can last for hours. This paper proposes an efficient online proctoring solution that can analyze the online conferences real-time on edge devices such as Android, iOS, and desktops. Since the computation can be done upfront on the devices where online conferences take place, it can scale well without requiring intensive resources such as GPU servers and complex cloud infrastructure. According to the linear models, face orientation does indeed impact the perceived eye openness. Also, the proposed z score facial landmark standardization was proven to be functional in detecting face orientation and contributed to classifying eye blinks with single eyelid distance computation while achieving a better f1 score and accuracy than the Eye Aspect Ratio (EAR) threshold method. Last but not least, the authors implemented the solution natively in the MediaPipe framework and open-sourced it along with the reproducible experimental results on GitHub. The solution provides face orientation, eye blink, facial activity, and translation detections out of the box and is highly customizable and extensible.

Keywords: android, desktop, edge computing, blink, face orientation, facial activity and translation, MediaPipe, open source, real-time, video conference, web, iOS, Z score facial landmark standardization

Procedia PDF Downloads 74

‹
1
2
3
4
5
6
7
8
9
10
...
1372
1373
›