Search results for: video tracking
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1835

Search results for: video tracking

1655 Efficient Motion Estimation by Fast Three Step Search Algorithm

Authors: S. M. Kulkarni, D. S. Bormane, S. L. Nalbalwar

Abstract:

The rapid development in the technology have dramatic impact on the medical health care field. Medical data base obtained with latest machines like CT Machine, MRI scanner requires large amount of memory storage and also it requires large bandwidth for transmission of data in telemedicine applications. Thus, there is need for video compression. As the database of medical images contain number of frames (slices), hence while coding of these images there is need of motion estimation. Motion estimation finds out movement of objects in an image sequence and gets motion vectors which represents estimated motion of object in the frame. In order to reduce temporal redundancy between successive frames of video sequence, motion compensation is preformed. In this paper three step search (TSS) block matching algorithm is implemented on different types of video sequences. It is shown that three step search algorithm produces better quality performance and less computational time compared with exhaustive full search algorithm.

Keywords: block matching, exhaustive search motion estimation, three step search, video compression

Procedia PDF Downloads 491
1654 Effect of the Levitation Screen Sizes on Magnetic Parameters of Tracking System

Authors: Y. R. Adullayev, О. О. Karimzada

Abstract:

Analytical expressions for inductances, current, ampere-turns, excitation winding, maximum width, coordinates of the levitation screen (LS) are derived for the calculation of electromagnetic devices based on tracking systems with levitation elements (TS with LS). Taking into account the expression of the complex magnetic resistance of the screen, the dependence of the screen width on the heating temperature of the physical and technical characteristics of the screen material and the relationship of the geometric dimensions of the magnetic circuit is established. Analytic expressions for a number of functional dependencies characterizing complex parameter relationships in explicit form are obtained and analyzed.

Keywords: tracking systems, levitation screens, electromagnetic levitation, excitation windings, magnetic cores, defining converter, receiving converter, electromagnetic force, electrical and magnetic resistance

Procedia PDF Downloads 232
1653 Keypoints Extraction for Markerless Tracking in Augmented Reality Applications: A Case Study in Dar As-Saraya Museum

Authors: Jafar W. Al-Badarneh, Abdalkareem R. Al-Hawary, Abdulmalik M. Morghem, Mostafa Z. Ali, Rami S. Al-Gharaibeh

Abstract:

Archeological heritage is at the heart of each country’s national glory. Moreover, it could develop into a source of national income. Heritage management requires socially-responsible marketing that achieves high visitor satisfaction while maintaining high site conservation. We have developed an Augmented Reality (AR) experience for heritage and cultural reservation at Dar-As-Saraya museum in Jordan. Our application of this notion relied on markerless-based tracking approach. This approach uses keypoints extraction technique where features of the environment are identified and defined into the system as keypoints. A set of these keypoints forms a tracker for an augmented object to be displayed and overlaid with a real scene at Dar As-Saraya museum. We tested and compared several techniques for markerless tracking and then applied the best technique to complete a mosaic artifact with AR content. The successful results from our application open the door for applications in open archeological sites where markerless tracking is mostly needed.

Keywords: augmented reality, cultural heritage, keypoints extraction, virtual recreation

Procedia PDF Downloads 337
1652 A Multi Sensor Monochrome Video Fusion Using Image Quality Assessment

Authors: M. Prema Kumar, P. Rajesh Kumar

Abstract:

The increasing interest in image fusion (combining images of two or more modalities such as infrared and visible light radiation) has led to a need for accurate and reliable image assessment methods. This paper gives a novel approach of merging the information content from several videos taken from the same scene in order to rack up a combined video that contains the finest information coming from different source videos. This process is known as video fusion which helps in providing superior quality (The term quality, connote measurement on the particular application.) image than the source images. In this technique different sensors (whose redundant information can be reduced) are used for various cameras that are imperative for capturing the required images and also help in reducing. In this paper Image fusion technique based on multi-resolution singular value decomposition (MSVD) has been used. The image fusion by MSVD is almost similar to that of wavelets. The idea behind MSVD is to replace the FIR filters in wavelet transform with singular value decomposition (SVD). It is computationally very simple and is well suited for real time applications like in remote sensing and in astronomy.

Keywords: multi sensor image fusion, MSVD, image processing, monochrome video

Procedia PDF Downloads 573
1651 Design of a Cooperative Neural Network, Particle Swarm Optimization (PSO) and Fuzzy Based Tracking Control for a Tilt Rotor Unmanned Aerial Vehicle

Authors: Mostafa Mjahed

Abstract:

Tilt Rotor UAVs (Unmanned Aerial Vehicles) are naturally unstable and difficult to maneuver. The purpose of this paper is to design controllers for the stabilization and trajectory tracking of this type of UAV. To this end, artificial intelligence methods have been exploited. First, the dynamics of this UAV was modeled using the Lagrange-Euler method. The conventional method based on Proportional, Integral and Derivative (PID) control was applied by decoupling the different flight modes. To improve stability and trajectory tracking of the Tilt Rotor, the fuzzy approach and the technique of multilayer neural networks (NN) has been used. Thus, Fuzzy Proportional Integral and Derivative (FPID) and Neural Network-based Proportional Integral and Derivative controllers (NNPID) have been developed. The meta-heuristic approach based on Particle Swarm Optimization (PSO) method allowed adjusting the setting parameters of NNPID controller, giving us an improved NNPID-PSO controller. Simulation results under the Matlab environment show the efficiency of the approaches adopted. Besides, the Tilt Rotor UAV has become stable and follows different types of trajectories with acceptable precision. The Fuzzy, NN and NN-PSO-based approaches demonstrated their robustness because the presence of the disturbances did not alter the stability or the trajectory tracking of the Tilt Rotor UAV.

Keywords: neural network, fuzzy logic, PSO, PID, trajectory tracking, tilt-rotor UAV

Procedia PDF Downloads 122
1650 Efficient DCT Architectures

Authors: Mr. P. Suryaprasad, R. Lalitha

Abstract:

This paper presents an efficient area and delay architectures for the implementation of one dimensional and two dimensional discrete cosine transform (DCT). These are supported to different lengths (4, 8, 16, and 32). DCT blocks are used in the different video coding standards for the image compression. The 2D- DCT calculation is made using the 2D-DCT separability property, such that the whole architecture is divided into two 1D-DCT calculations by using a transpose buffer. Based on the existing 1D-DCT architecture two different types of 2D-DCT architectures, folded and parallel types are implemented. Both of these two structures use the same transpose buffer. Proposed transpose buffer occupies less area and high speed than existing transpose buffer. Hence the area, low power and delay of both the 2D-DCT architectures are reduced.

Keywords: transposition buffer, video compression, discrete cosine transform, high efficiency video coding, two dimensional picture

Procedia PDF Downloads 523
1649 The Digital Video and Online Media Development for Integrated Marketing Communication and Tourism Promote in Taling Chan District, Bangkok

Authors: Somsak Klaysung

Abstract:

This study purpose to develop video to promote cultural tourism in Taling Chan District. For qualitative research, the sample size was 40 people from 5 group of the tourism entrepreneur in Taling Chan district, conducted the key informants’ content analysis by using focus group and structures in-depth interview from all stakeholders. Quota sampling was used for this kind of research. The findings indicated that get media video marketing and tourism contribute a set length 11.35 9 minutes there is plenty of social capital in Taling Chan District including detail like local wisdom, knowledge, and way of thinking related to nature, history, historic document, occupation, administration and attribute of local people. Additional research found the new path of travel through the water route according to Khlong Bang Ramat called Route 9 temples that travelers can travel by boat are available in the market in four areas Taling Chan also as well.

Keywords: digital video, integrated marketing communication, online media development, Taling Chan district

Procedia PDF Downloads 363
1648 Validation of Contemporary Physical Activity Tracking Technologies through Exercise in a Controlled Environment

Authors: Reem I. Altamimi, Geoff D. Skinner

Abstract:

Extended periods engaged in sedentary behavior increases the risk of becoming overweight and/or obese which is linked to other health problems. Adding technology to the term ‘active living’ permits its inclusion in promoting and facilitating habitual physical activity. Technology can either act as a barrier to, or facilitate this lifestyle, depending on the chosen technology. Physical Activity Monitoring Technologies (PAMTs) are a popular example of such technologies. Different contemporary PAMTs have been evaluated based on customer reviews; however, there is a lack of published experimental research into the efficacy of PAMTs. This research aims to investigate the reliability of four PAMTs: two wristbands (Fitbit Flex and Jawbone UP), a waist-clip (Fitbit One), and a mobile application (iPhone Health Application) for recording a specific distance walked on a treadmill (1.5km) at constant speed. Physical activity tracking technologies are varied in their recordings, even while performing the same activity. This research demonstrates that Jawbone UP band recorded the most accurate distance compared to Fitbit One, Fitbit Flex, and iPhone Health Application.

Keywords: Fitbit, jawbone up, mobile tracking applications, physical activity tracking technologies

Procedia PDF Downloads 322
1647 Hyperchaos-Based Video Encryption for Device-To-Device Communications

Authors: Samir Benzegane, Said Sadoudi, Mustapha Djeddou

Abstract:

In this paper, we present a software development of video streaming encryption for Device-to-Device (D2D) communications by using Hyperchaos-based Random Number Generator (HRNG) implemented in C#. The software implements and uses the proposed HRNG to generate key stream for encrypting and decrypting real-time video data. The used HRNG consists of Hyperchaos Lorenz system which produces four signal outputs taken as encryption keys. The generated keys are characterized by high quality randomness which is confirmed by passing standard NIST statistical tests. Security analysis of the proposed encryption scheme confirms its robustness against different attacks.

Keywords: hyperchaos Lorenz system, hyperchaos-based random number generator, D2D communications, C#

Procedia PDF Downloads 374
1646 Evaluating the Effects of an Educational Video on Running Shoe Selection and Subjective Perceptions

Authors: Andrew Fife, Jean-Francois Esculier, Codi Ramsey, Kim Hebert-Losier

Abstract:

Objectives: We aimed to identify how an evidence-based educational video influences how runners select shoes, and perceive shoe comfort, satisfaction, and performance over three months in comparison with a control video. Design: Two groups participated in a double-blind randomised controlled trial. Method: Fifty-six runners were randomly assigned to view one of two video presentations prior to purchasing new shoes for road running in speciality stores. Runners completed a survey with regards to their own shoes and one in reference to the new shoes purchased at three timepoints: before first use, onemonth post-purchase, and three-months post-purchase. Perceived shoe comfort, satisfaction, and performance were assessed using 100 mm visual analogue scales. Factors that influenced their shoe purchase were ranked in order of importance. Results: Comfort and satisfaction were not significantly different between groups and timepoints. The perceived performance of new shoes (75.6 mm) was significantly greater than own shoes (mean: 67.6 mm) before first use, but ratings returned to own-shoe levels one month later in both groups. The group receiving the evidence-based presentation reported their purchased shoes as being influenced more by the video (55.4 mm) than the control group (21.8 mm), although both chose the same brand and model as previously worn over half of the time. Runners in both groups prioritised fit, comfort, and choosing similar shoes to the ones they previously used. Conclusions: In contrast to expectations, the evidence-based educational video did not appear to influence running shoe selection, or overall perceived shoe comfort, satisfaction, or performance.

Keywords: comfort, consumer behaviour, consciousness, education, running, shoes

Procedia PDF Downloads 33
1645 Assisted Video Colorization Using Texture Descriptors

Authors: Andre Peres Ramos, Franklin Cesar Flores

Abstract:

Colorization is the process of add colors to a monochromatic image or video. Usually, the process involves to segment the image in regions of interest and then apply colors to each one, for videos, this process is repeated for each frame, which makes it a tedious and time-consuming job. We propose a new assisted method for video colorization; the user only has to colorize one frame, and then the colors are propagated to following frames. The user can intervene at any time to correct eventual errors in color assignment. The method consists of to extract intensity and texture descriptors from the frames and then perform a feature matching to determine the best color for each segment. To reduce computation time and give a better spatial coherence we narrow the area of search and give weights for each feature to emphasize texture descriptors. To give a more natural result, we use an optimization algorithm to make the color propagation. Experimental results in several image sequences, compared to others existing methods, demonstrates that the proposed method perform a better colorization with less time and user interference.

Keywords: colorization, feature matching, texture descriptors, video segmentation

Procedia PDF Downloads 162
1644 Research on Evaluation Method of Urban Road Section Traffic Safety Status Based on Video Information

Authors: Qiang Zhang, Xiaojian Hu

Abstract:

Aiming at the problem of the existing real-time evaluation methods for traffic safety status, a video information-based urban road section traffic safety status evaluation method was established, and the rapid detection method of traffic flow parameters based on video information is analyzed. The concept of the speed dispersion of the road section that affects the traffic safety state of the urban road section is proposed, and the method of evaluating the traffic safety state of the urban road section based on the speed dispersion of the road section is established. Experiments show that the proposed method can reasonably evaluate the safety status of urban roads in real-time, and the evaluation results can provide a corresponding basis for the traffic management department to formulate an effective urban road section traffic safety improvement plan.

Keywords: intelligent transportation system, road traffic safety, video information, vehicle speed dispersion

Procedia PDF Downloads 164
1643 Virtual Reality Based 3D Video Games and Speech-Lip Synchronization Superseding Algebraic Code Excited Linear Prediction

Authors: P. S. Jagadeesh Kumar, S. Meenakshi Sundaram, Wenli Hu, Yang Yung

Abstract:

In 3D video games, the dominance of production is unceasingly growing with a protruding level of affordability in terms of budget. Afterward, the automation of speech-lip synchronization technique is customarily onerous and has advanced a critical research subject in virtual reality based 3D video games. This paper presents one of these automatic tools, precisely riveted on the synchronization of the speech and the lip movement of the game characters. A robust and precise speech recognition segment that systematized with Algebraic Code Excited Linear Prediction method is developed which unconventionally delivers lip sync results. The Algebraic Code Excited Linear Prediction algorithm is constructed on that used in code-excited linear prediction, but Algebraic Code Excited Linear Prediction codebooks have an explicit algebraic structure levied upon them. This affords a quicker substitute to the software enactments of lip sync algorithms and thus advances the superiority of service factors abridged production cost.

Keywords: algebraic code excited linear prediction, speech-lip synchronization, video games, virtual reality

Procedia PDF Downloads 474
1642 Design of a Sliding Controller for Optical Disk Drives

Authors: Yu-Sheng Lu, Chung-Hsin Cheng, Shuen-Shing Jan

Abstract:

This paper presents the design and implementation of a sliding-mod controller for tracking servo of optical disk drives. The tracking servo is majorly subject to two disturbance sources: radial run-out and shock. The lateral run-out disturbance is mostly repeatable, and a model of such disturbance is incorporated into the controller design to effectively compensate for it. Meanwhile, as a shock disturbance is usually non-repeatable and unpredictable, the sliding-mode controller is employed for its robustness to abrupt perturbations. As a result, a sliding-mode controller design based on the internal model principle is tailored for tracking servo of optical disk drives in order to deal with these two major disturbances. Experimental comparative studies are conducted to investigate the effectiveness of the specially designed controller.

Keywords: mechatronics, optical disk drive, sliding-mode control, servo systems

Procedia PDF Downloads 380
1641 Irradion: Portable Small Animal Imaging and Irradiation Unit

Authors: Josef Uher, Jana Boháčová, Richard Kadeřábek

Abstract:

In this paper, we present a multi-robot imaging and irradiation research platform referred to as Irradion, with full capabilities of portable arbitrary path computed tomography (CT). Irradion is an imaging and irradiation unit entirely based on robotic arms for research on cancer treatment with ion beams on small animals (mice or rats). The platform comprises two subsystems that combine several imaging modalities, such as 2D X-ray imaging, CT, and particle tracking, with precise positioning of a small animal for imaging and irradiation. Computed Tomography: The CT subsystem of the Irradion platform is equipped with two 6-joint robotic arms that position a photon counting detector and an X-ray tube independently and freely around the scanned specimen and allow image acquisition utilizing computed tomography. Irradiation measures nearly all conventional 2D and 3D trajectories of X-ray imaging with precisely calibrated and repeatable geometrical accuracy leading to a spatial resolution of up to 50 µm. In addition, the photon counting detectors allow X-ray photon energy discrimination, which can suppress scattered radiation, thus improving image contrast. It can also measure absorption spectra and recognize different materials (tissue) types. X-ray video recording and real-time imaging options can be applied for studies of dynamic processes, including in vivo specimens. Moreover, Irradion opens the door to exploring new 2D and 3D X-ray imaging approaches. We demonstrate in this publication various novel scan trajectories and their benefits. Proton Imaging and Particle Tracking: The Irradion platform allows combining several imaging modules with any required number of robots. The proton tracking module comprises another two robots, each holding particle tracking detectors with position, energy, and time-sensitive sensors Timepix3. Timepix3 detectors can track particles entering and exiting the specimen and allow accurate guiding of photon/ion beams for irradiation. In addition, quantifying the energy losses before and after the specimen brings essential information for precise irradiation planning and verification. Work on the small animal research platform Irradion involved advanced software and hardware development that will offer researchers a novel way to investigate new approaches in (i) radiotherapy, (ii) spectral CT, (iii) arbitrary path CT, (iv) particle tracking. The robotic platform for imaging and radiation research developed for the project is an entirely new product on the market. Preclinical research systems with precision robotic irradiation with photon/ion beams combined with multimodality high-resolution imaging do not exist currently. The researched technology can potentially cause a significant leap forward compared to the current, first-generation primary devices.

Keywords: arbitrary path CT, robotic CT, modular, multi-robot, small animal imaging

Procedia PDF Downloads 91
1640 The Influence of Moisture Conditioning on Hamburg Wheel Tracking Test Results

Authors: Hussain Al-Baghli

Abstract:

The Hamburg Wheel Tracking Test (HWTT) was conducted to evaluate the resistance to moisture damage of two asphalt mixtures: an optimized rubberized asphalt mixture and an HMA mix with anti-stripping additives. The mixtures were subjected to varying numbers of moisture conditioning cycles and then tested for rutting depth. The results showed that the optimized rubberized asphalt mixture met the requirements for medium to heavy traffic in accordance with Kuwait's Ministry of Public Works specification. The number of moisture conditioning cycles did not significantly impact rutting development for the rubberized asphalt. The HMA asphalt samples showed a significant reduction in strength and did not satisfy the HWTT criteria after the moisture conditioning cycles.

Keywords: rubberized asphalt, Hamburg wheel tracking, antistripping, moisture conditioning

Procedia PDF Downloads 79
1639 Speech Perception by Video Hosting Services Actors: Urban Planning Conflicts

Authors: M. Pilgun

Abstract:

The report presents the results of a study of the specifics of speech perception by actors of video hosting services on the material of urban planning conflicts. To analyze the content, the multimodal approach using neural network technologies is employed. Analysis of word associations and associative networks of relevant stimulus revealed the evaluative reactions of the actors. Analysis of the data identified key topics that generated negative and positive perceptions from the participants. The calculation of social stress and social well-being indices based on user-generated content made it possible to build a rating of road transport construction objects according to the degree of negative and positive perception by actors.

Keywords: social media, speech perception, video hosting, networks

Procedia PDF Downloads 149
1638 Research on the Aesthetic Characteristics of Calligraphy Art Under The Cross-Cultural Background Based on Eye Tracking

Authors: Liu Yang

Abstract:

Calligraphy has a unique aesthetic value in Chinese traditional culture. Calligraphy reflects the physical beauty and the dynamic beauty of things through the structure of writing and the order of strokes to standardize the style of writing. In recent years, Chinese researchers have carried out research on the appreciation of calligraphy works from the perspective of psychology, such as how Chinese people appreciate the beauty of stippled lines, the beauty of virtual and real, and the beauty of the composition. However, there is currently no domestic research on how foreigners appreciate Chinese calligraphy. People's appreciation of calligraphy is mainly in the form of visual perception, and psychologists have been working on the use of eye trackers to record eye tracking data to explore the relationship between eye tracking and psychological activities. The purpose of this experimental study is to use eye tracking recorders to analyze the eye gaze trajectories of college students with different cultural backgrounds when they appreciate the same calligraphy work to reveal the differences in cognitive processing with different cultural backgrounds. It was found that Chinese students perceived calligraphy as words when viewing calligraphy works, so they first noticed fonts with easily recognizable glyphs, and the overall viewed time was short. Foreign students perceived calligraphy works as graphics, and they first noticed novel and abstract fonts, and the overall viewing time is longer. The understanding of calligraphy content has a certain influence on the appreciation of calligraphy works by foreign students. It is shown that when foreign students who understand the content of calligraphy works. The eye tracking path is more consistent with the calligraphy writing path, and it helps to develop associations with calligraphy works to better understand the connotation of calligraphy works. This result helps us understand the impact of cultural background differences on calligraphy appreciation and helps us to take more effective strategies to help foreign audiences understand Chinese calligraphy art.

Keywords: Chinese calligraphy, eye-tracking, cross-cultural, cultural communication

Procedia PDF Downloads 107
1637 Engaging Mature Learners through Video Case Studies

Authors: Jacqueline Mary Jepson

Abstract:

This article provides a case study centred on the development of 13 video episodes which have been created to enhance student engagement with a post graduate online course in Project Management. The student group was unique as their online course needed to provide for asynchronistic learning and an adult learning pedagogy. In addition, students had come from a wide range professional backgrounds, with some having no Project Management experience, while others had 20 years or more. Students had to gain an understanding of an advanced body of knowledge and the course needed to achieve the academic requirements to qualify individuals to apply their learning in a range of contexts for professional practice and scholarship. To achieve this, a 13 episode case study was developed along with supportive learning materials based on the relocation of a zoo. This unique project provided a learning environment where the project could evolve over each video episode demonstrating the application of Project Management methodology which was then tied into the learning outcomes for the course and the assessment tasks. Discussion forums provided a way for students to converse and demonstrate their own understanding of content and how Project Management methodology can be applied.

Keywords: project management, adult learning, video case study, asynchronistic education

Procedia PDF Downloads 338
1636 Obstacle Detection and Path Tracking Application for Disables

Authors: Aliya Ashraf, Mehreen Sirshar, Fatima Akhtar, Farwa Kazmi, Jawaria Wazir

Abstract:

Vision, the basis for performing navigational tasks, is absent or greatly reduced in visually impaired people due to which they face many hurdles. For increasing the navigational capabilities of visually impaired people a desktop application ODAPTA is presented in this paper. The application uses camera to capture video from surroundings, apply various image processing algorithms to get information about path and obstacles, tracks them and delivers that information to user through voice commands. Experimental results show that the application works effectively for straight paths in daylight.

Keywords: visually impaired, ODAPTA, Region of Interest (ROI), driver fatigue, face detection, expression recognition, CCD camera, artificial intelligence

Procedia PDF Downloads 552
1635 Classifications of Images for the Recognition of People’s Behaviors by SIFT and SVM

Authors: Henni Sid Ahmed, Belbachir Mohamed Faouzi, Jean Caelen

Abstract:

Behavior recognition has been studied for realizing drivers assisting system and automated navigation and is an important studied field in the intelligent Building. In this paper, a recognition method of behavior recognition separated from a real image was studied. Images were divided into several categories according to the actual weather, distance and angle of view etc. SIFT was firstly used to detect key points and describe them because the SIFT (Scale Invariant Feature Transform) features were invariant to image scale and rotation and were robust to changes in the viewpoint and illumination. My goal is to develop a robust and reliable system which is composed of two fixed cameras in every room of intelligent building which are connected to a computer for acquisition of video sequences, with a program using these video sequences as inputs, we use SIFT represented different images of video sequences, and SVM (support vector machine) Lights as a programming tool for classification of images in order to classify people’s behaviors in the intelligent building in order to give maximum comfort with optimized energy consumption.

Keywords: video analysis, people behavior, intelligent building, classification

Procedia PDF Downloads 378
1634 VideoAssist: A Labelling Assistant to Increase Efficiency in Annotating Video-Based Fire Dataset Using a Foundation Model

Authors: Keyur Joshi, Philip Dietrich, Tjark Windisch, Markus König

Abstract:

In the field of surveillance-based fire detection, the volume of incoming data is increasing rapidly. However, the labeling of a large industrial dataset is costly due to the high annotation costs associated with current state-of-the-art methods, which often require bounding boxes or segmentation masks for model training. This paper introduces VideoAssist, a video annotation solution that utilizes a video-based foundation model to annotate entire videos with minimal effort, requiring the labeling of bounding boxes for only a few keyframes. To the best of our knowledge, VideoAssist is the first method to significantly reduce the effort required for labeling fire detection videos. The approach offers bounding box and segmentation annotations for the video dataset with minimal manual effort. Results demonstrate that the performance of labels annotated by VideoAssist is comparable to those annotated by humans, indicating the potential applicability of this approach in fire detection scenarios.

Keywords: fire detection, label annotation, foundation models, object detection, segmentation

Procedia PDF Downloads 17
1633 Vehicle Timing Motion Detection Based on Multi-Dimensional Dynamic Detection Network

Authors: Jia Li, Xing Wei, Yuchen Hong, Yang Lu

Abstract:

Detecting vehicle behavior has always been the focus of intelligent transportation, but with the explosive growth of the number of vehicles and the complexity of the road environment, the vehicle behavior videos captured by traditional surveillance have been unable to satisfy the study of vehicle behavior. The traditional method of manually labeling vehicle behavior is too time-consuming and labor-intensive, but the existing object detection and tracking algorithms have poor practicability and low behavioral location detection rate. This paper proposes a vehicle behavior detection algorithm based on the dual-stream convolution network and the multi-dimensional video dynamic detection network. In the videos, the straight-line behavior of the vehicle will default to the background behavior. The Changing lanes, turning and turning around are set as target behaviors. The purpose of this model is to automatically mark the target behavior of the vehicle from the untrimmed videos. First, the target behavior proposals in the long video are extracted through the dual-stream convolution network. The model uses a dual-stream convolutional network to generate a one-dimensional action score waveform, and then extract segments with scores above a given threshold M into preliminary vehicle behavior proposals. Second, the preliminary proposals are pruned and identified using the multi-dimensional video dynamic detection network. Referring to the hierarchical reinforcement learning, the multi-dimensional network includes a Timer module and a Spacer module, where the Timer module mines time information in the video stream and the Spacer module extracts spatial information in the video frame. The Timer and Spacer module are implemented by Long Short-Term Memory (LSTM) and start from an all-zero hidden state. The Timer module uses the Transformer mechanism to extract timing information from the video stream and extract features by linear mapping and other methods. Finally, the model fuses time information and spatial information and obtains the location and category of the behavior through the softmax layer. This paper uses recall and precision to measure the performance of the model. Extensive experiments show that based on the dataset of this paper, the proposed model has obvious advantages compared with the existing state-of-the-art behavior detection algorithms. When the Time Intersection over Union (TIoU) threshold is 0.5, the Average-Precision (MP) reaches 36.3% (the MP of baselines is 21.5%). In summary, this paper proposes a vehicle behavior detection model based on multi-dimensional dynamic detection network. This paper introduces spatial information and temporal information to extract vehicle behaviors in long videos. Experiments show that the proposed algorithm is advanced and accurate in-vehicle timing behavior detection. In the future, the focus will be on simultaneously detecting the timing behavior of multiple vehicles in complex traffic scenes (such as a busy street) while ensuring accuracy.

Keywords: vehicle behavior detection, convolutional neural network, long short-term memory, deep learning

Procedia PDF Downloads 132
1632 Person Re-Identification using Siamese Convolutional Neural Network

Authors: Sello Mokwena, Monyepao Thabang

Abstract:

In this study, we propose a comprehensive approach to address the challenges in person re-identification models. By combining a centroid tracking algorithm with a Siamese convolutional neural network model, our method excels in detecting, tracking, and capturing robust person features across non-overlapping camera views. The algorithm efficiently identifies individuals in the camera network, while the neural network extracts fine-grained global features for precise cross-image comparisons. The approach's effectiveness is further accentuated by leveraging the camera network topology for guidance. Our empirical analysis on benchmark datasets highlights its competitive performance, particularly evident when background subtraction techniques are selectively applied, underscoring its potential in advancing person re-identification techniques.

Keywords: camera network, convolutional neural network topology, person tracking, person re-identification, siamese

Procedia PDF Downloads 73
1631 Authoring of Augmented Reality Manuals for Not Physically Available Products

Authors: Vito M. Manghisi, Michele Gattullo, Alessandro Evangelista, Enricoandrea Laviola

Abstract:

In this work, we compared two solutions for displaying a demo version of an Augmented Reality (AR) manual when the real product is not available, opting to replace it with its computer-aided design (CAD) model. AR has been proved to be effective in maintenance and assembly operations by many studies in the literature. However, most of them present solutions for existing products, usually converting old, printed manuals into AR manuals. In this case, authoring consists of defining how to convey existing instructions through AR. It is not a simple choice, and demo versions are created to test the design goodness. However, this becomes impossible when the product is not physically available, as for new products. A solution could be creating an entirely virtual environment with the product and the instructions. However, in this way, user interaction is completely different from that in the real application, then it would be hard testing the usability of the AR manual. This work aims to propose and compare two different solutions for the displaying of a demo version of an AR manual to support authoring in case of a product that is not physically available. We used as a case study that of an innovative semi-hermetic compressor that has not yet been produced. The applications were developed for a handheld device, using Unity 3D. The main issue was how to show the compressor and attach instructions on it. In one approach, we used Vuforia natural feature tracking to attach a CAD model of the compressor to a 2D image that is a drawing in scale 1:1 of the top-view of the CAD model. In this way, during the AR manual demonstration, the 3D model of the compressor is displayed on the user's device in place of the real compressor, and all the virtual instructions are attached to it. In the other approach, we first created a support application that shows the CAD model of the compressor on a marker. Then, we registered a video of this application, moving around the marker, obtaining a video that shows the CAD model from every point of view. For the AR manual, we used the Vuforia model target (360° option) to track the CAD model of the compressor, as it was the real compressor. Then, during the demonstration, the video is shown on a fixed large screen, and instructions are displayed attached to it in the AR manual. The first solution presents the main drawback to keeping the printed image with everyone working on the authoring of the AR manual, but allows to show the product in a real scale and interaction during the demonstration is very simple. The second one does not need a printed marker during the demonstration but a screen. Still, the compressor model is resized, and interaction is awkward since the user has to play the video on the screen to rotate the compressor. The two solutions were evaluated together with the company, and the preferred was the first one due to a more natural interaction.

Keywords: augmented reality, human computer interaction, operating instructions, maintenance, assembly

Procedia PDF Downloads 129
1630 Design of a Computer Vision Based Exercise Video Game for Senior Citizens

Authors: June Tay, Ivy Chia

Abstract:

There are numerous changes, both mental and physical, taking place when people age. We need to understand the different aspects required for healthy living, including meeting nutritional needs, regular physical activities to keep agility, sufficient rest and sleep to have physical and mental well-being, social engagement to avoid the risk of social isolation and depression, and access to healthcare to detect and manage chronic conditions. Promoting physical activities for an ageing population is necessary as many may have enjoyed sedentary lifestyles for some time. In our study, we evaluate the considerations when designing a computer vision video game for the elderly. We need to design some low-impact activities, such as stretching and gentle movements, because some elderly individuals may have joint pains or mobility issues. The exercise game should consist of simple movements that are easy to follow and remember. It should be fun and enjoyable so that they can be motivated to do some exercise. Social engagement can keep the elderly motivated and competitive, and they are more willing to engage in game exercises. Elderly citizens can compare their game scores and try to improve them. We propose a computer vision-based video game for the elderly that will capture and track the movement of the elderly hand pushing a ball on the screen into a circle. It can be easily set up using a PC laptop with a webcam. Our video game adhered to the design framework we employed, and it encompassed ease of use, a simple graphical interface, easy-to-play game exercise, and fun gameplay.

Keywords: about computer vision, video games, gerontology technology, caregiving

Procedia PDF Downloads 83
1629 Tracking Filtering Algorithm Based on ConvLSTM

Authors: Ailing Yang, Penghan Song, Aihua Cai

Abstract:

The nonlinear maneuvering target tracking problem is mainly a state estimation problem when the target motion model is uncertain. Traditional solutions include Kalman filtering based on Bayesian filtering framework and extended Kalman filtering. However, these methods need prior knowledge such as kinematics model and state system distribution, and their performance is poor in state estimation of nonprior complex dynamic systems. Therefore, in view of the problems existing in traditional algorithms, a convolution LSTM target state estimation (SAConvLSTM-SE) algorithm based on Self-Attention memory (SAM) is proposed to learn the historical motion state of the target and the error distribution information measured at the current time. The measured track point data of airborne radar are processed into data sets. After supervised training, the data-driven deep neural network based on SAConvLSTM can directly obtain the target state at the next moment. Through experiments on two different maneuvering targets, we find that the network has stronger robustness and better tracking accuracy than the existing tracking methods.

Keywords: maneuvering target, state estimation, Kalman filter, LSTM, self-attention

Procedia PDF Downloads 180
1628 Real Time Multi Person Action Recognition Using Pose Estimates

Authors: Aishrith Rao

Abstract:

Human activity recognition is an important aspect of video analytics, and many approaches have been recommended to enable action recognition. In this approach, the model is used to identify the action of the multiple people in the frame and classify them accordingly. A few approaches use RNNs and 3D CNNs, which are computationally expensive and cannot be trained with the small datasets which are currently available. Multi-person action recognition has been performed in order to understand the positions and action of people present in the video frame. The size of the video frame can be adjusted as a hyper-parameter depending on the hardware resources available. OpenPose has been used to calculate pose estimate using CNN to produce heap-maps, one of which provides skeleton features, which are basically joint features. The features are then extracted, and a classification algorithm can be applied to classify the action.

Keywords: human activity recognition, computer vision, pose estimates, convolutional neural networks

Procedia PDF Downloads 143
1627 Remembering Route in an Unfamiliar Homogenous Environment

Authors: Ahmed Sameer, Braj Bhushan

Abstract:

The objective of our study was to compare two techniques (no landmark vs imaginary landmark) of remembering route while traversing in an unfamiliar homogenous environment. We used two videos each having nine identical turns with no landmarks. In the first video participant was required to remember the sequence of turns. In the second video participant was required to imagine a landmark at each turn and associate the turn with it. In both the task the participant was asked to recall the sequence of turns as it appeared in the video. Results showed that performance in the first condition i.e. without use of landmarks was better than imaginary landmark condition. The difference, however, became significant when the participant were tested again about 30 minutes later though performance was still better in no-landmark condition. The finding is surprising given the past research in memory and is explained in terms of cognitive factors such as mental workload.

Keywords: wayfinding, landmarks, unfamiliar environment, cognitive psychology

Procedia PDF Downloads 476
1626 Gender Difference in Social Interaction Skills of Autism Using Token Economy and Video Modelling Strategies

Authors: Olusola Akintunde Adediran

Abstract:

This study examined differential effect of Gender difference in social interaction skill of pupils with autism using token economy and video modeling as intervention strategies. A pretest, posttest, control group, quasi-experimental research design was adopted in the study. 17 participants (11 males and 6 females) were selected purposively from 5 centres in Ibadan and randomized into three groups (token economy, video modeling and control groups). Two instruments were used in the study; Autism Spectrum Rating Scale (ASRS) for 299.00 Autistic Disorder (r = 0.82) and Children’s Self-report Social Skill Scale (CS4) (r= 0.93). A descriptive statistics was used to analyse the participants social interaction data based on intervention and gender, while inferential statistics of analysis of covariance (ANCOVA) and scheffe post-hoc measure was used to anlayse three null hypotheses tested at 0.05 level of significance. The results obtained indicated that there was a significant main effect of treatment on social interaction of participants, but there was no significant of main effect of gender on the social interaction of participants, hence, (F(2,14) = .741; p > .05, eta = .050). Lastly, there was no significant interaction effect of treatment and gender of the participants, hence (F(2,10) = 2.177; p > .05, eta 2 = 202). The study has contributed to the frontiers of knowledge by establishing that social interaction of autism is attainable when token economy and video modelling are used as treatment intervention, hence, they should be adopted by the teachers, curriculum planners and other stakeholders.

Keywords: social interaction, token economy, video modelling, autism, gender

Procedia PDF Downloads 139