Search results for: robot vision
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1461

Search results for: robot vision

1011 Visual Servoing for Quadrotor UAV Target Tracking: Effects of Target Information Sharing

Authors: Jason R. King, Hugh H. T. Liu

Abstract:

This research presents simulation and experimental work in the visual servoing of a quadrotor Unmanned Aerial Vehicle (UAV) to stabilize overtop of a moving target. Most previous work in the field assumes static or slow-moving, unpredictable targets. In this experiment, the target is assumed to be a friendly ground robot moving freely on a horizontal plane, which shares information with the UAV. This information includes velocity and acceleration information of the ground target to aid the quadrotor in its tracking task. The quadrotor is assumed to have a downward-facing camera which is fixed to the frame of the quadrotor. Only onboard sensing for the quadrotor is utilized for the experiment, with a VICON motion capture system in place used only to measure ground truth and evaluate the performance of the controller. The experimental platform consists of an ArDrone 2.0 and a Create Roomba, communicating using Robot Operating System (ROS). The addition of the target’s information is demonstrated to help the quadrotor in its tracking task using simulations of the dynamic model of a quadrotor in Matlab Simulink. A nested PID control loop is utilized for inner-loop control the quadrotor, similar to previous works at the Flight Systems and Controls Laboratory (FSC) at the University of Toronto Institute for Aerospace Studies (UTIAS). Experiments are performed with ground truth provided by an indoor motion capture system, and the results are analyzed. It is demonstrated that a velocity controller which incorporates the additional information is able to perform better than the controllers which do not have access to the target’s information.

Keywords: quadrotor, target tracking, unmanned aerial vehicle, UAV, UAS, visual servoing

Procedia PDF Downloads 316
1010 Automated Computer-Vision Analysis Pipeline of Calcium Imaging Neuronal Network Activity Data

Authors: David Oluigbo, Erik Hemberg, Nathan Shwatal, Wenqi Ding, Yin Yuan, Susanna Mierau

Abstract:

Introduction: Calcium imaging is an established technique in neuroscience research for detecting activity in neural networks. Bursts of action potentials in neurons lead to transient increases in intracellular calcium visualized with fluorescent indicators. Manual identification of cell bodies and their contours by experts typically takes 10-20 minutes per calcium imaging recording. Our aim, therefore, was to design an automated pipeline to facilitate and optimize calcium imaging data analysis. Our pipeline aims to accelerate cell body and contour identification and production of graphical representations reflecting changes in neuronal calcium-based fluorescence. Methods: We created a Python-based pipeline that uses OpenCV (a computer vision Python package) to accurately (1) detect neuron contours, (2) extract the mean fluorescence within the contour, and (3) identify transient changes in the fluorescence due to neuronal activity. The pipeline consisted of 3 Python scripts that could both be easily accessed through a Python Jupyter notebook. In total, we tested this pipeline on ten separate calcium imaging datasets from murine dissociate cortical cultures. We next compared our automated pipeline outputs with the outputs of manually labeled data for neuronal cell location and corresponding fluorescent times series generated by an expert neuroscientist. Results: Our results show that our automated pipeline efficiently pinpoints neuronal cell body location and neuronal contours and provides a graphical representation of neural network metrics accurately reflecting changes in neuronal calcium-based fluorescence. The pipeline detected the shape, area, and location of most neuronal cell body contours by using binary thresholding and grayscale image conversion to allow computer vision to better distinguish between cells and non-cells. Its results were also comparable to manually analyzed results but with significantly reduced result acquisition times of 2-5 minutes per recording versus 10-20 minutes per recording. Based on these findings, our next step is to precisely measure the specificity and sensitivity of the automated pipeline’s cell body and contour detection to extract more robust neural network metrics and dynamics. Conclusion: Our Python-based pipeline performed automated computer vision-based analysis of calcium image recordings from neuronal cell bodies in neuronal cell cultures. Our new goal is to improve cell body and contour detection to produce more robust, accurate neural network metrics and dynamic graphs.

Keywords: calcium imaging, computer vision, neural activity, neural networks

Procedia PDF Downloads 64
1009 An Evaluation of Rational Approach to Management by Objectives in Construction Contracting Organisation

Authors: Zakir H. Shaik, Punam L. Vartak

Abstract:

Management By Objectives (MBO) is a management technique in which objectives of an organisation are conveyed to the employees to establish the individual goals. These objectives and goals are then monitored and assessed jointly by management and the employee time to time. This tool can be used for planning, monitoring as well as for performance appraisal. The success of an organisation is largely dependent on its’s Vision. Thus, it is of paramount importance to achieve the realm of vision through a mission which is well crafted within the organisation to address the objectives. The success of the mission depends upon how realistic and action oriented philosophical approach, an organisation caters to; and how the individual goals are set to track and meet the objectives. Thus, focused and passionate efforts of the team, assigned for the mission, are an absolute obligation for achieving the vision of any organisation. Any construction site is generally a controlled disorder having huge investments, resources and logistics involved. The Construction progression is time-consuming with many isolated as well as interconnected activities. Traditional MBO approach can be unsuccessful if planning and control is non-realistic and inflexible. Moreover, the Construction Industry is far behind understanding these concepts. It is important to address the employee engagement in defining and creating awareness to achieve the targets. Besides, current economic environment and competitive world demands refined management tools to achieve profit, growth and survival of the business. Therefore, the necessity of rational MBO becomes vital part towards the success of an organisation. This paper details about the philosophical assumptions to develop the grounded theory in lieu of achieving objectives through RATIONAL MBO approach in Construction Contracting Organisations. The goals and objectives of the Construction Contracting Organisations can be achieved efficiently by adopting this RATIONAL MBO approach, as those are based on realistic, logical and balanced assumptions.

Keywords: growth, leadership, management by objectives, Management By Objectives (MBO), profit, rational

Procedia PDF Downloads 132
1008 Effects of Climate Change on Floods of Pakistan, and Gap Analysis of Existing Policies with Vision 2025

Authors: Saima Akbar, Tahseen Ullah Khan

Abstract:

The analysis of the climate change impact on flood frequency represents an important issue for water resource management and flood risk mitigation. This research was conducted to address the effects of climate change on flood incidents of Pakistan and find out gaps in existing policies to reducing the environmental aspects on floods and effects of global warming. The main objective of this research was to critically analyses the National Climate Change Policy (NCCP), National Disaster Management Authority (NDMA), Federal Flood Commission (FFC) and Vision 2025, as an effective policy document which is not only hitting the target of a climate resilient Pakistan but provides room for efficient and flexible policy implementation. The methodology integrates projected changes in monsoon patterns (since last 20 years and overall change in rainfall pattern since 1901 to 2015 from Pakistan Metrological Department), glacier melting, decreasing dam capacity and lacks in existing policies by using SWOT (Strength, Weakness, Opportunities, Threats) model in order to explore the relative impacts of global warming on the system performance. Results indicate the impacts of climate change are significant, but probably not large enough to justify a major effort for adapting the physical infrastructure to expected climatic conditions in Vision 2025 which is our shared destination to progress, ultimate aspiration to see Pakistan among the ten largest economies of the world by 2047– the centennial year of our independence. The conclusion of this research was to adapt sustainable measures to reduce flood impacts and make policies as neighboring countries are adapting for their sustainability.

Keywords: climatic factors, monsoon, Pakistan, sustainability

Procedia PDF Downloads 125
1007 Laser Registration and Supervisory Control of neuroArm Robotic Surgical System

Authors: Hamidreza Hoshyarmanesh, Hosein Madieh, Sanju Lama, Yaser Maddahi, Garnette R. Sutherland, Kourosh Zareinia

Abstract:

This paper illustrates the concept of an algorithm to register specified markers on the neuroArm surgical manipulators, an image-guided MR-compatible tele-operated robot for microsurgery and stereotaxy. Two range-finding algorithms, namely time-of-flight and phase-shift, are evaluated for registration and supervisory control. The time-of-flight approach is implemented in a semi-field experiment to determine the precise position of a tiny retro-reflective moving object. The moving object simulates a surgical tool tip. The tool is a target that would be connected to the neuroArm end-effector during surgery inside the magnet bore of the MR imaging system. In order to apply flight approach, a 905-nm pulsed laser diode and an avalanche photodiode are utilized as the transmitter and receiver, respectively. For the experiment, a high frequency time to digital converter was designed using a field-programmable gate arrays. In the phase-shift approach, a continuous green laser beam with a wavelength of 530 nm was used as the transmitter. Results showed that a positioning error of 0.1 mm occurred when the scanner-target point distance was set in the range of 2.5 to 3 meters. The effectiveness of this non-contact approach exhibited that the method could be employed as an alternative for conventional mechanical registration arm. Furthermore, the approach is not limited by physical contact and extension of joint angles.

Keywords: 3D laser scanner, intraoperative MR imaging, neuroArm, real time registration, robot-assisted surgery, supervisory control

Procedia PDF Downloads 259
1006 Status of India towards Achieving the Millennium Development Goals

Authors: Rupali Satsangi

Abstract:

14 years ago, leaders from every country agreed on a vision for the future – a world with less poverty, hunger and disease, greater survival prospects for mothers and their infants, better educated children, equal opportunities for women, and a healthier environment; a world in which developed and developing countries work in partnership for the betterment of all. This vision took the shape of eight Millennium Development Goals, which provide countries around the world a framework for development and time-bound targets by which progress can be measured. However, India has found 35 of the indicators as relevant to India. India’s MDG-framework has been contextualized through a concordance with the existing official indicators of corresponding dimensions in the national statistical system. The present study based on secondary data analyzed the status of India towards achieving the MDGs after reviewing the data study find out that India can miss the MDGs Bus in women health, sanitation and global partnership. These goals were less addressed by India in his policies and takeoffs.

Keywords: millennium development goals, national statistical system, global partnership, healthier environment

Procedia PDF Downloads 379
1005 “Presently”: A Personal Trainer App to Self-Train and Improve Presentation Skills

Authors: Shyam Mehraaj, Samanthi E. R. Siriwardana, Shehara A. K. G. H., Wanigasinghe N. T., Wandana R. A. K., Wedage C. V.

Abstract:

A presentation is a critical tool for conveying not just spoken information but also a wide spectrum of human emotions. The single most effective thing to make the presentation successful is to practice it beforehand. Preparing for a presentation has been shown to be essential for improving emotional control, intonation and prosody, pronunciation, and vocabulary, as well as the quality of the presentation slides. As a result, practicing has become one of the most critical parts of giving a good presentation. In this research, the main focus is to analyze the audio, video, and slides of the presentation uploaded by the presenters. This proposed solution is based on the Natural Language Processing and Computer Vision techniques to cater to the requirement for the presenter to do a presentation beforehand using a mobile responsive web application. The proposed system will assist in practicing the presentation beforehand by identifying the presenters’ emotions, body language, tonality, prosody, pronunciations and vocabulary, and presentation slides quality. Overall, the system will give a rating and feedback to the presenter about the performance so that the presenters’ can improve their presentation skills.

Keywords: presentation, self-evaluation, natural learning processing, computer vision

Procedia PDF Downloads 99
1004 High Level Synthesis of Canny Edge Detection Algorithm on Zynq Platform

Authors: Hanaa M. Abdelgawad, Mona Safar, Ayman M. Wahba

Abstract:

Real-time image and video processing is a demand in many computer vision applications, e.g. video surveillance, traffic management and medical imaging. The processing of those video applications requires high computational power. Therefore, the optimal solution is the collaboration of CPU and hardware accelerators. In this paper, a Canny edge detection hardware accelerator is proposed. Canny edge detection is one of the common blocks in the pre-processing phase of image and video processing pipeline. Our presented approach targets offloading the Canny edge detection algorithm from processing system (PS) to programmable logic (PL) taking the advantage of High Level Synthesis (HLS) tool flow to accelerate the implementation on Zynq platform. The resulting implementation enables up to a 100x performance improvement through hardware acceleration. The CPU utilization drops down and the frame rate jumps to 60 fps of 1080p full HD input video stream.

Keywords: high level synthesis, canny edge detection, hardware accelerators, computer vision

Procedia PDF Downloads 459
1003 Safety Effect of Smart Right-Turn Design at Intersections

Authors: Upal Barua

Abstract:

The risk of severe crashes at high-speed right-turns at intersections is a major safety concern these days. The application of a smart right-turn at an intersection is increasing day by day to address is an issue. The design, ‘Smart Right-turn’ consists of a narrow-angle of channelization at approximately 70°. This design increases the cone of vision of the right-tuning drivers towards the crossing pedestrians as well as traffic on the cross-road. As part of the Safety Improvement Program in Austin Transportation Department, several smart right-turns were constructed at high crash intersections where high-speed right-turns were found to be a contributing factor. This paper features the state of the art techniques applied in planning, engineering, designing and construction of this smart right-turn, key factors driving the success, and lessons learned in the process. This paper also presents the significant crash reductions achieved from the application of this smart right-turn design using Empirical Bayes method. The result showed that smart right-turns can reduce overall right-turn crashes by 43% and severe right-turn crashes by 70%.

Keywords: smart right-turn, intersection, cone of vision, empirical Bayes method

Procedia PDF Downloads 244
1002 Post-modernist Tragi-Comedy: A Study of Tom Stoppard’s “Rosencrantz and Guildenstern Are Dead”

Authors: Azza Taha Zaki

Abstract:

The death of tragedy is probably the most distinctive literary controversy of the twentieth century. There is common critical consent that tragedy in the classical sense of the word is no longer possible. Thinkers, philosophers, and critics such as Nietzsche, Durrenmatt, and George Steiner have all agreed that the decline of the genre in the modern age is due to the total lack of a unified world image and the absence of a shared vision in a fragmented and ideologically diversified world. The production of Rosencrantz and Guildenstern are Dead in 1967 marked the rise of the genre of tragi-comedy as a more appropriate reflection of the spirit of the age. At the hands of such great dramatists as Tom Stoppard (1937- ), the revived genre was not used as an extra comic element to give some comic relief to an otherwise tragic text, but it was given a postmodernist touch to serve the interpretation of the dilemma of man in the postmodernist world. This paper will study features of postmodernist tragi-comedy in Rosencrantz and Guildenstern are Dead as one of the most important plays in modern British theatre and investigate Stoppard’s vision of man and life as influenced by postmodernist thought and philosophy.

Keywords: British, drama, postmodernist, Stoppard, tragi-comedy

Procedia PDF Downloads 164
1001 6 DOF Cable-Driven Haptic Robot for Rendering High Axial Force with Low Off-Axis Impedance

Authors: Naghmeh Zamani, Ashkan Pourkand, David Grow

Abstract:

This paper presents the design and mechanical model of a hybrid impedance/admittance haptic device optimized for applications, like bone drilling, spinal awl probe use, and other surgical techniques were high force is required in the tool-axial direction, and low impedance is needed in all other directions. The performance levels required cannot be satisfied by existing, off-the-shelf haptic devices. This design may allow critical improvements in simulator fidelity for surgery training. The device consists primarily of two low-mass (carbon fiber) plates with a rod passing through them. Collectively, the device provides 6 DOF. The rod slides through a bushing in the top plate and it is connected to the bottom plate with a universal joint, constrained to move in only 2 DOF, allowing axial torque display the user’s hand. The two parallel plates are actuated and located by means of four cables pulled by motors. The forward kinematic equations are derived to ensure that the plates orientation remains constant. The corresponding equations are solved using the Newton-Raphson method. The static force/torque equations are also presented. Finally, we present the predicted distribution of location error, cables velocity, cable tension, force and torque for the device. These results and preliminary hardware fabrication indicate that this design may provide a revolutionary approach for haptic display of many surgical procedures by means of an architecture that allows arbitrary workspace scaling. Scaling of the height and width can be scaled arbitrarily.

Keywords: cable direct driven robot, haptics, parallel plates, bone drilling

Procedia PDF Downloads 239
1000 Image Classification with Localization Using Convolutional Neural Networks

Authors: Bhuyain Mobarok Hossain

Abstract:

Image classification and localization research is currently an important strategy in the field of computer vision. The evolution and advancement of deep learning and convolutional neural networks (CNN) have greatly improved the capabilities of object detection and image-based classification. Target detection is important to research in the field of computer vision, especially in video surveillance systems. To solve this problem, we will be applying a convolutional neural network of multiple scales at multiple locations in the image in one sliding window. Most translation networks move away from the bounding box around the area of interest. In contrast to this architecture, we consider the problem to be a classification problem where each pixel of the image is a separate section. Image classification is the method of predicting an individual category or specifying by a shoal of data points. Image classification is a part of the classification problem, including any labels throughout the image. The image can be classified as a day or night shot. Or, likewise, images of cars and motorbikes will be automatically placed in their collection. The deep learning of image classification generally includes convolutional layers; the invention of it is referred to as a convolutional neural network (CNN).

Keywords: image classification, object detection, localization, particle filter

Procedia PDF Downloads 279
999 Using Computer Vision and Machine Learning to Improve Facility Design for Healthcare Facility Worker Safety

Authors: Hengameh Hosseini

Abstract:

Design of large healthcare facilities – such as hospitals, multi-service line clinics, and nursing facilities - that can accommodate patients with wide-ranging disabilities is a challenging endeavor and one that is poorly understood among healthcare facility managers, administrators, and executives. An even less-understood extension of this problem is the implications of weakly or insufficiently accommodative design of facilities for healthcare workers in physically-intensive jobs who may also suffer from a range of disabilities and who are therefore at increased risk of workplace accident and injury. Combine this reality with the vast range of facility types, ages, and designs, and the problem of universal accommodation becomes even more daunting and complex. In this study, we focus on the implication of facility design for healthcare workers suffering with low vision who also have physically active jobs. The points of difficulty are myriad and could span health service infrastructure, the equipment used in health facilities, and transport to and from appointments and other services can all pose a barrier to health care if they are inaccessible, less accessible, or even simply less comfortable for people with various disabilities. We conduct a series of surveys and interviews with employees and administrators of 7 facilities of a range of sizes and ownership models in the Northeastern United States and combine that corpus with in-facility observations and data collection to identify five major points of failure common to all the facilities that we concluded could pose safety threats to employees with vision impairments, ranging from very minor to severe. We determine that lack of design empathy is a major commonality among facility management and ownership. We subsequently propose three methods for remedying this lack of empathy-informed design, to remedy the dangers posed to employees: the use of an existing open-sourced Augmented Reality application to simulate the low-vision experience for designers and managers; the use of a machine learning model we develop to automatically infer facility shortcomings from large datasets of recorded patient and employee reviews and feedback; and the use of a computer vision model fine tuned on images of each facility to infer and predict facility features, locations, and workflows, that could again pose meaningful dangers to visually impaired employees of each facility. After conducting a series of real-world comparative experiments with each of these approaches, we conclude that each of these are viable solutions under particular sets of conditions, and finally characterize the range of facility types, workforce composition profiles, and work conditions under which each of these methods would be most apt and successful.

Keywords: artificial intelligence, healthcare workers, facility design, disability, visually impaired, workplace safety

Procedia PDF Downloads 77
998 Refining Scheme Using Amphibious Epistemologies

Authors: David Blaine, George Raschbaum

Abstract:

The evaluation of DHCP has synthesized SCSI disks, and current trends suggest that the exploration of e-business that would allow for further study into robots will soon emerge. Given the current status of embedded algorithms, hackers worldwide obviously desire the exploration of replication, which embodies the confusing principles of programming languages. In our research we concentrate our efforts on arguing that erasure coding can be made "fuzzy", encrypted, and game-theoretic.

Keywords: SCHI disks, robot, algorithm, hacking, programming language

Procedia PDF Downloads 397
997 Perceptions of Senior Academics in Teacher Education Colleges Regarding the Integration of Digital Games during the Pandemic

Authors: Merav Hayakac, Orit Avidov-Ungarab

Abstract:

The current study adopted an interpretive-constructivist approach to examine how senior academics from a large sample of Israeli teacher education colleges serving general or religious populations perceived the integration of digital games into their teacher instruction and what their policy and vision were in this regard in the context of the COVID-19 pandemic. Half the participants expressed a desire to integrate digital games into their teaching and learning but acknowledged that this practice was uncommon. Only a small minority believed they had achieved successful integration, with doubt and skepticism expressed by some religious colleges. Most colleges had policies encouraging technology integration supported by ongoing funding. Although a considerable gap between policy and implementation remained, the COVID-19 pandemic was viewed as having accelerated the integration of digital games into pre-service teacher instruction. The findings suggest that discussions around technology-related vision and policy and their translation into practice should relate to the specific cultural needs and academic preparedness of the population(s) served by the college.

Keywords: COVID-19, digital games, pedagogy, teacher education colleges

Procedia PDF Downloads 76
996 The Meaningful Pixel and Texture: Exploring Digital Vision and Art Practice Based on Chinese Cosmotechnics

Authors: Xingdu Wang, Charlie Gere, Emma Rose, Yuxuan Zhao

Abstract:

The study introduces a fresh perspective on the digital realm through an examination of the Chinese concept of Xiang, elucidating how it can build an understanding of pixels and textures on screens as digital trigrams. This concept attempts to offer an outlook on the intersection of digital technology and the natural world, thereby contributing to discussions about the harmonious relationship between humans and technology. The study looks for the ancient Chinese theory of Xiang as a key to establishing the theories and practices to respond to the problem of Contemporary Chinese technics. Xiang is a Chinese method of understanding the essentials of things through appearances, which differs from the method of science in the Westen. Xiang, the basement of Chinese visual art, is rooted in ancient Chinese philosophy and connected to the eight trigrams. The discussion of Xiang connects art, philosophy, and technology. This paper connects the meaning of Xiang with the 'truth appearing' philosophically through the analysis of the concepts of phenomenon and noumenon and the unique Chinese way of observing. Hereafter, the historical interconnection between ancient painting and writing in China emphasizes their relationship between technical craftsmanship and artistic expression. In digital, the paper blurs the traditional boundaries between images and text on digital screens in theory. Lastly, this study identified an ensemble concept relating to pixels and textures in computer vision, drawing inspiration from AI image recognition in Chinese paintings. In art practice, by presenting a fluid visual experience in the form of pixels, which mimics the flow of lines in traditional calligraphy and painting, it is hoped that the viewer will be brought back to the process of the truth appearing as defined by the 'Xiang’.

Keywords: Chinese cosmotechnics, computer vision, contemporary Neo-Confucianism, texture and pixel, Xiang

Procedia PDF Downloads 36
995 Image Captioning with Vision-Language Models

Authors: Promise Ekpo Osaine, Daniel Melesse

Abstract:

Image captioning is an active area of research in the multi-modal artificial intelligence (AI) community as it connects vision and language understanding, especially in settings where it is required that a model understands the content shown in an image and generates semantically and grammatically correct descriptions. In this project, we followed a standard approach to a deep learning-based image captioning model, injecting architecture for the encoder-decoder setup, where the encoder extracts image features, and the decoder generates a sequence of words that represents the image content. As such, we investigated image encoders, which are ResNet101, InceptionResNetV2, EfficientNetB7, EfficientNetV2M, and CLIP. As a caption generation structure, we explored long short-term memory (LSTM). The CLIP-LSTM model demonstrated superior performance compared to the encoder-decoder models, achieving a BLEU-1 score of 0.904 and a BLEU-4 score of 0.640. Additionally, among the CNN-LSTM models, EfficientNetV2M-LSTM exhibited the highest performance with a BLEU-1 score of 0.896 and a BLEU-4 score of 0.586 while using a single-layer LSTM.

Keywords: multi-modal AI systems, image captioning, encoder, decoder, BLUE score

Procedia PDF Downloads 39
994 The Conception of the Students about the Presence of Mental Illness at School

Authors: Aline Giardin, Maria Rosa Chitolina, Maria Catarina Zanini

Abstract:

In this paper, we analyze the conceptions of high school students about mental health issues, and discuss the creation of mental basic health programs in schools. We base our findings in a quantitative survey carried out by us with 156 high school students of CTISM (Colégio Técnico Industrial de Santa Maria) school, located in Santa Maria city, Brazil. We have found that: (a) 28 students relate the subject ‘mental health’ with psychiatric hospitals and lunatic asylums; (b) 28 students have relatives affected by mental diseases; (c) 76 students believe that mental patients, if treated, can live a healthy life; (d) depression, schizophrenia and bipolar disorder are the most cited diseases; (e) 84 students have contact with mental patients, but know nothing about the disease; (f) 123 students have never been instructed about mental diseases while in the school; and (g) 135 students think that a mental health program would be important in the school. We argue that these numbers reflect a vision of mental health that can be related to the reductionist education still present in schools and to the lack of integration between health professionals, sciences teachers, and students. Furthermore, this vision can also be related to a stigmatization process, which interferes with the interactions and with the representations regarding mental disorders and mental patients in society.

Keywords: mental health, schools, mental illness, conception

Procedia PDF Downloads 448
993 Hands-off Parking: Deep Learning Gesture-based System for Individuals with Mobility Needs

Authors: Javier Romera, Alberto Justo, Ignacio Fidalgo, Joshue Perez, Javier Araluce

Abstract:

Nowadays, individuals with mobility needs face a significant challenge when docking vehicles. In many cases, after parking, they encounter insufficient space to exit, leading to two undesired outcomes: either avoiding parking in that spot or settling for improperly placed vehicles. To address this issue, the following paper presents a parking control system employing gestural teleoperation. The system comprises three main phases: capturing body markers, interpreting gestures, and transmitting orders to the vehicle. The initial phase is centered around the MediaPipe framework, a versatile tool optimized for real-time gesture recognition. MediaPipe excels at detecting and tracing body markers, with a special emphasis on hand gestures. Hands detection is done by generating 21 reference points for each hand. Subsequently, after data capture, the project employs the MultiPerceptron Layer (MPL) for indepth gesture classification. This tandem of MediaPipe's extraction prowess and MPL's analytical capability ensures that human gestures are translated into actionable commands with high precision. Furthermore, the system has been trained and validated within a built-in dataset. To prove the domain adaptation, a framework based on the Robot Operating System (ROS), as a communication backbone, alongside CARLA Simulator, is used. Following successful simulations, the system is transitioned to a real-world platform, marking a significant milestone in the project. This real vehicle implementation verifies the practicality and efficiency of the system beyond theoretical constructs.

Keywords: gesture detection, mediapipe, multiperceptron layer, robot operating system

Procedia PDF Downloads 71
992 A Biologically Inspired Approach to Automatic Classification of Textile Fabric Prints Based On Both Texture and Colour Information

Authors: Babar Khan, Wang Zhijie

Abstract:

Machine Vision has been playing a significant role in Industrial Automation, to imitate the wide variety of human functions, providing improved safety, reduced labour cost, the elimination of human error and/or subjective judgments, and the creation of timely statistical product data. Despite the intensive research, there have not been any attempts to classify fabric prints based on printed texture and colour, most of the researches so far encompasses only black and white or grey scale images. We proposed a biologically inspired processing architecture to classify fabrics w.r.t. the fabric print texture and colour. We created a texture descriptor based on the HMAX model for machine vision, and incorporated colour descriptor based on opponent colour channels simulating the single opponent and double opponent neuronal function of the brain. We found that our algorithm not only outperformed the original HMAX algorithm on classification of fabric print texture and colour, but we also achieved a recognition accuracy of 85-100% on different colour and different texture fabric.

Keywords: automatic classification, texture descriptor, colour descriptor, opponent colour channel

Procedia PDF Downloads 461
991 Quantitative Wide-Field Swept-Source Optical Coherence Tomography Angiography and Visual Outcomes in Retinal Artery Occlusion

Authors: Yifan Lu, Ying Cui, Ying Zhu, Edward S. Lu, Rebecca Zeng, Rohan Bajaj, Raviv Katz, Rongrong Le, Jay C. Wang, John B. Miller

Abstract:

Purpose: Retinal artery occlusion (RAO) is an ophthalmic emergency that can lead to poor visual outcome and is associated with an increased risk of cerebral stroke and cardiovascular events. Fluorescein angiography (FA) is the traditional diagnostic tool for RAO; however, wide-field swept-source optical coherence tomography angiography (WF SS-OCTA), as a nascent imaging technology, is able to provide quick and non-invasive angiographic information with a wide field of view. In this study, we looked for associations between OCT-A vascular metrics and visual acuity in patients with prior diagnosis of RAO. Methods: Patients with diagnoses of central retinal artery occlusion (CRAO) or branched retinal artery occlusion (BRAO) were included. A 6mm x 6mm Angio and a 15mm x 15mm AngioPlex Montage OCT-A image were obtained for both eyes in each patient using the Zeiss Plex Elite 9000 WF SS-OCTA device. Each 6mm x 6mm image was divided into nine Early Treatment Diabetic Retinopathy Study (ETDRS) subfields. The average measurement of the central foveal subfield, inner ring, and outer ring was calculated for each parameter. Non-perfusion area (NPA) was manually measured using 15mm x 15mm Montage images. A linear regression model was utilized to identify a correlation between the imaging metrics and visual acuity. A P-value less than 0.05 was considered to be statistically significant. Results: Twenty-five subjects were included in the study. For RAO eyes, there was a statistically significant negative correlation between vision and retinal thickness as well as superficial capillary plexus vessel density (SCP VD). A negative correlation was found between vision and deep capillary plexus vessel density (DCP VD) without statistical significance. There was a positive correlation between vision and choroidal thickness as well as choroidal volume without statistical significance. No statistically significant correlation was found between vision and the above metrics in contralateral eyes. For NPA measurements, no significant correlation was found between vision and NPA. Conclusions: This is the first study to our best knowledge to investigate the utility of WF SS-OCTA in RAO and to demonstrate correlations between various retinal vascular imaging metrics and visual outcomes. Further investigations should explore the associations between these imaging findings and cardiovascular risk as RAO patients are at elevated risk for symptomatic stroke. The results of this study provide a basis to understand the structural changes involved in visual outcomes in RAO. Furthermore, they may help guide management of RAO and prevention of cerebral stroke and cardiovascular accidents in patients with RAO.

Keywords: OCTA, swept-source OCT, retinal artery occlusion, Zeiss Plex Elite

Procedia PDF Downloads 113
990 Simulation-Based Validation of Safe Human-Robot-Collaboration

Authors: Titanilla Komenda

Abstract:

Human-machine-collaboration defines a direct interaction between humans and machines to fulfil specific tasks. Those so-called collaborative machines are used without fencing and interact with humans in predefined workspaces. Even though, human-machine-collaboration enables a flexible adaption to variable degrees of freedom, industrial applications are rarely found. The reasons for this are not technical progress but rather limitations in planning processes ensuring safety for operators. Until now, humans and machines were mainly considered separately in the planning process, focusing on ergonomics and system performance respectively. Within human-machine-collaboration, those aspects must not be seen in isolation from each other but rather need to be analysed in interaction. Furthermore, a simulation model is needed that can validate the system performance and ensure the safety for the operator at any given time. Following on from this, a holistic simulation model is presented, enabling a simulative representation of collaborative tasks – including both, humans and machines. The presented model does not only include a geometry and a motion model of interacting humans and machines but also a numerical behaviour model of humans as well as a Boole’s probabilistic sensor model. With this, error scenarios can be simulated by validating system behaviour in unplanned situations. As these models can be defined on the basis of Failure Mode and Effects Analysis as well as probabilities of errors, the implementation in a collaborative model is discussed and evaluated regarding limitations and simulation times. The functionality of the model is shown on industrial applications by comparing simulation results with video data. The analysis shows the impact of considering human factors in the planning process in contrast to only meeting system performance. In this sense, an optimisation function is presented that meets the trade-off between human and machine factors and aids in a successful and safe realisation of collaborative scenarios.

Keywords: human-machine-system, human-robot-collaboration, safety, simulation

Procedia PDF Downloads 338
989 Domain Adaptation Save Lives - Drowning Detection in Swimming Pool Scene Based on YOLOV8 Improved by Gaussian Poisson Generative Adversarial Network Augmentation

Authors: Simiao Ren, En Wei

Abstract:

Drowning is a significant safety issue worldwide, and a robust computer vision-based alert system can easily prevent such tragedies in swimming pools. However, due to domain shift caused by the visual gap (potentially due to lighting, indoor scene change, pool floor color etc.) between the training swimming pool and the test swimming pool, the robustness of such algorithms has been questionable. The annotation cost for labeling each new swimming pool is too expensive for mass adoption of such a technique. To address this issue, we propose a domain-aware data augmentation pipeline based on Gaussian Poisson Generative Adversarial Network (GP-GAN). Combined with YOLOv8, we demonstrate that such a domain adaptation technique can significantly improve the model performance (from 0.24 mAP to 0.82 mAP) on new test scenes. As the augmentation method only require background imagery from the new domain (no annotation needed), we believe this is a promising, practical route for preventing swimming pool drowning.

Keywords: computer vision, deep learning, YOLOv8, detection, swimming pool, drowning, domain adaptation, generative adversarial network, GAN, GP-GAN

Procedia PDF Downloads 66
988 Analysis of Facial Expressions with Amazon Rekognition

Authors: Kashika P. H.

Abstract:

The development of computer vision systems has been greatly aided by the efficient and precise detection of images and videos. Although the ability to recognize and comprehend images is a strength of the human brain, employing technology to tackle this issue is exceedingly challenging. In the past few years, the use of Deep Learning algorithms to treat object detection has dramatically expanded. One of the key issues in the realm of image recognition is the recognition and detection of certain notable people from randomly acquired photographs. Face recognition uses a way to identify, assess, and compare faces for a variety of purposes, including user identification, user counting, and classification. With the aid of an accessible deep learning-based API, this article intends to recognize various faces of people and their facial descriptors more accurately. The purpose of this study is to locate suitable individuals and deliver accurate information about them by using the Amazon Rekognition system to identify a specific human from a vast image dataset. We have chosen the Amazon Rekognition system, which allows for more accurate face analysis, face comparison, and face search, to tackle this difficulty.

Keywords: Amazon rekognition, API, deep learning, computer vision, face detection, text detection

Procedia PDF Downloads 82
987 Multi-Spectral Deep Learning Models for Forest Fire Detection

Authors: Smitha Haridasan, Zelalem Demissie, Atri Dutta, Ajita Rattani

Abstract:

Aided by the wind, all it takes is one ember and a few minutes to create a wildfire. Wildfires are growing in frequency and size due to climate change. Wildfires and its consequences are one of the major environmental concerns. Every year, millions of hectares of forests are destroyed over the world, causing mass destruction and human casualties. Thus early detection of wildfire becomes a critical component to mitigate this threat. Many computer vision-based techniques have been proposed for the early detection of forest fire using video surveillance. Several computer vision-based methods have been proposed to predict and detect forest fires at various spectrums, namely, RGB, HSV, and YCbCr. The aim of this paper is to propose a multi-spectral deep learning model that combines information from different spectrums at intermediate layers for accurate fire detection. A heterogeneous dataset assembled from publicly available datasets is used for model training and evaluation in this study. The experimental results show that multi-spectral deep learning models could obtain an improvement of about 4.68 % over those based on a single spectrum for fire detection.

Keywords: deep learning, forest fire detection, multi-spectral learning, natural hazard detection

Procedia PDF Downloads 207
986 Experimental Investigation of the Performance and Emission Characteristics of a Diesel Engine Fuelled by Bio-Additives under Variable Loads

Authors: Faisal Mahroogi, Mahmoud Bady, Ahmed Alsisi

Abstract:

The Saudi Vision 2030 program is a government initiative aimed at increasing economic, social, and cultural diversification. Dedicated to clean energy, the Kingdom has been working on solutions such as the circular carbon economy (CCE) and diversifying its energy mix to address energy and climate challenges. With a goal of a Net Zero future by 2060, Saudi Arabia's Vision 2030 emphasizes sustainability. Vision 2030 approa ches today's energy and climate challenges responsibly and creatively as a model for a sustainable future. As per the Ambitions of the National Environment Strategy of the Saudi Ministry of Environment, Agriculture, and Water (MEWA), raising environmental compliance across all sectors and reducing pollution and adverse environmental impacts are critical focus areas.Therefore, the present paper introduces an experimental investigation of a diesel engine's performance and exhaust emissions operating with waste cooking oil (WCO) as a diesel additive. The engine type used is a one-cylinder natural-aspirated constant-speed direct-injection diesel engine. The main variables of the study were the load and the fuel type. The engine performance and emission characteristics were investigated when fueled with three blends. The first blend (D70B10W10DD10) is composed of 70% diesel, 10% butanol,10% WCO, and 10% diethyl ether. The second blend (D60B10W20DD10) is composed of 60% diesel, 10% butanol, 20% WCO, and 10% diethyl ether. The third blend (D50B10W30DD10) comprises 50% diesel, 10% butanol, 30% WCO, and 10% diethyl ether. The study results show that the engine emissions of carbon monoxide (CO) and nitrogen oxides (NOX) vary considerably with the fuel composition and applied load. Concerning engine performance, the cylinder pressure is sensitive to the load and fuel type variation.

Keywords: ICE, waste cooking oil, bio additives, butanol, combustion and emission characteristics

Procedia PDF Downloads 21
985 Resisting Adversarial Assaults: A Model-Agnostic Autoencoder Solution

Authors: Massimo Miccoli, Luca Marangoni, Alberto Aniello Scaringi, Alessandro Marceddu, Alessandro Amicone

Abstract:

The susceptibility of deep neural networks (DNNs) to adversarial manipulations is a recognized challenge within the computer vision domain. Adversarial examples, crafted by adding subtle yet malicious alterations to benign images, exploit this vulnerability. Various defense strategies have been proposed to safeguard DNNs against such attacks, stemming from diverse research hypotheses. Building upon prior work, our approach involves the utilization of autoencoder models. Autoencoders, a type of neural network, are trained to learn representations of training data and reconstruct inputs from these representations, typically minimizing reconstruction errors like mean squared error (MSE). Our autoencoder was trained on a dataset of benign examples; learning features specific to them. Consequently, when presented with significantly perturbed adversarial examples, the autoencoder exhibited high reconstruction errors. The architecture of the autoencoder was tailored to the dimensions of the images under evaluation. We considered various image sizes, constructing models differently for 256x256 and 512x512 images. Moreover, the choice of the computer vision model is crucial, as most adversarial attacks are designed with specific AI structures in mind. To mitigate this, we proposed a method to replace image-specific dimensions with a structure independent of both dimensions and neural network models, thereby enhancing robustness. Our multi-modal autoencoder reconstructs the spectral representation of images across the red-green-blue (RGB) color channels. To validate our approach, we conducted experiments using diverse datasets and subjected them to adversarial attacks using models such as ResNet50 and ViT_L_16 from the torch vision library. The autoencoder extracted features used in a classification model, resulting in an MSE (RGB) of 0.014, a classification accuracy of 97.33%, and a precision of 99%.

Keywords: adversarial attacks, malicious images detector, binary classifier, multimodal transformer autoencoder

Procedia PDF Downloads 52
984 F-VarNet: Fast Variational Network for MRI Reconstruction

Authors: Omer Cahana, Maya Herman, Ofer Levi

Abstract:

Magnetic resonance imaging (MRI) is a long medical scan that stems from a long acquisition time. This length is mainly due to the traditional sampling theorem, which defines a lower boundary for sampling. However, it is still possible to accelerate the scan by using a different approach, such as compress sensing (CS) or parallel imaging (PI). These two complementary methods can be combined to achieve a faster scan with high-fidelity imaging. In order to achieve that, two properties have to exist: i) the signal must be sparse under a known transform domain, ii) the sampling method must be incoherent. In addition, a nonlinear reconstruction algorithm needs to be applied to recover the signal. While the rapid advance in the deep learning (DL) field, which has demonstrated tremendous successes in various computer vision task’s, the field of MRI reconstruction is still in an early stage. In this paper, we present an extension of the state-of-the-art model in MRI reconstruction -VarNet. We utilize VarNet by using dilated convolution in different scales, which extends the receptive field to capture more contextual information. Moreover, we simplified the sensitivity map estimation (SME), for it holds many unnecessary layers for this task. Those improvements have shown significant decreases in computation costs as well as higher accuracy.

Keywords: MRI, deep learning, variational network, computer vision, compress sensing

Procedia PDF Downloads 127
983 3D Vision Transformer for Cervical Spine Fracture Detection and Classification

Authors: Obulesh Avuku, Satwik Sunnam, Sri Charan Mohan Janthuka, Keerthi Yalamaddi

Abstract:

In the United States alone, there are over 1.5 million spine fractures per year, resulting in about 17,730 spinal cord injuries. The cervical spine is where fractures in the spine most frequently occur. The prevalence of spinal fractures in the elderly has increased, and in this population, fractures may be harder to see on imaging because of coexisting degenerative illness and osteoporosis. Nowadays, computed tomography (CT) is almost completely used instead of radiography for the imaging diagnosis of adult spine fractures (x-rays). To stop neurologic degeneration and paralysis following trauma, it is vital to trace any vertebral fractures at the earliest. Many approaches have been proposed for the classification of the cervical spine [2d models]. We are here in this paper trying to break the bounds and use the vision transformers, a State-Of-The-Art- Model in image classification, by making minimal changes possible to the architecture of ViT and making it 3D-enabled architecture and this is evaluated using a weighted multi-label logarithmic loss. We have taken this problem statement from a previously held Kaggle competition, i.e., RSNA 2022 Cervical Spine Fracture Detection.

Keywords: cervical spine, spinal fractures, osteoporosis, computed tomography, 2d-models, ViT, multi-label logarithmic loss, Kaggle, public score, private score

Procedia PDF Downloads 82
982 Robotic Mini Gastric Bypass Surgery

Authors: Arun Prasad, Abhishek Tiwari, Rekha Jaiswal, Vivek Chaudhary

Abstract:

Background: Robotic Roux en Y gastric bypass is being done for some time but is technically difficult, requiring operating in both the sub diaphragmatic and infracolic compartments of the abdomen. This can mean a dual docking of the robot or a hybrid partial laparoscopic and partial robotic surgery. The Mini /One anastomosis /omega loop gastric bypass (MGB) has the advantage of having all dissection and anastomosis in the supracolic compartment and is therefore suitable technically for robotic surgery. Methods: We have done 208 robotic mini gastric bypass surgeries. The robot is docked above the head of the patient in the midline. Camera port is placed supra umbilically. Two ports are placed on the left side of the patient and one port on the right side of the patient. An assistant port is placed between the camera port and right sided robotic port for use of stapler. Distal stomach is stapled from the lesser curve followed by a vertical sleeve upwards leading to a long sleeve pouch. Jejunum is taken at 200 cm from the duodenojejunal junction and brought up to do a side to side gastrojejunostomy. Results: All patients had a successful robotic procedure. Mean time taken was 85 minutes. There were major intraoperative or post operative complications. No patient needed conversion or re-explorative surgery. Mean excess weight loss over a period of 2 year was about 75%. There was no mortality. Patient satisfaction score was high and was attributed to the good weight loss and minimal dietary modifications that were needed after the procedure. Long term side effects were anemia and bile reflux in a small number of patients. Conclusions: MGB / OAGB is gaining worldwide interest as a short simple procedure that has been shown to very effective and safe bariatric surgery. The purpose of this study was to report on the safety and efficacy of robotic surgery for this procedure. This is the first report of totally robotic mini gastric bypass.

Keywords: MGB, mini gastric bypass, OAGB, robotic bariatric surgery

Procedia PDF Downloads 270