Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 1098

Search results for: vision picking

948 Aromatic Medicinal Plant Classification Using Deep Learning

Authors: Tsega Asresa Mengistu, Getahun Tigistu

Abstract:

Computer vision is an artificial intelligence subfield that allows computers and systems to retrieve meaning from digital images. It is applied in various fields of study self-driving cars, video surveillance, agriculture, Quality control, Health care, construction, military, and everyday life. Aromatic and medicinal plants are botanical raw materials used in cosmetics, medicines, health foods, and other natural health products for therapeutic and Aromatic culinary purposes. Herbal industries depend on these special plants. These plants and their products not only serve as a valuable source of income for farmers and entrepreneurs, and going to export not only industrial raw materials but also valuable foreign exchange. There is a lack of technologies for the classification and identification of Aromatic and medicinal plants in Ethiopia. The manual identification system of plants is a tedious, time-consuming, labor, and lengthy process. For farmers, industry personnel, academics, and pharmacists, it is still difficult to identify parts and usage of plants before ingredient extraction. In order to solve this problem, the researcher uses a deep learning approach for the efficient identification of aromatic and medicinal plants by using a convolutional neural network. The objective of the proposed study is to identify the aromatic and medicinal plant Parts and usages using computer vision technology. Therefore, this research initiated a model for the automatic classification of aromatic and medicinal plants by exploring computer vision technology. Morphological characteristics are still the most important tools for the identification of plants. Leaves are the most widely used parts of plants besides the root, flower and fruit, latex, and barks. The study was conducted on aromatic and medicinal plants available in the Ethiopian Institute of Agricultural Research center. An experimental research design is proposed for this study. This is conducted in Convolutional neural networks and Transfer learning. The Researcher employs sigmoid Activation as the last layer and Rectifier liner unit in the hidden layers. Finally, the researcher got a classification accuracy of 66.4 in convolutional neural networks and 67.3 in mobile networks, and 64 in the Visual Geometry Group.

Keywords: aromatic and medicinal plants, computer vision, deep convolutional neural network

Procedia PDF Downloads 383

947 Hand Symbol Recognition Using Canny Edge Algorithm and Convolutional Neural Network

Authors: Harshit Mittal, Neeraj Garg

Abstract:

Hand symbol recognition is a pivotal component in the domain of computer vision, with far-reaching applications spanning sign language interpretation, human-computer interaction, and accessibility. This research paper discusses the approach with the integration of the Canny Edge algorithm and convolutional neural network. The significance of this study lies in its potential to enhance communication and accessibility for individuals with hearing impairments or those engaged in gesture-based interactions with technology. In the experiment mentioned, the data is manually collected by the authors from the webcam using Python codes, to increase the dataset augmentation, is applied to original images, which makes the model more compatible and advanced. Further, the dataset of about 6000 coloured images distributed equally in 5 classes (i.e., 1, 2, 3, 4, 5) are pre-processed first to gray images and then by the Canny Edge algorithm with threshold 1 and 2 as 150 each. After successful data building, this data is trained on the Convolutional Neural Network model, giving accuracy: 0.97834, precision: 0.97841, recall: 0.9783, and F1 score: 0.97832. For user purposes, a block of codes is built in Python to enable a window for hand symbol recognition. This research, at its core, seeks to advance the field of computer vision by providing an advanced perspective on hand sign recognition. By leveraging the capabilities of the Canny Edge algorithm and convolutional neural network, this study contributes to the ongoing efforts to create more accurate, efficient, and accessible solutions for individuals with diverse communication needs.

Keywords: hand symbol recognition, computer vision, Canny edge algorithm, convolutional neural network

Procedia PDF Downloads 33

946 Pre-Analysis of Printed Circuit Boards Based on Multispectral Imaging for Vision Based Recognition of Electronics Waste

Authors: Florian Kleber, Martin Kampel

Abstract:

The increasing demand of gallium, indium and rare-earth elements for the production of electronics, e.g. solid state-lighting, photovoltaics, integrated circuits, and liquid crystal displays, will exceed the world-wide supply according to current forecasts. Recycling systems to reclaim these materials are not yet in place, which challenges the sustainability of these technologies. This paper proposes a multispectral imaging system as a basis for a vision based recognition system for valuable components of electronics waste. Multispectral images intend to enhance the contrast of images of printed circuit boards (single components, as well as labels) for further analysis, such as optical character recognition and entire printed circuit board recognition. The results show that a higher contrast is achieved in the near infrared compared to ultraviolet and visible light.

Keywords: electronics waste, multispectral imaging, printed circuit boards, rare-earth elements

Procedia PDF Downloads 390

945 MAGNI Dynamics: A Vision-Based Kinematic and Dynamic Upper-Limb Model for Intelligent Robotic Rehabilitation

Authors: Alexandros Lioulemes, Michail Theofanidis, Varun Kanal, Konstantinos Tsiakas, Maher Abujelala, Chris Collander, William B. Townsend, Angie Boisselle, Fillia Makedon

Abstract:

This paper presents a home-based robot-rehabilitation instrument, called ”MAGNI Dynamics”, that utilized a vision-based kinematic/dynamic module and an adaptive haptic feedback controller. The system is expected to provide personalized rehabilitation by adjusting its resistive and supportive behavior according to a fuzzy intelligence controller that acts as an inference system, which correlates the user’s performance to different stiffness factors. The vision module uses the Kinect’s skeletal tracking to monitor the user’s effort in an unobtrusive and safe way, by estimating the torque that affects the user’s arm. The system’s torque estimations are justified by capturing electromyographic data from primitive hand motions (Shoulder Abduction and Shoulder Forward Flexion). Moreover, we present and analyze how the Barrett WAM generates a force-field with a haptic controller to support or challenge the users. Experiments show that by shifting the proportional value, that corresponds to different stiffness factors of the haptic path, can potentially help the user to improve his/her motor skills. Finally, potential areas for future research are discussed, that address how a rehabilitation robotic framework may include multisensing data, to improve the user’s recovery process.

Keywords: human-robot interaction, kinect, kinematics, dynamics, haptic control, rehabilitation robotics, artificial intelligence

Procedia PDF Downloads 297

944 Automated Computer-Vision Analysis Pipeline of Calcium Imaging Neuronal Network Activity Data

Authors: David Oluigbo, Erik Hemberg, Nathan Shwatal, Wenqi Ding, Yin Yuan, Susanna Mierau

Abstract:

Introduction: Calcium imaging is an established technique in neuroscience research for detecting activity in neural networks. Bursts of action potentials in neurons lead to transient increases in intracellular calcium visualized with fluorescent indicators. Manual identification of cell bodies and their contours by experts typically takes 10-20 minutes per calcium imaging recording. Our aim, therefore, was to design an automated pipeline to facilitate and optimize calcium imaging data analysis. Our pipeline aims to accelerate cell body and contour identification and production of graphical representations reflecting changes in neuronal calcium-based fluorescence. Methods: We created a Python-based pipeline that uses OpenCV (a computer vision Python package) to accurately (1) detect neuron contours, (2) extract the mean fluorescence within the contour, and (3) identify transient changes in the fluorescence due to neuronal activity. The pipeline consisted of 3 Python scripts that could both be easily accessed through a Python Jupyter notebook. In total, we tested this pipeline on ten separate calcium imaging datasets from murine dissociate cortical cultures. We next compared our automated pipeline outputs with the outputs of manually labeled data for neuronal cell location and corresponding fluorescent times series generated by an expert neuroscientist. Results: Our results show that our automated pipeline efficiently pinpoints neuronal cell body location and neuronal contours and provides a graphical representation of neural network metrics accurately reflecting changes in neuronal calcium-based fluorescence. The pipeline detected the shape, area, and location of most neuronal cell body contours by using binary thresholding and grayscale image conversion to allow computer vision to better distinguish between cells and non-cells. Its results were also comparable to manually analyzed results but with significantly reduced result acquisition times of 2-5 minutes per recording versus 10-20 minutes per recording. Based on these findings, our next step is to precisely measure the specificity and sensitivity of the automated pipeline’s cell body and contour detection to extract more robust neural network metrics and dynamics. Conclusion: Our Python-based pipeline performed automated computer vision-based analysis of calcium image recordings from neuronal cell bodies in neuronal cell cultures. Our new goal is to improve cell body and contour detection to produce more robust, accurate neural network metrics and dynamic graphs.

Keywords: calcium imaging, computer vision, neural activity, neural networks

Procedia PDF Downloads 56

943 An Evaluation of Rational Approach to Management by Objectives in Construction Contracting Organisation

Authors: Zakir H. Shaik, Punam L. Vartak

Abstract:

Management By Objectives (MBO) is a management technique in which objectives of an organisation are conveyed to the employees to establish the individual goals. These objectives and goals are then monitored and assessed jointly by management and the employee time to time. This tool can be used for planning, monitoring as well as for performance appraisal. The success of an organisation is largely dependent on its’s Vision. Thus, it is of paramount importance to achieve the realm of vision through a mission which is well crafted within the organisation to address the objectives. The success of the mission depends upon how realistic and action oriented philosophical approach, an organisation caters to; and how the individual goals are set to track and meet the objectives. Thus, focused and passionate efforts of the team, assigned for the mission, are an absolute obligation for achieving the vision of any organisation. Any construction site is generally a controlled disorder having huge investments, resources and logistics involved. The Construction progression is time-consuming with many isolated as well as interconnected activities. Traditional MBO approach can be unsuccessful if planning and control is non-realistic and inflexible. Moreover, the Construction Industry is far behind understanding these concepts. It is important to address the employee engagement in defining and creating awareness to achieve the targets. Besides, current economic environment and competitive world demands refined management tools to achieve profit, growth and survival of the business. Therefore, the necessity of rational MBO becomes vital part towards the success of an organisation. This paper details about the philosophical assumptions to develop the grounded theory in lieu of achieving objectives through RATIONAL MBO approach in Construction Contracting Organisations. The goals and objectives of the Construction Contracting Organisations can be achieved efficiently by adopting this RATIONAL MBO approach, as those are based on realistic, logical and balanced assumptions.

Keywords: growth, leadership, management by objectives, Management By Objectives (MBO), profit, rational

Procedia PDF Downloads 128

942 Effects of Climate Change on Floods of Pakistan, and Gap Analysis of Existing Policies with Vision 2025

Authors: Saima Akbar, Tahseen Ullah Khan

Abstract:

The analysis of the climate change impact on flood frequency represents an important issue for water resource management and flood risk mitigation. This research was conducted to address the effects of climate change on flood incidents of Pakistan and find out gaps in existing policies to reducing the environmental aspects on floods and effects of global warming. The main objective of this research was to critically analyses the National Climate Change Policy (NCCP), National Disaster Management Authority (NDMA), Federal Flood Commission (FFC) and Vision 2025, as an effective policy document which is not only hitting the target of a climate resilient Pakistan but provides room for efficient and flexible policy implementation. The methodology integrates projected changes in monsoon patterns (since last 20 years and overall change in rainfall pattern since 1901 to 2015 from Pakistan Metrological Department), glacier melting, decreasing dam capacity and lacks in existing policies by using SWOT (Strength, Weakness, Opportunities, Threats) model in order to explore the relative impacts of global warming on the system performance. Results indicate the impacts of climate change are significant, but probably not large enough to justify a major effort for adapting the physical infrastructure to expected climatic conditions in Vision 2025 which is our shared destination to progress, ultimate aspiration to see Pakistan among the ten largest economies of the world by 2047– the centennial year of our independence. The conclusion of this research was to adapt sustainable measures to reduce flood impacts and make policies as neighboring countries are adapting for their sustainability.

Keywords: climatic factors, monsoon, Pakistan, sustainability

Procedia PDF Downloads 124

941 Status of India towards Achieving the Millennium Development Goals

Authors: Rupali Satsangi

Abstract:

14 years ago, leaders from every country agreed on a vision for the future – a world with less poverty, hunger and disease, greater survival prospects for mothers and their infants, better educated children, equal opportunities for women, and a healthier environment; a world in which developed and developing countries work in partnership for the betterment of all. This vision took the shape of eight Millennium Development Goals, which provide countries around the world a framework for development and time-bound targets by which progress can be measured. However, India has found 35 of the indicators as relevant to India. India’s MDG-framework has been contextualized through a concordance with the existing official indicators of corresponding dimensions in the national statistical system. The present study based on secondary data analyzed the status of India towards achieving the MDGs after reviewing the data study find out that India can miss the MDGs Bus in women health, sanitation and global partnership. These goals were less addressed by India in his policies and takeoffs.

Keywords: millennium development goals, national statistical system, global partnership, healthier environment

Procedia PDF Downloads 368

940 Martial Arts and Combative Program of the Philippine Military Academy Cadet Corps Armed Forces of the Philippines: An Assessment

Authors: Jayson Vicente

Abstract:

The young men and women of Philippine Military Academy Cadet Corps Armed Forces of the Philippines (PMA CCAFP) are bred to be front liners and last line of defense during war and times of peace; as such, they must be equipped with the most practical and most effective combat-ready Martial Arts and Combative skills to effectively fulfill their duty, as well as to protect and safeguard themselves to continue serving the people and their country. This study shall assess the current Martial Arts and Combative Program of the PMA CCAFP using descriptive methodology by interviews and floating questionnaires. The current Martial Arts and Combative Program of the PMA CCAFP with all of the subjects involved are more sports inclined rather than combat-equipped. Picking the best from each subject used in the program, this study seeks to recommend improvements or create a better Martial Arts and Combative Program that will satisfy the objective of producing Martial Arts combatant graduates. A good Martial Arts and Combative Program for PMA is essential to prepare them for what lies ahead, which is unforgiving and no rules to pacify threat.

Keywords: combative, martial arts, military, program

Procedia PDF Downloads 123

939 “Presently”: A Personal Trainer App to Self-Train and Improve Presentation Skills

Authors: Shyam Mehraaj, Samanthi E. R. Siriwardana, Shehara A. K. G. H., Wanigasinghe N. T., Wandana R. A. K., Wedage C. V.

Abstract:

A presentation is a critical tool for conveying not just spoken information but also a wide spectrum of human emotions. The single most effective thing to make the presentation successful is to practice it beforehand. Preparing for a presentation has been shown to be essential for improving emotional control, intonation and prosody, pronunciation, and vocabulary, as well as the quality of the presentation slides. As a result, practicing has become one of the most critical parts of giving a good presentation. In this research, the main focus is to analyze the audio, video, and slides of the presentation uploaded by the presenters. This proposed solution is based on the Natural Language Processing and Computer Vision techniques to cater to the requirement for the presenter to do a presentation beforehand using a mobile responsive web application. The proposed system will assist in practicing the presentation beforehand by identifying the presenters’ emotions, body language, tonality, prosody, pronunciations and vocabulary, and presentation slides quality. Overall, the system will give a rating and feedback to the presenter about the performance so that the presenters’ can improve their presentation skills.

Keywords: presentation, self-evaluation, natural learning processing, computer vision

Procedia PDF Downloads 75

938 High Level Synthesis of Canny Edge Detection Algorithm on Zynq Platform

Authors: Hanaa M. Abdelgawad, Mona Safar, Ayman M. Wahba

Abstract:

Real-time image and video processing is a demand in many computer vision applications, e.g. video surveillance, traffic management and medical imaging. The processing of those video applications requires high computational power. Therefore, the optimal solution is the collaboration of CPU and hardware accelerators. In this paper, a Canny edge detection hardware accelerator is proposed. Canny edge detection is one of the common blocks in the pre-processing phase of image and video processing pipeline. Our presented approach targets offloading the Canny edge detection algorithm from processing system (PS) to programmable logic (PL) taking the advantage of High Level Synthesis (HLS) tool flow to accelerate the implementation on Zynq platform. The resulting implementation enables up to a 100x performance improvement through hardware acceleration. The CPU utilization drops down and the frame rate jumps to 60 fps of 1080p full HD input video stream.

Keywords: high level synthesis, canny edge detection, hardware accelerators, computer vision

Procedia PDF Downloads 451

937 Safety Effect of Smart Right-Turn Design at Intersections

Authors: Upal Barua

Abstract:

The risk of severe crashes at high-speed right-turns at intersections is a major safety concern these days. The application of a smart right-turn at an intersection is increasing day by day to address is an issue. The design, ‘Smart Right-turn’ consists of a narrow-angle of channelization at approximately 70°. This design increases the cone of vision of the right-tuning drivers towards the crossing pedestrians as well as traffic on the cross-road. As part of the Safety Improvement Program in Austin Transportation Department, several smart right-turns were constructed at high crash intersections where high-speed right-turns were found to be a contributing factor. This paper features the state of the art techniques applied in planning, engineering, designing and construction of this smart right-turn, key factors driving the success, and lessons learned in the process. This paper also presents the significant crash reductions achieved from the application of this smart right-turn design using Empirical Bayes method. The result showed that smart right-turns can reduce overall right-turn crashes by 43% and severe right-turn crashes by 70%.

Keywords: smart right-turn, intersection, cone of vision, empirical Bayes method

Procedia PDF Downloads 234

936 Post-modernist Tragi-Comedy: A Study of Tom Stoppard’s “Rosencrantz and Guildenstern Are Dead”

Authors: Azza Taha Zaki

Abstract:

The death of tragedy is probably the most distinctive literary controversy of the twentieth century. There is common critical consent that tragedy in the classical sense of the word is no longer possible. Thinkers, philosophers, and critics such as Nietzsche, Durrenmatt, and George Steiner have all agreed that the decline of the genre in the modern age is due to the total lack of a unified world image and the absence of a shared vision in a fragmented and ideologically diversified world. The production of Rosencrantz and Guildenstern are Dead in 1967 marked the rise of the genre of tragi-comedy as a more appropriate reflection of the spirit of the age. At the hands of such great dramatists as Tom Stoppard (1937- ), the revived genre was not used as an extra comic element to give some comic relief to an otherwise tragic text, but it was given a postmodernist touch to serve the interpretation of the dilemma of man in the postmodernist world. This paper will study features of postmodernist tragi-comedy in Rosencrantz and Guildenstern are Dead as one of the most important plays in modern British theatre and investigate Stoppard’s vision of man and life as influenced by postmodernist thought and philosophy.

Keywords: British, drama, postmodernist, Stoppard, tragi-comedy

Procedia PDF Downloads 160

935 Image Classification with Localization Using Convolutional Neural Networks

Authors: Bhuyain Mobarok Hossain

Abstract:

Image classification and localization research is currently an important strategy in the field of computer vision. The evolution and advancement of deep learning and convolutional neural networks (CNN) have greatly improved the capabilities of object detection and image-based classification. Target detection is important to research in the field of computer vision, especially in video surveillance systems. To solve this problem, we will be applying a convolutional neural network of multiple scales at multiple locations in the image in one sliding window. Most translation networks move away from the bounding box around the area of interest. In contrast to this architecture, we consider the problem to be a classification problem where each pixel of the image is a separate section. Image classification is the method of predicting an individual category or specifying by a shoal of data points. Image classification is a part of the classification problem, including any labels throughout the image. The image can be classified as a day or night shot. Or, likewise, images of cars and motorbikes will be automatically placed in their collection. The deep learning of image classification generally includes convolutional layers; the invention of it is referred to as a convolutional neural network (CNN).

Keywords: image classification, object detection, localization, particle filter

Procedia PDF Downloads 266

934 Can Urbanisation Be the Cause for Increasing Urban Poverty: An Exploratory Analysis for India

Authors: Sarmistha Singh

Abstract:

An analysis of trend of urbanization and urban poverty in recent decades is showing that a distinctly reducing rural poverty and increasing in urban areas. It can be argued that the higher the urbanization fuelled by the urban migration to city, which is picking up people from less skilled, education so they faced obstacle to enter into the mainstream economy of city. The share of workforce in economy is higher; in contrast it remains as negligence. At the same time, less wages, absence of social security, social dialogue make them insecure. The vulnerability in their livelihood found. So the paper explores the relation of urbanization and urban poverty in the city, in other words how the urbanization process affecting the urban space in creating the number of poor people in the city. The central focus is the mobility of people with less education and skilled with motive of job search and better livelihood. In many studies found the higher the urbanization and higher the urban poverty in city. In other words, poverty is the impact of urbanization. The strategy of urban inequality through ‘dispersal of concentration’ by the World Bank and others, need to be examined.

Keywords: urbanization, mobility, urban poverty, informal settlements, informal worker

Procedia PDF Downloads 391

933 Using Computer Vision and Machine Learning to Improve Facility Design for Healthcare Facility Worker Safety

Authors: Hengameh Hosseini

Abstract:

Design of large healthcare facilities – such as hospitals, multi-service line clinics, and nursing facilities - that can accommodate patients with wide-ranging disabilities is a challenging endeavor and one that is poorly understood among healthcare facility managers, administrators, and executives. An even less-understood extension of this problem is the implications of weakly or insufficiently accommodative design of facilities for healthcare workers in physically-intensive jobs who may also suffer from a range of disabilities and who are therefore at increased risk of workplace accident and injury. Combine this reality with the vast range of facility types, ages, and designs, and the problem of universal accommodation becomes even more daunting and complex. In this study, we focus on the implication of facility design for healthcare workers suffering with low vision who also have physically active jobs. The points of difficulty are myriad and could span health service infrastructure, the equipment used in health facilities, and transport to and from appointments and other services can all pose a barrier to health care if they are inaccessible, less accessible, or even simply less comfortable for people with various disabilities. We conduct a series of surveys and interviews with employees and administrators of 7 facilities of a range of sizes and ownership models in the Northeastern United States and combine that corpus with in-facility observations and data collection to identify five major points of failure common to all the facilities that we concluded could pose safety threats to employees with vision impairments, ranging from very minor to severe. We determine that lack of design empathy is a major commonality among facility management and ownership. We subsequently propose three methods for remedying this lack of empathy-informed design, to remedy the dangers posed to employees: the use of an existing open-sourced Augmented Reality application to simulate the low-vision experience for designers and managers; the use of a machine learning model we develop to automatically infer facility shortcomings from large datasets of recorded patient and employee reviews and feedback; and the use of a computer vision model fine tuned on images of each facility to infer and predict facility features, locations, and workflows, that could again pose meaningful dangers to visually impaired employees of each facility. After conducting a series of real-world comparative experiments with each of these approaches, we conclude that each of these are viable solutions under particular sets of conditions, and finally characterize the range of facility types, workforce composition profiles, and work conditions under which each of these methods would be most apt and successful.

Keywords: artificial intelligence, healthcare workers, facility design, disability, visually impaired, workplace safety

Procedia PDF Downloads 70

932 Perceptions of Senior Academics in Teacher Education Colleges Regarding the Integration of Digital Games during the Pandemic

Authors: Merav Hayakac, Orit Avidov-Ungarab

Abstract:

The current study adopted an interpretive-constructivist approach to examine how senior academics from a large sample of Israeli teacher education colleges serving general or religious populations perceived the integration of digital games into their teacher instruction and what their policy and vision were in this regard in the context of the COVID-19 pandemic. Half the participants expressed a desire to integrate digital games into their teaching and learning but acknowledged that this practice was uncommon. Only a small minority believed they had achieved successful integration, with doubt and skepticism expressed by some religious colleges. Most colleges had policies encouraging technology integration supported by ongoing funding. Although a considerable gap between policy and implementation remained, the COVID-19 pandemic was viewed as having accelerated the integration of digital games into pre-service teacher instruction. The findings suggest that discussions around technology-related vision and policy and their translation into practice should relate to the specific cultural needs and academic preparedness of the population(s) served by the college.

Keywords: COVID-19, digital games, pedagogy, teacher education colleges

Procedia PDF Downloads 70

931 The Meaningful Pixel and Texture: Exploring Digital Vision and Art Practice Based on Chinese Cosmotechnics

Authors: Xingdu Wang, Charlie Gere, Emma Rose, Yuxuan Zhao

Abstract:

The study introduces a fresh perspective on the digital realm through an examination of the Chinese concept of Xiang, elucidating how it can build an understanding of pixels and textures on screens as digital trigrams. This concept attempts to offer an outlook on the intersection of digital technology and the natural world, thereby contributing to discussions about the harmonious relationship between humans and technology. The study looks for the ancient Chinese theory of Xiang as a key to establishing the theories and practices to respond to the problem of Contemporary Chinese technics. Xiang is a Chinese method of understanding the essentials of things through appearances, which differs from the method of science in the Westen. Xiang, the basement of Chinese visual art, is rooted in ancient Chinese philosophy and connected to the eight trigrams. The discussion of Xiang connects art, philosophy, and technology. This paper connects the meaning of Xiang with the 'truth appearing' philosophically through the analysis of the concepts of phenomenon and noumenon and the unique Chinese way of observing. Hereafter, the historical interconnection between ancient painting and writing in China emphasizes their relationship between technical craftsmanship and artistic expression. In digital, the paper blurs the traditional boundaries between images and text on digital screens in theory. Lastly, this study identified an ensemble concept relating to pixels and textures in computer vision, drawing inspiration from AI image recognition in Chinese paintings. In art practice, by presenting a fluid visual experience in the form of pixels, which mimics the flow of lines in traditional calligraphy and painting, it is hoped that the viewer will be brought back to the process of the truth appearing as defined by the 'Xiang’.

Keywords: Chinese cosmotechnics, computer vision, contemporary Neo-Confucianism, texture and pixel, Xiang

Procedia PDF Downloads 32

930 Basic Modal Displacements (BMD) for Optimizing the Buildings Subjected to Earthquakes

Authors: Seyed Sadegh Naseralavi, Mohsen Khatibinia

Abstract:

In structural optimizations through meta-heuristic algorithms, analyses of structures are performed for many times. For this reason, performing the analyses in a time saving way is precious. The importance of the point is more accentuated in time-history analyses which take much time. To this aim, peak picking methods also known as spectrum analyses are generally utilized. However, such methods do not have the required accuracy either done by square root of sum of squares (SRSS) or complete quadratic combination (CQC) rules. The paper presents an efficient technique for evaluating the dynamic responses during the optimization process with high speed and accuracy. In the method, first by using a static equivalent of the earthquake, an initial design is obtained. Then, the displacements in the modal coordinates are achieved. The displacements are herein called basic modal displacements (MBD). For each new design of the structure, the responses can be derived by well scaling each of the MBD along the time and amplitude and superposing them together using the corresponding modal matrices. To illustrate the efficiency of the method, an optimization problems is studied. The results show that the proposed approach is a suitable replacement for the conventional time history and spectrum analyses in such problems.

Keywords: basic modal displacements, earthquake, optimization, spectrum

Procedia PDF Downloads 335

929 Image Captioning with Vision-Language Models

Authors: Promise Ekpo Osaine, Daniel Melesse

Abstract:

Image captioning is an active area of research in the multi-modal artificial intelligence (AI) community as it connects vision and language understanding, especially in settings where it is required that a model understands the content shown in an image and generates semantically and grammatically correct descriptions. In this project, we followed a standard approach to a deep learning-based image captioning model, injecting architecture for the encoder-decoder setup, where the encoder extracts image features, and the decoder generates a sequence of words that represents the image content. As such, we investigated image encoders, which are ResNet101, InceptionResNetV2, EfficientNetB7, EfficientNetV2M, and CLIP. As a caption generation structure, we explored long short-term memory (LSTM). The CLIP-LSTM model demonstrated superior performance compared to the encoder-decoder models, achieving a BLEU-1 score of 0.904 and a BLEU-4 score of 0.640. Additionally, among the CNN-LSTM models, EfficientNetV2M-LSTM exhibited the highest performance with a BLEU-1 score of 0.896 and a BLEU-4 score of 0.586 while using a single-layer LSTM.

Keywords: multi-modal AI systems, image captioning, encoder, decoder, BLUE score

Procedia PDF Downloads 28

928 The Conception of the Students about the Presence of Mental Illness at School

Authors: Aline Giardin, Maria Rosa Chitolina, Maria Catarina Zanini

Abstract:

In this paper, we analyze the conceptions of high school students about mental health issues, and discuss the creation of mental basic health programs in schools. We base our findings in a quantitative survey carried out by us with 156 high school students of CTISM (Colégio Técnico Industrial de Santa Maria) school, located in Santa Maria city, Brazil. We have found that: (a) 28 students relate the subject ‘mental health’ with psychiatric hospitals and lunatic asylums; (b) 28 students have relatives affected by mental diseases; (c) 76 students believe that mental patients, if treated, can live a healthy life; (d) depression, schizophrenia and bipolar disorder are the most cited diseases; (e) 84 students have contact with mental patients, but know nothing about the disease; (f) 123 students have never been instructed about mental diseases while in the school; and (g) 135 students think that a mental health program would be important in the school. We argue that these numbers reflect a vision of mental health that can be related to the reductionist education still present in schools and to the lack of integration between health professionals, sciences teachers, and students. Furthermore, this vision can also be related to a stigmatization process, which interferes with the interactions and with the representations regarding mental disorders and mental patients in society.

Keywords: mental health, schools, mental illness, conception

Procedia PDF Downloads 440

927 A Biologically Inspired Approach to Automatic Classification of Textile Fabric Prints Based On Both Texture and Colour Information

Authors: Babar Khan, Wang Zhijie

Abstract:

Machine Vision has been playing a significant role in Industrial Automation, to imitate the wide variety of human functions, providing improved safety, reduced labour cost, the elimination of human error and/or subjective judgments, and the creation of timely statistical product data. Despite the intensive research, there have not been any attempts to classify fabric prints based on printed texture and colour, most of the researches so far encompasses only black and white or grey scale images. We proposed a biologically inspired processing architecture to classify fabrics w.r.t. the fabric print texture and colour. We created a texture descriptor based on the HMAX model for machine vision, and incorporated colour descriptor based on opponent colour channels simulating the single opponent and double opponent neuronal function of the brain. We found that our algorithm not only outperformed the original HMAX algorithm on classification of fabric print texture and colour, but we also achieved a recognition accuracy of 85-100% on different colour and different texture fabric.

Keywords: automatic classification, texture descriptor, colour descriptor, opponent colour channel

Procedia PDF Downloads 458

926 Quantitative Wide-Field Swept-Source Optical Coherence Tomography Angiography and Visual Outcomes in Retinal Artery Occlusion

Authors: Yifan Lu, Ying Cui, Ying Zhu, Edward S. Lu, Rebecca Zeng, Rohan Bajaj, Raviv Katz, Rongrong Le, Jay C. Wang, John B. Miller

Abstract:

Purpose: Retinal artery occlusion (RAO) is an ophthalmic emergency that can lead to poor visual outcome and is associated with an increased risk of cerebral stroke and cardiovascular events. Fluorescein angiography (FA) is the traditional diagnostic tool for RAO; however, wide-field swept-source optical coherence tomography angiography (WF SS-OCTA), as a nascent imaging technology, is able to provide quick and non-invasive angiographic information with a wide field of view. In this study, we looked for associations between OCT-A vascular metrics and visual acuity in patients with prior diagnosis of RAO. Methods: Patients with diagnoses of central retinal artery occlusion (CRAO) or branched retinal artery occlusion (BRAO) were included. A 6mm x 6mm Angio and a 15mm x 15mm AngioPlex Montage OCT-A image were obtained for both eyes in each patient using the Zeiss Plex Elite 9000 WF SS-OCTA device. Each 6mm x 6mm image was divided into nine Early Treatment Diabetic Retinopathy Study (ETDRS) subfields. The average measurement of the central foveal subfield, inner ring, and outer ring was calculated for each parameter. Non-perfusion area (NPA) was manually measured using 15mm x 15mm Montage images. A linear regression model was utilized to identify a correlation between the imaging metrics and visual acuity. A P-value less than 0.05 was considered to be statistically significant. Results: Twenty-five subjects were included in the study. For RAO eyes, there was a statistically significant negative correlation between vision and retinal thickness as well as superficial capillary plexus vessel density (SCP VD). A negative correlation was found between vision and deep capillary plexus vessel density (DCP VD) without statistical significance. There was a positive correlation between vision and choroidal thickness as well as choroidal volume without statistical significance. No statistically significant correlation was found between vision and the above metrics in contralateral eyes. For NPA measurements, no significant correlation was found between vision and NPA. Conclusions: This is the first study to our best knowledge to investigate the utility of WF SS-OCTA in RAO and to demonstrate correlations between various retinal vascular imaging metrics and visual outcomes. Further investigations should explore the associations between these imaging findings and cardiovascular risk as RAO patients are at elevated risk for symptomatic stroke. The results of this study provide a basis to understand the structural changes involved in visual outcomes in RAO. Furthermore, they may help guide management of RAO and prevention of cerebral stroke and cardiovascular accidents in patients with RAO.

Keywords: OCTA, swept-source OCT, retinal artery occlusion, Zeiss Plex Elite

Procedia PDF Downloads 110

925 The Role of International Organizations in the Implementation of Return Migration Policy in Cameroon

Authors: Charles Simplice Mbatsogo Mebo

Abstract:

With growth picking up again, Africa seems increasingly attractive for its own nationals who return home through new opportunities available for them. The purpose of our research paper is to understand the role of the international partners in Cameroon, with regards to their support for the return and reintegration of migrants. We, therefore, questioned the relevance and effectiveness and efficacy of international instruments in reintegrating returnees to Cameroon. After our analysis that was conducted on the basis of a documentary exploration, interviews, and field surveys, it appears that the contribution of the international partners in Cameroon is proven in relation to their participation in the financing and placement of returned experts. However, their contribution remains insufficient due to their low level of deployment and the insignificant impact of their investments on the reintegration of Cameroonian Diasporas. The research also reveals some exogenous and endogenous constraints that hinder international institutions' actions in terms of accompanying migrants returning to Cameroon. Finally, for a better management of the returnees' issue, it is necessary to set up a mechanism to raise awareness and a coordination system of all international actors involved. It is also relevant to reform the migration policy, build institutional capacities, and improve the juridical-administrative and economic environment so as to favor co-development in Cameroon.

Keywords: international partners, returnees, diaspora, migration policy, co-development

Procedia PDF Downloads 121

924 Domain Adaptation Save Lives - Drowning Detection in Swimming Pool Scene Based on YOLOV8 Improved by Gaussian Poisson Generative Adversarial Network Augmentation

Authors: Simiao Ren, En Wei

Abstract:

Drowning is a significant safety issue worldwide, and a robust computer vision-based alert system can easily prevent such tragedies in swimming pools. However, due to domain shift caused by the visual gap (potentially due to lighting, indoor scene change, pool floor color etc.) between the training swimming pool and the test swimming pool, the robustness of such algorithms has been questionable. The annotation cost for labeling each new swimming pool is too expensive for mass adoption of such a technique. To address this issue, we propose a domain-aware data augmentation pipeline based on Gaussian Poisson Generative Adversarial Network (GP-GAN). Combined with YOLOv8, we demonstrate that such a domain adaptation technique can significantly improve the model performance (from 0.24 mAP to 0.82 mAP) on new test scenes. As the augmentation method only require background imagery from the new domain (no annotation needed), we believe this is a promising, practical route for preventing swimming pool drowning.

Keywords: computer vision, deep learning, YOLOv8, detection, swimming pool, drowning, domain adaptation, generative adversarial network, GAN, GP-GAN

Procedia PDF Downloads 58

923 Analysis of Facial Expressions with Amazon Rekognition

Authors: Kashika P. H.

Abstract:

The development of computer vision systems has been greatly aided by the efficient and precise detection of images and videos. Although the ability to recognize and comprehend images is a strength of the human brain, employing technology to tackle this issue is exceedingly challenging. In the past few years, the use of Deep Learning algorithms to treat object detection has dramatically expanded. One of the key issues in the realm of image recognition is the recognition and detection of certain notable people from randomly acquired photographs. Face recognition uses a way to identify, assess, and compare faces for a variety of purposes, including user identification, user counting, and classification. With the aid of an accessible deep learning-based API, this article intends to recognize various faces of people and their facial descriptors more accurately. The purpose of this study is to locate suitable individuals and deliver accurate information about them by using the Amazon Rekognition system to identify a specific human from a vast image dataset. We have chosen the Amazon Rekognition system, which allows for more accurate face analysis, face comparison, and face search, to tackle this difficulty.

Keywords: Amazon rekognition, API, deep learning, computer vision, face detection, text detection

Procedia PDF Downloads 77

922 Multi-Spectral Deep Learning Models for Forest Fire Detection

Authors: Smitha Haridasan, Zelalem Demissie, Atri Dutta, Ajita Rattani

Abstract:

Aided by the wind, all it takes is one ember and a few minutes to create a wildfire. Wildfires are growing in frequency and size due to climate change. Wildfires and its consequences are one of the major environmental concerns. Every year, millions of hectares of forests are destroyed over the world, causing mass destruction and human casualties. Thus early detection of wildfire becomes a critical component to mitigate this threat. Many computer vision-based techniques have been proposed for the early detection of forest fire using video surveillance. Several computer vision-based methods have been proposed to predict and detect forest fires at various spectrums, namely, RGB, HSV, and YCbCr. The aim of this paper is to propose a multi-spectral deep learning model that combines information from different spectrums at intermediate layers for accurate fire detection. A heterogeneous dataset assembled from publicly available datasets is used for model training and evaluation in this study. The experimental results show that multi-spectral deep learning models could obtain an improvement of about 4.68 % over those based on a single spectrum for fire detection.

Keywords: deep learning, forest fire detection, multi-spectral learning, natural hazard detection

Procedia PDF Downloads 203

921 Experimental Investigation of the Performance and Emission Characteristics of a Diesel Engine Fuelled by Bio-Additives under Variable Loads

Authors: Faisal Mahroogi, Mahmoud Bady, Ahmed Alsisi

Abstract:

The Saudi Vision 2030 program is a government initiative aimed at increasing economic, social, and cultural diversification. Dedicated to clean energy, the Kingdom has been working on solutions such as the circular carbon economy (CCE) and diversifying its energy mix to address energy and climate challenges. With a goal of a Net Zero future by 2060, Saudi Arabia's Vision 2030 emphasizes sustainability. Vision 2030 approa ches today's energy and climate challenges responsibly and creatively as a model for a sustainable future. As per the Ambitions of the National Environment Strategy of the Saudi Ministry of Environment, Agriculture, and Water (MEWA), raising environmental compliance across all sectors and reducing pollution and adverse environmental impacts are critical focus areas.Therefore, the present paper introduces an experimental investigation of a diesel engine's performance and exhaust emissions operating with waste cooking oil (WCO) as a diesel additive. The engine type used is a one-cylinder natural-aspirated constant-speed direct-injection diesel engine. The main variables of the study were the load and the fuel type. The engine performance and emission characteristics were investigated when fueled with three blends. The first blend (D70B10W10DD10) is composed of 70% diesel, 10% butanol,10% WCO, and 10% diethyl ether. The second blend (D60B10W20DD10) is composed of 60% diesel, 10% butanol, 20% WCO, and 10% diethyl ether. The third blend (D50B10W30DD10) comprises 50% diesel, 10% butanol, 30% WCO, and 10% diethyl ether. The study results show that the engine emissions of carbon monoxide (CO) and nitrogen oxides (NOX) vary considerably with the fuel composition and applied load. Concerning engine performance, the cylinder pressure is sensitive to the load and fuel type variation.

Keywords: ICE, waste cooking oil, bio additives, butanol, combustion and emission characteristics

Procedia PDF Downloads 10

920 Resisting Adversarial Assaults: A Model-Agnostic Autoencoder Solution

Authors: Massimo Miccoli, Luca Marangoni, Alberto Aniello Scaringi, Alessandro Marceddu, Alessandro Amicone

Abstract:

The susceptibility of deep neural networks (DNNs) to adversarial manipulations is a recognized challenge within the computer vision domain. Adversarial examples, crafted by adding subtle yet malicious alterations to benign images, exploit this vulnerability. Various defense strategies have been proposed to safeguard DNNs against such attacks, stemming from diverse research hypotheses. Building upon prior work, our approach involves the utilization of autoencoder models. Autoencoders, a type of neural network, are trained to learn representations of training data and reconstruct inputs from these representations, typically minimizing reconstruction errors like mean squared error (MSE). Our autoencoder was trained on a dataset of benign examples; learning features specific to them. Consequently, when presented with significantly perturbed adversarial examples, the autoencoder exhibited high reconstruction errors. The architecture of the autoencoder was tailored to the dimensions of the images under evaluation. We considered various image sizes, constructing models differently for 256x256 and 512x512 images. Moreover, the choice of the computer vision model is crucial, as most adversarial attacks are designed with specific AI structures in mind. To mitigate this, we proposed a method to replace image-specific dimensions with a structure independent of both dimensions and neural network models, thereby enhancing robustness. Our multi-modal autoencoder reconstructs the spectral representation of images across the red-green-blue (RGB) color channels. To validate our approach, we conducted experiments using diverse datasets and subjected them to adversarial attacks using models such as ResNet50 and ViT_L_16 from the torch vision library. The autoencoder extracted features used in a classification model, resulting in an MSE (RGB) of 0.014, a classification accuracy of 97.33%, and a precision of 99%.

Keywords: adversarial attacks, malicious images detector, binary classifier, multimodal transformer autoencoder

Procedia PDF Downloads 38

919 F-VarNet: Fast Variational Network for MRI Reconstruction

Authors: Omer Cahana, Maya Herman, Ofer Levi

Abstract:

Magnetic resonance imaging (MRI) is a long medical scan that stems from a long acquisition time. This length is mainly due to the traditional sampling theorem, which defines a lower boundary for sampling. However, it is still possible to accelerate the scan by using a different approach, such as compress sensing (CS) or parallel imaging (PI). These two complementary methods can be combined to achieve a faster scan with high-fidelity imaging. In order to achieve that, two properties have to exist: i) the signal must be sparse under a known transform domain, ii) the sampling method must be incoherent. In addition, a nonlinear reconstruction algorithm needs to be applied to recover the signal. While the rapid advance in the deep learning (DL) field, which has demonstrated tremendous successes in various computer vision task’s, the field of MRI reconstruction is still in an early stage. In this paper, we present an extension of the state-of-the-art model in MRI reconstruction -VarNet. We utilize VarNet by using dilated convolution in different scales, which extends the receptive field to capture more contextual information. Moreover, we simplified the sensitivity map estimation (SME), for it holds many unnecessary layers for this task. Those improvements have shown significant decreases in computation costs as well as higher accuracy.

Keywords: MRI, deep learning, variational network, computer vision, compress sensing

Procedia PDF Downloads 114