Search results for: computer vision
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 3033

Search results for: computer vision

2973 A Review of Deep Learning Methods in Computer-Aided Detection and Diagnosis Systems based on Whole Mammogram and Ultrasound Scan Classification

Authors: Ian Omung'a

Abstract:

Breast cancer remains to be one of the deadliest cancers for women worldwide, with the risk of developing tumors being as high as 50 percent in Sub-Saharan African countries like Kenya. With as many as 42 percent of these cases set to be diagnosed late when cancer has metastasized and or the prognosis has become terminal, Full Field Digital [FFD] Mammography remains an effective screening technique that leads to early detection where in most cases, successful interventions can be made to control or eliminate the tumors altogether. FFD Mammograms have been proven to multiply more effective when used together with Computer-Aided Detection and Diagnosis [CADe] systems, relying on algorithmic implementations of Deep Learning techniques in Computer Vision to carry out deep pattern recognition that is comparable to the level of a human radiologist and decipher whether specific areas of interest in the mammogram scan image portray abnormalities if any and whether these abnormalities are indicative of a benign or malignant tumor. Within this paper, we review emergent Deep Learning techniques that will prove relevant to the development of State-of-The-Art FFD Mammogram CADe systems. These techniques will span self-supervised learning for context-encoded occlusion, self-supervised learning for pre-processing and labeling automation, as well as the creation of a standardized large-scale mammography dataset as a benchmark for CADe systems' evaluation. Finally, comparisons are drawn between existing practices that pre-date these techniques and how the development of CADe systems that incorporate them will be different.

Keywords: breast cancer diagnosis, computer aided detection and diagnosis, deep learning, whole mammogram classfication, ultrasound classification, computer vision

Procedia PDF Downloads 59
2972 Deep Learning based Image Classifiers for Detection of CSSVD in Cacao Plants

Authors: Atuhurra Jesse, N'guessan Yves-Roland Douha, Pabitra Lenka

Abstract:

The detection of diseases within plants has attracted a lot of attention from computer vision enthusiasts. Despite the progress made to detect diseases in many plants, there remains a research gap to train image classifiers to detect the cacao swollen shoot virus disease or CSSVD for short, pertinent to cacao plants. This gap has mainly been due to the unavailability of high quality labeled training data. Moreover, institutions have been hesitant to share their data related to CSSVD. To fill these gaps, image classifiers to detect CSSVD-infected cacao plants are presented in this study. The classifiers are based on VGG16, ResNet50 and Vision Transformer (ViT). The image classifiers are evaluated on a recently released and publicly accessible KaraAgroAI Cocoa dataset. The best performing image classifier, based on ResNet50, achieves 95.39\% precision, 93.75\% recall, 94.34\% F1-score and 94\% accuracy on only 20 epochs. There is a +9.75\% improvement in recall when compared to previous works. These results indicate that the image classifiers learn to identify cacao plants infected with CSSVD.

Keywords: CSSVD, image classification, ResNet50, vision transformer, KaraAgroAI cocoa dataset

Procedia PDF Downloads 49
2971 Proposal for a Web System for the Control of Fungal Diseases in Grapes in Fruits Markets

Authors: Carlos Tarmeño Noriega, Igor Aguilar Alonso

Abstract:

Fungal diseases are common in vineyards; they cause a decrease in the quality of the products that can be sold, generating distrust of the customer towards the seller when buying fruit. Currently, technology allows the classification of fruits according to their characteristics thanks to artificial intelligence. This study proposes the implementation of a control system that allows the identification of the main fungal diseases present in the Italia grape, making use of a convolutional neural network (CNN), OpenCV, and TensorFlow. The methodology used was based on a collection of 20 articles referring to the proposed research on quality control, classification, and recognition of fruits through artificial vision techniques.

Keywords: computer vision, convolutional neural networks, quality control, fruit market, OpenCV, TensorFlow

Procedia PDF Downloads 35
2970 A New and Simple Method of Plotting Binocular Single Vision Field (BSVF) using the Cervical Range of Motion - CROM - Device

Authors: Mihir Kothari, Heena Khan, Vivek Rathod

Abstract:

Assessment of binocular single vision field (BSVF) is traditionally done using a Goldmann perimeter. The measurement of BSVF is important for the management of incomitant strabismus, viz. orbital fractures, thyroid orbitopathy, oculomotor cranial nerve palsies, Duane syndrome etc. In this paper, we describe a new technique for measuring BSVF using a CROM device. Goldmann perimeter is very bulky and expensive (Euro 5000.00 or more) instrument which is 'almost' obsolete from the contemporary ophthalmology practice. Whereas, CROM can be easily made in the DIY (do it yourself) manner for the fraction of the price of the perimeter (only Euro 15.00). Moreover, CROM is useful for the accurate measurement of ocular torticollis vis. nystagmus, paralytic or incomitant squint etc, and it is highly portable.

Keywords: binocular single vision, perimetry, cervical rgen of motion, visual field, binocular single vision field

Procedia PDF Downloads 29
2969 Facilitating Curriculum Access for Pupils with Vision Impairments: An Analysis of the Role of Specialist Teachers in England and Turkey

Authors: Kubra Akbayrak

Abstract:

In parallel with increasing inclusive practice for pupils with vision impairments, the role of specialist teachers who have specialized in the area of vision impairment has dramatically changed in recent years. This study, therefore, aims to provide a holistic perspective towards the distinctive role of specialist teachers of pupils with vision impairments in different educational settings (including mainstream settings, special school settings, etc.) in Turkey and England. Within the scope of the study, semi-structured interviews have been conducted with 17 specialist teachers in Turkey and 14 specialist teachers in England in order to reveal the perception of specialist teachers regarding their roles in different educational settings as well as their perception towards their pre-service training. As this study is a part of an ongoing PhD research, the qualitative data through semi-structured interviews will be analyzed through using Bronfenbrenner’s ecological systems theory as a theoretical framework in order to provide a holistic view regarding the role of specialist teachers particularly in facilitating curriculum access for pupils with vision impairments in England and Turkey. However, the initial findings broadly illustrate that specialist teachers who work in special school settings have different understanding regarding their roles compared to specialist teachers who work in mainstream settings in relation to promoting independence for pupils with vision impairments. The initial findings also imply that specialist teachers in England and Turkey have different perception about their roles in relation to providing specialist advice and guidance for families of pupils. With the completion of the analysis of the study, it is hoped that the findings will provide an insight into the role of specialist teachers in order to provide implication for programmes which prepare specialist teachers of pupils with vision impairments.

Keywords: curriculum access, pupils with vision impairments, specialist teachers, special education

Procedia PDF Downloads 185
2968 Plant Identification Using Convolution Neural Network and Vision Transformer-Based Models

Authors: Virender Singh, Mathew Rees, Simon Hampton, Sivaram Annadurai

Abstract:

Plant identification is a challenging task that aims to identify the family, genus, and species according to plant morphological features. Automated deep learning-based computer vision algorithms are widely used for identifying plants and can help users narrow down the possibilities. However, numerous morphological similarities between and within species render correct classification difficult. In this paper, we tested custom convolution neural network (CNN) and vision transformer (ViT) based models using the PyTorch framework to classify plants. We used a large dataset of 88,000 provided by the Royal Horticultural Society (RHS) and a smaller dataset of 16,000 images from the PlantClef 2015 dataset for classifying plants at genus and species levels, respectively. Our results show that for classifying plants at the genus level, ViT models perform better compared to CNN-based models ResNet50 and ResNet-RS-420 and other state-of-the-art CNN-based models suggested in previous studies on a similar dataset. ViT model achieved top accuracy of 83.3% for classifying plants at the genus level. For classifying plants at the species level, ViT models perform better compared to CNN-based models ResNet50 and ResNet-RS-420, with a top accuracy of 92.5%. We show that the correct set of augmentation techniques plays an important role in classification success. In conclusion, these results could help end users, professionals and the general public alike in identifying plants quicker and with improved accuracy.

Keywords: plant identification, CNN, image processing, vision transformer, classification

Procedia PDF Downloads 50
2967 Functional Vision of Older People with Cognitive Impairment Living in Galician Nursing Homes

Authors: C. Vázquez, L. M. Gigirey, C. P. del Oro, S. Seoane

Abstract:

Poor vision is common among older people, and several studies show connections between visual impairment and cognitive function. 15 older adult live in Galician Government nursing homes, and cognitive decline is one of the main reasons of admission. Objectives: (1) To evaluate functional far and near vision of older people with cognitive impairment. (2) To determine connections between visual and cognitive state of “our” residents. Methodology: A total of 364 older adults (aged 65 years or more) underwent a visual and cognitive screening. We tested presenting visual acuity (binocular visual acuity with habitual correction if warn) for distance and near vision (E-Snellen, usual working distance for near vision). Binocular presenting visual acuity less than 0.3 was used as cut point for diagnosis of visual impairment. Exclusion criteria included immobilized residents unable to reach the USC Dual Sensory Loss Unit for visual screening. To screen cognition we employed the mini-mental examination test (Spanish version). Analysis of categorical variables was performed using chi-square tests. We utilized Pearson and Spearman correlation tests and the variance analysis to determine differences between groups of interest (SPSS 19.0 version). Results: the percentage of residents with cognitive decline reaches 32.2% Prevalence of visual impairment for distance and near vision increases among those subjects with cognitive impairment respect those with normal cognition. Shift correlation exists between distance visual acuity and mini-mental test (age and sex controlled), and moderate association was found in case of near vision (p<0.01). Conclusion: First results shows that people with cognitive impairment have poor functional distance and near vision than those with normal cognition. Next step will be to analyse the individual contribution of distance and near vision loss on cognition.

Keywords: visual impairment, cognition, aging, nursing homes

Procedia PDF Downloads 393
2966 The Corporate Vision Effect on Rajabhat University Brand Building in Thailand

Authors: Pisit Potjanajaruwit

Abstract:

This study aims to (1) investigate the corporate vision factor influencing Rajabhat University brand building in Thailand and (2) explore influences of brand building upon Rajabhat University stakeholders’ loyalty, and the research method will use mixed methods to conduct qualitative research with the quantitative research. The qualitative will approach by Indebt-interview the executive of Rathanagosin Rajabhat University group for 6 key informants and the quantitative data was collected by questionnaires distributed to stakeholder including instructors, staff, students and parents of the Rathanagosin Rajabhat University group for 400 sampling were selected by multi-stage sampling method. Data was analyzed by Structural Equation Modeling: SEM and also provide the focus group interview for confirming the model. Findings corporate vision had a direct and positive influence on Rajabhat University brand building were showed direct and positive influence on stakeholder’s loyalty and stakeholder’s loyalty was indirectly influenced by corporate vision through Rajabhat University brand building.

Keywords: brand building, corporate vision, Rajabhat University, stakeholder‘s loyalty

Procedia PDF Downloads 180
2965 Domain Adaptation Save Lives - Drowning Detection in Swimming Pool Scene Based on YOLOV8 Improved by Gaussian Poisson Generative Adversarial Network Augmentation

Authors: Simiao Ren, En Wei

Abstract:

Drowning is a significant safety issue worldwide, and a robust computer vision-based alert system can easily prevent such tragedies in swimming pools. However, due to domain shift caused by the visual gap (potentially due to lighting, indoor scene change, pool floor color etc.) between the training swimming pool and the test swimming pool, the robustness of such algorithms has been questionable. The annotation cost for labeling each new swimming pool is too expensive for mass adoption of such a technique. To address this issue, we propose a domain-aware data augmentation pipeline based on Gaussian Poisson Generative Adversarial Network (GP-GAN). Combined with YOLOv8, we demonstrate that such a domain adaptation technique can significantly improve the model performance (from 0.24 mAP to 0.82 mAP) on new test scenes. As the augmentation method only require background imagery from the new domain (no annotation needed), we believe this is a promising, practical route for preventing swimming pool drowning.

Keywords: computer vision, deep learning, YOLOv8, detection, swimming pool, drowning, domain adaptation, generative adversarial network, GAN, GP-GAN

Procedia PDF Downloads 47
2964 Analysis of Facial Expressions with Amazon Rekognition

Authors: Kashika P. H.

Abstract:

The development of computer vision systems has been greatly aided by the efficient and precise detection of images and videos. Although the ability to recognize and comprehend images is a strength of the human brain, employing technology to tackle this issue is exceedingly challenging. In the past few years, the use of Deep Learning algorithms to treat object detection has dramatically expanded. One of the key issues in the realm of image recognition is the recognition and detection of certain notable people from randomly acquired photographs. Face recognition uses a way to identify, assess, and compare faces for a variety of purposes, including user identification, user counting, and classification. With the aid of an accessible deep learning-based API, this article intends to recognize various faces of people and their facial descriptors more accurately. The purpose of this study is to locate suitable individuals and deliver accurate information about them by using the Amazon Rekognition system to identify a specific human from a vast image dataset. We have chosen the Amazon Rekognition system, which allows for more accurate face analysis, face comparison, and face search, to tackle this difficulty.

Keywords: Amazon rekognition, API, deep learning, computer vision, face detection, text detection

Procedia PDF Downloads 71
2963 Multi-Spectral Deep Learning Models for Forest Fire Detection

Authors: Smitha Haridasan, Zelalem Demissie, Atri Dutta, Ajita Rattani

Abstract:

Aided by the wind, all it takes is one ember and a few minutes to create a wildfire. Wildfires are growing in frequency and size due to climate change. Wildfires and its consequences are one of the major environmental concerns. Every year, millions of hectares of forests are destroyed over the world, causing mass destruction and human casualties. Thus early detection of wildfire becomes a critical component to mitigate this threat. Many computer vision-based techniques have been proposed for the early detection of forest fire using video surveillance. Several computer vision-based methods have been proposed to predict and detect forest fires at various spectrums, namely, RGB, HSV, and YCbCr. The aim of this paper is to propose a multi-spectral deep learning model that combines information from different spectrums at intermediate layers for accurate fire detection. A heterogeneous dataset assembled from publicly available datasets is used for model training and evaluation in this study. The experimental results show that multi-spectral deep learning models could obtain an improvement of about 4.68 % over those based on a single spectrum for fire detection.

Keywords: deep learning, forest fire detection, multi-spectral learning, natural hazard detection

Procedia PDF Downloads 190
2962 F-VarNet: Fast Variational Network for MRI Reconstruction

Authors: Omer Cahana, Maya Herman, Ofer Levi

Abstract:

Magnetic resonance imaging (MRI) is a long medical scan that stems from a long acquisition time. This length is mainly due to the traditional sampling theorem, which defines a lower boundary for sampling. However, it is still possible to accelerate the scan by using a different approach, such as compress sensing (CS) or parallel imaging (PI). These two complementary methods can be combined to achieve a faster scan with high-fidelity imaging. In order to achieve that, two properties have to exist: i) the signal must be sparse under a known transform domain, ii) the sampling method must be incoherent. In addition, a nonlinear reconstruction algorithm needs to be applied to recover the signal. While the rapid advance in the deep learning (DL) field, which has demonstrated tremendous successes in various computer vision task’s, the field of MRI reconstruction is still in an early stage. In this paper, we present an extension of the state-of-the-art model in MRI reconstruction -VarNet. We utilize VarNet by using dilated convolution in different scales, which extends the receptive field to capture more contextual information. Moreover, we simplified the sensitivity map estimation (SME), for it holds many unnecessary layers for this task. Those improvements have shown significant decreases in computation costs as well as higher accuracy.

Keywords: MRI, deep learning, variational network, computer vision, compress sensing

Procedia PDF Downloads 106
2961 Influence of Peripheral Vision Restrictions on the Walking Trajectory When Texting While Walking

Authors: Macky Kato, Takeshi Sato, Mizuki Nakajima

Abstract:

One major problem related to the use of smartphones is texting while simultaneously engaging in other things, resulting in serious road accidents. Apart from texting while driving being one of the most dangerous behaviors, texting while walking is also dangerous because it narrows the pedestrians’ field of vision. However, many of pedestrian text while walking very habitually. Smartphone users often overlook the potential harm associated with this behavior even while crossing roads. The successful texting while walking make them think that they are safe. The purpose of this study is to reveal of the influence of peripheral vision to the stability of walking trajectory with texting while walking. In total, 9 healthy male university students participated in the experiment. Their mean age was 21.4 years, and standard deviation was 0.7 years. They attempted to walk 10 m in three conditions. First one is the control (CTR) condition, with no phone and no restriction. The second one is the texting while walking (TWG) with no restrictions. The third one is restriction condition (PRS), with phone restricted by experimental peripheral goggles. The horizontal distances (HDS) and directions are measured as the scale of horizontal stability. The longitudinal distances (LDS) between the footprints were measured as the scale of the walking rhythm. The results showed that the HDS of the footprints from the straight line increased as the participants walked in the TWG and PRS conditions. In the PRS condition, this tendency was particularly remarkable. In addition, the LDS between the footprints decreased in the order of the CTR, TWG, and PRS conditions. The ANOVA results showed significant differences in the three conditions with respect to HDS. The differences among these conditions showed that the narrowing of the Pedestrian's vision because of smartphone use influences the walking trajectory and rhythm. It can be said that the pedestrians seem to use their peripheral vision marginally on texting while walking. Therefore, we concluded that the texting while walking narrows the peripheral vision so danger to increase the risk of the accidents.

Keywords: peripheral vision, stability, texting while walking, walking trajectory

Procedia PDF Downloads 218
2960 Essentiality of Core Strategic Vision in Continuous Cost Reduction Management

Authors: Lai Ving Kam

Abstract:

Many markets are maturing, consumer buying powers are weakening and customer preferences change rapidly. To survive, many adopt fast paced continuous cost reduction and competitive pricing to remain relevance. Marketers desire to push for more sales to increase revenues have intensified competitions at time cannibalize the product and market. The amazing technologies changes have created both hope and despair to the industries. The pressure to constantly reduce cost, on the one hand, create and market new products in cheaper prices and shorter life cycles, on the other has become a continuous endeavour. The twin trends appear irreconcilable. Can core strategic vision provides and adapts new directions in continuous cost reduction? This study investigates core strategic vision able to meet this need, for firms to survive and stay profitable. Under current uncertainty market, are firms falling back on their core strategic visions to take them out of the unfavourable positions?

Keywords: core strategy vision, continuous cost reduction, fashionable products industry, competitive pricing

Procedia PDF Downloads 287
2959 Development of Agricultural Robotic Platform for Inter-Row Plant: An Autonomous Navigation Based on Machine Vision

Authors: Alaa El-Din Rezk

Abstract:

In Egypt, management of crops still away from what is being used today by utilizing the advances of mechanical design capabilities, sensing and electronics technology. These technologies have been introduced in many places and recorm, for Straight Path, Curved Path, Sine Wave ded high accuracy in different field operations. So, an autonomous robotic platform based on machine vision has been developed and constructed to be implemented in Egyptian conditions as self-propelled mobile vehicle for carrying tools for inter/intra-row crop management based on different control modules. The experiments were carried out at plant protection research institute (PPRI) during 2014-2015 to optimize the accuracy of agricultural robotic platform control using machine vision in term of the autonomous navigation and performance of the robot’s guidance system. Results showed that the robotic platform' guidance system with machine vision was able to adequately distinguish the path and resisted image noise and did better than human operators for getting less lateral offset error. The average error of autonomous was 2.75, 19.33, 21.22, 34.18, and 16.69 mm. while the human operator was 32.70, 4.85, 7.85, 38.35 and 14.75 mm Path, Offset Discontinuity and Angle Discontinuity respectively.

Keywords: autonomous robotic, Hough transform, image processing, machine vision

Procedia PDF Downloads 269
2958 An Efficient Approach for Recyclable Waste Detection and Classification Using Deep Learning

Authors: Md Aminul Haque, Md. Aminul Islam, Prabal Kumar Chowdhury

Abstract:

One of the world’s most pressing issues right now is the lack of a competent waste management system, particularly in emerging and underdeveloped countries. Recycling solid waste, which comprises numerous dangerous non-biodegradable sub-stances like glass, metals, plastics, etc, is the most essential step in reducing waste-related issues in the environment. Typically, collected waste includes all types of waste that must be thoroughly sorted to be recycled efficiently. Most countries use manual waste sorting techniques, which are efficient. Nevertheless, the waste sorting process by human beings is not safe as there is always a risk of exposing themselves to toxic wastes, which could be serious for their health. Our thesis presents a Deep Learning technique based on computer vision for automatically identifying waste. To construct the model, we used Convolutional Neural Networks, real-time object detection systems, such as YOLOv5 and YOLOv7, as well as several transfers learning-based architectures, including VGG16, MobileNet, Inception-Resnet-v2. The model is trained on numerous images for each type of waste to ensure no overfitting and greater accuracy. The highest accuracy we achieved for our waste detection model YOLOv5x, is 93.7%.

Keywords: deep learning, object detection, YOLOv7, image processing, computer vision

Procedia PDF Downloads 0
2957 Hand Motion Tracking as a Human Computer Interation for People with Cerebral Palsy

Authors: Ana Teixeira, Joao Orvalho

Abstract:

This paper describes experiments using Scratch games, to check the feasibility of employing cerebral palsy users gestures as an alternative of interaction with a computer carried out by students of Master Human Computer Interaction (HCI) of IPC Coimbra. The main focus of this work is to study the usability of a Web Camera as a motion tracking device to achieve a virtual human-computer interaction used by individuals with CP. An approach for Human-computer Interaction (HCI) is present, where individuals with cerebral palsy react and interact with a scratch game through the use of a webcam as an external interaction device. Motion tracking interaction is an emerging technology that is becoming more useful, effective and affordable. However, it raises new questions from the HCI viewpoint, for example, which environments are most suitable for interaction by users with disabilities. In our case, we put emphasis on the accessibility and usability aspects of such interaction devices to meet the special needs of people with disabilities, and specifically people with CP. Despite the fact that our work has just started, preliminary results show that, in general, computer vision interaction systems are very useful; in some cases, these systems are the only way by which some people can interact with a computer. The purpose of the experiments was to verify two hypothesis: 1) people with cerebral palsy can interact with a computer using their natural gestures, 2) scratch games can be a research tool in experiments with disabled young people. A game in Scratch with three levels is created to be played through the use of a webcam. This device permits the detection of certain key points of the user’s body, which allows to assume the head, arms and specially the hands as the most important aspects of recognition. Tests with 5 individuals of different age and gender were made throughout 3 days through periods of 30 minutes with each participant. For a more extensive and reliable statistical analysis, the number of both participants and repetitions in further investigations should be increased. However, already at this stage of research, it is possible to draw some conclusions. First, and the most important, is that simple scratch games on the computer can be a research tool that allows investigating the interaction with computer performed by young persons with CP using intentional gestures. Measurements performed with the assistance of games are attractive for young disabled users. The second important conclusion is that they are able to play scratch games using their gestures. Therefore, the proposed interaction method is promising for them as a human-computer interface. In the future, we plan to include the development of multimodal interfaces that combine various computer vision devices with other input devices improvements in the existing systems to accommodate more the special needs of individuals, in addition, to perform experiments on a larger number of participants.

Keywords: motion tracking, cerebral palsy, rehabilitation, HCI

Procedia PDF Downloads 203
2956 Nighttime Dehaze - Enhancement

Authors: Harshan Baskar, Anirudh S. Chakravarthy, Prateek Garg, Divyam Goel, Abhijith S. Raj, Kshitij Kumar, Lakshya, Ravichandra Parvatham, V. Sushant, Bijay Kumar Rout

Abstract:

In this paper, we introduce a new computer vision task called nighttime dehaze-enhancement. This task aims to jointly perform dehazing and lightness enhancement. Our task fundamentally differs from nighttime dehazing – our goal is to jointly dehaze and enhance scenes, while nighttime dehazing aims to dehaze scenes under a nighttime setting. In order to facilitate further research on this task, we release a new benchmark dataset called Reside-β Night dataset, consisting of 4122 nighttime hazed images from 2061 scenes and 2061 ground truth images. Moreover, we also propose a new network called NDENet (Nighttime Dehaze-Enhancement Network), which jointly performs dehazing and low-light enhancement in an end-to-end manner. We evaluate our method on the proposed benchmark and achieve SSIM of 0.8962 and PSNR of 26.25. We also compare our network with other baseline networks on our benchmark to demonstrate the effectiveness of our approach. We believe that nighttime dehaze-enhancement is an essential task, particularly for autonomous navigation applications, and we hope that our work will open up new frontiers in research. Our dataset and code will be made publicly available upon acceptance of our paper.

Keywords: dehazing, image enhancement, nighttime, computer vision

Procedia PDF Downloads 104
2955 Multi-Labeled Aromatic Medicinal Plant Image Classification Using Deep Learning

Authors: Tsega Asresa, Getahun Tigistu, Melaku Bayih

Abstract:

Computer vision is a subfield of artificial intelligence that allows computers and systems to extract meaning from digital images and video. It is used in a wide range of fields of study, including self-driving cars, video surveillance, medical diagnosis, manufacturing, law, agriculture, quality control, health care, facial recognition, and military applications. Aromatic medicinal plants are botanical raw materials used in cosmetics, medicines, health foods, essential oils, decoration, cleaning, and other natural health products for therapeutic and Aromatic culinary purposes. These plants and their products not only serve as a valuable source of income for farmers and entrepreneurs but also going to export for valuable foreign currency exchange. In Ethiopia, there is a lack of technologies for the classification and identification of Aromatic medicinal plant parts and disease type cured by aromatic medicinal plants. Farmers, industry personnel, academicians, and pharmacists find it difficult to identify plant parts and disease types cured by plants before ingredient extraction in the laboratory. Manual plant identification is a time-consuming, labor-intensive, and lengthy process. To alleviate these challenges, few studies have been conducted in the area to address these issues. One way to overcome these problems is to develop a deep learning model for efficient identification of Aromatic medicinal plant parts with their corresponding disease type. The objective of the proposed study is to identify the aromatic medicinal plant parts and their disease type classification using computer vision technology. Therefore, this research initiated a model for the classification of aromatic medicinal plant parts and their disease type by exploring computer vision technology. Morphological characteristics are still the most important tools for the identification of plants. Leaves are the most widely used parts of plants besides roots, flowers, fruits, and latex. For this study, the researcher used RGB leaf images with a size of 128x128 x3. In this study, the researchers trained five cutting-edge models: convolutional neural network, Inception V3, Residual Neural Network, Mobile Network, and Visual Geometry Group. Those models were chosen after a comprehensive review of the best-performing models. The 80/20 percentage split is used to evaluate the model, and classification metrics are used to compare models. The pre-trained Inception V3 model outperforms well, with training and validation accuracy of 99.8% and 98.7%, respectively.

Keywords: aromatic medicinal plant, computer vision, convolutional neural network, deep learning, plant classification, residual neural network

Procedia PDF Downloads 111
2954 Online Pose Estimation and Tracking Approach with Siamese Region Proposal Network

Authors: Cheng Fang, Lingwei Quan, Cunyue Lu

Abstract:

Human pose estimation and tracking are to accurately identify and locate the positions of human joints in the video. It is a computer vision task which is of great significance for human motion recognition, behavior understanding and scene analysis. There has been remarkable progress on human pose estimation in recent years. However, more researches are needed for human pose tracking especially for online tracking. In this paper, a framework, called PoseSRPN, is proposed for online single-person pose estimation and tracking. We use Siamese network attaching a pose estimation branch to incorporate Single-person Pose Tracking (SPT) and Visual Object Tracking (VOT) into one framework. The pose estimation branch has a simple network structure that replaces the complex upsampling and convolution network structure with deconvolution. By augmenting the loss of fully convolutional Siamese network with the pose estimation task, pose estimation and tracking can be trained in one stage. Once trained, PoseSRPN only relies on a single bounding box initialization and producing human joints location. The experimental results show that while maintaining the good accuracy of pose estimation on COCO and PoseTrack datasets, the proposed method achieves a speed of 59 frame/s, which is superior to other pose tracking frameworks.

Keywords: computer vision, pose estimation, pose tracking, Siamese network

Procedia PDF Downloads 114
2953 Framework for Socio-Technical Issues in Requirements Engineering for Developing Resilient Machine Vision Systems Using Levels of Automation through the Lifecycle

Authors: Ryan Messina, Mehedi Hasan

Abstract:

This research is to examine the impacts of using data to generate performance requirements for automation in visual inspections using machine vision. These situations are intended for design and how projects can smooth the transfer of tacit knowledge to using an algorithm. We have proposed a framework when specifying machine vision systems. This framework utilizes varying levels of automation as contingency planning to reduce data processing complexity. Using data assists in extracting tacit knowledge from those who can perform the manual tasks to assist design the system; this means that real data from the system is always referenced and minimizes errors between participating parties. We propose using three indicators to know if the project has a high risk of failing to meet requirements related to accuracy and reliability. All systems tested achieved a better integration into operations after applying the framework.

Keywords: automation, contingency planning, continuous engineering, control theory, machine vision, system requirements, system thinking

Procedia PDF Downloads 157
2952 Telecontrolled Service Robots for Increasing the Quality of Life of Elderly and Disabled

Authors: Nayden Chivarov, Denis Chikurtev, Kaloyan Yovchev, Nedko Shivarov

Abstract:

This paper represents methods for improving the efficiency and precision of service mobile robot. This robot is used for increasing the quality of life of elderly and disabled people. The key concept of the proposed Intelligent Service Mobile Robot is its easier adaptability to achieve services for a wide range of Elderly or Disabled Person’s needs, by performing different tasks for supporting Elderly or Disabled Persons care. We developed robot autonomous navigation and computer vision systems in order to recognize different objects and bring them to the people. Web based user interface is developed to provide easy access and tele-control of the robot by any device through the internet. In this study algorithms for object recognition and localization are proposed for providing successful object recognition and accuracy in the positioning. Different methods for sending movement commands to the mobile robot system are proposed and evaluated. After executing some experiments to show the results of the research, we can summarize that these systems and algorithms provide good control of the service mobile robot and it will be more useful to help the elderly and disabled persons.

Keywords: service robot, mobile robot, autonomous navigation, computer vision, web user interface, ROS

Procedia PDF Downloads 299
2951 Saudi and U.S. Newspaper Coverage of Saudi Vision 2030 Concerning Women in Online Newspapers

Authors: Ziyad Alghamdi

Abstract:

This research investigates how issues concerning Saudi women have been represented in selected U.S. and Saudi publications. Saudi Vision 2030 is the Kingdom of Saudi Arabia's development strategy, which was revealed on April 25, 2016. This study used 115 news items across selected newspapers as its sampling. The New York Times and the Washington Post were chosen to represent U.S. newspapers and picked two Saudi newspapers, Al Jazirah, and Al Watan. This research examines how these issues were covered before and during the implementation of Saudi Vision 2030. The news pieces were analyzed using both quantitative and qualitative methodologies. The qualitative study employed an inductive technique to uncover frames. Furthermore, this work looked at how American and Saudi publications had framed Saudi women depicted in images by reviewing the photographs used in news reports about Saudi women's issues. The primary conclusion implies that the human-interest frame was more prevalent in American media, whereas the economic frame was more prevalent in Saudi publications. A variety of diverse topics were considered.

Keywords: Saudi newspapers, Saudi Vision 2030, framing theory, Saudi women

Procedia PDF Downloads 47
2950 Vision-Based Collision Avoidance for Unmanned Aerial Vehicles by Recurrent Neural Networks

Authors: Yao-Hong Tsai

Abstract:

Due to the sensor technology, video surveillance has become the main way for security control in every big city in the world. Surveillance is usually used by governments for intelligence gathering, the prevention of crime, the protection of a process, person, group or object, or the investigation of crime. Many surveillance systems based on computer vision technology have been developed in recent years. Moving target tracking is the most common task for Unmanned Aerial Vehicle (UAV) to find and track objects of interest in mobile aerial surveillance for civilian applications. The paper is focused on vision-based collision avoidance for UAVs by recurrent neural networks. First, images from cameras on UAV were fused based on deep convolutional neural network. Then, a recurrent neural network was constructed to obtain high-level image features for object tracking and extracting low-level image features for noise reducing. The system distributed the calculation of the whole system to local and cloud platform to efficiently perform object detection, tracking and collision avoidance based on multiple UAVs. The experiments on several challenging datasets showed that the proposed algorithm outperforms the state-of-the-art methods.

Keywords: unmanned aerial vehicle, object tracking, deep learning, collision avoidance

Procedia PDF Downloads 114
2949 The Meaningful Pixel and Texture: Exploring Digital Vision and Art Practice Based on Chinese Cosmotechnics

Authors: Xingdu Wang, Charlie Gere, Emma Rose, Yuxuan Zhao

Abstract:

The study introduces a fresh perspective on the digital realm through an examination of the Chinese concept of Xiang, elucidating how it can build an understanding of pixels and textures on screens as digital trigrams. This concept attempts to offer an outlook on the intersection of digital technology and the natural world, thereby contributing to discussions about the harmonious relationship between humans and technology. The study looks for the ancient Chinese theory of Xiang as a key to establishing the theories and practices to respond to the problem of Contemporary Chinese technics. Xiang is a Chinese method of understanding the essentials of things through appearances, which differs from the method of science in the Westen. Xiang, the basement of Chinese visual art, is rooted in ancient Chinese philosophy and connected to the eight trigrams. The discussion of Xiang connects art, philosophy, and technology. This paper connects the meaning of Xiang with the 'truth appearing' philosophically through the analysis of the concepts of phenomenon and noumenon and the unique Chinese way of observing. Hereafter, the historical interconnection between ancient painting and writing in China emphasizes their relationship between technical craftsmanship and artistic expression. In digital, the paper blurs the traditional boundaries between images and text on digital screens in theory. Lastly, this study identified an ensemble concept relating to pixels and textures in computer vision, drawing inspiration from AI image recognition in Chinese paintings. In art practice, by presenting a fluid visual experience in the form of pixels, which mimics the flow of lines in traditional calligraphy and painting, it is hoped that the viewer will be brought back to the process of the truth appearing as defined by the 'Xiang’.

Keywords: Chinese cosmotechnics, computer vision, contemporary Neo-Confucianism, texture and pixel, Xiang

Procedia PDF Downloads 25
2948 RV-YOLOX: Object Detection on Inland Waterways Based on Optimized YOLOX Through Fusion of Vision and 3+1D Millimeter Wave Radar

Authors: Zixian Zhang, Shanliang Yao, Zile Huang, Zhaodong Wu, Xiaohui Zhu, Yong Yue, Jieming Ma

Abstract:

Unmanned Surface Vehicles (USVs) are valuable due to their ability to perform dangerous and time-consuming tasks on the water. Object detection tasks are significant in these applications. However, inherent challenges, such as the complex distribution of obstacles, reflections from shore structures, water surface fog, etc., hinder the performance of object detection of USVs. To address these problems, this paper provides a fusion method for USVs to effectively detect objects in the inland surface environment, utilizing vision sensors and 3+1D Millimeter-wave radar. MMW radar is complementary to vision sensors, providing robust environmental information. The radar 3D point cloud is transferred to 2D radar pseudo image to unify radar and vision information format by utilizing the point transformer. We propose a multi-source object detection network (RV-YOLOX )based on radar-vision fusion for inland waterways environment. The performance is evaluated on our self-recording waterways dataset. Compared with the YOLOX network, our fusion network significantly improves detection accuracy, especially for objects with bad light conditions.

Keywords: inland waterways, YOLO, sensor fusion, self-attention

Procedia PDF Downloads 50
2947 Traffic Analysis and Prediction Using Closed-Circuit Television Systems

Authors: Aragorn Joaquin Pineda Dela Cruz

Abstract:

Road traffic congestion is continually deteriorating in Hong Kong. The largest contributing factor is the increase in vehicle fleet size, resulting in higher competition over the utilisation of road space. This study proposes a project that can process closed-circuit television images and videos to provide real-time traffic detection and prediction capabilities. Specifically, a deep-learning model involving computer vision techniques for video and image-based vehicle counting, then a separate model to detect and predict traffic congestion levels based on said data. State-of-the-art object detection models such as You Only Look Once and Faster Region-based Convolutional Neural Networks are tested and compared on closed-circuit television data from various major roads in Hong Kong. It is then used for training in long short-term memory networks to be able to predict traffic conditions in the near future, in an effort to provide more precise and quicker overviews of current and future traffic conditions relative to current solutions such as navigation apps.

Keywords: intelligent transportation system, vehicle detection, traffic analysis, deep learning, machine learning, computer vision, traffic prediction

Procedia PDF Downloads 62
2946 Vision Aided INS for Soft Landing

Authors: R. Sri Karthi Krishna, A. Saravana Kumar, Kesava Brahmaji, V. S. Vinoj

Abstract:

The lunar surface may contain rough and non-uniform terrain with dips and peaks. Soft-landing is a method of landing the lander on the lunar surface without any damage to the vehicle. This project focuses on finding a safe landing site for the vehicle by developing a method for the lateral velocity determination of the lunar lander. This is done by processing the real time images obtained by means of an on-board vision sensor. The hazard avoidance phase of the soft-landing starts when the vehicle is about 200 m above the lunar surface. Here, the lander has a very low velocity of about 10 cm/s:vertical and 5 m/s:horizontal. On the detection of a hazard the lander is navigated by controlling the vertical and lateral velocity. In order to find an appropriate landing site and to accordingly navigate, the lander image processing is performed continuously. The images are taken continuously until the landing site is determined, and the lander safely lands on the lunar surface. By integrating this vision-based navigation with the INS a better accuracy for the soft-landing of the lunar lander can be obtained.

Keywords: vision aided INS, image processing, lateral velocity estimation, materials engineering

Procedia PDF Downloads 425
2945 Shared Vision System Support for Maintenance Tasks of Wind Turbines

Authors: Buket Celik Ünal, Onur Ünal

Abstract:

Communication is the most challenging part of maintenance operations. Communication between expert and fieldworker is crucial for effective maintenance and this also affects the safety of the fieldworkers. To support a machine user in a remote collaborative physical task, both, a mobile and a stationary device are needed. Such a system is called a shared vision system and the system supports two people to solve a problem from different places. This system reduces the errors and provides a reliable support for qualified and less qualified users. Through this research, it was aimed to validate the effectiveness of using a shared vision system to facilitate communication between on-site workers and those issuing instructions regarding maintenance or inspection works over long distances. The system is designed with head-worn display which is called a shared vision system. As a part of this study, a substitute system is used and implemented by using a shared vision system for maintenance operation. The benefits of the use of a shared vision system are analyzed and results are adapted to the wind turbines to improve the occupational safety and health for maintenance technicians. The motivation for the research effort in this study can be summarized in the following research questions: -How can expert support technician over long distances during maintenance operation? -What are the advantages of using a shared vision system? Experience from the experiment shows that using a shared vision system is an advantage for both electrical and mechanical system failures. Results support that the shared vision system can be used for wind turbine maintenance and repair tasks. Because wind turbine generator/gearbox and the substitute system have similar failures. Electrical failures, such as voltage irregularities, wiring failures and mechanical failures, such as alignment, vibration, over-speed conditions are the common and similar failures for both. Furthermore, it was analyzed the effectiveness of the shared vision system by using a smart glasses in connection with the maintenance task performed by a substitute system under four different circumstances, namely by using a shared vision system, an audio communication, a smartphone and by yourself condition. A suitable method for determining dependencies between factors measured in Chi Square Test, and Chi Square Test for Independence measured for determining a relationship between two qualitative variables and finally Mann Whitney U Test is used to compare any two data sets. While based on this experiment, no relation was found between the results and the gender. Participants` responses confirmed that the shared vision system is efficient and helpful for maintenance operations. From the results of the research, there was a statistically significant difference in the average time taken by subjects on works using a shared vision system under the other conditions. Additionally, this study confirmed that a shared vision system provides reduction in time to diagnose and resolve maintenance issues, reduction in diagnosis errors, reduced travel costs for experts, and increased reliability in service.

Keywords: communication support, maintenance and inspection tasks, occupational health and safety, shared vision system

Procedia PDF Downloads 226
2944 Automatic Detection of Sugarcane Diseases: A Computer Vision-Based Approach

Authors: Himanshu Sharma, Karthik Kumar, Harish Kumar

Abstract:

The major problem in crop cultivation is the occurrence of multiple crop diseases. During the growth stage, timely identification of crop diseases is paramount to ensure the high yield of crops, lower production costs, and minimize pesticide usage. In most cases, crop diseases produce observable characteristics and symptoms. The Surveyors usually diagnose crop diseases when they walk through the fields. However, surveyor inspections tend to be biased and error-prone due to the nature of the monotonous task and the subjectivity of individuals. In addition, visual inspection of each leaf or plant is costly, time-consuming, and labour-intensive. Furthermore, the plant pathologists and experts who can often identify the disease within the plant according to their symptoms in early stages are not readily available in remote regions. Therefore, this study specifically addressed early detection of leaf scald, red rot, and eyespot types of diseases within sugarcane plants. The study proposes a computer vision-based approach using a convolutional neural network (CNN) for automatic identification of crop diseases. To facilitate this, firstly, images of sugarcane diseases were taken from google without modifying the scene, background, or controlling the illumination to build the training dataset. Then, the testing dataset was developed based on the real-time collected images from the sugarcane field from India. Then, the image dataset is pre-processed for feature extraction and selection. Finally, the CNN-based Visual Geometry Group (VGG) model was deployed on the training and testing dataset to classify the images into diseased and healthy sugarcane plants and measure the model's performance using various parameters, i.e., accuracy, sensitivity, specificity, and F1-score. The promising result of the proposed model lays the groundwork for the automatic early detection of sugarcane disease. The proposed research directly sustains an increase in crop yield.

Keywords: automatic classification, computer vision, convolutional neural network, image processing, sugarcane disease, visual geometry group

Procedia PDF Downloads 78