Search results for: Yolo
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 24

Search results for: Yolo

24 Deep Learning Based Road Crack Detection on an Embedded Platform

Authors: Nurhak Altın, Ayhan Kucukmanisa, Oguzhan Urhan

Abstract:

It is important that highways are in good condition for traffic safety. Road crashes (road cracks, erosion of lane markings, etc.) can cause accidents by affecting driving. Image processing based methods for detecting road cracks are available in the literature. In this paper, a deep learning based road crack detection approach is proposed. YOLO (You Look Only Once) is adopted as core component of the road crack detection approach presented. The YOLO network structure, which is developed for object detection, is trained with road crack images as a new class that is not previously used in YOLO. The performance of the proposed method is compared using different training methods: using randomly generated weights and training their own pre-trained weights (transfer learning). A similar training approach is applied to the simplified version of the YOLO network model (tiny yolo) and the results of the performance are examined. The developed system is able to process 8 fps on NVIDIA Jetson TX1 development kit.

Keywords: deep learning, embedded platform, real-time processing, road crack detection

Procedia PDF Downloads 303
23 A Comparison of YOLO Family for Apple Detection and Counting in Orchards

Authors: Yuanqing Li, Changyi Lei, Zhaopeng Xue, Zhuo Zheng, Yanbo Long

Abstract:

In agricultural production and breeding, implementing automatic picking robot in orchard farming to reduce human labour and error is challenging. The core function of it is automatic identification based on machine vision. This paper focuses on apple detection and counting in orchards and implements several deep learning methods. Extensive datasets are used and a semi-automatic annotation method is proposed. The proposed deep learning models are in state-of-the-art YOLO family. In view of the essence of the models with various backbones, a multi-dimensional comparison in details is made in terms of counting accuracy, mAP and model memory, laying the foundation for realising automatic precision agriculture.

Keywords: agricultural object detection, deep learning, machine vision, YOLO family

Procedia PDF Downloads 152
22 ANAC-id - Facial Recognition to Detect Fraud

Authors: Giovanna Borges Bottino, Luis Felipe Freitas do Nascimento Alves Teixeira

Abstract:

This article aims to present a case study of the National Civil Aviation Agency (ANAC) in Brazil, ANAC-id. ANAC-id is the artificial intelligence algorithm developed for image analysis that recognizes standard images of unobstructed and uprighted face without sunglasses, allowing to identify potential inconsistencies. It combines YOLO architecture and 3 libraries in python - face recognition, face comparison, and deep face, providing robust analysis with high level of accuracy.

Keywords: artificial intelligence, deepface, face compare, face recognition, YOLO, computer vision

Procedia PDF Downloads 113
21 Automating 2D CAD to 3D Model Generation Process: Wall pop-ups

Authors: Mohit Gupta, Chialing Wei, Thomas Czerniawski

Abstract:

In this paper, we have built a neural network that can detect walls on 2D sheets and subsequently create a 3D model in Revit using Dynamo. The training set includes 3500 labeled images, and the detection algorithm used is YOLO. Typically, engineers/designers make concentrated efforts to convert 2D cad drawings to 3D models. This costs a considerable amount of time and human effort. This paper makes a contribution in automating the task of 3D walls modeling. 1. Detecting Walls in 2D cad and generating 3D pop-ups in Revit. 2. Saving designer his/her modeling time in drafting elements like walls from 2D cad to 3D representation. An object detection algorithm YOLO is used for wall detection and localization. The neural network is trained over 3500 labeled images of size 256x256x3. Then, Dynamo is interfaced with the output of the neural network to pop-up 3D walls in Revit. The research uses modern technological tools like deep learning and artificial intelligence to automate the process of generating 3D walls without needing humans to manually model them. Thus, contributes to saving time, human effort, and money.

Keywords: neural networks, Yolo, 2D to 3D transformation, CAD object detection

Procedia PDF Downloads 102
20 On Enabling Miner Self-Rescue with In-Mine Robots using Real-Time Object Detection with Thermal Images

Authors: Cyrus Addy, Venkata Sriram Siddhardh Nadendla, Kwame Awuah-Offei

Abstract:

Surface robots in modern underground mine rescue operations suffer from several limitations in enabling a prompt self-rescue. Therefore, the possibility of designing and deploying in-mine robots to expedite miner self-rescue can have a transformative impact on miner safety. These in-mine robots for miner self-rescue can be envisioned to carry out diverse tasks such as object detection, autonomous navigation, and payload delivery. Specifically, this paper investigates the challenges in the design of object detection algorithms for in-mine robots using thermal images, especially to detect people in real-time. A total of 125 thermal images were collected in the Missouri S&T Experimental Mine with the help of student volunteers using the FLIR TG 297 infrared camera, which were pre-processed into training and validation datasets with 100 and 25 images, respectively. Three state-of-the-art, pre-trained real-time object detection models, namely YOLOv5, YOLO-FIRI, and YOLOv8, were considered and re-trained using transfer learning techniques on the training dataset. On the validation dataset, the re-trained YOLOv8 outperforms the re-trained versions of both YOLOv5, and YOLO-FIRI.

Keywords: miner self-rescue, object detection, underground mine, YOLO

Procedia PDF Downloads 32
19 YOLO-IR: Infrared Small Object Detection in High Noise Images

Authors: Yufeng Li, Yinan Ma, Jing Wu, Chengnian Long

Abstract:

Infrared object detection aims at separating small and dim targets from cluttered backgrounds, and its capabilities extend beyond the limits of visible light, making it invaluable in a wide range of applications, such as improving safety, security, efficiency, and functionality. However, existing methods are usually sensitive to the noise of the input infrared image, leading to a decrease in target detection accuracy and an increase in the false alarm rate in high-noise environments. To address this issue, an infrared small target detection algorithm called YOLO-IR is proposed in this paper to improve the robustness to high infrared noise. To address the problem that high noise significantly reduces the clarity and reliability of target features in infrared images, we design a soft-threshold coordinate attention mechanism to improve the model’s ability to extract target features and its robustness to noise. Since the noise may overwhelm the local details of the target, resulting in the loss of small target features during depth down-sampling, we propose a deep and shallow feature fusion neck to improve the detection accuracy. In addition, because the generalized Intersection over Union (IoU)-based loss functions may be sensitive to noise and lead to unstable training in high-noise environments, we introduce a Wasserstein-distance based loss function to improve the training of the model. The experimental results show that YOLO-IR achieves a 5.0% improvement in recall and a 6.6% improvement in the F1 score over the existing state-of-the-art model.

Keywords: infrared small target detection, high noise, robustness, soft-threshold coordinate attention, feature fusion

Procedia PDF Downloads 8
18 Weed Classification Using a Two-Dimensional Deep Convolutional Neural Network

Authors: Muhammad Ali Sarwar, Muhammad Farooq, Nayab Hassan, Hammad Hassan

Abstract:

Pakistan is highly recognized for its agriculture and is well known for producing substantial amounts of wheat, cotton, and sugarcane. However, some factors contribute to a decline in crop quality and a reduction in overall output. One of the main factors contributing to this decline is the presence of weed and its late detection. This process of detection is manual and demands a detailed inspection to be done by the farmer itself. But by the time detection of weed, the farmer will be able to save its cost and can increase the overall production. The focus of this research is to identify and classify the four main types of weeds (Small-Flowered Cranesbill, Chick Weed, Prickly Acacia, and Black-Grass) that are prevalent in our region’s major crops. In this work, we implemented three different deep learning techniques: YOLO-v5, Inception-v3, and Deep CNN on the same Dataset, and have concluded that deep convolutions neural network performed better with an accuracy of 97.45% for such classification. In relative to the state of the art, our proposed approach yields 2% better results. We devised the architecture in an efficient way such that it can be used in real-time.

Keywords: deep convolution networks, Yolo, machine learning, agriculture

Procedia PDF Downloads 53
17 Gait Biometric for Person Re-Identification

Authors: Lavanya Srinivasan

Abstract:

Biometric identification is to identify unique features in a person like fingerprints, iris, ear, and voice recognition that need the subject's permission and physical contact. Gait biometric is used to identify the unique gait of the person by extracting moving features. The main advantage of gait biometric to identify the gait of a person at a distance, without any physical contact. In this work, the gait biometric is used for person re-identification. The person walking naturally compared with the same person walking with bag, coat, and case recorded using longwave infrared, short wave infrared, medium wave infrared, and visible cameras. The videos are recorded in rural and in urban environments. The pre-processing technique includes human identified using YOLO, background subtraction, silhouettes extraction, and synthesis Gait Entropy Image by averaging the silhouettes. The moving features are extracted from the Gait Entropy Energy Image. The extracted features are dimensionality reduced by the principal component analysis and recognised using different classifiers. The comparative results with the different classifier show that linear discriminant analysis outperforms other classifiers with 95.8% for visible in the rural dataset and 94.8% for longwave infrared in the urban dataset.

Keywords: biometric, gait, silhouettes, YOLO

Procedia PDF Downloads 135
16 A U-Net Based Architecture for Fast and Accurate Diagram Extraction

Authors: Revoti Prasad Bora, Saurabh Yadav, Nikita Katyal

Abstract:

In the context of educational data mining, the use case of extracting information from images containing both text and diagrams is of high importance. Hence, document analysis requires the extraction of diagrams from such images and processes the text and diagrams separately. To the author’s best knowledge, none among plenty of approaches for extracting tables, figures, etc., suffice the need for real-time processing with high accuracy as needed in multiple applications. In the education domain, diagrams can be of varied characteristics viz. line-based i.e. geometric diagrams, chemical bonds, mathematical formulas, etc. There are two broad categories of approaches that try to solve similar problems viz. traditional computer vision based approaches and deep learning approaches. The traditional computer vision based approaches mainly leverage connected components and distance transform based processing and hence perform well in very limited scenarios. The existing deep learning approaches either leverage YOLO or faster-RCNN architectures. These approaches suffer from a performance-accuracy tradeoff. This paper proposes a U-Net based architecture that formulates the diagram extraction as a segmentation problem. The proposed method provides similar accuracy with a much faster extraction time as compared to the mentioned state-of-the-art approaches. Further, the segmentation mask in this approach allows the extraction of diagrams of irregular shapes.

Keywords: computer vision, deep-learning, educational data mining, faster-RCNN, figure extraction, image segmentation, real-time document analysis, text extraction, U-Net, YOLO

Procedia PDF Downloads 85
15 Audio-Visual Co-Data Processing Pipeline

Authors: Rita Chattopadhyay, Vivek Anand Thoutam

Abstract:

Speech is the most acceptable means of communication where we can quickly exchange our feelings and thoughts. Quite often, people can communicate orally but cannot interact or work with computers or devices. It’s easy and quick to give speech commands than typing commands to computers. In the same way, it’s easy listening to audio played from a device than extract output from computers or devices. Especially with Robotics being an emerging market with applications in warehouses, the hospitality industry, consumer electronics, assistive technology, etc., speech-based human-machine interaction is emerging as a lucrative feature for robot manufacturers. Considering this factor, the objective of this paper is to design the “Audio-Visual Co-Data Processing Pipeline.” This pipeline is an integrated version of Automatic speech recognition, a Natural language model for text understanding, object detection, and text-to-speech modules. There are many Deep Learning models for each type of the modules mentioned above, but OpenVINO Model Zoo models are used because the OpenVINO toolkit covers both computer vision and non-computer vision workloads across Intel hardware and maximizes performance, and accelerates application development. A speech command is given as input that has information about target objects to be detected and start and end times to extract the required interval from the video. Speech is converted to text using the Automatic speech recognition QuartzNet model. The summary is extracted from text using a natural language model Generative Pre-Trained Transformer-3 (GPT-3). Based on the summary, essential frames from the video are extracted, and the You Only Look Once (YOLO) object detection model detects You Only Look Once (YOLO) objects on these extracted frames. Frame numbers that have target objects (specified objects in the speech command) are saved as text. Finally, this text (frame numbers) is converted to speech using text to speech model and will be played from the device. This project is developed for 80 You Only Look Once (YOLO) labels, and the user can extract frames based on only one or two target labels. This pipeline can be extended for more than two target labels easily by making appropriate changes in the object detection module. This project is developed for four different speech command formats by including sample examples in the prompt used by Generative Pre-Trained Transformer-3 (GPT-3) model. Based on user preference, one can come up with a new speech command format by including some examples of the respective format in the prompt used by the Generative Pre-Trained Transformer-3 (GPT-3) model. This pipeline can be used in many projects like human-machine interface, human-robot interaction, and surveillance through speech commands. All object detection projects can be upgraded using this pipeline so that one can give speech commands and output is played from the device.

Keywords: OpenVINO, automatic speech recognition, natural language processing, object detection, text to speech

Procedia PDF Downloads 38
14 Emotion Recognition in Video and Images in the Wild

Authors: Faizan Tariq, Moayid Ali Zaidi

Abstract:

Facial emotion recognition algorithms are expanding rapidly now a day. People are using different algorithms with different combinations to generate best results. There are six basic emotions which are being studied in this area. Author tried to recognize the facial expressions using object detector algorithms instead of traditional algorithms. Two object detection algorithms were chosen which are Faster R-CNN and YOLO. For pre-processing we used image rotation and batch normalization. The dataset I have chosen for the experiments is Static Facial Expression in Wild (SFEW). Our approach worked well but there is still a lot of room to improve it, which will be a future direction.

Keywords: face recognition, emotion recognition, deep learning, CNN

Procedia PDF Downloads 146
13 Terraria AI: YOLO Interface for Decision-Making Algorithms

Authors: Emmanuel Barrantes Chaves, Ernesto Rivera Alvarado

Abstract:

This paper presents a method to enable agents for the Terraria game to evaluate algorithms commonly used in general video game artificial intelligence competitions. The usage of the ‘You Only Look Once’ model in the first layer of the process obtains information from the screen, translating this information into a video game description language known as “Video Game Description Language”; the agents take that as input to make decisions. For this, the state-of-the-art algorithms were tested and compared; Monte Carlo Tree Search and Rolling Horizon Evolutionary; in this case, Rolling Horizon Evolutionary shows a better performance. This approach’s main advantage is that a VGDL beforehand is unnecessary. It will be built on the fly and opens the road for using more games as a framework for AI.

Keywords: AI, MCTS, RHEA, Terraria, VGDL, YOLOv5

Procedia PDF Downloads 46
12 Detecting Characters as Objects Towards Character Recognition on Licence Plates

Authors: Alden Boby, Dane Brown, James Connan

Abstract:

Character recognition is a well-researched topic across disciplines. Regardless, creating a solution that can cater to multiple situations is still challenging. Vehicle licence plates lack an international standard, meaning that different countries and regions have their own licence plate format. A problem that arises from this is that the typefaces and designs from different regions make it difficult to create a solution that can cater to a wide range of licence plates. The main issue concerning detection is the character recognition stage. This paper aims to create an object detection-based character recognition model trained on a custom dataset that consists of typefaces of licence plates from various regions. Given that characters have featured consistently maintained across an array of fonts, YOLO can be trained to recognise characters based on these features, which may provide better performance than OCR methods such as Tesseract OCR.

Keywords: computer vision, character recognition, licence plate recognition, object detection

Procedia PDF Downloads 76
11 Automated Pothole Detection Using Convolution Neural Networks and 3D Reconstruction Using Stereovision

Authors: Eshta Ranyal, Kamal Jain, Vikrant Ranyal

Abstract:

Potholes are a severe threat to road safety and a major contributing factor towards road distress. In the Indian context, they are a major road hazard. Timely detection of potholes and subsequent repair can prevent the roads from deteriorating. To facilitate the roadway authorities in the timely detection and repair of potholes, we propose a pothole detection methodology using convolutional neural networks. The YOLOv3 model is used as it is fast and accurate in comparison to other state-of-the-art models. You only look once v3 (YOLOv3) is a state-of-the-art, real-time object detection system that features multi-scale detection. A mean average precision(mAP) of 73% was obtained on a training dataset of 200 images. The dataset was then increased to 500 images, resulting in an increase in mAP. We further calculated the depth of the potholes using stereoscopic vision by reconstruction of 3D potholes. This enables calculating pothole volume, its extent, which can then be used to evaluate the pothole severity as low, moderate, high.

Keywords: CNN, pothole detection, pothole severity, YOLO, stereovision

Procedia PDF Downloads 98
10 An Investigation on Smartphone-Based Machine Vision System for Inspection

Authors: They Shao Peng

Abstract:

Machine vision system for inspection is an automated technology that is normally utilized to analyze items on the production line for quality control purposes, it also can be known as an automated visual inspection (AVI) system. By applying automated visual inspection, the existence of items, defects, contaminants, flaws, and other irregularities in manufactured products can be easily detected in a short time and accurately. However, AVI systems are still inflexible and expensive due to their uniqueness for a specific task and consuming a lot of set-up time and space. With the rapid development of mobile devices, smartphones can be an alternative device for the visual system to solve the existing problems of AVI. Since the smartphone-based AVI system is still at a nascent stage, this led to the motivation to investigate the smartphone-based AVI system. This study is aimed to provide a low-cost AVI system with high efficiency and flexibility. In this project, the object detection models, which are You Only Look Once (YOLO) model and Single Shot MultiBox Detector (SSD) model, are trained, evaluated, and integrated with the smartphone and webcam devices. The performance of the smartphone-based AVI is compared with the webcam-based AVI according to the precision and inference time in this study. Additionally, a mobile application is developed which allows users to implement real-time object detection and object detection from image storage.

Keywords: automated visual inspection, deep learning, machine vision, mobile application

Procedia PDF Downloads 80
9 RV-YOLOX: Object Detection on Inland Waterways Based on Optimized YOLOX Through Fusion of Vision and 3+1D Millimeter Wave Radar

Authors: Zixian Zhang, Shanliang Yao, Zile Huang, Zhaodong Wu, Xiaohui Zhu, Yong Yue, Jieming Ma

Abstract:

Unmanned Surface Vehicles (USVs) are valuable due to their ability to perform dangerous and time-consuming tasks on the water. Object detection tasks are significant in these applications. However, inherent challenges, such as the complex distribution of obstacles, reflections from shore structures, water surface fog, etc., hinder the performance of object detection of USVs. To address these problems, this paper provides a fusion method for USVs to effectively detect objects in the inland surface environment, utilizing vision sensors and 3+1D Millimeter-wave radar. MMW radar is complementary to vision sensors, providing robust environmental information. The radar 3D point cloud is transferred to 2D radar pseudo image to unify radar and vision information format by utilizing the point transformer. We propose a multi-source object detection network (RV-YOLOX )based on radar-vision fusion for inland waterways environment. The performance is evaluated on our self-recording waterways dataset. Compared with the YOLOX network, our fusion network significantly improves detection accuracy, especially for objects with bad light conditions.

Keywords: inland waterways, YOLO, sensor fusion, self-attention

Procedia PDF Downloads 50
8 Using Deep Learning Real-Time Object Detection Convolution Neural Networks for Fast Fruit Recognition in the Tree

Authors: K. Bresilla, L. Manfrini, B. Morandi, A. Boini, G. Perulli, L. C. Grappadelli

Abstract:

Image/video processing for fruit in the tree using hard-coded feature extraction algorithms have shown high accuracy during recent years. While accurate, these approaches even with high-end hardware are computationally intensive and too slow for real-time systems. This paper details the use of deep convolution neural networks (CNNs), specifically an algorithm (YOLO - You Only Look Once) with 24+2 convolution layers. Using deep-learning techniques eliminated the need for hard-code specific features for specific fruit shapes, color and/or other attributes. This CNN is trained on more than 5000 images of apple and pear fruits on 960 cores GPU (Graphical Processing Unit). Testing set showed an accuracy of 90%. After this, trained data were transferred to an embedded device (Raspberry Pi gen.3) with camera for more portability. Based on correlation between number of visible fruits or detected fruits on one frame and the real number of fruits on one tree, a model was created to accommodate this error rate. Speed of processing and detection of the whole platform was higher than 40 frames per second. This speed is fast enough for any grasping/harvesting robotic arm or other real-time applications.

Keywords: artificial intelligence, computer vision, deep learning, fruit recognition, harvesting robot, precision agriculture

Procedia PDF Downloads 375
7 Crop Classification using Unmanned Aerial Vehicle Images

Authors: Iqra Yaseen

Abstract:

One of the well-known areas of computer science and engineering, image processing in the context of computer vision has been essential to automation. In remote sensing, medical science, and many other fields, it has made it easier to uncover previously undiscovered facts. Grading of diverse items is now possible because of neural network algorithms, categorization, and digital image processing. Its use in the classification of agricultural products, particularly in the grading of seeds or grains and their cultivars, is widely recognized. A grading and sorting system enables the preservation of time, consistency, and uniformity. Global population growth has led to an increase in demand for food staples, biofuel, and other agricultural products. To meet this demand, available resources must be used and managed more effectively. Image processing is rapidly growing in the field of agriculture. Many applications have been developed using this approach for crop identification and classification, land and disease detection and for measuring other parameters of crop. Vegetation localization is the base of performing these task. Vegetation helps to identify the area where the crop is present. The productivity of the agriculture industry can be increased via image processing that is based upon Unmanned Aerial Vehicle photography and satellite. In this paper we use the machine learning techniques like Convolutional Neural Network, deep learning, image processing, classification, You Only Live Once to UAV imaging dataset to divide the crop into distinct groups and choose the best way to use it.

Keywords: image processing, UAV, YOLO, CNN, deep learning, classification

Procedia PDF Downloads 55
6 Video Object Segmentation for Automatic Image Annotation of Ethernet Connectors with Environment Mapping and 3D Projection

Authors: Marrone Silverio Melo Dantas Pedro Henrique Dreyer, Gabriel Fonseca Reis de Souza, Daniel Bezerra, Ricardo Souza, Silvia Lins, Judith Kelner, Djamel Fawzi Hadj Sadok

Abstract:

The creation of a dataset is time-consuming and often discourages researchers from pursuing their goals. To overcome this problem, we present and discuss two solutions adopted for the automation of this process. Both optimize valuable user time and resources and support video object segmentation with object tracking and 3D projection. In our scenario, we acquire images from a moving robotic arm and, for each approach, generate distinct annotated datasets. We evaluated the precision of the annotations by comparing these with a manually annotated dataset, as well as the efficiency in the context of detection and classification problems. For detection support, we used YOLO and obtained for the projection dataset an F1-Score, accuracy, and mAP values of 0.846, 0.924, and 0.875, respectively. Concerning the tracking dataset, we achieved an F1-Score of 0.861, an accuracy of 0.932, whereas mAP reached 0.894. In order to evaluate the quality of the annotated images used for classification problems, we employed deep learning architectures. We adopted metrics accuracy and F1-Score, for VGG, DenseNet, MobileNet, Inception, and ResNet. The VGG architecture outperformed the others for both projection and tracking datasets. It reached an accuracy and F1-score of 0.997 and 0.993, respectively. Similarly, for the tracking dataset, it achieved an accuracy of 0.991 and an F1-Score of 0.981.

Keywords: RJ45, automatic annotation, object tracking, 3D projection

Procedia PDF Downloads 126
5 Real-Time Pedestrian Detection Method Based on Improved YOLOv3

Authors: Jingting Luo, Yong Wang, Ying Wang

Abstract:

Pedestrian detection in image or video data is a very important and challenging task in security surveillance. The difficulty of this task is to locate and detect pedestrians of different scales in complex scenes accurately. To solve these problems, a deep neural network (RT-YOLOv3) is proposed to realize real-time pedestrian detection at different scales in security monitoring. RT-YOLOv3 improves the traditional YOLOv3 algorithm. Firstly, the deep residual network is added to extract vehicle features. Then six convolutional neural networks with different scales are designed and fused with the corresponding scale feature maps in the residual network to form the final feature pyramid to perform pedestrian detection tasks. This method can better characterize pedestrians. In order to further improve the accuracy and generalization ability of the model, a hybrid pedestrian data set training method is used to extract pedestrian data from the VOC data set and train with the INRIA pedestrian data set. Experiments show that the proposed RT-YOLOv3 method achieves 93.57% accuracy of mAP (mean average precision) and 46.52f/s (number of frames per second). In terms of accuracy, RT-YOLOv3 performs better than Fast R-CNN, Faster R-CNN, YOLO, SSD, YOLOv2, and YOLOv3. This method reduces the missed detection rate and false detection rate, improves the positioning accuracy, and meets the requirements of real-time detection of pedestrian objects.

Keywords: pedestrian detection, feature detection, convolutional neural network, real-time detection, YOLOv3

Procedia PDF Downloads 103
4 Using Machine Learning to Build a Real-Time COVID-19 Mask Safety Monitor

Authors: Yash Jain

Abstract:

The US Center for Disease Control has recommended wearing masks to slow the spread of the virus. The research uses a video feed from a camera to conduct real-time classifications of whether or not a human is correctly wearing a mask, incorrectly wearing a mask, or not wearing a mask at all. Utilizing two distinct datasets from the open-source website Kaggle, a mask detection network had been trained. The first dataset that was used to train the model was titled 'Face Mask Detection' on Kaggle, where the dataset was retrieved from and the second dataset was titled 'Face Mask Dataset, which provided the data in a (YOLO Format)' so that the TinyYoloV3 model could be trained. Based on the data from Kaggle, two machine learning models were implemented and trained: a Tiny YoloV3 Real-time model and a two-stage neural network classifier. The two-stage neural network classifier had a first step of identifying distinct faces within the image, and the second step was a classifier to detect the state of the mask on the face and whether it was worn correctly, incorrectly, or no mask at all. The TinyYoloV3 was used for the live feed as well as for a comparison standpoint against the previous two-stage classifier and was trained using the darknet neural network framework. The two-stage classifier attained a mean average precision (MAP) of 80%, while the model trained using TinyYoloV3 real-time detection had a mean average precision (MAP) of 59%. Overall, both models were able to correctly classify stages/scenarios of no mask, mask, and incorrectly worn masks.

Keywords: datasets, classifier, mask-detection, real-time, TinyYoloV3, two-stage neural network classifier

Procedia PDF Downloads 118
3 A Deep Learning Approach to Detect Complete Safety Equipment for Construction Workers Based on YOLOv7

Authors: Shariful Islam, Sharun Akter Khushbu, S. M. Shaqib, Shahriar Sultan Ramit

Abstract:

In the construction sector, ensuring worker safety is of the utmost significance. In this study, a deep learning-based technique is presented for identifying safety gear worn by construction workers, such as helmets, goggles, jackets, gloves, and footwear. The suggested method precisely locates these safety items by using the YOLO v7 (You Only Look Once) object detection algorithm. The dataset utilized in this work consists of labeled images split into training, testing and validation sets. Each image has bounding box labels that indicate where the safety equipment is located within the image. The model is trained to identify and categorize the safety equipment based on the labeled dataset through an iterative training approach. We used custom dataset to train this model. Our trained model performed admirably well, with good precision, recall, and F1-score for safety equipment recognition. Also, the model's evaluation produced encouraging results, with a [email protected] score of 87.7%. The model performs effectively, making it possible to quickly identify safety equipment violations on building sites. A thorough evaluation of the outcomes reveals the model's advantages and points up potential areas for development. By offering an automatic and trustworthy method for safety equipment detection, this research contributes to the fields of computer vision and workplace safety. The proposed deep learning-based approach will increase safety compliance and reduce the risk of accidents in the construction industry.

Keywords: deep learning, safety equipment detection, YOLOv7, computer vision, workplace safety

Procedia PDF Downloads 24
2 Assessment of Seeding and Weeding Field Robot Performance

Authors: Victor Bloch, Eerikki Kaila, Reetta Palva

Abstract:

Field robots are an important tool for enhancing efficiency and decreasing the climatic impact of food production. There exists a number of commercial field robots; however, since this technology is still new, the robot advantages and limitations, as well as methods for optimal using of robots, are still unclear. In this study, the performance of a commercial field robot for seeding and weeding was assessed. A research 2-ha sugar beet field with 0.5m row width was used for testing, which included robotic sowing of sugar beet and weeding five times during the first two months of the growing. About three and five percent of the field were used as untreated and chemically weeded control areas, respectively. The plant detection was based on the exact plant location without image processing. The robot was equipped with six seeding and weeding tools, including passive between-rows harrow hoes and active hoes cutting inside rows between the plants, and it moved with a maximal speed of 0.9 km/h. The robot's performance was assessed by image processing. The field images were collected by an action camera with a height of 2 m and a resolution 27M pixels installed on the robot and by a drone with a 16M pixel camera flying at 4 m height. To detect plants and weeds, the YOLO model was trained with transfer learning from two available datasets. A preliminary analysis of the entire field showed that in the areas treated by the robot, the weed average density varied across the field from 6.8 to 9.1 weeds/m² (compared with 0.8 in the chemically treated area and 24.3 in the untreated area), the weed average density inside rows was 2.0-2.9 weeds / m (compared with 0 on the chemically treated area), and the emergence rate was 90-95%. The information about the robot's performance has high importance for the application of robotics for field tasks. With the help of the developed method, the performance can be assessed several times during the growth according to the robotic weeding frequency. When it’s used by farmers, they can know the field condition and efficiency of the robotic treatment all over the field. Farmers and researchers could develop optimal strategies for using the robot, such as seeding and weeding timing, robot settings, and plant and field parameters and geometry. The robot producers can have quantitative information from an actual working environment and improve the robots accordingly.

Keywords: agricultural robot, field robot, plant detection, robot performance

Procedia PDF Downloads 25
1 AI-Based Information System for Hygiene and Safety Management of Shared Kitchens

Authors: Jongtae Rhee, Sangkwon Han, Seungbin Ji, Junhyeong Park, Byeonghun Kim, Taekyung Kim, Byeonghyeon Jeon, Jiwoo Yang

Abstract:

The shared kitchen is a concept that transfers the value of the sharing economy to the kitchen. It is a type of kitchen equipped with cooking facilities that allows multiple companies or chefs to share time and space and use it jointly. These shared kitchens provide economic benefits and convenience, such as reduced investment costs and rent, but also increase the risk of safety management, such as cross-contamination of food ingredients. Therefore, to manage the safety of food ingredients and finished products in a shared kitchen where several entities jointly use the kitchen and handle various types of food ingredients, it is critical to manage followings: the freshness of food ingredients, user hygiene and safety and cross-contamination of cooking equipment and facilities. In this study, it propose a machine learning-based system for hygiene safety and cross-contamination management, which are highly difficult to manage. User clothing management and user access management, which are most relevant to the hygiene and safety of shared kitchens, are solved through machine learning-based methodology, and cutting board usage management, which is most relevant to cross-contamination management, is implemented as an integrated safety management system based on artificial intelligence. First, to prevent cross-contamination of food ingredients, we use images collected through a real-time camera to determine whether the food ingredients match a given cutting board based on a real-time object detection model, YOLO v7. To manage the hygiene of user clothing, we use a camera-based facial recognition model to recognize the user, and real-time object detection model to determine whether a sanitary hat and mask are worn. In addition, to manage access for users qualified to enter the shared kitchen, we utilize machine learning based signature recognition module. By comparing the pairwise distance between the contract signature and the signature at the time of entrance to the shared kitchen, access permission is determined through a pre-trained signature verification model. These machine learning-based safety management tasks are integrated into a single information system, and each result is managed in an integrated database. Through this, users are warned of safety dangers through the tablet PC installed in the shared kitchen, and managers can track the cause of the sanitary and safety accidents. As a result of system integration analysis, real-time safety management services can be continuously provided by artificial intelligence, and machine learning-based methodologies are used for integrated safety management of shared kitchens that allows dynamic contracts among various users. By solving this problem, we were able to secure the feasibility and safety of the shared kitchen business.

Keywords: artificial intelligence, food safety, information system, safety management, shared kitchen

Procedia PDF Downloads 25