Search results for: convolutional%20architectures
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 368

Search results for: convolutional%20architectures

218 Hyper Parameter Optimization of Deep Convolutional Neural Networks for Pavement Distress Classification

Authors: Oumaima Khlifati, Khadija Baba

Abstract:

Pavement distress is the main factor responsible for the deterioration of road structure durability, damage vehicles, and driver comfort. Transportation agencies spend a high proportion of their funds on pavement monitoring and maintenance. The auscultation of pavement distress was based on the manual survey, which was extremely time consuming, labor intensive, and required domain expertise. Therefore, the automatic distress detection is needed to reduce the cost of manual inspection and avoid more serious damage by implementing the appropriate remediation actions at the right time. Inspired by recent deep learning applications, this paper proposes an algorithm for automatic road distress detection and classification using on the Deep Convolutional Neural Network (DCNN). In this study, the types of pavement distress are classified as transverse or longitudinal cracking, alligator, pothole, and intact pavement. The dataset used in this work is composed of public asphalt pavement images. In order to learn the structure of the different type of distress, the DCNN models are trained and tested as a multi-label classification task. In addition, to get the highest accuracy for our model, we adjust the structural optimization hyper parameters such as the number of convolutions and max pooling, filers, size of filters, loss functions, activation functions, and optimizer and fine-tuning hyper parameters that conclude batch size and learning rate. The optimization of the model is executed by checking all feasible combinations and selecting the best performing one. The model, after being optimized, performance metrics is calculated, which describe the training and validation accuracies, precision, recall, and F1 score.

Keywords: distress pavement, hyperparameters, automatic classification, deep learning

Procedia PDF Downloads 57
217 Domain-Specific Deep Neural Network Model for Classification of Abnormalities on Chest Radiographs

Authors: Nkechinyere Joy Olawuyi, Babajide Samuel Afolabi, Bola Ibitoye

Abstract:

This study collected a preprocessed dataset of chest radiographs and formulated a deep neural network model for detecting abnormalities. It also evaluated the performance of the formulated model and implemented a prototype of the formulated model. This was with the view to developing a deep neural network model to automatically classify abnormalities in chest radiographs. In order to achieve the overall purpose of this research, a large set of chest x-ray images were sourced for and collected from the CheXpert dataset, which is an online repository of annotated chest radiographs compiled by the Machine Learning Research Group, Stanford University. The chest radiographs were preprocessed into a format that can be fed into a deep neural network. The preprocessing techniques used were standardization and normalization. The classification problem was formulated as a multi-label binary classification model, which used convolutional neural network architecture to make a decision on whether an abnormality was present or not in the chest radiographs. The classification model was evaluated using specificity, sensitivity, and Area Under Curve (AUC) score as the parameter. A prototype of the classification model was implemented using Keras Open source deep learning framework in Python Programming Language. The AUC ROC curve of the model was able to classify Atelestasis, Support devices, Pleural effusion, Pneumonia, A normal CXR (no finding), Pneumothorax, and Consolidation. However, Lung opacity and Cardiomegaly had a probability of less than 0.5 and thus were classified as absent. Precision, recall, and F1 score values were 0.78; this implies that the number of False Positive and False Negative is the same, revealing some measure of label imbalance in the dataset. The study concluded that the developed model is sufficient to classify abnormalities present in chest radiographs into present or absent.

Keywords: transfer learning, convolutional neural network, radiograph, classification, multi-label

Procedia PDF Downloads 84
216 A Multi-Output Network with U-Net Enhanced Class Activation Map and Robust Classification Performance for Medical Imaging Analysis

Authors: Jaiden Xuan Schraut, Leon Liu, Yiqiao Yin

Abstract:

Computer vision in medical diagnosis has achieved a high level of success in diagnosing diseases with high accuracy. However, conventional classifiers that produce an image to-label result provides insufficient information for medical professionals to judge and raise concerns over the trust and reliability of a model with results that cannot be explained. In order to gain local insight into cancerous regions, separate tasks such as imaging segmentation need to be implemented to aid the doctors in treating patients, which doubles the training time and costs which renders the diagnosis system inefficient and difficult to be accepted by the public. To tackle this issue and drive AI-first medical solutions further, this paper proposes a multi-output network that follows a U-Net architecture for image segmentation output and features an additional convolutional neural networks (CNN) module for auxiliary classification output. Class activation maps are a method of providing insight into a convolutional neural network’s feature maps that leads to its classification but in the case of lung diseases, the region of interest is enhanced by U-net-assisted Class Activation Map (CAM) visualization. Therefore, our proposed model combines image segmentation models and classifiers to crop out only the lung region of a chest X-ray’s class activation map to provide a visualization that improves the explainability and is able to generate classification results simultaneously which builds trust for AI-led diagnosis systems. The proposed U-Net model achieves 97.61% accuracy and a dice coefficient of 0.97 on testing data from the COVID-QU-Ex Dataset which includes both diseased and healthy lungs.

Keywords: multi-output network model, U-net, class activation map, image classification, medical imaging analysis

Procedia PDF Downloads 161
215 Web Data Scraping Technology Using Term Frequency Inverse Document Frequency to Enhance the Big Data Quality on Sentiment Analysis

Authors: Sangita Pokhrel, Nalinda Somasiri, Rebecca Jeyavadhanam, Swathi Ganesan

Abstract:

Tourism is a booming industry with huge future potential for global wealth and employment. There are countless data generated over social media sites every day, creating numerous opportunities to bring more insights to decision-makers. The integration of Big Data Technology into the tourism industry will allow companies to conclude where their customers have been and what they like. This information can then be used by businesses, such as those in charge of managing visitor centers or hotels, etc., and the tourist can get a clear idea of places before visiting. The technical perspective of natural language is processed by analysing the sentiment features of online reviews from tourists, and we then supply an enhanced long short-term memory (LSTM) framework for sentiment feature extraction of travel reviews. We have constructed a web review database using a crawler and web scraping technique for experimental validation to evaluate the effectiveness of our methodology. The text form of sentences was first classified through Vader and Roberta model to get the polarity of the reviews. In this paper, we have conducted study methods for feature extraction, such as Count Vectorization and TFIDF Vectorization, and implemented Convolutional Neural Network (CNN) classifier algorithm for the sentiment analysis to decide the tourist’s attitude towards the destinations is positive, negative, or simply neutral based on the review text that they posted online. The results demonstrated that from the CNN algorithm, after pre-processing and cleaning the dataset, we received an accuracy of 96.12% for the positive and negative sentiment analysis.

Keywords: counter vectorization, convolutional neural network, crawler, data technology, long short-term memory, web scraping, sentiment analysis

Procedia PDF Downloads 53
214 Quality Assessment of New Zealand Mānuka Honeys Using Hyperspectral Imaging Combined with Deep 1D-Convolutional Neural Networks

Authors: Hien Thi Dieu Truong, Mahmoud Al-Sarayreh, Pullanagari Reddy, Marlon M. Reis, Richard Archer

Abstract:

New Zealand mānuka honey is a honeybee product derived mainly from Leptospermum scoparium nectar. The potent antibacterial activity of mānuka honey derives principally from methylglyoxal (MGO), in addition to the hydrogen peroxide and other lesser activities present in all honey. MGO is formed from dihydroxyacetone (DHA) unique to L. scoparium nectar. Mānuka honey also has an idiosyncratic phenolic profile that is useful as a chemical maker. Authentic mānuka honey is highly valuable, but almost all honey is formed from natural mixtures of nectars harvested by a hive over a time period. Once diluted by other nectars, mānuka honey irrevocably loses value. We aimed to apply hyperspectral imaging to honey frames before bulk extraction to minimise the dilution of genuine mānuka by other honey and ensure authenticity at the source. This technology is non-destructive and suitable for an industrial setting. Chemometrics using linear Partial Least Squares (PLS) and Support Vector Machine (SVM) showed limited efficacy in interpreting chemical footprints due to large non-linear relationships between predictor and predictand in a large sample set, likely due to honey quality variability across geographic regions. Therefore, an advanced modelling approach, one-dimensional convolutional neural networks (1D-CNN), was investigated for analysing hyperspectral data for extraction of biochemical information from honey. The 1D-CNN model showed superior prediction of honey quality (R² = 0.73, RMSE = 2.346, RPD= 2.56) to PLS (R² = 0.66, RMSE = 2.607, RPD= 1.91) and SVM (R² = 0.67, RMSE = 2.559, RPD=1.98). Classification of mono-floral manuka honey from multi-floral and non-manuka honey exceeded 90% accuracy for all models tried. Overall, this study reveals the potential of HSI and deep learning modelling for automating the evaluation of honey quality in frames.

Keywords: mānuka honey, quality, purity, potency, deep learning, 1D-CNN, chemometrics

Procedia PDF Downloads 102
213 A Convolutional Neural Network Based Vehicle Theft Detection, Location, and Reporting System

Authors: Michael Moeti, Khuliso Sigama, Thapelo Samuel Matlala

Abstract:

One of the principal challenges that the world is confronted with is insecurity. The crime rate is increasing exponentially, and protecting our physical assets especially in the motorist industry, is becoming impossible when applying our own strength. The need to develop technological solutions that detect and report theft without any human interference is inevitable. This is critical, especially for vehicle owners, to ensure theft detection and speedy identification towards recovery efforts in cases where a vehicle is missing or attempted theft is taking place. The vehicle theft detection system uses Convolutional Neural Network (CNN) to recognize the driver's face captured using an installed mobile phone device. The location identification function uses a Global Positioning System (GPS) to determine the real-time location of the vehicle. Upon identification of the location, Global System for Mobile Communications (GSM) technology is used to report or notify the vehicle owner about the whereabouts of the vehicle. The installed mobile app was implemented by making use of python as it is undoubtedly the best choice in machine learning. It allows easy access to machine learning algorithms through its widely developed library ecosystem. The graphical user interface was developed by making use of JAVA as it is better suited for mobile development. Google's online database (Firebase) was used as a means of storage for the application. The system integration test was performed using a simple percentage analysis. Sixty (60) vehicle owners participated in this study as a sample, and questionnaires were used in order to establish the acceptability of the system developed. The result indicates the efficiency of the proposed system, and consequently, the paper proposes the use of the system can effectively monitor the vehicle at any given place, even if it is driven outside its normal jurisdiction. More so, the system can be used as a database to detect, locate and report missing vehicles to different security agencies.

Keywords: CNN, location identification, tracking, GPS, GSM

Procedia PDF Downloads 124
212 Using Convolutional Neural Networks to Distinguish Different Sign Language Alphanumerics

Authors: Stephen L. Green, Alexander N. Gorban, Ivan Y. Tyukin

Abstract:

Within the past decade, using Convolutional Neural Networks (CNN)’s to create Deep Learning systems capable of translating Sign Language into text has been a breakthrough in breaking the communication barrier for deaf-mute people. Conventional research on this subject has been concerned with training the network to recognize the fingerspelling gestures of a given language and produce their corresponding alphanumerics. One of the problems with the current developing technology is that images are scarce, with little variations in the gestures being presented to the recognition program, often skewed towards single skin tones and hand sizes that makes a percentage of the population’s fingerspelling harder to detect. Along with this, current gesture detection programs are only trained on one finger spelling language despite there being one hundred and forty-two known variants so far. All of this presents a limitation for traditional exploitation for the state of current technologies such as CNN’s, due to their large number of required parameters. This work aims to present a technology that aims to resolve this issue by combining a pretrained legacy AI system for a generic object recognition task with a corrector method to uptrain the legacy network. This is a computationally efficient procedure that does not require large volumes of data even when covering a broad range of sign languages such as American Sign Language, British Sign Language and Chinese Sign Language (Pinyin). Implementing recent results on method concentration, namely the stochastic separation theorem, an AI system is supposed as an operate mapping an input present in the set of images u ∈ U to an output that exists in a set of predicted class labels q ∈ Q of the alphanumeric that q represents and the language it comes from. These inputs and outputs, along with the interval variables z ∈ Z represent the system’s current state which implies a mapping that assigns an element x ∈ ℝⁿ to the triple (u, z, q). As all xi are i.i.d vectors drawn from a product mean distribution, over a period of time the AI generates a large set of measurements xi called S that are grouped into two categories: the correct predictions M and the incorrect predictions Y. Once the network has made its predictions, a corrector can then be applied through centering S and Y by subtracting their means. The data is then regularized by applying the Kaiser rule to the resulting eigenmatrix and then whitened before being split into pairwise, positively correlated clusters. Each of these clusters produces a unique hyperplane and if any element x falls outside the region bounded by these lines then it is reported as an error. As a result of this methodology, a self-correcting recognition process is created that can identify fingerspelling from a variety of sign language and successfully identify the corresponding alphanumeric and what language the gesture originates from which no other neural network has been able to replicate.

Keywords: convolutional neural networks, deep learning, shallow correctors, sign language

Procedia PDF Downloads 75
211 Using Machine Learning to Classify Different Body Parts and Determine Healthiness

Authors: Zachary Pan

Abstract:

Our general mission is to solve the problem of classifying images into different body part types and deciding if each of them is healthy or not. However, for now, we will determine healthiness for only one-sixth of the body parts, specifically the chest. We will detect pneumonia in X-ray scans of those chest images. With this type of AI, doctors can use it as a second opinion when they are taking CT or X-ray scans of their patients. Another ad-vantage of using this machine learning classifier is that it has no human weaknesses like fatigue. The overall ap-proach to this problem is to split the problem into two parts: first, classify the image, then determine if it is healthy. In order to classify the image into a specific body part class, the body parts dataset must be split into test and training sets. We can then use many models, like neural networks or logistic regression models, and fit them using the training set. Now, using the test set, we can obtain a realistic accuracy the models will have on images in the real world since these testing images have never been seen by the models before. In order to increase this testing accuracy, we can also apply many complex algorithms to the models, like multiplicative weight update. For the second part of the problem, to determine if the body part is healthy, we can have another dataset consisting of healthy and non-healthy images of the specific body part and once again split that into the test and training sets. We then use another neural network to train on those training set images and use the testing set to figure out its accuracy. We will do this process only for the chest images. A major conclusion reached is that convolutional neural networks are the most reliable and accurate at image classification. In classifying the images, the logistic regression model, the neural network, neural networks with multiplicative weight update, neural networks with the black box algorithm, and the convolutional neural network achieved 96.83 percent accuracy, 97.33 percent accuracy, 97.83 percent accuracy, 96.67 percent accuracy, and 98.83 percent accuracy, respectively. On the other hand, the overall accuracy of the model that de-termines if the images are healthy or not is around 78.37 percent accuracy.

Keywords: body part, healthcare, machine learning, neural networks

Procedia PDF Downloads 71
210 Encephalon-An Implementation of a Handwritten Mathematical Expression Solver

Authors: Shreeyam, Ranjan Kumar Sah, Shivangi

Abstract:

Recognizing and solving handwritten mathematical expressions can be a challenging task, particularly when certain characters are segmented and classified. This project proposes a solution that uses Convolutional Neural Network (CNN) and image processing techniques to accurately solve various types of equations, including arithmetic, quadratic, and trigonometric equations, as well as logical operations like logical AND, OR, NOT, NAND, XOR, and NOR. The proposed solution also provides a graphical solution, allowing users to visualize equations and their solutions. In addition to equation solving, the platform, called CNNCalc, offers a comprehensive learning experience for students. It provides educational content, a quiz platform, and a coding platform for practicing programming skills in different languages like C, Python, and Java. This all-in-one solution makes the learning process engaging and enjoyable for students. The proposed methodology includes horizontal compact projection analysis and survey for segmentation and binarization, as well as connected component analysis and integrated connected component analysis for character classification. The compact projection algorithm compresses the horizontal projections to remove noise and obtain a clearer image, contributing to the accuracy of character segmentation. Experimental results demonstrate the effectiveness of the proposed solution in solving a wide range of mathematical equations. CNNCalc provides a powerful and user-friendly platform for solving equations, learning, and practicing programming skills. With its comprehensive features and accurate results, CNNCalc is poised to revolutionize the way students learn and solve mathematical equations. The platform utilizes a custom-designed Convolutional Neural Network (CNN) with image processing techniques to accurately recognize and classify symbols within handwritten equations. The compact projection algorithm effectively removes noise from horizontal projections, leading to clearer images and improved character segmentation. Experimental results demonstrate the accuracy and effectiveness of the proposed solution in solving a wide range of equations, including arithmetic, quadratic, trigonometric, and logical operations. CNNCalc features a user-friendly interface with a graphical representation of equations being solved, making it an interactive and engaging learning experience for users. The platform also includes tutorials, testing capabilities, and programming features in languages such as C, Python, and Java. Users can track their progress and work towards improving their skills. CNNCalc is poised to revolutionize the way students learn and solve mathematical equations with its comprehensive features and accurate results.

Keywords: AL, ML, hand written equation solver, maths, computer, CNNCalc, convolutional neural networks

Procedia PDF Downloads 82
209 Training a Neural Network to Segment, Detect and Recognize Numbers

Authors: Abhisek Dash

Abstract:

This study had three neural networks, one for number segmentation, one for number detection and one for number recognition all of which are coupled to one another. All networks were trained on the MNIST dataset and were convolutional. It was assumed that the images had lighter background and darker foreground. The segmentation network took 28x28 images as input and had sixteen outputs. Segmentation training starts when a dark pixel is encountered. Taking a window(7x7) over that pixel as focus, the eight neighborhood of the focus was checked for further dark pixels. The segmentation network was then trained to move in those directions which had dark pixels. To this end the segmentation network had 16 outputs. They were arranged as “go east”, ”don’t go east ”, “go south east”, “don’t go south east”, “go south”, “don’t go south” and so on w.r.t focus window. The focus window was resized into a 28x28 image and the network was trained to consider those neighborhoods which had dark pixels. The neighborhoods which had dark pixels were pushed into a queue in a particular order. The neighborhoods were then popped one at a time stitched to the existing partial image of the number one at a time and trained on which neighborhoods to consider when the new partial image was presented. The above process was repeated until the image was fully covered by the 7x7 neighborhoods and there were no more uncovered black pixels. During testing the network scans and looks for the first dark pixel. From here on the network predicts which neighborhoods to consider and segments the image. After this step the group of neighborhoods are passed into the detection network. The detection network took 28x28 images as input and had two outputs denoting whether a number was detected or not. Since the ground truth of the bounds of a number was known during training the detection network outputted in favor of number not found until the bounds were not met and vice versa. The recognition network was a standard CNN that also took 28x28 images and had 10 outputs for recognition of numbers from 0 to 9. This network was activated only when the detection network votes in favor of number detected. The above methodology could segment connected and overlapping numbers. Additionally the recognition unit was only invoked when a number was detected which minimized false positives. It also eliminated the need for rules of thumb as segmentation is learned. The strategy can also be extended to other characters as well.

Keywords: convolutional neural networks, OCR, text detection, text segmentation

Procedia PDF Downloads 129
208 A Study on the Application of Machine Learning and Deep Learning Techniques for Skin Cancer Detection

Authors: Hritwik Ghosh, Irfan Sadiq Rahat, Sachi Nandan Mohanty, J. V. R. Ravindra

Abstract:

In the rapidly evolving landscape of medical diagnostics, the early detection and accurate classification of skin cancer remain paramount for effective treatment outcomes. This research delves into the transformative potential of Artificial Intelligence (AI), specifically Deep Learning (DL), as a tool for discerning and categorizing various skin conditions. Utilizing a diverse dataset of 3,000 images representing nine distinct skin conditions, we confront the inherent challenge of class imbalance. This imbalance, where conditions like melanomas are over-represented, is addressed by incorporating class weights during the model training phase, ensuring an equitable representation of all conditions in the learning process. Our pioneering approach introduces a hybrid model, amalgamating the strengths of two renowned Convolutional Neural Networks (CNNs), VGG16 and ResNet50. These networks, pre-trained on the ImageNet dataset, are adept at extracting intricate features from images. By synergizing these models, our research aims to capture a holistic set of features, thereby bolstering classification performance. Preliminary findings underscore the hybrid model's superiority over individual models, showcasing its prowess in feature extraction and classification. Moreover, the research emphasizes the significance of rigorous data pre-processing, including image resizing, color normalization, and segmentation, in ensuring data quality and model reliability. In essence, this study illuminates the promising role of AI and DL in revolutionizing skin cancer diagnostics, offering insights into its potential applications in broader medical domains.

Keywords: artificial intelligence, machine learning, deep learning, skin cancer, dermatology, convolutional neural networks, image classification, computer vision, healthcare technology, cancer detection, medical imaging

Procedia PDF Downloads 43
207 Convolutional Neural Network Based on Random Kernels for Analyzing Visual Imagery

Authors: Ja-Keoung Koo, Kensuke Nakamura, Hyohun Kim, Dongwha Shin, Yeonseok Kim, Ji-Su Ahn, Byung-Woo Hong

Abstract:

The machine learning techniques based on a convolutional neural network (CNN) have been actively developed and successfully applied to a variety of image analysis tasks including reconstruction, noise reduction, resolution enhancement, segmentation, motion estimation, object recognition. The classical visual information processing that ranges from low level tasks to high level ones has been widely developed in the deep learning framework. It is generally considered as a challenging problem to derive visual interpretation from high dimensional imagery data. A CNN is a class of feed-forward artificial neural network that usually consists of deep layers the connections of which are established by a series of non-linear operations. The CNN architecture is known to be shift invariant due to its shared weights and translation invariance characteristics. However, it is often computationally intractable to optimize the network in particular with a large number of convolution layers due to a large number of unknowns to be optimized with respect to the training set that is generally required to be large enough to effectively generalize the model under consideration. It is also necessary to limit the size of convolution kernels due to the computational expense despite of the recent development of effective parallel processing machinery, which leads to the use of the constantly small size of the convolution kernels throughout the deep CNN architecture. However, it is often desired to consider different scales in the analysis of visual features at different layers in the network. Thus, we propose a CNN model where different sizes of the convolution kernels are applied at each layer based on the random projection. We apply random filters with varying sizes and associate the filter responses with scalar weights that correspond to the standard deviation of the random filters. We are allowed to use large number of random filters with the cost of one scalar unknown for each filter. The computational cost in the back-propagation procedure does not increase with the larger size of the filters even though the additional computational cost is required in the computation of convolution in the feed-forward procedure. The use of random kernels with varying sizes allows to effectively analyze image features at multiple scales leading to a better generalization. The robustness and effectiveness of the proposed CNN based on random kernels are demonstrated by numerical experiments where the quantitative comparison of the well-known CNN architectures and our models that simply replace the convolution kernels with the random filters is performed. The experimental results indicate that our model achieves better performance with less number of unknown weights. The proposed algorithm has a high potential in the application of a variety of visual tasks based on the CNN framework. Acknowledgement—This work was supported by the MISP (Ministry of Science and ICT), Korea, under the National Program for Excellence in SW (20170001000011001) supervised by IITP, and NRF-2014R1A2A1A11051941, NRF2017R1A2B4006023.

Keywords: deep learning, convolutional neural network, random kernel, random projection, dimensionality reduction, object recognition

Procedia PDF Downloads 259
206 Smart Sensor Data to Predict Machine Performance with IoT-Based Machine Learning and Artificial Intelligence

Authors: C. J. Rossouw, T. I. van Niekerk

Abstract:

The global manufacturing industry is utilizing the internet and cloud-based services to further explore the anatomy and optimize manufacturing processes in support of the movement into the Fourth Industrial Revolution (4IR). The 4IR from a third world and African perspective is hindered by the fact that many manufacturing systems that were developed in the third industrial revolution are not inherently equipped to utilize the internet and services of the 4IR, hindering the progression of third world manufacturing industries into the 4IR. This research focuses on the development of a non-invasive and cost-effective cyber-physical IoT system that will exploit a machine’s vibration to expose semantic characteristics in the manufacturing process and utilize these results through a real-time cloud-based machine condition monitoring system with the intention to optimize the system. A microcontroller-based IoT sensor was designed to acquire a machine’s mechanical vibration data, process it in real-time, and transmit it to a cloud-based platform via Wi-Fi and the internet. Time-frequency Fourier analysis was applied to the vibration data to form an image representation of the machine’s behaviour. This data was used to train a Convolutional Neural Network (CNN) to learn semantic characteristics in the machine’s behaviour and relate them to a state of operation. The same data was also used to train a Convolutional Autoencoder (CAE) to detect anomalies in the data. Real-time edge-based artificial intelligence was achieved by deploying the CNN and CAE on the sensor to analyse the vibration. A cloud platform was deployed to visualize the vibration data and the results of the CNN and CAE in real-time. The cyber-physical IoT system was deployed on a semi-automated metal granulation machine with a set of trained machine learning models. Using a single sensor, the system was able to accurately visualize three states of the machine’s operation in real-time. The system was also able to detect a variance in the material being granulated. The research demonstrates how non-IoT manufacturing systems can be equipped with edge-based artificial intelligence to establish a remote machine condition monitoring system.

Keywords: IoT, cyber-physical systems, artificial intelligence, manufacturing, vibration analytics, continuous machine condition monitoring

Procedia PDF Downloads 63
205 Automatic Detection of Sugarcane Diseases: A Computer Vision-Based Approach

Authors: Himanshu Sharma, Karthik Kumar, Harish Kumar

Abstract:

The major problem in crop cultivation is the occurrence of multiple crop diseases. During the growth stage, timely identification of crop diseases is paramount to ensure the high yield of crops, lower production costs, and minimize pesticide usage. In most cases, crop diseases produce observable characteristics and symptoms. The Surveyors usually diagnose crop diseases when they walk through the fields. However, surveyor inspections tend to be biased and error-prone due to the nature of the monotonous task and the subjectivity of individuals. In addition, visual inspection of each leaf or plant is costly, time-consuming, and labour-intensive. Furthermore, the plant pathologists and experts who can often identify the disease within the plant according to their symptoms in early stages are not readily available in remote regions. Therefore, this study specifically addressed early detection of leaf scald, red rot, and eyespot types of diseases within sugarcane plants. The study proposes a computer vision-based approach using a convolutional neural network (CNN) for automatic identification of crop diseases. To facilitate this, firstly, images of sugarcane diseases were taken from google without modifying the scene, background, or controlling the illumination to build the training dataset. Then, the testing dataset was developed based on the real-time collected images from the sugarcane field from India. Then, the image dataset is pre-processed for feature extraction and selection. Finally, the CNN-based Visual Geometry Group (VGG) model was deployed on the training and testing dataset to classify the images into diseased and healthy sugarcane plants and measure the model's performance using various parameters, i.e., accuracy, sensitivity, specificity, and F1-score. The promising result of the proposed model lays the groundwork for the automatic early detection of sugarcane disease. The proposed research directly sustains an increase in crop yield.

Keywords: automatic classification, computer vision, convolutional neural network, image processing, sugarcane disease, visual geometry group

Procedia PDF Downloads 91
204 Monitoring Large-Coverage Forest Canopy Height by Integrating LiDAR and Sentinel-2 Images

Authors: Xiaobo Liu, Rakesh Mishra, Yun Zhang

Abstract:

Continuous monitoring of forest canopy height with large coverage is essential for obtaining forest carbon stocks and emissions, quantifying biomass estimation, analyzing vegetation coverage, and determining biodiversity. LiDAR can be used to collect accurate woody vegetation structure such as canopy height. However, LiDAR’s coverage is usually limited because of its high cost and limited maneuverability, which constrains its use for dynamic and large area forest canopy monitoring. On the other hand, optical satellite images, like Sentinel-2, have the ability to cover large forest areas with a high repeat rate, but they do not have height information. Hence, exploring the solution of integrating LiDAR data and Sentinel-2 images to enlarge the coverage of forest canopy height prediction and increase the prediction repeat rate has been an active research topic in the environmental remote sensing community. In this study, we explore the potential of training a Random Forest Regression (RFR) model and a Convolutional Neural Network (CNN) model, respectively, to develop two predictive models for predicting and validating the forest canopy height of the Acadia Forest in New Brunswick, Canada, with a 10m ground sampling distance (GSD), for the year 2018 and 2021. Two 10m airborne LiDAR-derived canopy height models, one for 2018 and one for 2021, are used as ground truth to train and validate the RFR and CNN predictive models. To evaluate the prediction performance of the trained RFR and CNN models, two new predicted canopy height maps (CHMs), one for 2018 and one for 2021, are generated using the trained RFR and CNN models and 10m Sentinel-2 images of 2018 and 2021, respectively. The two 10m predicted CHMs from Sentinel-2 images are then compared with the two 10m airborne LiDAR-derived canopy height models for accuracy assessment. The validation results show that the mean absolute error (MAE) for year 2018 of the RFR model is 2.93m, CNN model is 1.71m; while the MAE for year 2021 of the RFR model is 3.35m, and the CNN model is 3.78m. These demonstrate the feasibility of using the RFR and CNN models developed in this research for predicting large-coverage forest canopy height at 10m spatial resolution and a high revisit rate.

Keywords: remote sensing, forest canopy height, LiDAR, Sentinel-2, artificial intelligence, random forest regression, convolutional neural network

Procedia PDF Downloads 54
203 Mobile Smart Application Proposal for Predicting Calories in Food

Authors: Marcos Valdez Alexander Junior, Igor Aguilar-Alonso

Abstract:

Malnutrition is the root of different diseases that universally affect everyone, diseases such as obesity and malnutrition. The objective of this research is to predict the calories of the food to be eaten, developing a smart mobile application to show the user if a meal is balanced. Due to the large percentage of obesity and malnutrition in Peru, the present work is carried out. The development of the intelligent application is proposed with a three-layer architecture, and for the prediction of the nutritional value of the food, the use of pre-trained models based on convolutional neural networks is proposed.

Keywords: volume estimation, calorie estimation, artificial vision, food nutrition

Procedia PDF Downloads 65
202 Federated Learning in Healthcare

Authors: Ananya Gangavarapu

Abstract:

Convolutional Neural Networks (CNN) based models are providing diagnostic capabilities on par with the medical specialists in many specialty areas. However, collecting the medical data for training purposes is very challenging because of the increased regulations around data collections and privacy concerns around personal health data. The gathering of the data becomes even more difficult if the capture devices are edge-based mobile devices (like smartphones) with feeble wireless connectivity in rural/remote areas. In this paper, I would like to highlight Federated Learning approach to mitigate data privacy and security issues.

Keywords: deep learning in healthcare, data privacy, federated learning, training in distributed environment

Procedia PDF Downloads 113
201 Underwater Image Enhancement and Reconstruction Using CNN and the MultiUNet Model

Authors: Snehal G. Teli, R. J. Shelke

Abstract:

CNN and MultiUNet models are the framework for the proposed method for enhancing and reconstructing underwater images. Multiscale merging of features and regeneration are both performed by the MultiUNet. CNN collects relevant features. Extensive tests on benchmark datasets show that the proposed strategy performs better than the latest methods. As a result of this work, underwater images can be represented and interpreted in a number of underwater applications with greater clarity. This strategy will advance underwater exploration and marine research by enhancing real-time underwater image processing systems, underwater robotic vision, and underwater surveillance.

Keywords: convolutional neural network, image enhancement, machine learning, multiunet, underwater images

Procedia PDF Downloads 42
200 Accurate Mass Segmentation Using U-Net Deep Learning Architecture for Improved Cancer Detection

Authors: Ali Hamza

Abstract:

Accurate segmentation of breast ultrasound images is of paramount importance in enhancing the diagnostic capabilities of breast cancer detection. This study presents an approach utilizing the U-Net architecture for segmenting breast ultrasound images aimed at improving the accuracy and reliability of mass identification within the breast tissue. The proposed method encompasses a multi-stage process. Initially, preprocessing techniques are employed to refine image quality and diminish noise interference. Subsequently, the U-Net architecture, a deep learning convolutional neural network (CNN), is employed for pixel-wise segmentation of regions of interest corresponding to potential breast masses. The U-Net's distinctive architecture, characterized by a contracting and expansive pathway, enables accurate boundary delineation and detailed feature extraction. To evaluate the effectiveness of the proposed approach, an extensive dataset of breast ultrasound images is employed, encompassing diverse cases. Quantitative performance metrics such as the Dice coefficient, Jaccard index, sensitivity, specificity, and Hausdorff distance are employed to comprehensively assess the segmentation accuracy. Comparative analyses against traditional segmentation methods showcase the superiority of the U-Net architecture in capturing intricate details and accurately segmenting breast masses. The outcomes of this study emphasize the potential of the U-Net-based segmentation approach in bolstering breast ultrasound image analysis. The method's ability to reliably pinpoint mass boundaries holds promise for aiding radiologists in precise diagnosis and treatment planning. However, further validation and integration within clinical workflows are necessary to ascertain their practical clinical utility and facilitate seamless adoption by healthcare professionals. In conclusion, leveraging the U-Net architecture for breast ultrasound image segmentation showcases a robust framework that can significantly enhance diagnostic accuracy and advance the field of breast cancer detection. This approach represents a pivotal step towards empowering medical professionals with a more potent tool for early and accurate breast cancer diagnosis.

Keywords: mage segmentation, U-Net, deep learning, breast cancer detection, diagnostic accuracy, mass identification, convolutional neural network

Procedia PDF Downloads 47
199 DenseNet and Autoencoder Architecture for COVID-19 Chest X-Ray Image Classification and Improved U-Net Lung X-Ray Segmentation

Authors: Jonathan Gong

Abstract:

Purpose AI-driven solutions are at the forefront of many pathology and medical imaging methods. Using algorithms designed to better the experience of medical professionals within their respective fields, the efficiency and accuracy of diagnosis can improve. In particular, X-rays are a fast and relatively inexpensive test that can diagnose diseases. In recent years, X-rays have not been widely used to detect and diagnose COVID-19. The under use of Xrays is mainly due to the low diagnostic accuracy and confounding with pneumonia, another respiratory disease. However, research in this field has expressed a possibility that artificial neural networks can successfully diagnose COVID-19 with high accuracy. Models and Data The dataset used is the COVID-19 Radiography Database. This dataset includes images and masks of chest X-rays under the labels of COVID-19, normal, and pneumonia. The classification model developed uses an autoencoder and a pre-trained convolutional neural network (DenseNet201) to provide transfer learning to the model. The model then uses a deep neural network to finalize the feature extraction and predict the diagnosis for the input image. This model was trained on 4035 images and validated on 807 separate images from the ones used for training. The images used to train the classification model include an important feature: the pictures are cropped beforehand to eliminate distractions when training the model. The image segmentation model uses an improved U-Net architecture. This model is used to extract the lung mask from the chest X-ray image. The model is trained on 8577 images and validated on a validation split of 20%. These models are calculated using the external dataset for validation. The models’ accuracy, precision, recall, f1-score, IOU, and loss are calculated. Results The classification model achieved an accuracy of 97.65% and a loss of 0.1234 when differentiating COVID19-infected, pneumonia-infected, and normal lung X-rays. The segmentation model achieved an accuracy of 97.31% and an IOU of 0.928. Conclusion The models proposed can detect COVID-19, pneumonia, and normal lungs with high accuracy and derive the lung mask from a chest X-ray with similarly high accuracy. The hope is for these models to elevate the experience of medical professionals and provide insight into the future of the methods used.

Keywords: artificial intelligence, convolutional neural networks, deep learning, image processing, machine learning

Procedia PDF Downloads 96
198 On-Road Text Detection Platform for Driver Assistance Systems

Authors: Guezouli Larbi, Belkacem Soundes

Abstract:

The automation of the text detection process can help the human in his driving task. Its application can be very useful to help drivers to have more information about their environment by facilitating the reading of road signs such as directional signs, events, stores, etc. In this paper, a system consisting of two stages has been proposed. In the first one, we used pseudo-Zernike moments to pinpoint areas of the image that may contain text. The architecture of this part is based on three main steps, region of interest (ROI) detection, text localization, and non-text region filtering. Then, in the second step, we present a convolutional neural network architecture (On-Road Text Detection Network - ORTDN) which is considered a classification phase. The results show that the proposed framework achieved ≈ 35 fps and an mAP of ≈ 90%, thus a low computational time with competitive accuracy.

Keywords: text detection, CNN, PZM, deep learning

Procedia PDF Downloads 56
197 The Outcome of Using Machine Learning in Medical Imaging

Authors: Adel Edwar Waheeb Louka

Abstract:

Purpose AI-driven solutions are at the forefront of many pathology and medical imaging methods. Using algorithms designed to better the experience of medical professionals within their respective fields, the efficiency and accuracy of diagnosis can improve. In particular, X-rays are a fast and relatively inexpensive test that can diagnose diseases. In recent years, X-rays have not been widely used to detect and diagnose COVID-19. The under use of Xrays is mainly due to the low diagnostic accuracy and confounding with pneumonia, another respiratory disease. However, research in this field has expressed a possibility that artificial neural networks can successfully diagnose COVID-19 with high accuracy. Models and Data The dataset used is the COVID-19 Radiography Database. This dataset includes images and masks of chest X-rays under the labels of COVID-19, normal, and pneumonia. The classification model developed uses an autoencoder and a pre-trained convolutional neural network (DenseNet201) to provide transfer learning to the model. The model then uses a deep neural network to finalize the feature extraction and predict the diagnosis for the input image. This model was trained on 4035 images and validated on 807 separate images from the ones used for training. The images used to train the classification model include an important feature: the pictures are cropped beforehand to eliminate distractions when training the model. The image segmentation model uses an improved U-Net architecture. This model is used to extract the lung mask from the chest X-ray image. The model is trained on 8577 images and validated on a validation split of 20%. These models are calculated using the external dataset for validation. The models’ accuracy, precision, recall, f1-score, IOU, and loss are calculated. Results The classification model achieved an accuracy of 97.65% and a loss of 0.1234 when differentiating COVID19-infected, pneumonia-infected, and normal lung X-rays. The segmentation model achieved an accuracy of 97.31% and an IOU of 0.928. Conclusion The models proposed can detect COVID-19, pneumonia, and normal lungs with high accuracy and derive the lung mask from a chest X-ray with similarly high accuracy. The hope is for these models to elevate the experience of medical professionals and provide insight into the future of the methods used.

Keywords: artificial intelligence, convolutional neural networks, deeplearning, image processing, machine learningSarapin, intraarticular, chronic knee pain, osteoarthritisFNS, trauma, hip, neck femur fracture, minimally invasive surgery

Procedia PDF Downloads 23
196 Unsupervised Learning of Spatiotemporally Coherent Metrics

Authors: Ross Goroshin, Joan Bruna, Jonathan Tompson, David Eigen, Yann LeCun

Abstract:

Current state-of-the-art classification and detection algorithms rely on supervised training. In this work we study unsupervised feature learning in the context of temporally coherent video data. We focus on feature learning from unlabeled video data, using the assumption that adjacent video frames contain semantically similar information. This assumption is exploited to train a convolutional pooling auto-encoder regularized by slowness and sparsity. We establish a connection between slow feature learning to metric learning and show that the trained encoder can be used to define a more temporally and semantically coherent metric.

Keywords: machine learning, pattern clustering, pooling, classification

Procedia PDF Downloads 420
195 Automatic Identification of Aquatic Insects Based on Deep Learning and Computer Vision

Authors: Predrag Simović, Katarina Stojanović, Milena Radenković, Dimitrija Savić Zdravković, Aleksandar Milosavljević, Bratislav Predić, Milenka Božanić, Ana Petrović, Djuradj Milošević

Abstract:

Mayflies (Ephemeroptera), stoneflies (Plecoptera), and caddisflies (Trichoptera) (collectively referred to as EPT) are key participants in most freshwater habitats and often exhibit high diversity. Moreover, their presence and relative abundance are used in freshwater ecological and biomonitoring studies. Current methods for freshwater ecosystem biomonitoring follow a traditional approach of taxa monitoring based on morphological characters, which is time-consuming, and often generates data sets with low taxonomic resolution and unverifiable identification precision. To assist in solving identification problems and contribute to the knowledge of the distribution of many species, there was a need to develop alternative approaches in macroinvertebrate sample identification. Here, we establish an automatic machine-based identification approach for EPT taxa (Insect) using deep Convolutional Neural Networks (CNNs) and computer vision to increase the efficiency and taxonomic resolution in biomonitoring. The 5 550 specimens were collected from freshwater ecosystems of Serbia, and the deep model was built upon 90 EPT taxa. The protocol for obtaining images included the following stages: taxonomic identification by human experts and DNA barcoding validation, mounting the larvae, and photographing the dorsal side using a stereomicroscope and camera (16 650 individuals). The most informative image regions (the dorsal segments of individuals) for the decision-making process in the deep learning model were visualized using Gradient Weighted Class Activation Mapping (Grad-CAM). After training the artificial neural network, a CNN model was then built that was able to classify the 90 EPT taxa into their respective taxonomic categories automatically with 98.7%. Our approach offers a straightforward and efficient solution for routine monitoring programs, focusing on key biotic descriptors, such as EPT taxa. In addition, this application provides a streamlined solution that not only saves time, reduces equipment and expert requirements but also significantly enhances reliability and information content. The identification of the EPT larvae is difficult because of the variation of morphological features even within a single genus or the close resemblance of several species, and therefore, future research should focus on increasing the number of entities (species) in the model.

Keywords: convolutional neural networks, DNA barcoding, EPT taxa, biomonitoring

Procedia PDF Downloads 47
194 Facial Emotion Recognition Using Deep Learning

Authors: Ashutosh Mishra, Nikhil Goyal

Abstract:

A 3D facial emotion recognition model based on deep learning is proposed in this paper. Two convolution layers and a pooling layer are employed in the deep learning architecture. After the convolution process, the pooling is finished. The probabilities for various classes of human faces are calculated using the sigmoid activation function. To verify the efficiency of deep learning-based systems, a set of faces. The Kaggle dataset is used to verify the accuracy of a deep learning-based face recognition model. The model's accuracy is about 65 percent, which is lower than that of other facial expression recognition techniques. Despite significant gains in representation precision due to the nonlinearity of profound image representations.

Keywords: facial recognition, computational intelligence, convolutional neural network, depth map

Procedia PDF Downloads 196
193 AS-Geo: Arbitrary-Sized Image Geolocalization with Learnable Geometric Enhancement Resizer

Authors: Huayuan Lu, Chunfang Yang, Ma Zhu, Baojun Qi, Yaqiong Qiao, Jiangqian Xu

Abstract:

Image geolocalization has great application prospects in fields such as autonomous driving and virtual/augmented reality. In practical application scenarios, the size of the image to be located is not fixed; it is impractical to train different networks for all possible sizes. When its size does not match the size of the input of the descriptor extraction model, existing image geolocalization methods usually directly scale or crop the image in some common ways. This will result in the loss of some information important to the geolocalization task, thus affecting the performance of the image geolocalization method. For example, excessive down-sampling can lead to blurred building contour, and inappropriate cropping can lead to the loss of key semantic elements, resulting in incorrect geolocation results. To address this problem, this paper designs a learnable image resizer and proposes an arbitrary-sized image geolocation method. (1) The designed learnable image resizer employs the self-attention mechanism to enhance the geometric features of the resized image. Firstly, it applies bilinear interpolation to the input image and its feature maps to obtain the initial resized image and the resized feature maps. Then, SKNet (selective kernel net) is used to approximate the best receptive field, thus keeping the geometric shapes as the original image. And SENet (squeeze and extraction net) is used to automatically select the feature maps with strong contour information, enhancing the geometric features. Finally, the enhanced geometric features are fused with the initial resized image, to obtain the final resized images. (2) The proposed image geolocalization method embeds the above image resizer as a fronting layer of the descriptor extraction network. It not only enables the network to be compatible with arbitrary-sized input images but also enhances the geometric features that are crucial to the image geolocalization task. Moreover, the triplet attention mechanism is added after the first convolutional layer of the backbone network to optimize the utilization of geometric elements extracted by the first convolutional layer. Finally, the local features extracted by the backbone network are aggregated to form image descriptors for image geolocalization. The proposed method was evaluated on several mainstream datasets, such as Pittsburgh30K, Tokyo24/7, and Places365. The results show that the proposed method has excellent size compatibility and compares favorably to recently mainstream geolocalization methods.

Keywords: image geolocalization, self-attention mechanism, image resizer, geometric feature

Procedia PDF Downloads 178
192 Prototyping a Portable, Affordable Sign Language Glove

Authors: Vidhi Jain

Abstract:

Communication between speakers and non-speakers of American Sign Language (ASL) can be problematic, inconvenient, and expensive. This project attempts to bridge the communication gap by designing a portable glove that captures the user’s ASL gestures and outputs the translated text on a smartphone. The glove is equipped with flex sensors, contact sensors, and a gyroscope to measure the flexion of the fingers, the contact between fingers, and the rotation of the hand. The glove’s Arduino UNO microcontroller analyzes the sensor readings to identify the gesture from a library of learned gestures. The Bluetooth module transmits the gesture to a smartphone. Using this device, one day speakers of ASL may be able to communicate with others in an affordable and convenient way.

Keywords: sign language, morse code, convolutional neural network, American sign language, gesture recognition

Procedia PDF Downloads 22
191 Radar-Based Classification of Pedestrian and Dog Using High-Resolution Raw Range-Doppler Signatures

Authors: C. Mayr, J. Periya, A. Kariminezhad

Abstract:

In this paper, we developed a learning framework for the classification of vulnerable road users (VRU) by their range-Doppler signatures. The frequency-modulated continuous-wave (FMCW) radar raw data is first pre-processed to obtain robust object range-Doppler maps per coherent time interval. The complex-valued range-Doppler maps captured from our outdoor measurements are further fed into a convolutional neural network (CNN) to learn the classification. This CNN has gone through a hyperparameter optimization process for improved learning. By learning VRU range-Doppler signatures, the three classes 'pedestrian', 'dog', and 'noise' are classified with an average accuracy of almost 95%. Interestingly, this classification accuracy holds for a combined longitudinal and lateral object trajectories.

Keywords: machine learning, radar, signal processing, autonomous driving

Procedia PDF Downloads 210
190 GRCNN: Graph Recognition Convolutional Neural Network for Synthesizing Programs from Flow Charts

Authors: Lin Cheng, Zijiang Yang

Abstract:

Program synthesis is the task to automatically generate programs based on user specification. In this paper, we present a framework that synthesizes programs from flow charts that serve as accurate and intuitive specification. In order doing so, we propose a deep neural network called GRCNN that recognizes graph structure from its image. GRCNN is trained end-to-end, which can predict edge and node information of the flow chart simultaneously. Experiments show that the accuracy rate to synthesize a program is 66.4%, and the accuracy rates to recognize edge and node are 94.1% and 67.9%, respectively. On average, it takes about 60 milliseconds to synthesize a program.

Keywords: program synthesis, flow chart, specification, graph recognition, CNN

Procedia PDF Downloads 96
189 Review on Rainfall Prediction Using Machine Learning Technique

Authors: Prachi Desai, Ankita Gandhi, Mitali Acharya

Abstract:

Rainfall forecast is mainly used for predictions of rainfall in a specified area and determining their future rainfall conditions. Rainfall is always a global issue as it affects all major aspects of one's life. Agricultural, fisheries, forestry, tourism industry and other industries are widely affected by these conditions. The studies have resulted in insufficient availability of water resources and an increase in water demand in the near future. We already have a new forecast system that uses the deep Convolutional Neural Network (CNN) to forecast monthly rainfall and climate changes. We have also compared CNN against Artificial Neural Networks (ANN). Machine Learning techniques that are used in rainfall predictions include ARIMA Model, ANN, LR, SVM etc. The dataset on which we are experimenting is gathered online over the year 1901 to 20118. Test results have suggested more realistic improvements than conventional rainfall forecasts.

Keywords: ANN, CNN, supervised learning, machine learning, deep learning

Procedia PDF Downloads 153