Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 8

Search results for: optimizers

8 Investigating the Influence of Activation Functions on Image Classification Accuracy via Deep Convolutional Neural Network

Abstract:

Convolutional Neural Networks (CNNs) have emerged as powerful tools for image classification, and the choice of optimizers profoundly affects their performance. The study of optimizers and their adaptations remains a topic of significant importance in machine learning research. While numerous studies have explored and advocated for various optimizers, the efficacy of these optimization techniques is still subject to scrutiny. This work aims to address the challenges surrounding the effectiveness of optimizers by conducting a comprehensive analysis and evaluation. The primary focus of this investigation lies in examining the performance of different optimizers when employed in conjunction with the popular activation function, Rectified Linear Unit (ReLU). By incorporating ReLU, known for its favorable properties in prior research, the aim is to bolster the effectiveness of the optimizers under scrutiny. Specifically, we evaluate the adjustment of these optimizers with both the original Softmax activation function and the modified ReLU activation function, carefully assessing their impact on overall performance. To achieve this, a series of experiments are conducted using a well-established benchmark dataset for image classification tasks, namely the Canadian Institute for Advanced Research dataset (CIFAR-10). The selected optimizers for investigation encompass a range of prominent algorithms, including Adam, Root Mean Squared Propagation (RMSprop), Adaptive Learning Rate Method (Adadelta), Adaptive Gradient Algorithm (Adagrad), and Stochastic Gradient Descent (SGD). The performance analysis encompasses a comprehensive evaluation of the classification accuracy, convergence speed, and robustness of the CNN models trained with each optimizer. Through rigorous experimentation and meticulous assessment, we discern the strengths and weaknesses of the different optimization techniques, providing valuable insights into their suitability for image classification tasks. By conducting this in-depth study, we contribute to the existing body of knowledge surrounding optimizers in CNNs, shedding light on their performance characteristics for image classification. The findings gleaned from this research serve to guide researchers and practitioners in making informed decisions when selecting optimizers and activation functions, thus advancing the state-of-the-art in the field of image classification with convolutional neural networks.

Keywords: deep neural network, optimizers, RMsprop, ReLU, stochastic gradient descent

Procedia PDF Downloads 59

7 Loss Function Optimization for CNN-Based Fingerprint Anti-Spoofing

Authors: Yehjune Heo

Abstract:

As biometric systems become widely deployed, the security of identification systems can be easily attacked by various spoof materials. This paper contributes to finding a reliable and practical anti-spoofing method using Convolutional Neural Networks (CNNs) based on the types of loss functions and optimizers. The types of CNNs used in this paper include AlexNet, VGGNet, and ResNet. By using various loss functions including Cross-Entropy, Center Loss, Cosine Proximity, and Hinge Loss, and various loss optimizers which include Adam, SGD, RMSProp, Adadelta, Adagrad, and Nadam, we obtained significant performance changes. We realize that choosing the correct loss function for each model is crucial since different loss functions lead to different errors on the same evaluation. By using a subset of the Livdet 2017 database, we validate our approach to compare the generalization power. It is important to note that we use a subset of LiveDet and the database is the same across all training and testing for each model. This way, we can compare the performance, in terms of generalization, for the unseen data across all different models. The best CNN (AlexNet) with the appropriate loss function and optimizers result in more than 3% of performance gain over the other CNN models with the default loss function and optimizer. In addition to the highest generalization performance, this paper also contains the models with high accuracy associated with parameters and mean average error rates to find the model that consumes the least memory and computation time for training and testing. Although AlexNet has less complexity over other CNN models, it is proven to be very efficient. For practical anti-spoofing systems, the deployed version should use a small amount of memory and should run very fast with high anti-spoofing performance. For our deployed version on smartphones, additional processing steps, such as quantization and pruning algorithms, have been applied in our final model.

Keywords: anti-spoofing, CNN, fingerprint recognition, loss function, optimizer

Procedia PDF Downloads 103

6 A User-Directed Approach to Optimization via Metaprogramming

Authors: Eashan Hatti

Abstract:

In software development, programmers often must make a choice between high-level programming and high-performance programs. High-level programming encourages the use of complex, pervasive abstractions. However, the use of these abstractions degrades performance-high performance demands that programs be low-level. In a compiler, the optimizer attempts to let the user have both. The optimizer takes high-level, abstract code as an input and produces low-level, performant code as an output. However, there is a problem with having the optimizer be a built-in part of the compiler. Domain-specific abstractions implemented as libraries are common in high-level languages. As a language’s library ecosystem grows, so does the number of abstractions that programmers will use. If these abstractions are to be performant, the optimizer must be extended with new optimizations to target them, or these abstractions must rely on existing general-purpose optimizations. The latter is often not as effective as needed. The former presents too significant of an effort for the compiler developers, as they are the only ones who can extend the language with new optimizations. Thus, the language becomes more high-level, yet the optimizer – and, in turn, program performance – falls behind. Programmers are again confronted with a choice between high-level programming and high-performance programs. To investigate a potential solution to this problem, we developed Peridot, a prototype programming language. Peridot’s main contribution is that it enables library developers to easily extend the language with new optimizations themselves. This allows the optimization workload to be taken off the compiler developers’ hands and given to a much larger set of people who can specialize in each problem domain. Because of this, optimizations can be much more effective while also being much more numerous. To enable this, Peridot supports metaprogramming designed for implementing program transformations. The language is split into two fragments or “levels”, one for metaprogramming, the other for high-level general-purpose programming. The metaprogramming level supports logic programming. Peridot’s key idea is that optimizations are simply implemented as metaprograms. The meta level supports several specific features which make it particularly suited to implementing optimizers. For instance, metaprograms can automatically deduce equalities between the programs they are optimizing via unification, deal with variable binding declaratively via higher-order abstract syntax, and avoid the phase-ordering problem via non-determinism. We have found that this design centered around logic programming makes optimizers concise and easy to write compared to their equivalents in functional or imperative languages. Overall, implementing Peridot has shown that its design is a viable solution to the problem of writing code which is both high-level and performant.

Keywords: optimization, metaprogramming, logic programming, abstraction

Procedia PDF Downloads 55

5 Drone Classification Using Classification Methods Using Conventional Model With Embedded Audio-Visual Features

Authors: Hrishi Rakshit, Pooneh Bagheri Zadeh

Abstract:

This paper investigates the performance of drone classification methods using conventional DCNN with different hyperparameters, when additional drone audio data is embedded in the dataset for training and further classification. In this paper, first a custom dataset is created using different images of drones from University of South California (USC) datasets and Leeds Beckett university datasets with embedded drone audio signal. The three well-known DCNN architectures namely, Resnet50, Darknet53 and Shufflenet are employed over the created dataset tuning their hyperparameters such as, learning rates, maximum epochs, Mini Batch size with different optimizers. Precision-Recall curves and F1 Scores-Threshold curves are used to evaluate the performance of the named classification algorithms. Experimental results show that Resnet50 has the highest efficiency compared to other DCNN methods.

Keywords: drone classifications, deep convolutional neural network, hyperparameters, drone audio signal

Procedia PDF Downloads 52

4 Method and Apparatus for Optimized Job Scheduling in the High-Performance Computing Cloud Environment

Authors: Subodh Kumar, Amit Varde

Abstract:

Typical on-premises high-performance computing (HPC) environments consist of a fixed number and a fixed set of computing hardware. During the design of the HPC environment, the hardware components, including but not limited to CPU, Memory, GPU, and networking, are carefully chosen from select vendors for optimal performance. High capital cost for building the environment is a prime factor influencing the design environment. A class of software called “Job Schedulers” are critical to maximizing these resources and running multiple workloads to extract the maximum value for the high capital cost. In principle, schedulers work by preventing workloads and users from monopolizing the finite hardware resources by queuing jobs in a workload. A cloud-based HPC environment does not have the limitations of fixed (type of and quantity of) hardware resources. In theory, users and workloads could spin up any number and type of hardware resource. This paper discusses the limitations of using traditional scheduling algorithms for cloud-based HPC workloads. It proposes a new set of features, called “HPC optimizers,” for maximizing the benefits of the elasticity and scalability of the cloud with the goal of cost-performance optimization of the workload.

Keywords: high performance computing, HPC, cloud computing, optimization, schedulers

Procedia PDF Downloads 54

3 Using Computer Vision to Detect and Localize Fractures in Wrist X-ray Images

Authors: John Paul Q. Tomas, Mark Wilson L. de los Reyes, Kirsten Joyce P. Vasquez

Abstract:

The most frequent type of fracture is a wrist fracture, which often makes it difficult for medical professionals to find and locate. In this study, fractures in wrist x-ray pictures were located and identified using deep learning and computer vision. The researchers used image filtering, masking, morphological operations, and data augmentation for the image preprocessing and trained the RetinaNet and Faster R-CNN models with ResNet50 backbones and Adam optimizers separately for each image filtering technique and projection. The RetinaNet model with Anisotropic Diffusion Smoothing filter trained with 50 epochs has obtained the greatest accuracy of 99.14%, precision of 100%, sensitivity/recall of 98.41%, specificity of 100%, and an IoU score of 56.44% for the Posteroanterior projection utilizing augmented data. For the Lateral projection using augmented data, the RetinaNet model with an Anisotropic Diffusion filter trained with 50 epochs has produced the highest accuracy of 98.40%, precision of 98.36%, sensitivity/recall of 98.36%, specificity of 98.43%, and an IoU score of 58.69%. When comparing the test results of the different individual projections, models, and image filtering techniques, the Anisotropic Diffusion filter trained with 50 epochs has produced the best classification and regression scores for both projections.

Keywords: Artificial Intelligence, Computer Vision, Wrist Fracture, Deep Learning

Procedia PDF Downloads 47

2 Neuroevolution Based on Adaptive Ensembles of Biologically Inspired Optimization Algorithms Applied for Modeling a Chemical Engineering Process

Authors: Sabina-Adriana Floria, Marius Gavrilescu, Florin Leon, Silvia Curteanu, Costel Anton

Abstract:

Neuroevolution is a subfield of artificial intelligence used to solve various problems in different application areas. Specifically, neuroevolution is a technique that applies biologically inspired methods to generate neural network architectures and optimize their parameters automatically. In this paper, we use different biologically inspired optimization algorithms in an ensemble strategy with the aim of training multilayer perceptron neural networks, resulting in regression models used to simulate the industrial chemical process of obtaining bricks from silicone-based materials. Installations in the raw ceramics industry, i.e., bricks, are characterized by significant energy consumption and large quantities of emissions. In addition, the initial conditions that were taken into account during the design and commissioning of the installation can change over time, which leads to the need to add new mixes to adjust the operating conditions for the desired purpose, e.g., material properties and energy saving. The present approach follows the study by simulation of a process of obtaining bricks from silicone-based materials, i.e., the modeling and optimization of the process. Optimization aims to determine the working conditions that minimize the emissions represented by nitrogen monoxide. We first use a search procedure to find the best values for the parameters of various biologically inspired optimization algorithms. Then, we propose an adaptive ensemble strategy that uses only a subset of the best algorithms identified in the search stage. The adaptive ensemble strategy combines the results of selected algorithms and automatically assigns more processing capacity to the more efficient algorithms. Their efficiency may also vary at different stages of the optimization process. In a given ensemble iteration, the most efficient algorithms aim to maintain good convergence, while the less efficient algorithms can improve population diversity. The proposed adaptive ensemble strategy outperforms the individual optimizers and the non-adaptive ensemble strategy in convergence speed, and the obtained results provide lower error values.

Keywords: optimization, biologically inspired algorithm, neuroevolution, ensembles, bricks, emission minimization

Procedia PDF Downloads 71

1 Use of Artificial Neural Networks to Estimate Evapotranspiration for Efficient Irrigation Management

Authors: Adriana Postal, Silvio C. Sampaio, Marcio A. Villas Boas, Josué P. Castro, Ralpho R. Reis

Abstract:

This study deals with the estimation of reference evapotranspiration (ET₀) in an agricultural context, focusing on efficient irrigation management to meet the growing interest in the sustainable management of water resources. Given the importance of water in agriculture and its scarcity in many regions, efficient use of this resource is essential to ensure food security and environmental sustainability. The methodology used involved the application of artificial intelligence techniques, specifically Multilayer Perceptron (MLP) Artificial Neural Networks (ANNs), to predict ET₀ in the state of Paraná, Brazil. The models were trained and validated with meteorological data from the Brazilian National Institute of Meteorology (INMET), together with data obtained from a producer's weather station in the western region of Paraná. Two optimizers (SGD and Adam) and different meteorological variables, such as temperature, humidity, solar radiation, and wind speed, were explored as inputs to the models. Nineteen configurations with different input variables were tested; amidst them, configuration 9, with 8 input variables, was identified as the most efficient of all. Configuration 10, with 4 input variables, was considered the most effective, considering the smallest number of variables. The main conclusions of this study show that MLP ANNs are capable of accurately estimating ET₀, providing a valuable tool for irrigation management in agriculture. Both configurations (9 and 10) showed promising performance in predicting ET₀. The validation of the models with cultivator data underlined the practical relevance of these tools and confirmed their generalization ability for different field conditions. The results of the statistical metrics, including Mean Absolute Error (MAE), Mean Squared Error (MSE), Root Mean Squared Error (RMSE), and Coefficient of Determination (R2), showed excellent agreement between the model predictions and the observed data, with MAE as low as 0.01 mm/day and 0.03 mm/day, respectively. In addition, the models achieved an R2 between 0.99 and 1, indicating a satisfactory fit to the real data. This agreement was also confirmed by the Kolmogorov-Smirnov test, which evaluates the agreement of the predictions with the statistical behavior of the real data and yields values between 0.02 and 0.04 for the producer data. In addition, the results of this study suggest that the developed technique can be applied to other locations by using specific data from these sites to further improve ET₀ predictions and thus contribute to sustainable irrigation management in different agricultural regions. To summarize, this study has helped to advance research in the field of irrigation management in agriculture. It provides an accessible and effective approach to ET₀ estimation that has the potential to significantly improve water use efficiency and promote agricultural sustainability in different contexts.

Keywords: agricultural technology, neural networks in agriculture, water efficiency, water use optimization

Procedia PDF Downloads 11