Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 5159

Search results for: reinforcement learning

5159 Metareasoning Image Optimization Q-Learning

Authors: Mahasa Zahirnia

Abstract:

The purpose of this paper is to explore new and effective ways of optimizing satellite images using artificial intelligence, and the process of implementing reinforcement learning to enhance the quality of data captured within the image. In our implementation of Bellman's Reinforcement Learning equations, associated state diagrams, and multi-stage image processing, we were able to enhance image quality, detect and define objects. Reinforcement learning is the differentiator in the area of artificial intelligence, and Q-Learning relies on trial and error to achieve its goals. The reward system that is embedded in Q-Learning allows the agent to self-evaluate its performance and decide on the best possible course of action based on the current and future environment. Results show that within a simulated environment, built on the images that are commercially available, the rate of detection was 40-90%. Reinforcement learning through Q-Learning algorithm is not just desired but required design criteria for image optimization and enhancements. The proposed methods presented are a cost effective method of resolving uncertainty of the data because reinforcement learning finds ideal policies to manage the process using a smaller sample of images.

Keywords: Q-learning, image optimization, reinforcement learning, Markov decision process

Procedia PDF Downloads 57
5158 The AI Arena: A Framework for Distributed Multi-Agent Reinforcement Learning

Authors: Edward W. Staley, Corban G. Rivera, Ashley J. Llorens

Abstract:

Advances in reinforcement learning (RL) have resulted in recent breakthroughs in the application of artificial intelligence (AI) across many different domains. An emerging landscape of development environments is making powerful RL techniques more accessible for a growing community of researchers. However, most existing frameworks do not directly address the problem of learning in complex operating environments, such as dense urban settings or defense-related scenarios, that incorporate distributed, heterogeneous teams of agents. To help enable AI research for this important class of applications, we introduce the AI Arena: a scalable framework with flexible abstractions for distributed multi-agent reinforcement learning. The AI Arena extends the OpenAI Gym interface to allow greater flexibility in learning control policies across multiple agents with heterogeneous learning strategies and localized views of the environment. To illustrate the utility of our framework, we present experimental results that demonstrate performance gains due to a distributed multi-agent learning approach over commonly-used RL techniques in several different learning environments.

Keywords: reinforcement learning, multi-agent, deep learning, artificial intelligence

Procedia PDF Downloads 41
5157 Reinforcement Learning for Classification of Low-Resolution Satellite Images

Authors: Khadija Bouzaachane, El Mahdi El Guarmah

Abstract:

The classification of low-resolution satellite images has been a worthwhile and fertile field that attracts plenty of researchers due to its importance in monitoring geographical areas. It could be used for several purposes such as disaster management, military surveillance, agricultural monitoring. The main objective of this work is to classify efficiently and accurately low-resolution satellite images by using novel technics of deep learning and reinforcement learning. The images include roads, residential areas, industrial areas, rivers, sea lakes, and vegetation. To achieve that goal, we carried out experiments on the sentinel-2 images considering both high accuracy and efficiency classification. Our proposed model achieved a 91% accuracy on the testing dataset besides a good classification for land cover. Focus on the parameter precision; we have obtained 93% for the river, 92% for residential, 97% for residential, 96% for the forest, 87% for annual crop, 84% for herbaceous vegetation, 85% for pasture, 78% highway and 100% for Sea Lake.

Keywords: classification, deep learning, reinforcement learning, satellite imagery

Procedia PDF Downloads 58
5156 Machine Learning Approach for Mutation Testing

Authors: Michael Stewart

Abstract:

Mutation testing is a type of software testing proposed in the 1970s where program statements are deliberately changed to introduce simple errors so that test cases can be validated to determine if they can detect the errors. Test cases are executed against the mutant code to determine if one fails, detects the error and ensures the program is correct. One major issue with this type of testing was it became intensive computationally to generate and test all possible mutations for complex programs. This paper used reinforcement learning and parallel processing within the context of mutation testing for the selection of mutation operators and test cases that reduced the computational cost of testing and improved test suite effectiveness. Experiments were conducted using sample programs to determine how well the reinforcement learning-based algorithm performed with one live mutation, multiple live mutations and no live mutations. The experiments, measured by mutation score, were used to update the algorithm and improved accuracy for predictions. The performance was then evaluated on multiple processor computers. With reinforcement learning, the mutation operators utilized were reduced by 50 – 100%.

Keywords: automated-testing, machine learning, mutation testing, parallel processing, reinforcement learning, software engineering, software testing

Procedia PDF Downloads 63
5155 Efficient Subgoal Discovery for Hierarchical Reinforcement Learning Using Local Computations

Authors: Adrian Millea

Abstract:

In hierarchical reinforcement learning, one of the main issues encountered is the discovery of subgoal states or options (which are policies reaching subgoal states) by partitioning the environment in a meaningful way. This partitioning usually requires an expensive global clustering operation or eigendecomposition of the Laplacian of the states graph. We propose a local solution to this issue, much more efficient than algorithms using global information, which successfully discovers subgoal states by computing a simple function, which we call heterogeneity for each state as a function of its neighbors. Moreover, we construct a value function using the difference in heterogeneity from one step to the next, as reward, such that we are able to explore the state space much more efficiently than say epsilon-greedy. The same principle can then be applied to higher level of the hierarchy, where now states are subgoals discovered at the level below.

Keywords: exploration, hierarchical reinforcement learning, locality, options, value functions

Procedia PDF Downloads 39
5154 Analysis of Q-Learning on Artificial Neural Networks for Robot Control Using Live Video Feed

Authors: Nihal Murali, Kunal Gupta, Surekha Bhanot

Abstract:

Training of artificial neural networks (ANNs) using reinforcement learning (RL) techniques is being widely discussed in the robot learning literature. The high model complexity of ANNs along with the model-free nature of RL algorithms provides a desirable combination for many robotics applications. There is a huge need for algorithms that generalize using raw sensory inputs, such as vision, without any hand-engineered features or domain heuristics. In this paper, the standard control problem of line following robot was used as a test-bed, and an ANN controller for the robot was trained on images from a live video feed using Q-learning. A virtual agent was first trained in simulation environment and then deployed onto a robot’s hardware. The robot successfully learns to traverse a wide range of curves and displays excellent generalization ability. Qualitative analysis of the evolution of policies, performance and weights of the network provide insights into the nature and convergence of the learning algorithm.

Keywords: artificial neural networks, q-learning, reinforcement learning, robot learning

Procedia PDF Downloads 268
5153 Gaits Stability Analysis for a Pneumatic Quadruped Robot Using Reinforcement Learning

Authors: Soofiyan Atar, Adil Shaikh, Sahil Rajpurkar, Pragnesh Bhalala, Aniket Desai, Irfan Siddavatam

Abstract:

Deep reinforcement learning (deep RL) algorithms leverage the symbolic power of complex controllers by automating it by mapping sensory inputs to low-level actions. Deep RL eliminates the complex robot dynamics with minimal engineering. Deep RL provides high-risk involvement by directly implementing it in real-world scenarios and also high sensitivity towards hyperparameters. Tuning of hyperparameters on a pneumatic quadruped robot becomes very expensive through trial-and-error learning. This paper presents an automated learning control for a pneumatic quadruped robot using sample efficient deep Q learning, enabling minimal tuning and very few trials to learn the neural network. Long training hours may degrade the pneumatic cylinder due to jerk actions originated through stochastic weights. We applied this method to the pneumatic quadruped robot, which resulted in a hopping gait. In our process, we eliminated the use of a simulator and acquired a stable gait. This approach evolves so that the resultant gait matures more sturdy towards any stochastic changes in the environment. We further show that our algorithm performed very well as compared to programmed gait using robot dynamics.

Keywords: model-based reinforcement learning, gait stability, supervised learning, pneumatic quadruped

Procedia PDF Downloads 87
5152 A Reinforcement Learning Approach for Evaluation of Real-Time Disaster Relief Demand and Network Condition

Authors: Ali Nadi, Ali Edrissi

Abstract:

Relief demand and transportation links availability is the essential information that is needed for every natural disaster operation. This information is not in hand once a disaster strikes. Relief demand and network condition has been evaluated based on prediction method in related works. Nevertheless, prediction seems to be over or under estimated due to uncertainties and may lead to a failure operation. Therefore, in this paper a stochastic programming model is proposed to evaluate real-time relief demand and network condition at the onset of a natural disaster. To address the time sensitivity of the emergency response, the proposed model uses reinforcement learning for optimization of the total relief assessment time. The proposed model is tested on a real size network problem. The simulation results indicate that the proposed model performs well in the case of collecting real-time information.

Keywords: disaster management, real-time demand, reinforcement learning, relief demand

Procedia PDF Downloads 189
5151 Intelligent Adaptive Learning in a Changing Environment

Authors: G. Valentis, Q. Berthelot

Abstract:

Nowadays the trend to develop ever more intelligent and autonomous systems often takes its inspiration in the living beings on Earth. Some simple isolated systems are able, once brought together, to form a strong and reliable system. When trying to adapt the idea to man-made systems it is not possible to include in their program everything the system may encounter during its life cycle. It is, thus, necessary to make the system able to take decisions based on other criteria such as its past experience, i.e. to make the system learn on its own. However, at some point the acquired knowledge depends also on environment. So the question is: if system environment is modified, how could the system respond to it quickly and appropriately enough? Here, starting from reinforcement learning to rate its decisions, and using adaptive learning algorithms for gain and loss reward, the system is made able to respond to changing environment and to adapt its knowledge as time passes. Application is made to a robot finding an exit in a labyrinth.

Keywords: reinforcement learning, neural network, autonomous systems, adaptive learning, changing environment

Procedia PDF Downloads 308
5150 Adhesion Performance According to Lateral Reinforcement Method of Textile

Authors: Jungbhin You, Taekyun Kim, Jongho Park, Sungnam Hong, Sun-Kyu Park

Abstract:

Reinforced concrete has been mainly used in construction field because of excellent durability. However, it may lead to reduction of durability and safety due to corrosion of reinforcement steels according to damage of concrete surface. Recently, research of textile is ongoing to complement weakness of reinforced concrete. In previous research, only experiment of longitudinal length were performed. Therefore, in order to investigate the adhesion performance according to the lattice shape and the embedded length, the pull-out test was performed on the roving with parameter of the number of lateral reinforcement, the lateral reinforcement length and the lateral reinforcement spacing. As a result, the number of lateral reinforcement and the lateral reinforcement length did not significantly affect the load variation depending on the adhesion performance, and only the load analysis results according to the reinforcement spacing are affected.

Keywords: adhesion performance, lateral reinforcement, pull-out test, textile

Procedia PDF Downloads 214
5149 Deep Reinforcement Learning Approach for Optimal Control of Industrial Smart Grids

Authors: Niklas Panten, Eberhard Abele

Abstract:

This paper presents a novel approach for real-time and near-optimal control of industrial smart grids by deep reinforcement learning (DRL). To achieve highly energy-efficient factory systems, the energetic linkage of machines, technical building equipment and the building itself is desirable. However, the increased complexity of the interacting sub-systems, multiple time-variant target values and stochastic influences by the production environment, weather and energy markets make it difficult to efficiently control the energy production, storage and consumption in the hybrid industrial smart grids. The studied deep reinforcement learning approach allows to explore the solution space for proper control policies which minimize a cost function. The deep neural network of the DRL agent is based on a multilayer perceptron (MLP), Long Short-Term Memory (LSTM) and convolutional layers. The agent is trained within multiple Modelica-based factory simulation environments by the Advantage Actor Critic algorithm (A2C). The DRL controller is evaluated by means of the simulation and then compared to a conventional, rule-based approach. Finally, the results indicate that the DRL approach is able to improve the control performance and significantly reduce energy respectively operating costs of industrial smart grids.

Keywords: industrial smart grids, energy efficiency, deep reinforcement learning, optimal control

Procedia PDF Downloads 96
5148 Modern Scotland Yard: Improving Surveillance Policies Using Adversarial Agent-Based Modelling and Reinforcement Learning

Authors: Olaf Visker, Arnout De Vries, Lambert Schomaker

Abstract:

Predictive policing refers to the usage of analytical techniques to identify potential criminal activity. It has been widely implemented by various police departments. Being a relatively new area of research, there are, to the author’s knowledge, no absolute tried, and true methods and they still exhibit a variety of potential problems. One of those problems is closely related to the lack of understanding of how acting on these prediction influence crime itself. The goal of law enforcement is ultimately crime reduction. As such, a policy needs to be established that best facilitates this goal. This research aims to find such a policy by using adversarial agent-based modeling in combination with modern reinforcement learning techniques. It is presented here that a baseline model for both law enforcement and criminal agents and compare their performance to their respective reinforcement models. The experiments show that our smart law enforcement model is capable of reducing crime by making more deliberate choices regarding the locations of potential criminal activity. Furthermore, it is shown that the smart criminal model presents behavior consistent with popular crime theories and outperforms the baseline model in terms of crimes committed and time to capture. It does, however, still suffer from the difficulties of capturing long term rewards and learning how to handle multiple opposing goals.

Keywords: adversarial, agent based modelling, predictive policing, reinforcement learning

Procedia PDF Downloads 38
5147 A Novel Exploration/Exploitation Policy Accelerating Learning In Both Stationary And Non Stationary Environment Navigation Tasks

Authors: Wiem Zemzem, Moncef Tagina

Abstract:

In this work, we are addressing the problem of an autonomous mobile robot navigating in a large, unknown and dynamic environment using reinforcement learning abilities. This problem is principally related to the exploration/exploitation dilemma, especially the need to find a solution letting the robot detect the environmental change and also learn in order to adapt to the new environmental form without ignoring knowledge already acquired. Firstly, a new action selection strategy, called ε-greedy-MPA (the ε-greedy policy favoring the most promising actions) is proposed. Unlike existing exploration/exploitation policies (EEPs) such as ε-greedy and Boltzmann, the new EEP doesn’t only rely on the information of the actual state but also uses those of the eventual next states. Secondly, as the environment is large, an exploration favoring least recently visited states is added to the proposed EEP in order to accelerate learning. Finally, various simulations with ball-catching problem have been conducted to evaluate the ε-greedy-MPA policy. The results of simulated experiments show that combining this policy with the Qlearning method is more effective and efficient compared with the ε-greedy policy in stationary environments and the utility-based reinforcement learning approach in non stationary environments.

Keywords: autonomous mobile robot, exploration/ exploitation policy, large, dynamic environment, reinforcement learning

Procedia PDF Downloads 324
5146 Mutiple Medical Landmark Detection on X-Ray Scan Using Reinforcement Learning

Authors: Vijaya Yuvaram Singh V M, Kameshwar Rao J V

Abstract:

The challenge with development of neural network based methods for medical is the availability of data. Anatomical landmark detection in the medical domain is a process to find points on the x-ray scan report of the patient. Most of the time this task is done manually by trained professionals as it requires precision and domain knowledge. Traditionally object detection based methods are used for landmark detection. Here, we utilize reinforcement learning and query based method to train a single agent capable of detecting multiple landmarks. A deep Q network agent is trained to detect single and multiple landmarks present on hip and shoulder from x-ray scan of a patient. Here a single agent is trained to find multiple landmark making it superior to having individual agents per landmark. For the initial study, five images of different patients are used as the environment and tested the agents performance on two unseen images.

Keywords: reinforcement learning, medical landmark detection, multi target detection, deep neural network

Procedia PDF Downloads 37
5145 LanE-change Path Planning of Autonomous Driving Using Model-Based Optimization, Deep Reinforcement Learning and 5G Vehicle-to-Vehicle Communications

Authors: William Li

Abstract:

Lane-change path planning is a crucial and yet complex task in autonomous driving. The traditional path planning approach based on a system of carefully-crafted rules to cover various driving scenarios becomes unwieldy as more and more rules are added to deal with exceptions and corner cases. This paper proposes to divide the entire path planning to two stages. In the first stage the ego vehicle travels longitudinally in the source lane to reach a safe state. In the second stage the ego vehicle makes lateral lane-change maneuver to the target lane. The paper derives the safe state conditions based on lateral lane-change maneuver calculation to ensure collision free in the second stage. To determine the acceleration sequence that minimizes the time to reach a safe state in the first stage, the paper proposes three schemes, namely, kinetic model based optimization, deep reinforcement learning, and 5G vehicle-to-vehicle (V2V) communications. The paper investigates these schemes via simulation. The model-based optimization is sensitive to the model assumptions. The deep reinforcement learning is more flexible in handling scenarios beyond the model assumed by the optimization. The 5G V2V eliminates uncertainty in predicting future behaviors of surrounding vehicles by sharing driving intents and enabling cooperative driving.

Keywords: lane change, path planning, autonomous driving, deep reinforcement learning, 5G, V2V communications, connected vehicles

Procedia PDF Downloads 28
5144 Reinforcement Learning for Robust Missile Autopilot Design: TRPO Enhanced by Schedule Experience Replay

Authors: Bernardo Cortez, Florian Peter, Thomas Lausenhammer, Paulo Oliveira

Abstract:

Designing missiles’ autopilot controllers have been a complex task, given the extensive flight envelope and the nonlinear flight dynamics. A solution that can excel both in nominal performance and in robustness to uncertainties is still to be found. While Control Theory often debouches into parameters’ scheduling procedures, Reinforcement Learning has presented interesting results in ever more complex tasks, going from videogames to robotic tasks with continuous action domains. However, it still lacks clearer insights on how to find adequate reward functions and exploration strategies. To the best of our knowledge, this work is a pioneer in proposing Reinforcement Learning as a framework for flight control. In fact, it aims at training a model-free agent that can control the longitudinal non-linear flight dynamics of a missile, achieving the target performance and robustness to uncertainties. To that end, under TRPO’s methodology, the collected experience is augmented according to HER, stored in a replay buffer and sampled according to its significance. Not only does this work enhance the concept of prioritized experience replay into BPER, but it also reformulates HER, activating them both only when the training progress converges to suboptimal policies, in what is proposed as the SER methodology. The results show that it is possible both to achieve the target performance and to improve the agent’s robustness to uncertainties (with low damage on nominal performance) by further training it in non-nominal environments, therefore validating the proposed approach and encouraging future research in this field.

Keywords: Reinforcement Learning, flight control, HER, missile autopilot, TRPO

Procedia PDF Downloads 86
5143 The Effect of Geogrid Reinforcement Pre-Stressing on the Performance of Sand Bed Supporting a Strip Foundation

Authors: Ahmed M. Eltohamy

Abstract:

In this paper, an experimental and numerical study was adopted to investigate the effect geogrid soil reinforcement pre-stressing on the pressure settlement relation of sand bed supporting a strip foundation. The studied parameters include foundation depth and pre-stress ratio for the cases of one and two pre-stressed reinforcement layers. The study reflected that pre-stressing of soil reinforcement resulted in a marked enhancement in reinforced bed soil stiffness compared to the reinforced soil without pre-stress. The best benefit of pre-stressing reinforcement was obtained as the overburden pressure and pre-straining ratio increase. Pre-stressing of double reinforcement topmost layers results in further enhancement of stress strain relation of bed soil.

Keywords: geogrid reinforcement, prestress, strip footing, bearing capacity

Procedia PDF Downloads 196
5142 Examination of the Reinforcement Forces Generated in Pseudo-Static and Dynamic Status in Retaining Walls

Authors: K. Passbakhsh

Abstract:

Determination of reinforcement forces is one of the most important and main discussions in designing retaining walls. By determining these forces we refrain from conservative planning. By numerically modeling the reinforced soil retaining walls under dynamic loading reinforcement forces can be calculated. In this study we try to approach the gained forces by pseudo-static method according to FHWA code and gained forces from numerical modeling by finite element method, by selecting seismic horizontal coefficient for different wall height. PLAXIS software was used for numerical analysis. Then the effect of reinforcement stiffness and soil type on reinforcement forces is examined.

Keywords: reinforced soil, PLAXIS, reinforcement forces, retaining walls

Procedia PDF Downloads 259
5141 Learning to Translate by Learning to Communicate to an Entailment Classifier

Authors: Szymon Rutkowski, Tomasz Korbak

Abstract:

We present a reinforcement-learning-based method of training neural machine translation models without parallel corpora. The standard encoder-decoder approach to machine translation suffers from two problems we aim to address. First, it needs parallel corpora, which are scarce, especially for low-resource languages. Second, it lacks psychological plausibility of learning procedure: learning a foreign language is about learning to communicate useful information, not merely learning to transduce from one language’s 'encoding' to another. We instead pose the problem of learning to translate as learning a policy in a communication game between two agents: the translator and the classifier. The classifier is trained beforehand on a natural language inference task (determining the entailment relation between a premise and a hypothesis) in the target language. The translator produces a sequence of actions that correspond to generating translations of both the hypothesis and premise, which are then passed to the classifier. The translator is rewarded for classifier’s performance on determining entailment between sentences translated by the translator to disciple’s native language. Translator’s performance thus reflects its ability to communicate useful information to the classifier. In effect, we train a machine translation model without the need for parallel corpora altogether. While similar reinforcement learning formulations for zero-shot translation were proposed before, there is a number of improvements we introduce. While prior research aimed at grounding the translation task in the physical world by evaluating agents on an image captioning task, we found that using a linguistic task is more sample-efficient. Natural language inference (also known as recognizing textual entailment) captures semantic properties of sentence pairs that are poorly correlated with semantic similarity, thus enforcing basic understanding of the role played by compositionality. It has been shown that models trained recognizing textual entailment produce high-quality general-purpose sentence embeddings transferrable to other tasks. We use stanford natural language inference (SNLI) dataset as well as its analogous datasets for French (XNLI) and Polish (CDSCorpus). Textual entailment corpora can be obtained relatively easily for any language, which makes our approach more extensible to low-resource languages than traditional approaches based on parallel corpora. We evaluated a number of reinforcement learning algorithms (including policy gradients and actor-critic) to solve the problem of translator’s policy optimization and found that our attempts yield some promising improvements over previous approaches to reinforcement-learning based zero-shot machine translation.

Keywords: agent-based language learning, low-resource translation, natural language inference, neural machine translation, reinforcement learning

Procedia PDF Downloads 38
5140 Trajectory Design and Power Allocation for Energy -Efficient UAV Communication Based on Deep Reinforcement Learning

Authors: Yuling Cui, Danhao Deng, Chaowei Wang, Weidong Wang

Abstract:

In recent years, unmanned aerial vehicles (UAVs) have been widely used in wireless communication, attracting more and more attention from researchers. UAVs can not only serve as a relay for auxiliary communication but also serve as an aerial base station for ground users (GUs). However, limited energy means that they cannot work all the time and cover a limited range of services. In this paper, we investigate 2D UAV trajectory design and power allocation in order to maximize the UAV's service time and downlink throughput. Based on deep reinforcement learning, we propose a depth deterministic strategy gradient algorithm for trajectory design and power distribution (TDPA-DDPG) to solve the energy-efficient and communication service quality problem. The simulation results show that TDPA-DDPG can extend the service time of UAV as much as possible, improve the communication service quality, and realize the maximization of downlink throughput, which is significantly improved compared with existing methods.

Keywords: UAV trajectory design, power allocation, energy efficient, downlink throughput, deep reinforcement learning, DDPG

Procedia PDF Downloads 28
5139 Obstacle Avoidance Using Image-Based Visual Servoing Based on Deep Reinforcement Learning

Authors: Tong He, Long Chen, Irag Mantegh, Wen-Fang Xie

Abstract:

This paper proposes an image-based obstacle avoidance and tracking target identification strategy in GPS-degraded or GPS-denied environment for an Unmanned Aerial Vehicle (UAV). The traditional force algorithm for obstacle avoidance could produce local minima area, in which UAV cannot get away obstacle effectively. In order to eliminate it, an artificial potential approach based on harmonic potential is proposed to guide the UAV to avoid the obstacle by using the vision system. And image-based visual servoing scheme (IBVS) has been adopted to implement the proposed obstacle avoidance approach. In IBVS, the pixel accuracy is a key factor to realize the obstacle avoidance. In this paper, the deep reinforcement learning framework has been applied by reducing pixel errors through constant interaction between the environment and the agent. In addition, the combination of OpenTLD and Tensorflow based on neural network is used to identify the type of tracking target. Numerical simulation in Matlab and ROS GAZEBO show the satisfactory result in target identification and obstacle avoidance.

Keywords: image-based visual servoing, obstacle avoidance, tracking target identification, deep reinforcement learning, artificial potential approach, neural network

Procedia PDF Downloads 34
5138 Off-Policy Q-learning Technique for Intrusion Response in Network Security

Authors: Zheni S. Stefanova, Kandethody M. Ramachandran

Abstract:

With the increasing dependency on our computer devices, we face the necessity of adequate, efficient and effective mechanisms, for protecting our network. There are two main problems that Intrusion Detection Systems (IDS) attempt to solve. 1) To detect the attack, by analyzing the incoming traffic and inspect the network (intrusion detection). 2) To produce a prompt response when the attack occurs (intrusion prevention). It is critical creating an Intrusion detection model that will detect a breach in the system on time and also challenging making it provide an automatic and with an acceptable delay response at every single stage of the monitoring process. We cannot afford to adopt security measures with a high exploiting computational power, and we are not able to accept a mechanism that will react with a delay. In this paper, we will propose an intrusion response mechanism that is based on artificial intelligence, and more precisely, reinforcement learning techniques (RLT). The RLT will help us to create a decision agent, who will control the process of interacting with the undetermined environment. The goal is to find an optimal policy, which will represent the intrusion response, therefore, to solve the Reinforcement learning problem, using a Q-learning approach. Our agent will produce an optimal immediate response, in the process of evaluating the network traffic.This Q-learning approach will establish the balance between exploration and exploitation and provide a unique, self-learning and strategic artificial intelligence response mechanism for IDS.

Keywords: cyber security, intrusion prevention, optimal policy, Q-learning

Procedia PDF Downloads 132
5137 Comparative Analysis of Reinforcement Learning Algorithms for Autonomous Driving

Authors: Migena Mana, Ahmed Khalid Syed, Abdul Malik, Nikhil Cherian

Abstract:

In recent years, advancements in deep learning enabled researchers to tackle the problem of self-driving cars. Car companies use huge datasets to train their deep learning models to make autonomous cars a reality. However, this approach has certain drawbacks in that the state space of possible actions for a car is so huge that there cannot be a dataset for every possible road scenario. To overcome this problem, the concept of reinforcement learning (RL) is being investigated in this research. Since the problem of autonomous driving can be modeled in a simulation, it lends itself naturally to the domain of reinforcement learning. The advantage of this approach is that we can model different and complex road scenarios in a simulation without having to deploy in the real world. The autonomous agent can learn to drive by finding the optimal policy. This learned model can then be easily deployed in a real-world setting. In this project, we focus on three RL algorithms: Q-learning, Deep Deterministic Policy Gradient (DDPG), and Proximal Policy Optimization (PPO). To model the environment, we have used TORCS (The Open Racing Car Simulator), which provides us with a strong foundation to test our model. The inputs to the algorithms are the sensor data provided by the simulator such as velocity, distance from side pavement, etc. The outcome of this research project is a comparative analysis of these algorithms. Based on the comparison, the PPO algorithm gives the best results. When using PPO algorithm, the reward is greater, and the acceleration, steering angle and braking are more stable compared to the other algorithms, which means that the agent learns to drive in a better and more efficient way in this case. Additionally, we have come up with a dataset taken from the training of the agent with DDPG and PPO algorithms. It contains all the steps of the agent during one full training in the form: (all input values, acceleration, steering angle, break, loss, reward). This study can serve as a base for further complex road scenarios. Furthermore, it can be enlarged in the field of computer vision, using the images to find the best policy.

Keywords: autonomous driving, DDPG (deep deterministic policy gradient), PPO (proximal policy optimization), reinforcement learning

Procedia PDF Downloads 48
5136 Nonlinear Finite Element Modeling of Unbonded Steel Reinforced Concrete Beams

Authors: Fares Jnaid, Riyad Aboutaha

Abstract:

In this paper, a nonlinear Finite Element Analysis (FEA) was carried out using ANSYS software to build a model able of predicting the behavior of Reinforced Concrete (RC) beams with unbonded reinforcement. The FEA model was compared to existing experimental data by other researchers. The existing experimental data consisted of 16 beams that varied from structurally sound beams to beams with unbonded reinforcement with different unbonded lengths and reinforcement ratios. The model was able to predict the ultimate flexural strength, load-deflection curve, and crack pattern of concrete beams with unbonded reinforcement. It was concluded that when the when the unbonded length is less than 45% of the span, there will be no decrease in the ultimate flexural strength due to the loss of bond between the steel reinforcement and the surrounding concrete regardless of the reinforcement ratio. Moreover, when the reinforcement ratio is relatively low, there will be no decrease in ultimate flexural strength regardless of the length of unbond.

Keywords: FEA, ANSYS, unbond, strain

Procedia PDF Downloads 155
5135 The Student Care: The Influence of Family’s Attention toward the Student of Junior High Schools in Physics Learning Achievements

Authors: Siti Rossidatul Munawaroh, Siti Khusnul Khowatim

Abstract:

This study is determined to find how is the influence of family attention of students in provides guidance of the student learning. The increasing of student’s learning motivation can be increased made up in various ways, one of them are through students social guidance in their relation with the family. The family not only provides the matter and the learning time but also be supervise for the learning time and guide his children to overcome a learning disability. The character of physics subject in their science experiences at junior high schools has demanded that student’s ability is to think symbolically and understand something in a meaningful manner. Therefore, the reinforcement of the physics learning motivation is clearly necessary not only by the school are related, but the family environment and the society. As for the role of family which includes maintenance, parenting, coaching, and educating both of physically and spiritually, this way is expected to give spirit impulsion in studying physics subject in order to increase student learning achievements.

Keywords: physics subject, the influence of family attention, learning motivation, the Student care

Procedia PDF Downloads 310
5134 Deep Reinforcement Learning Approach for Trading Automation in The Stock Market

Authors: Taylan Kabbani, Ekrem Duman

Abstract:

The design of adaptive systems that take advantage of financial markets while reducing the risk can bring more stagnant wealth into the global market. However, most efforts made to generate successful deals in trading financial assets rely on Supervised Learning (SL), which suffered from various limitations. Deep Reinforcement Learning (DRL) offers to solve these drawbacks of SL approaches by combining the financial assets price "prediction" step and the "allocation" step of the portfolio in one unified process to produce fully autonomous systems capable of interacting with its environment to make optimal decisions through trial and error. In this paper, a continuous action space approach is adopted to give the trading agent the ability to gradually adjust the portfolio's positions with each time step (dynamically re-allocate investments), resulting in better agent-environment interaction and faster convergence of the learning process. In addition, the approach supports the managing of a portfolio with several assets instead of a single one. This work represents a novel DRL model to generate profitable trades in the stock market, effectively overcoming the limitations of supervised learning approaches. We formulate the trading problem, or what is referred to as The Agent Environment as Partially observed Markov Decision Process (POMDP) model, considering the constraints imposed by the stock market, such as liquidity and transaction costs. More specifically, we design an environment that simulates the real-world trading process by augmenting the state representation with ten different technical indicators and sentiment analysis of news articles for each stock. We then solve the formulated POMDP problem using the Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithm, which can learn policies in high-dimensional and continuous action spaces like those typically found in the stock market environment. From the point of view of stock market forecasting and the intelligent decision-making mechanism, this paper demonstrates the superiority of deep reinforcement learning in financial markets over other types of machine learning such as supervised learning and proves its credibility and advantages of strategic decision-making.

Keywords: the stock market, deep reinforcement learning, MDP, twin delayed deep deterministic policy gradient, sentiment analysis, technical indicators, autonomous agent

Procedia PDF Downloads 23
5133 Development of AA2024 Matrix Composites Reinforced with Micro Yttrium through Cold Compaction with Superior Mechanical Properties

Authors: C. H. S. Vidyasagar, D. B. Karunakar

Abstract:

In this present work, five different composite samples with AA2024 as matrix and varying amounts of yttrium (0.1-0.5 wt.%) as reinforcement are developed through cold compaction. The microstructures of the developed composite samples revealed that the yttrium reinforcement caused grain refinement up to 0.3 wt.% and beyond which the refinement is not effective. The microstructure revealed Al2Cu precipitation which strengthened the composite up to 0.3 wt.% yttrium reinforcement. Upon further increase in yttrium reinforcement, the intermetallics and the precipitation coarsen and their corresponding strengthening effect decreases. The mechanical characterization revealed that the composite sample reinforced with 0.3 wt.% yttrium showed highest mechanical properties like 82 HV of hardness, 276 MPa Ultimate Tensile Strength (UTS), 229 MPa Yield Strength (YS) and an elongation (EL) of 18.9% respectively. However, the relative density of the developed composites decreased with the increase in yttrium reinforcement.

Keywords: mechanical properties, AA 2024 matrix, yttrium reinforcement, cold compaction, precipitation

Procedia PDF Downloads 42
5132 Experimental and Analytical Study to Investigate the Effect of Tension Reinforcement on Behavior of Reinforced Concrete Short Beams

Authors: Hakan Ozturk, Aydin Demir, Kemal Edip, Marta Stojmanovska, Julijana Bojadjieva

Abstract:

There are many factors that affect the behavior of reinforced concrete beams. These can be listed as concrete compressive and reinforcement yield strength, amount of tension, compression and confinement bars, and strain hardening of reinforcement. In the study, support condition of short beams is selected statically indeterminate to first degree. Experimental and numerical analysis are carried for reinforcement concrete (RC) short beams. Dimensions of cross sections are selected as 250mm width and 500 mm height. The length of RC short beams is designed as 2250 mm and these values are constant in all beams. After verifying accurately finite element model, a numerical parametric study is performed with varied diameter of tension reinforcement. Effect of change in diameter is investigated on behavior of RC short beams. As a result of the study, ductility ratios and failure modes are determined, and load-displacement graphs are obtained in order to understand the behavior of short beams. It is deduced that diameter of tension reinforcement plays very important role on the behavior of RC short beams in terms of ductility and brittleness.

Keywords: short beam, reinforced concrete, finite element analysis, longitudinal reinforcement

Procedia PDF Downloads 124
5131 Deep Reinforcement Learning-Based Computation Offloading for 5G Vehicle-Aware Multi-Access Edge Computing Network

Authors: Ziying Wu, Danfeng Yan

Abstract:

Multi-Access Edge Computing (MEC) is one of the key technologies of the future 5G network. By deploying edge computing centers at the edge of wireless access network, the computation tasks can be offloaded to edge servers rather than the remote cloud server to meet the requirements of 5G low-latency and high-reliability application scenarios. Meanwhile, with the development of IOV (Internet of Vehicles) technology, various delay-sensitive and compute-intensive in-vehicle applications continue to appear. Compared with traditional internet business, these computation tasks have higher processing priority and lower delay requirements. In this paper, we design a 5G-based Vehicle-Aware Multi-Access Edge Computing Network (VAMECN) and propose a joint optimization problem of minimizing total system cost. In view of the problem, a deep reinforcement learning-based joint computation offloading and task migration optimization (JCOTM) algorithm is proposed, considering the influences of multiple factors such as concurrent multiple computation tasks, system computing resources distribution, and network communication bandwidth. And, the mixed integer nonlinear programming problem is described as a Markov Decision Process. Experiments show that our proposed algorithm can effectively reduce task processing delay and equipment energy consumption, optimize computing offloading and resource allocation schemes, and improve system resource utilization, compared with other computing offloading policies.

Keywords: multi-access edge computing, computation offloading, 5th generation, vehicle-aware, deep reinforcement learning, deep q-network

Procedia PDF Downloads 31
5130 Risk Assessment of Reinforcement System on Fractured Rock Mass, Gate Shaft Project, Jatigede Dam, Sumedang, West Java, Indonesia

Authors: A. Ardianto, M. A. Putera Agung, S. Pramusandi

Abstract:

Power waterway is one of dam structures and as an intake vertical tunnel or well function for hydroelectric power plants in Jatigede area, Sumedang, West Java. Gate shaft is also one of parts the power waterway system. The paper concerns some consideration in determining a critical state parameter on the back stability analysis of gate shaft or excavation wall stability during excavation. Study analysis was carried out using without and with reinforcement system. Results study showed that reinforcement shaft could reduce the total displacement and safety factor could increases significantly. Based on the back calculation results, it was recommended to install some reinforcement materials and drainage system to reduce pore water pressure.

Keywords: power waterway, reinforcement, displacement, safety

Procedia PDF Downloads 332