Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 7654

Search results for: multi-agent reinforcement learning

7654 Deep Reinforcement Learning Model for Autonomous Driving

Abstract:

The development of intelligent transportation systems (ITS) and artificial intelligence (AI) are spurring us to pave the way for the widespread adoption of autonomous vehicles (AVs). This is open again opportunities for smart roads, smart traffic safety, and mobility comfort. A highly intelligent decision-making system is essential for autonomous driving around dense, dynamic objects. It must be able to handle complex road geometry and topology, as well as complex multiagent interactions, and closely follow higher-level commands such as routing information. Autonomous vehicles have become a very hot research topic in recent years due to their significant ability to reduce traffic accidents and personal injuries. Using new artificial intelligence-based technologies handles important functions in scene understanding, motion planning, decision making, vehicle control, social behavior, and communication for AV. This paper focuses only on deep reinforcement learning-based methods; it does not include traditional (flat) planar techniques, which have been the subject of extensive research in the past because reinforcement learning (RL) has become a powerful learning framework now capable of learning complex policies in high dimensional environments. The DRL algorithm used so far found solutions to the four main problems of autonomous driving; in our paper, we highlight the challenges and point to possible future research directions.

Keywords: deep reinforcement learning, autonomous driving, deep deterministic policy gradient, deep Q-learning

Procedia PDF Downloads 77

7653 Metareasoning Image Optimization Q-Learning

Authors: Mahasa Zahirnia

Abstract:

The purpose of this paper is to explore new and effective ways of optimizing satellite images using artificial intelligence, and the process of implementing reinforcement learning to enhance the quality of data captured within the image. In our implementation of Bellman's Reinforcement Learning equations, associated state diagrams, and multi-stage image processing, we were able to enhance image quality, detect and define objects. Reinforcement learning is the differentiator in the area of artificial intelligence, and Q-Learning relies on trial and error to achieve its goals. The reward system that is embedded in Q-Learning allows the agent to self-evaluate its performance and decide on the best possible course of action based on the current and future environment. Results show that within a simulated environment, built on the images that are commercially available, the rate of detection was 40-90%. Reinforcement learning through Q-Learning algorithm is not just desired but required design criteria for image optimization and enhancements. The proposed methods presented are a cost effective method of resolving uncertainty of the data because reinforcement learning finds ideal policies to manage the process using a smaller sample of images.

Keywords: Q-learning, image optimization, reinforcement learning, Markov decision process

Procedia PDF Downloads 209

7652 Q-Learning of Bee-Like Robots Through Obstacle Avoidance

Authors: Jawairia Rasheed

Abstract:

Modern robots are often used for search and rescue purpose. One of the key areas of interest in such cases is learning complex environments. One of the key methodologies for robots in such cases is reinforcement learning. In reinforcement learning robots learn to move the path to reach the goal while avoiding obstacles. Q-learning, one of the most advancement of reinforcement learning is used for making the robots to learn the path. Robots learn by interacting with the environment to reach the goal. In this paper simulation model of bee-like robots is implemented in NETLOGO. In the start the learning rate was less and it increased with the passage of time. The bees successfully learned to reach the goal while avoiding obstacles through Q-learning technique.

Keywords: reinforlearning of bee like robots for reaching the goalcement learning for randomly placed obstacles, obstacle avoidance through q-learning, q-learning for obstacle avoidance,

Procedia PDF Downloads 91

7651 Decoding the Structure of Multi-Agent System Communication: A Comparative Analysis of Protocols and Paradigms

Authors: Gulshad Azatova, Aleksandr Kapitonov, Natig Aminov

Abstract:

Multiagent systems have gained significant attention in various fields, such as robotics, autonomous vehicles, and distributed computing, where multiple agents cooperate and communicate to achieve complex tasks. Efficient communication among agents is a crucial aspect of these systems, as it directly impacts their overall performance and scalability. This scholarly work provides an exploration of essential communication elements and conducts a comparative assessment of diverse protocols utilized in multiagent systems. The emphasis lies in scrutinizing the strengths, weaknesses, and applicability of these protocols across various scenarios. The research also sheds light on emerging trends within communication protocols for multiagent systems, including the incorporation of machine learning methods and the adoption of blockchain-based solutions to ensure secure communication. These trends provide valuable insights into the evolving landscape of multiagent systems and their communication protocols.

Keywords: communication, multi-agent systems, protocols, consensus

Procedia PDF Downloads 62

7650 The AI Arena: A Framework for Distributed Multi-Agent Reinforcement Learning

Authors: Edward W. Staley, Corban G. Rivera, Ashley J. Llorens

Abstract:

Advances in reinforcement learning (RL) have resulted in recent breakthroughs in the application of artificial intelligence (AI) across many different domains. An emerging landscape of development environments is making powerful RL techniques more accessible for a growing community of researchers. However, most existing frameworks do not directly address the problem of learning in complex operating environments, such as dense urban settings or defense-related scenarios, that incorporate distributed, heterogeneous teams of agents. To help enable AI research for this important class of applications, we introduce the AI Arena: a scalable framework with flexible abstractions for distributed multi-agent reinforcement learning. The AI Arena extends the OpenAI Gym interface to allow greater flexibility in learning control policies across multiple agents with heterogeneous learning strategies and localized views of the environment. To illustrate the utility of our framework, we present experimental results that demonstrate performance gains due to a distributed multi-agent learning approach over commonly-used RL techniques in several different learning environments.

Keywords: reinforcement learning, multi-agent, deep learning, artificial intelligence

Procedia PDF Downloads 152

7649 Curriculum-Based Multi-Agent Reinforcement Learning for Robotic Navigation

Authors: Hyeongbok Kim, Lingling Zhao, Xiaohong Su

Abstract:

Deep reinforcement learning has been applied to address various problems in robotics, such as autonomous driving and unmanned aerial vehicle. However, because of the sparse reward penalty for a collision with obstacles during the navigation mission, the agent fails to learn the optimal policy or requires a long time for convergence. Therefore, using obstacles and enemy agents, in this paper, we present a curriculum-based boost learning method to effectively train compound skills during multi-agent reinforcement learning. First, to enable the agents to solve challenging tasks, we gradually increased learning difficulties by adjusting reward shaping instead of constructing different learning environments. Then, in a benchmark environment with static obstacles and moving enemy agents, the experimental results showed that the proposed curriculum learning strategy enhanced cooperative navigation and compound collision avoidance skills in uncertain environments while improving learning efficiency.

Keywords: curriculum learning, hard exploration, multi-agent reinforcement learning, robotic navigation, sparse reward

Procedia PDF Downloads 87

7648 Deep Reinforcement Learning with Leonard-Ornstein Processes Based Recommender System

Authors: Khalil Bachiri, Ali Yahyaouy, Nicoleta Rogovschi

Abstract:

Improved user experience is a goal of contemporary recommender systems. Recommender systems are starting to incorporate reinforcement learning since it easily satisfies this goal of increasing a user’s reward every session. In this paper, we examine the most effective Reinforcement Learning agent tactics on the Movielens (1M) dataset, balancing precision and a variety of recommendations. The absence of variability in final predictions makes simplistic techniques, although able to optimize ranking quality criteria, worthless for consumers of the recommendation system. Utilizing the stochasticity of Leonard-Ornstein processes, our suggested strategy encourages the agent to investigate its surroundings. Research demonstrates that raising the NDCG (Discounted Cumulative Gain) and HR (HitRate) criterion without lowering the Ornstein-Uhlenbeck process drift coefficient enhances the diversity of suggestions.

Keywords: recommender systems, reinforcement learning, deep learning, DDPG, Leonard-Ornstein process

Procedia PDF Downloads 135

7647 Leveraging Deep Q Networks in Portfolio Optimization

Authors: Peng Liu

Abstract:

Deep Q networks (DQNs) represent a significant advancement in reinforcement learning, utilizing neural networks to approximate the optimal Q-value for guiding sequential decision processes. This paper presents a comprehensive introduction to reinforcement learning principles, delves into the mechanics of DQNs, and explores its application in portfolio optimization. By evaluating the performance of DQNs against traditional benchmark portfolios, we demonstrate its potential to enhance investment strategies. Our results underscore the advantages of DQNs in dynamically adjusting asset allocations, offering a robust portfolio management framework.

Keywords: deep reinforcement learning, deep Q networks, portfolio optimization, multi-period optimization

Procedia PDF Downloads 21

7646 Predicting Shot Making in Basketball Learnt Fromadversarial Multiagent Trajectories

Authors: Mark Harmon, Abdolghani Ebrahimi, Patrick Lucey, Diego Klabjan

Abstract:

In this paper, we predict the likelihood of a player making a shot in basketball from multiagent trajectories. Previous approaches to similar problems center on hand-crafting features to capture domain-specific knowledge. Although intuitive, recent work in deep learning has shown, this approach is prone to missing important predictive features. To circumvent this issue, we present a convolutional neural network (CNN) approach where we initially represent the multiagent behavior as an image. To encode the adversarial nature of basketball, we use a multichannel image which we then feed into a CNN. Additionally, to capture the temporal aspect of the trajectories, we use “fading.” We find that this approach is superior to a traditional FFN model. By using gradient ascent, we were able to discover what the CNN filters look for during training. Last, we find that a combined FFN+CNN is the best performing network with an error rate of 39%.

Keywords: basketball, computer vision, image processing, convolutional neural network

Procedia PDF Downloads 151

7645 Reinforcement Learning for Self Driving Racing Car Games

Authors: Adam Beaunoyer, Cory Beaunoyer, Mohammed Elmorsy, Hanan Saleh

Abstract:

This research aims to create a reinforcement learning agent capable of racing in challenging simulated environments with a low collision count. We present a reinforcement learning agent that can navigate challenging tracks using both a Deep Q-Network (DQN) and a Soft Actor-Critic (SAC) method. A challenging track includes curves, jumps, and varying road widths throughout. Using open-source code on Github, the environment used in this research is based on the 1995 racing game WipeOut. The proposed reinforcement learning agent can navigate challenging tracks rapidly while maintaining low racing completion time and collision count. The results show that the SAC model outperforms the DQN model by a large margin. We also propose an alternative multiple-car model that can navigate the track without colliding with other vehicles on the track. The SAC model is the basis for the multiple-car model, where it can complete the laps quicker than the single-car model but has a higher collision rate with the track wall.

Keywords: reinforcement learning, soft actor-critic, deep q-network, self-driving cars, artificial intelligence, gaming

Procedia PDF Downloads 40

7644 A Fully Interpretable Deep Reinforcement Learning-Based Motion Control for Legged Robots

Authors: Haodong Huang, Zida Zhao, Shilong Sun, Chiyao Li, Wenfu Xu

Abstract:

The control methods for legged robots based on deep reinforcement learning have seen widespread application; however, the inherent black-box nature of neural networks presents challenges in understanding the decision-making motives of the robots. To address this issue, we propose a fully interpretable deep reinforcement learning training method to elucidate the underlying principles of legged robot motion. We incorporate the dynamics of legged robots into the policy, where observations serve as inputs and actions as outputs of the dynamics model. By embedding the dynamics equations within the multi-layer perceptron (MLP) computation process and making the parameters trainable, we enhance interpretability. Additionally, Bayesian optimization is introduced to train these parameters. We validate the proposed fully interpretable motion control algorithm on a legged robot, opening new research avenues for motion control and learning algorithms for legged robots within the deep learning framework.

Keywords: deep reinforcement learning, interpretation, motion control, legged robots

Procedia PDF Downloads 11

7643 A Deep Reinforcement Learning-Based Secure Framework against Adversarial Attacks in Power System

Authors: Arshia Aflaki, Hadis Karimipour, Anik Islam

Abstract:

Generative Adversarial Attacks (GAAs) threaten critical sectors, ranging from fingerprint recognition to industrial control systems. Existing Deep Learning (DL) algorithms are not robust enough against this kind of cyber-attack. As one of the most critical industries in the world, the power grid is not an exception. In this study, a Deep Reinforcement Learning-based (DRL) framework assisting the DL model to improve the robustness of the model against generative adversarial attacks is proposed. Real-world smart grid stability data, as an IIoT dataset, test our method and improves the classification accuracy of a deep learning model from around 57 percent to 96 percent.

Keywords: generative adversarial attack, deep reinforcement learning, deep learning, IIoT, generative adversarial networks, power system

Procedia PDF Downloads 26

7642 Deep Reinforcement Learning Model Using Parameterised Quantum Circuits

Authors: Lokes Parvatha Kumaran S., Sakthi Jay Mahenthar C., Sathyaprakash P., Jayakumar V., Shobanadevi A.

Abstract:

With the evolution of technology, the need to solve complex computational problems like machine learning and deep learning has shot up. But even the most powerful classical supercomputers find it difficult to execute these tasks. With the recent development of quantum computing, researchers and tech-giants strive for new quantum circuits for machine learning tasks, as present works on Quantum Machine Learning (QML) ensure less memory consumption and reduced model parameters. But it is strenuous to simulate classical deep learning models on existing quantum computing platforms due to the inflexibility of deep quantum circuits. As a consequence, it is essential to design viable quantum algorithms for QML for noisy intermediate-scale quantum (NISQ) devices. The proposed work aims to explore Variational Quantum Circuits (VQC) for Deep Reinforcement Learning by remodeling the experience replay and target network into a representation of VQC. In addition, to reduce the number of model parameters, quantum information encoding schemes are used to achieve better results than the classical neural networks. VQCs are employed to approximate the deep Q-value function for decision-making and policy-selection reinforcement learning with experience replay and the target network.

Keywords: quantum computing, quantum machine learning, variational quantum circuit, deep reinforcement learning, quantum information encoding scheme

Procedia PDF Downloads 125

7641 Distributed System Computing Resource Scheduling Algorithm Based on Deep Reinforcement Learning

Authors: Yitao Lei, Xingxiang Zhai, Burra Venkata Durga Kumar

Abstract:

As the quantity and complexity of computing in large-scale software systems increase, distributed system computing becomes increasingly important. The distributed system realizes high-performance computing by collaboration between different computing resources. If there are no efficient resource scheduling resources, the abuse of distributed computing may cause resource waste and high costs. However, resource scheduling is usually an NP-hard problem, so we cannot find a general solution. However, some optimization algorithms exist like genetic algorithm, ant colony optimization, etc. The large scale of distributed systems makes this traditional optimization algorithm challenging to work with. Heuristic and machine learning algorithms are usually applied in this situation to ease the computing load. As a result, we do a review of traditional resource scheduling optimization algorithms and try to introduce a deep reinforcement learning method that utilizes the perceptual ability of neural networks and the decision-making ability of reinforcement learning. Using the machine learning method, we try to find important factors that influence the performance of distributed system computing and help the distributed system do an efficient computing resource scheduling. This paper surveys the application of deep reinforcement learning on distributed system computing resource scheduling proposes a deep reinforcement learning method that uses a recurrent neural network to optimize the resource scheduling, and proposes the challenges and improvement directions for DRL-based resource scheduling algorithms.

Keywords: resource scheduling, deep reinforcement learning, distributed system, artificial intelligence

Procedia PDF Downloads 104

7640 Targeted Photoactivatable Multiagent Nanoconjugates for Imaging and Photodynamic Therapy

Authors: Shazia Bano

Abstract:

Nanoconjugates that integrate photo-based therapeutics and diagnostics within a single platform promise great advances in revolutionizing cancer treatments. However, to achieve high therapeutic efficacy, designing functionally efficacious nanocarriers to tightly retain the drug, promoting selective drug localization and release, and the validation of the efficacy of these nanoconjugates is a great challenge. Here we have designed smart multiagent, liposome based targeted photoactivatable multiagent nanoconjugates, doped with a photoactivatable chromophore benzoporphyrin derivative (BPD) labelled with an active targeting ligand cetuximab to target the EGFR receptor (over expressed in various cancer cells) to deliver a combination of therapeutic agents. This study establishes a tunable nanoplatform for the delivery of the photoactivatable multiagent nanoconjugates for tumor-specific accumulation and targeted destruction of cancer cells in complex cancer model to enhance the therapeutic index of the administrated drugs.

Keywords: targeting, photodynamic therapy, photoactivatable, nanoconjugates

Procedia PDF Downloads 135

7639 Reinforcement Learning for Classification of Low-Resolution Satellite Images

Authors: Khadija Bouzaachane, El Mahdi El Guarmah

Abstract:

The classification of low-resolution satellite images has been a worthwhile and fertile field that attracts plenty of researchers due to its importance in monitoring geographical areas. It could be used for several purposes such as disaster management, military surveillance, agricultural monitoring. The main objective of this work is to classify efficiently and accurately low-resolution satellite images by using novel technics of deep learning and reinforcement learning. The images include roads, residential areas, industrial areas, rivers, sea lakes, and vegetation. To achieve that goal, we carried out experiments on the sentinel-2 images considering both high accuracy and efficiency classification. Our proposed model achieved a 91% accuracy on the testing dataset besides a good classification for land cover. Focus on the parameter precision; we have obtained 93% for the river, 92% for residential, 97% for residential, 96% for the forest, 87% for annual crop, 84% for herbaceous vegetation, 85% for pasture, 78% highway and 100% for Sea Lake.

Keywords: classification, deep learning, reinforcement learning, satellite imagery

Procedia PDF Downloads 203

7638 A Comparative Study of Mechanisms across Different Online Social Learning Types

Authors: Xinyu Wang

Abstract:

In the context of the rapid development of Internet technology and the increasing prevalence of online social media, this study investigates the impact of digital communication on social learning. Through three behavioral experiments, we explore both affective and cognitive social learning in online environments. Experiment 1 manipulates the content of experimental materials and two forms of feedback, emotional valence, sociability, and repetition, to verify whether individuals can achieve online emotional social learning through reinforcement using two social learning strategies. Results reveal that both social learning strategies can assist individuals in affective, social learning through reinforcement, with feedback-based learning strategies outperforming frequency-dependent strategies. Experiment 2 similarly manipulates the content of experimental materials and two forms of feedback to verify whether individuals can achieve online knowledge social learning through reinforcement using two social learning strategies. Results show that similar to online affective social learning, individuals adopt both social learning strategies to achieve cognitive social learning through reinforcement, with feedback-based learning strategies outperforming frequency-dependent strategies. Experiment 3 simultaneously observes online affective and cognitive social learning by manipulating the content of experimental materials and feedback at different levels of social pressure. Results indicate that online affective social learning exhibits different learning effects under different levels of social pressure, whereas online cognitive social learning remains unaffected by social pressure, demonstrating more stable learning effects. Additionally, to explore the sustained effects of online social learning and differences in duration among different types of online social learning, all three experiments incorporate two test time points. Results reveal significant differences in pre-post-test scores for online social learning in Experiments 2 and 3, whereas differences are less apparent in Experiment 1. To accurately measure the sustained effects of online social learning, the researchers conducted a mini-meta-analysis of all effect sizes of online social learning duration. Results indicate that although the overall effect size is small, the effect of online social learning weakens over time.

Keywords: online social learning, affective social learning, cognitive social learning, social learning strategies, social reinforcement, social pressure, duration

Procedia PDF Downloads 38

7637 Machine Learning Approach for Mutation Testing

Authors: Michael Stewart

Abstract:

Mutation testing is a type of software testing proposed in the 1970s where program statements are deliberately changed to introduce simple errors so that test cases can be validated to determine if they can detect the errors. Test cases are executed against the mutant code to determine if one fails, detects the error and ensures the program is correct. One major issue with this type of testing was it became intensive computationally to generate and test all possible mutations for complex programs. This paper used reinforcement learning and parallel processing within the context of mutation testing for the selection of mutation operators and test cases that reduced the computational cost of testing and improved test suite effectiveness. Experiments were conducted using sample programs to determine how well the reinforcement learning-based algorithm performed with one live mutation, multiple live mutations and no live mutations. The experiments, measured by mutation score, were used to update the algorithm and improved accuracy for predictions. The performance was then evaluated on multiple processor computers. With reinforcement learning, the mutation operators utilized were reduced by 50 – 100%.

Keywords: automated-testing, machine learning, mutation testing, parallel processing, reinforcement learning, software engineering, software testing

Procedia PDF Downloads 194

7636 Sampling Effects on Secondary Voltage Control of Microgrids Based on Network of Multiagent

Authors: M. J. Park, S. H. Lee, C. H. Lee, O. M. Kwon

Abstract:

This paper studies a secondary voltage control framework of the microgrids based on the consensus for a communication network of multiagent. The proposed control is designed by the communication network with one-way links. The communication network is modeled by a directed graph. At this time, the concept of sampling is considered as the communication constraint among each distributed generator in the microgrids. To analyze the sampling effects on the secondary voltage control of the microgrids, by using Lyapunov theory and some mathematical techniques, the sufficient condition for such problem will be established regarding linear matrix inequality (LMI). Finally, some simulation results are given to illustrate the necessity of the consideration of the sampling effects on the secondary voltage control of the microgrids.

Keywords: microgrids, secondary control, multiagent, sampling, LMI

Procedia PDF Downloads 328

7635 Personalized Email Marketing Strategy: A Reinforcement Learning Approach

Authors: Lei Zhang, Tingting Xu, Jun He, Zhenyu Yan

Abstract:

Email marketing is one of the most important segments of online marketing. It has been proved to be the most effective way to acquire and retain customers. The email content is vital to customers. Different customers may have different familiarity with a product, so a successful marketing strategy must personalize email content based on individual customers’ product affinity. In this study, we build our personalized email marketing strategy with three types of emails: nurture, promotion, and conversion. Each type of email has a different influence on customers. We investigate this difference by analyzing customers’ open rates, click rates and opt-out rates. Feature importance from response models is also analyzed. The goal of the marketing strategy is to improve the click rate on conversion-type emails. To build the personalized strategy, we formulate the problem as a reinforcement learning problem and adopt a Q-learning algorithm with variations. The simulation results show that our model-based strategy outperforms the current marketer’s strategy.

Keywords: email marketing, email content, reinforcement learning, machine learning, Q-learning

Procedia PDF Downloads 189

7634 Efficient Subgoal Discovery for Hierarchical Reinforcement Learning Using Local Computations

Authors: Adrian Millea

Abstract:

In hierarchical reinforcement learning, one of the main issues encountered is the discovery of subgoal states or options (which are policies reaching subgoal states) by partitioning the environment in a meaningful way. This partitioning usually requires an expensive global clustering operation or eigendecomposition of the Laplacian of the states graph. We propose a local solution to this issue, much more efficient than algorithms using global information, which successfully discovers subgoal states by computing a simple function, which we call heterogeneity for each state as a function of its neighbors. Moreover, we construct a value function using the difference in heterogeneity from one step to the next, as reward, such that we are able to explore the state space much more efficiently than say epsilon-greedy. The same principle can then be applied to higher level of the hierarchy, where now states are subgoals discovered at the level below.

Keywords: exploration, hierarchical reinforcement learning, locality, options, value functions

Procedia PDF Downloads 166

7633 Using Q-Learning to Auto-Tune PID Controller Gains for Online Quadcopter Altitude Stabilization

Authors: Y. Alrubyli

Abstract:

Unmanned Arial Vehicles (UAVs), and more specifically, quadcopters need to be stable during their flights. Altitude stability is usually achieved by using a PID controller that is built into the flight controller software. Furthermore, the PID controller has gains that need to be tuned to reach optimal altitude stabilization during the quadcopter’s flight. For that, control system engineers need to tune those gains by using extensive modeling of the environment, which might change from one environment and condition to another. As quadcopters penetrate more sectors, from the military to the consumer sectors, they have been put into complex and challenging environments more than ever before. Hence, intelligent self-stabilizing quadcopters are needed to maneuver through those complex environments and situations. Here we show that by using online reinforcement learning with minimal background knowledge, the altitude stability of the quadcopter can be achieved using a model-free approach. We found that by using background knowledge instead of letting the online reinforcement learning algorithm wander for a while to tune the PID gains, altitude stabilization can be achieved faster. In addition, using this approach will accelerate development by avoiding extensive simulations before applying the PID gains to the real-world quadcopter. Our results demonstrate the possibility of using the trial and error approach of reinforcement learning combined with background knowledge to achieve faster quadcopter altitude stabilization in different environments and conditions.

Keywords: reinforcement learning, Q-leanring, online learning, PID tuning, unmanned aerial vehicle, quadcopter

Procedia PDF Downloads 166

7632 Effectiveness of Reinforcement Learning (RL) for Autonomous Energy Management Solutions

Authors: Tesfaye Mengistu

Abstract:

This thesis aims to investigate the effectiveness of Reinforcement Learning (RL) for Autonomous Energy Management solutions. The study explores the potential of Model Free RL approaches, such as Monte Carlo RL and Q-learning, to improve energy management by autonomously adjusting energy management strategies to maximize efficiency. The research investigates the implementation of RL algorithms for optimizing energy consumption in a single-agent environment. The focus is on developing a framework for the implementation of RL algorithms, highlighting the importance of RL for enabling autonomous systems to adapt quickly to changing conditions and make decisions based on previous experiences. Moreover, the paper proposes RL as a novel energy management solution to address nations' CO2 emission goals. Reinforcement learning algorithms are well-suited to solving problems with sequential decision-making patterns and can provide accurate and immediate outputs to ease the planning and decision-making process. This research provides insights into the challenges and opportunities of using RL for energy management solutions and recommends further studies to explore its full potential. In conclusion, this study provides valuable insights into how RL can be used to improve the efficiency of energy management systems and supports the use of RL as a promising approach for developing autonomous energy management solutions in residential buildings.

Keywords: artificial intelligence, reinforcement learning, monte carlo, energy management, CO2 emission

Procedia PDF Downloads 77

7631 Research on Knowledge Graph Inference Technology Based on Proximal Policy Optimization

Authors: Yihao Kuang, Bowen Ding

Abstract:

With the increasing scale and complexity of knowledge graph, modern knowledge graph contains more and more types of entity, relationship, and attribute information. Therefore, in recent years, it has been a trend for knowledge graph inference to use reinforcement learning to deal with large-scale, incomplete, and noisy knowledge graph and improve the inference effect and interpretability. The Proximal Policy Optimization (PPO) algorithm utilizes a near-end strategy optimization approach. This allows for more extensive updates of policy parameters while constraining the update extent to maintain training stability. This characteristic enables PPOs to converge to improve strategies more rapidly, often demonstrating enhanced performance early in the training process. Furthermore, PPO has the advantage of offline learning, effectively utilizing historical experience data for training and enhancing sample utilization. This means that even with limited resources, PPOs can efficiently train for reinforcement learning tasks. Based on these characteristics, this paper aims to obtain better and more efficient inference effect by introducing PPO into knowledge inference technology.

Keywords: reinforcement learning, PPO, knowledge inference, supervised learning

Procedia PDF Downloads 59

7630 Analysis of Q-Learning on Artificial Neural Networks for Robot Control Using Live Video Feed

Authors: Nihal Murali, Kunal Gupta, Surekha Bhanot

Abstract:

Training of artificial neural networks (ANNs) using reinforcement learning (RL) techniques is being widely discussed in the robot learning literature. The high model complexity of ANNs along with the model-free nature of RL algorithms provides a desirable combination for many robotics applications. There is a huge need for algorithms that generalize using raw sensory inputs, such as vision, without any hand-engineered features or domain heuristics. In this paper, the standard control problem of line following robot was used as a test-bed, and an ANN controller for the robot was trained on images from a live video feed using Q-learning. A virtual agent was first trained in simulation environment and then deployed onto a robot’s hardware. The robot successfully learns to traverse a wide range of curves and displays excellent generalization ability. Qualitative analysis of the evolution of policies, performance and weights of the network provide insights into the nature and convergence of the learning algorithm.

Keywords: artificial neural networks, q-learning, reinforcement learning, robot learning

Procedia PDF Downloads 366

7629 Gaits Stability Analysis for a Pneumatic Quadruped Robot Using Reinforcement Learning

Authors: Soofiyan Atar, Adil Shaikh, Sahil Rajpurkar, Pragnesh Bhalala, Aniket Desai, Irfan Siddavatam

Abstract:

Deep reinforcement learning (deep RL) algorithms leverage the symbolic power of complex controllers by automating it by mapping sensory inputs to low-level actions. Deep RL eliminates the complex robot dynamics with minimal engineering. Deep RL provides high-risk involvement by directly implementing it in real-world scenarios and also high sensitivity towards hyperparameters. Tuning of hyperparameters on a pneumatic quadruped robot becomes very expensive through trial-and-error learning. This paper presents an automated learning control for a pneumatic quadruped robot using sample efficient deep Q learning, enabling minimal tuning and very few trials to learn the neural network. Long training hours may degrade the pneumatic cylinder due to jerk actions originated through stochastic weights. We applied this method to the pneumatic quadruped robot, which resulted in a hopping gait. In our process, we eliminated the use of a simulator and acquired a stable gait. This approach evolves so that the resultant gait matures more sturdy towards any stochastic changes in the environment. We further show that our algorithm performed very well as compared to programmed gait using robot dynamics.

Keywords: model-based reinforcement learning, gait stability, supervised learning, pneumatic quadruped

Procedia PDF Downloads 307

7628 Adhesion Performance According to Lateral Reinforcement Method of Textile

Authors: Jungbhin You, Taekyun Kim, Jongho Park, Sungnam Hong, Sun-Kyu Park

Abstract:

Reinforced concrete has been mainly used in construction field because of excellent durability. However, it may lead to reduction of durability and safety due to corrosion of reinforcement steels according to damage of concrete surface. Recently, research of textile is ongoing to complement weakness of reinforced concrete. In previous research, only experiment of longitudinal length were performed. Therefore, in order to investigate the adhesion performance according to the lattice shape and the embedded length, the pull-out test was performed on the roving with parameter of the number of lateral reinforcement, the lateral reinforcement length and the lateral reinforcement spacing. As a result, the number of lateral reinforcement and the lateral reinforcement length did not significantly affect the load variation depending on the adhesion performance, and only the load analysis results according to the reinforcement spacing are affected.

Keywords: adhesion performance, lateral reinforcement, pull-out test, textile

Procedia PDF Downloads 354

7627 Robot Movement Using the Trust Region Policy Optimization

Authors: Romisaa Ali

Abstract:

The Policy Gradient approach is one of the deep reinforcement learning families that combines deep neural networks (DNN) with reinforcement learning RL to discover the optimum of the control problem through experience gained from the interaction between the robot and its surroundings. In contrast to earlier policy gradient algorithms, which were unable to handle these two types of error because of over-or under-estimation introduced by the deep neural network model, this article will discuss the state-of-the-art SOTA policy gradient technique, trust region policy optimization (TRPO), by applying this method in various environments compared to another policy gradient method, the Proximal Policy Optimization (PPO), to explain their robust optimization, using this SOTA to gather experience data during various training phases after observing the impact of hyper-parameters on neural network performance.

Keywords: deep neural networks, deep reinforcement learning, proximal policy optimization, state-of-the-art, trust region policy optimization

Procedia PDF Downloads 164

7626 High-Frequency Cryptocurrency Portfolio Management Using Multi-Agent System Based on Federated Reinforcement Learning

Authors: Sirapop Nuannimnoi, Hojjat Baghban, Ching-Yao Huang

Abstract:

Over the past decade, with the fast development of blockchain technology since the birth of Bitcoin, there has been a massive increase in the usage of Cryptocurrencies. Cryptocurrencies are not seen as an investment opportunity due to the market’s erratic behavior and high price volatility. With the recent success of deep reinforcement learning (DRL), portfolio management can be modeled and automated. In this paper, we propose a novel DRL-based multi-agent system to automatically make proper trading decisions on multiple cryptocurrencies and gain profits in the highly volatile cryptocurrency market. We also extend this multi-agent system with horizontal federated transfer learning for better adapting to the inclusion of new cryptocurrencies in our portfolio; therefore, we can, through the concept of diversification, maximize our profits and minimize the trading risks. Experimental results through multiple simulation scenarios reveal that this proposed algorithmic trading system can offer three promising key advantages over other systems, including maximized profits, minimized risks, and adaptability.

Keywords: cryptocurrency portfolio management, algorithmic trading, federated learning, multi-agent reinforcement learning

Procedia PDF Downloads 113

7625 Research on Knowledge Graph Inference Technology Based on Proximal Policy Optimization

Authors: Yihao Kuang, Bowen Ding

Abstract:

With the increasing scale and complexity of knowledge graph, modern knowledge graph contains more and more types of entity, relationship, and attribute information. Therefore, in recent years, it has been a trend for knowledge graph inference to use reinforcement learning to deal with large-scale, incomplete, and noisy knowledge graphs and improve the inference effect and interpretability. The Proximal Policy Optimization (PPO) algorithm utilizes a near-end strategy optimization approach. This allows for more extensive updates of policy parameters while constraining the update extent to maintain training stability. This characteristic enables PPOs to converge to improved strategies more rapidly, often demonstrating enhanced performance early in the training process. Furthermore, PPO has the advantage of offline learning, effectively utilizing historical experience data for training and enhancing sample utilization. This means that even with limited resources, PPOs can efficiently train for reinforcement learning tasks. Based on these characteristics, this paper aims to obtain a better and more efficient inference effect by introducing PPO into knowledge inference technology.

Keywords: reinforcement learning, PPO, knowledge inference

Procedia PDF Downloads 231