Search results for: TD3
2 Robot Exploration and Navigation in Unseen Environments Using Deep Reinforcement Learning
Authors: Romisaa Ali
Abstract:
This paper presents a comparison between twin-delayed Deep Deterministic Policy Gradient (TD3) and Soft Actor-Critic (SAC) reinforcement learning algorithms in the context of training robust navigation policies for Jackal robots. By leveraging an open-source framework and custom motion control environments, the study evaluates the performance, robustness, and transferability of the trained policies across a range of scenarios. The primary focus of the experiments is to assess the training process, the adaptability of the algorithms, and the robot’s ability to navigate in previously unseen environments. Moreover, the paper examines the influence of varying environment complexities on the learning process and the generalization capabilities of the resulting policies. The results of this study aim to inform and guide the development of more efficient and practical reinforcement learning-based navigation policies for Jackal robots in real-world scenarios.
Keywords: Jackal robot environments, reinforcement learning, TD3, SAC, robust navigation, transferability, Custom Environment.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1771 Deep Reinforcement Learning Approach for Trading Automation in the Stock Market
Authors: Taylan Kabbani, Ekrem Duman
Abstract:
Deep Reinforcement Learning (DRL) algorithms can scale to previously intractable problems. The automation of profit generation in the stock market is possible using DRL, by combining the financial assets price ”prediction” step and the ”allocation” step of the portfolio in one unified process to produce fully autonomous systems capable of interacting with its environment to make optimal decisions through trial and error. This work represents a DRL model to generate profitable trades in the stock market, effectively overcoming the limitations of supervised learning approaches. We formulate the trading problem as a Partially observed Markov Decision Process (POMDP) model, considering the constraints imposed by the stock market, such as liquidity and transaction costs. We then solved the formulated POMDP problem using the Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithm and achieved a 2.68 Sharpe ratio on the test dataset. From the point of view of stock market forecasting and the intelligent decision-making mechanism, this paper demonstrates the superiority of DRL in financial markets over other types of machine learning and proves its credibility and advantages of strategic decision-making.
Keywords: Autonomous agent, deep reinforcement learning, MDP, sentiment analysis, stock market, technical indicators, twin delayed deep deterministic policy gradient.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 566