Search results for: MDP
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 3

Search results for: MDP

3 Optimal Maintenance and Improvement Policies in Water Distribution System: Markov Decision Process Approach

Authors: Jong Woo Kim, Go Bong Choi, Sang Hwan Son, Dae Shik Kim, Jung Chul Suh, Jong Min Lee

Abstract:

The Markov decision process (MDP) based methodology is implemented in order to establish the optimal schedule which minimizes the cost. Formulation of MDP problem is presented using the information about the current state of pipe, improvement cost, failure cost and pipe deterioration model. The objective function and detailed algorithm of dynamic programming (DP) are modified due to the difficulty of implementing the conventional DP approaches. The optimal schedule derived from suggested model is compared to several policies via Monte Carlo simulation. Validity of the solution and improvement in computational time are proved.

Keywords: Markov decision processes, Dynamic Programming, Monte Carlo simulation, Periodic replacement, Weibull distribution.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2863
2 Trajectory-Based Modified Policy Iteration

Authors: R. Sharma, M. Gopal

Abstract:

This paper presents a new problem solving approach that is able to generate optimal policy solution for finite-state stochastic sequential decision-making problems with high data efficiency. The proposed algorithm iteratively builds and improves an approximate Markov Decision Process (MDP) model along with cost-to-go value approximates by generating finite length trajectories through the state-space. The approach creates a synergy between an approximate evolving model and approximate cost-to-go values to produce a sequence of improving policies finally converging to the optimal policy through an intelligent and structured search of the policy space. The approach modifies the policy update step of the policy iteration so as to result in a speedy and stable convergence to the optimal policy. We apply the algorithm to a non-holonomic mobile robot control problem and compare its performance with other Reinforcement Learning (RL) approaches, e.g., a) Q-learning, b) Watkins Q(λ), c) SARSA(λ).

Keywords: Markov Decision Process (MDP), Mobile robot, Policy iteration, Simulation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1477
1 Deep Reinforcement Learning Approach for Trading Automation in the Stock Market

Authors: Taylan Kabbani, Ekrem Duman

Abstract:

Deep Reinforcement Learning (DRL) algorithms can scale to previously intractable problems. The automation of profit generation in the stock market is possible using DRL, by combining  the financial assets price ”prediction” step and the ”allocation” step of the portfolio in one unified process to produce fully autonomous systems capable of interacting with its environment to make optimal decisions through trial and error. This work represents a DRL model to generate profitable trades in the stock market, effectively overcoming the limitations of supervised learning approaches. We formulate the trading problem as a Partially observed Markov Decision Process (POMDP) model, considering the constraints imposed by the stock market, such as liquidity and transaction costs. We then solved the formulated POMDP problem using the Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithm and achieved a 2.68 Sharpe ratio on the test dataset. From the point of view of stock market forecasting and the intelligent decision-making mechanism, this paper demonstrates the superiority of DRL in financial markets over other types of machine learning and proves its credibility and advantages of strategic decision-making.

Keywords: Autonomous agent, deep reinforcement learning, MDP, sentiment analysis, stock market, technical indicators, twin delayed deep deterministic policy gradient.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 579