Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 15

Reinforcement Learning Related Publications

15 On Dialogue Systems Based on Deep Learning

Authors: Yifan Fan, Xudong Luo, Pingping Lin

Abstract:

Nowadays, dialogue systems increasingly become the way for humans to access many computer systems. So, humans can interact with computers in natural language. A dialogue system consists of three parts: understanding what humans say in natural language, managing dialogue, and generating responses in natural language. In this paper, we survey deep learning based methods for dialogue management, response generation and dialogue evaluation. Specifically, these methods are based on neural network, long short-term memory network, deep reinforcement learning, pre-training and generative adversarial network. We compare these methods and point out the further research directions.

Keywords: Evaluation, Reinforcement Learning, Deep learning, Dialogue Management, response generation

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 197
14 Distributed Coverage Control by Robot Networks in Unknown Environments Using a Modified EM Algorithm

Authors: Mohammadhosein Hasanbeig, Lacra Pavel

Abstract:

In this paper, we study a distributed control algorithm for the problem of unknown area coverage by a network of robots. The coverage objective is to locate a set of targets in the area and to minimize the robots’ energy consumption. The robots have no prior knowledge about the location and also about the number of the targets in the area. One efficient approach that can be used to relax the robots’ lack of knowledge is to incorporate an auxiliary learning algorithm into the control scheme. A learning algorithm actually allows the robots to explore and study the unknown environment and to eventually overcome their lack of knowledge. The control algorithm itself is modeled based on game theory where the network of the robots use their collective information to play a non-cooperative potential game. The algorithm is tested via simulations to verify its performance and adaptability.

Keywords: Game theory, Reinforcement Learning, Distributed Control, Multi-Agent Learning

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 624
13 A Reinforcement Learning Approach for Evaluation of Real-Time Disaster Relief Demand and Network Condition

Authors: Ali Nadi, Ali Edrissi

Abstract:

Relief demand and transportation links availability is the essential information that is needed for every natural disaster operation. This information is not in hand once a disaster strikes. Relief demand and network condition has been evaluated based on prediction method in related works. Nevertheless, prediction seems to be over or under estimated due to uncertainties and may lead to a failure operation. Therefore, in this paper a stochastic programming model is proposed to evaluate real-time relief demand and network condition at the onset of a natural disaster. To address the time sensitivity of the emergency response, the proposed model uses reinforcement learning for optimization of the total relief assessment time. The proposed model is tested on a real size network problem. The simulation results indicate that the proposed model performs well in the case of collecting real-time information.

Keywords: Disaster Management, Reinforcement Learning, real-time demand, relief demand

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1493
12 Biologically Inspired Controller for the Autonomous Navigation of a Mobile Robot in an Evasion Task

Authors: Dejanira Araiza-Illan, Tony J. Dodd

Abstract:

A novel biologically inspired controller for the autonomous navigation of a mobile robot in an evasion task is proposed. The controller takes advantage of the environment by calculating a measure of danger and subsequently choosing the parameters of a reinforcement learning based decision process. Two different reinforcement learning algorithms were used: Qlearning and Sarsa (λ). Simulations show that selecting dynamic parameters reduce the time while executing the decision making process, so the robot can obtain a policy to succeed in an escaping task in a realistic time.

Keywords: Mobile Robots, Reinforcement Learning, autonomous navigation

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1183
11 Agent-based Simulation for Blood Glucose Control in Diabetic Patients

Authors: Sh. Yasini, M. B. Naghibi-Sistani, A. Karimpour

Abstract:

This paper employs a new approach to regulate the blood glucose level of type I diabetic patient under an intensive insulin treatment. The closed-loop control scheme incorporates expert knowledge about treatment by using reinforcement learning theory to maintain the normoglycemic average of 80 mg/dl and the normal condition for free plasma insulin concentration in severe initial state. The insulin delivery rate is obtained off-line by using Qlearning algorithm, without requiring an explicit model of the environment dynamics. The implementation of the insulin delivery rate, therefore, requires simple function evaluation and minimal online computations. Controller performance is assessed in terms of its ability to reject the effect of meal disturbance and to overcome the variability in the glucose-insulin dynamics from patient to patient. Computer simulations are used to evaluate the effectiveness of the proposed technique and to show its superiority in controlling hyperglycemia over other existing algorithms

Keywords: Reinforcement Learning, type I diabetes, Insulin Delivery rate, Q-learning algorithm

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1725
10 A Cognitive Robot Collaborative Reinforcement Learning Algorithm

Authors: Amit Gil, Helman Stern, Yael Edan

Abstract:

A cognitive collaborative reinforcement learning algorithm (CCRL) that incorporates an advisor into the learning process is developed to improve supervised learning. An autonomous learner is enabled with a self awareness cognitive skill to decide when to solicit instructions from the advisor. The learner can also assess the value of advice, and accept or reject it. The method is evaluated for robotic motion planning using simulation. Tests are conducted for advisors with skill levels from expert to novice. The CCRL algorithm and a combined method integrating its logic with Clouse-s Introspection Approach, outperformed a base-line fully autonomous learner, and demonstrated robust performance when dealing with various advisor skill levels, learning to accept advice received from an expert, while rejecting that of less skilled collaborators. Although the CCRL algorithm is based on RL, it fits other machine learning methods, since advisor-s actions are only added to the outer layer.

Keywords: Reinforcement Learning, Robot Learning, Human-Robot Collaboration, motion planning

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1445
9 Behavioral Analysis of Team Members in Virtual Organization based on Trust Dimension and Learning

Authors: Indiramma M., K. R. Anandakumar

Abstract:

Trust management and Reputation models are becoming integral part of Internet based applications such as CSCW, E-commerce and Grid Computing. Also the trust dimension is a significant social structure and key to social relations within a collaborative community. Collaborative Decision Making (CDM) is a difficult task in the context of distributed environment (information across different geographical locations) and multidisciplinary decisions are involved such as Virtual Organization (VO). To aid team decision making in VO, Decision Support System and social network analysis approaches are integrated. In such situations social learning helps an organization in terms of relationship, team formation, partner selection etc. In this paper we focus on trust learning. Trust learning is an important activity in terms of information exchange, negotiation, collaboration and trust assessment for cooperation among virtual team members. In this paper we have proposed a reinforcement learning which enhances the trust decision making capability of interacting agents during collaboration in problem solving activity. Trust computational model with learning that we present is adapted for best alternate selection of new project in the organization. We verify our model in a multi-agent simulation where the agents in the community learn to identify trustworthy members, inconsistent behavior and conflicting behavior of agents.

Keywords: Trust, Reinforcement Learning, Bayesian network, Collaborative Decision making, Multi Agent System (MAS)

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1590
8 A Computer Model of Language Acquisition – Syllable Learning – Based on Hebbian Cell Assemblies and Reinforcement Learning

Authors: Sepideh Fazeli, Fariba Bahrami

Abstract:

Investigating language acquisition is one of the most challenging problems in the area of studying language. Syllable learning as a level of language acquisition has a considerable significance since it plays an important role in language acquisition. Because of impossibility of studying language acquisition directly with children, especially in its developmental phases, computer models will be useful in examining language acquisition. In this paper a computer model of early language learning for syllable learning is proposed. It is guided by a conceptual model of syllable learning which is named Directions Into Velocities of Articulators model (DIVA). The computer model uses simple associational and reinforcement learning rules within neural network architecture which are inspired by neuroscience. Our simulation results verify the ability of the proposed computer model in producing phonemes during babbling and early speech. Also, it provides a framework for examining the neural basis of language learning and communication disorders.

Keywords: Reinforcement Learning, Brain Modeling, Computer Models, language acquisition

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1247
7 Learning Classifier Systems Approach for Automated Discovery of Censored Production Rules

Authors: Suraiya Jabin, Kamal K. Bharadwaj

Abstract:

In the recent past Learning Classifier Systems have been successfully used for data mining. Learning Classifier System (LCS) is basically a machine learning technique which combines evolutionary computing, reinforcement learning, supervised or unsupervised learning and heuristics to produce adaptive systems. A LCS learns by interacting with an environment from which it receives feedback in the form of numerical reward. Learning is achieved by trying to maximize the amount of reward received. All LCSs models more or less, comprise four main components; a finite population of condition–action rules, called classifiers; the performance component, which governs the interaction with the environment; the credit assignment component, which distributes the reward received from the environment to the classifiers accountable for the rewards obtained; the discovery component, which is responsible for discovering better rules and improving existing ones through a genetic algorithm. The concatenate of the production rules in the LCS form the genotype, and therefore the GA should operate on a population of classifier systems. This approach is known as the 'Pittsburgh' Classifier Systems. Other LCS that perform their GA at the rule level within a population are known as 'Mitchigan' Classifier Systems. The most predominant representation of the discovered knowledge is the standard production rules (PRs) in the form of IF P THEN D. The PRs, however, are unable to handle exceptions and do not exhibit variable precision. The Censored Production Rules (CPRs), an extension of PRs, were proposed by Michalski and Winston that exhibit variable precision and supports an efficient mechanism for handling exceptions. A CPR is an augmented production rule of the form: IF P THEN D UNLESS C, where Censor C is an exception to the rule. Such rules are employed in situations, in which conditional statement IF P THEN D holds frequently and the assertion C holds rarely. By using a rule of this type we are free to ignore the exception conditions, when the resources needed to establish its presence are tight or there is simply no information available as to whether it holds or not. Thus, the IF P THEN D part of CPR expresses important information, while the UNLESS C part acts only as a switch and changes the polarity of D to ~D. In this paper Pittsburgh style LCSs approach is used for automated discovery of CPRs. An appropriate encoding scheme is suggested to represent a chromosome consisting of fixed size set of CPRs. Suitable genetic operators are designed for the set of CPRs and individual CPRs and also appropriate fitness function is proposed that incorporates basic constraints on CPR. Experimental results are presented to demonstrate the performance of the proposed learning classifier system.

Keywords: Data Mining, Machine Learning, Reinforcement Learning, Censored Production Rule, GeneticAlgorithm, Learning Classifier System, PittsburgApproach

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1229
6 A Modular On-line Profit Sharing Approach in Multiagent Domains

Authors: Pucheng Zhou, Bingrong Hong

Abstract:

How to coordinate the behaviors of the agents through learning is a challenging problem within multi-agent domains. Because of its complexity, recent work has focused on how coordinated strategies can be learned. Here we are interested in using reinforcement learning techniques to learn the coordinated actions of a group of agents, without requiring explicit communication among them. However, traditional reinforcement learning methods are based on the assumption that the environment can be modeled as Markov Decision Process, which usually cannot be satisfied when multiple agents coexist in the same environment. Moreover, to effectively coordinate each agent-s behavior so as to achieve the goal, it-s necessary to augment the state of each agent with the information about other existing agents. Whereas, as the number of agents in a multiagent environment increases, the state space of each agent grows exponentially, which will cause the combinational explosion problem. Profit sharing is one of the reinforcement learning methods that allow agents to learn effective behaviors from their experiences even within non-Markovian environments. In this paper, to remedy the drawback of the original profit sharing approach that needs much memory to store each state-action pair during the learning process, we firstly address a kind of on-line rational profit sharing algorithm. Then, we integrate the advantages of modular learning architecture with on-line rational profit sharing algorithm, and propose a new modular reinforcement learning model. The effectiveness of the technique is demonstrated using the pursuit problem.

Keywords: Reinforcement Learning, Multi-Agent Learning, rationalprofit sharing, modular architecture

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1170
5 Relational Representation in XCSF

Authors: Mohammad Ali Tabarzad, Caro Lucas, Ali Hamzeh

Abstract:

Generalization is one of the most challenging issues of Learning Classifier Systems. This feature depends on the representation method which the system used. Considering the proposed representation schemes for Learning Classifier System, it can be concluded that many of them are designed to describe the shape of the region which the environmental states belong and the other relations of the environmental state with that region was ignored. In this paper, we propose a new representation scheme which is designed to show various relationships between the environmental state and the region that is specified with a particular classifier.

Keywords: Reinforcement Learning, Classifier Systems, Relational Representation, XCSF

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1013
4 A Probabilistic Reinforcement-Based Approach to Conceptualization

Authors: Hadi Firouzi, Majid Nili Ahmadabadi, Babak N. Araabi

Abstract:

Conceptualization strengthens intelligent systems in generalization skill, effective knowledge representation, real-time inference, and managing uncertain and indefinite situations in addition to facilitating knowledge communication for learning agents situated in real world. Concept learning introduces a way of abstraction by which the continuous state is formed as entities called concepts which are connected to the action space and thus, they illustrate somehow the complex action space. Of computational concept learning approaches, action-based conceptualization is favored because of its simplicity and mirror neuron foundations in neuroscience. In this paper, a new biologically inspired concept learning approach based on the probabilistic framework is proposed. This approach exploits and extends the mirror neuron-s role in conceptualization for a reinforcement learning agent in nondeterministic environments. In the proposed method, instead of building a huge numerical knowledge, the concepts are learnt gradually from rewards through interaction with the environment. Moreover the probabilistic formation of the concepts is employed to deal with uncertain and dynamic nature of real problems in addition to the ability of generalization. These characteristics as a whole distinguish the proposed learning algorithm from both a pure classification algorithm and typical reinforcement learning. Simulation results show advantages of the proposed framework in terms of convergence speed as well as generalization and asymptotic behavior because of utilizing both success and failures attempts through received rewards. Experimental results, on the other hand, show the applicability and effectiveness of the proposed method in continuous and noisy environments for a real robotic task such as maze as well as the benefits of implementing an incremental learning scenario in artificial agents.

Keywords: Reinforcement Learning, Concept Learning, probabilistic decision making

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1250
3 Learning Classifier Systems Approach for Automated Discovery of Crisp and Fuzzy Hierarchical Production Rules

Authors: Suraiya Jabin, Kamal K. Bharadwaj

Abstract:

This research presents a system for post processing of data that takes mined flat rules as input and discovers crisp as well as fuzzy hierarchical structures using Learning Classifier System approach. Learning Classifier System (LCS) is basically a machine learning technique that combines evolutionary computing, reinforcement learning, supervised or unsupervised learning and heuristics to produce adaptive systems. A LCS learns by interacting with an environment from which it receives feedback in the form of numerical reward. Learning is achieved by trying to maximize the amount of reward received. Crisp description for a concept usually cannot represent human knowledge completely and practically. In the proposed Learning Classifier System initial population is constructed as a random collection of HPR–trees (related production rules) and crisp / fuzzy hierarchies are evolved. A fuzzy subsumption relation is suggested for the proposed system and based on Subsumption Matrix (SM), a suitable fitness function is proposed. Suitable genetic operators are proposed for the chosen chromosome representation method. For implementing reinforcement a suitable reward and punishment scheme is also proposed. Experimental results are presented to demonstrate the performance of the proposed system.

Keywords: Data Mining, Reinforcement Learning, Learning Classifier System, Hierarchical Production Rule, Fuzzy Subsumption Relation, Subsumption matrix

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1184
2 Markov Game Controller Design Algorithms

Authors: Rajneesh Sharma, M. Gopal

Abstract:

Markov games are a generalization of Markov decision process to a multi-agent setting. Two-player zero-sum Markov game framework offers an effective platform for designing robust controllers. This paper presents two novel controller design algorithms that use ideas from game-theory literature to produce reliable controllers that are able to maintain performance in presence of noise and parameter variations. A more widely used approach for controller design is the H∞ optimal control, which suffers from high computational demand and at times, may be infeasible. Our approach generates an optimal control policy for the agent (controller) via a simple Linear Program enabling the controller to learn about the unknown environment. The controller is facing an unknown environment, and in our formulation this environment corresponds to the behavior rules of the noise modeled as the opponent. Proposed controller architectures attempt to improve controller reliability by a gradual mixing of algorithmic approaches drawn from the game theory literature and the Minimax-Q Markov game solution approach, in a reinforcement-learning framework. We test the proposed algorithms on a simulated Inverted Pendulum Swing-up task and compare its performance against standard Q learning.

Keywords: Reinforcement Learning, Inverted pendulum, controller, Markov decision process, Matrix Games, Markov Games, Smooth Fictitious play

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1237
1 A Learning Agent for Knowledge Extraction from an Active Semantic Network

Authors: Simon Thiel, Stavros Dalakakis, Dieter Roller

Abstract:

This paper outlines the development of a learning retrieval agent. Task of this agent is to extract knowledge of the Active Semantic Network in respect to user-requests. Based on a reinforcement learning approach, the agent learns to interpret the user-s intention. Especially, the learning algorithm focuses on the retrieval of complex long distant relations. Increasing its learnt knowledge with every request-result-evaluation sequence, the agent enhances his capability in finding the intended information.

Keywords: Reinforcement Learning, learning retrieval agent, search in semantic networks

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1209