Reinforcement learning

From Wikiversity

Jump to: navigation, search
Please help develop this page

This page was created, but so far, little content has been added. Everyone is invited to help expand and create educational content for Wikiversity. If you need help learning how to add content, see the editing tutorial and the MediaWiki syntax reference.

To help you get started with content, we have automatically added references below to other Wikimedia Foundation projects. This will help you find materials such as information, media and quotations on which to base the development of "Reinforcement learning" as an educational resource. However, please do not simply copy-and-paste large chunks from other projects. You can also use the links in the blue box to help you classify this page by subject, educational level and resource type.

Run a search on Reinforcement learning at Wikipedia.
Search Wikimedia Commons for images, sounds and other media related to: Reinforcement learning
Search for Reinforcement learning on the following projects:
Lost on Wikiversity? Please help by choosing project boxes to classify this resource by:

Reinforcement learning

Contents

[edit] What is Reinforcement Learning

"Reinforcement learning (RL) is learning from interaction with an environment, from the consequences of action, rather than from explicit teaching." -- Rich Sutton

[edit] Evaluative Feedback (Chapter 2)

[edit] Softmax Action Selection

Softmax action selection is the way to maintain exploration and exploitation balance. The softmax policy will choose action a on period t with probablity: \frac{\exp(\frac{Q_t(a)}{\tau})}{\sum_{b=1}^n \exp(\frac{Q_t(b)}{\tau})}

[edit] Reinforcement Learning Problems (Sutton and Barto Chapter 3)

[edit] Dynamic Programming (Sutton and Barto Chapter 4)

[edit] Monte Carlo Methods (Sutton and Barto Chaper 5)

[edit] Temporal-Difference Learning (Sutton and Barto Chapter 6)

[edit] Eligibility Traces (Sutton and Barto Chapter 7)

[edit] Related Terms

  • Machine Learning
  • Adaptive Dynamic Programming
  • Markov Decision Process

[edit] References

Personal tools