# Reinforcement learning

Jump to navigation
Jump to search

Please help develop and classify this resource
Learn how you can develop this resource to teach participants about |

**Reinforcement learning**

## Contents

- 1 What is Reinforcement Learning
- 2 Evaluative Feedback (Chapter 2)
- 3 Reinforcement Learning Problems (Sutton and Barto Chapter 3)
- 4 Dynamic Programming (Sutton and Barto Chapter 4)
- 5 Monte Carlo Methods (Sutton and Barto Chaper 5)
- 6 Temporal-Difference Learning (Sutton and Barto Chapter 6)
- 7 Eligibility Traces (Sutton and Barto Chapter 7)
- 8 Related Terms
- 9 References

## What is Reinforcement Learning[edit]

"Reinforcement learning (RL) is learning from interaction with an environment, from the consequences of action, rather than from explicit teaching." -- Rich Sutton

## Evaluative Feedback (Chapter 2)[edit]

### Softmax Action Selection[edit]

Softmax action selection is the way to maintain exploration and exploitation balance. The softmax policy will choose action *a* on period *t* with probability:

## Reinforcement Learning Problems (Sutton and Barto Chapter 3)[edit]

## Dynamic Programming (Sutton and Barto Chapter 4)[edit]

## Monte Carlo Methods (Sutton and Barto Chaper 5)[edit]

## Temporal-Difference Learning (Sutton and Barto Chapter 6)[edit]

## Eligibility Traces (Sutton and Barto Chapter 7)[edit]

## Related Terms[edit]

- Machine Learning
- Adaptive Dynamic Programming
- Markov Decision Process

## References[edit]

- Sutton and Barto, Reinforcement Learning, an introduction, MIT Press 1998 (online version at http://www.cs.ualberta.ca/%7Esutton/book/ebook/the-book.html