SARSA

From Wikiversity
Jump to navigation Jump to search

Overview[edit | edit source]

Search for SARSA on Wikipedia.

SARSA is an algorithm for applying reinforcement learning in artificial neural networks (ANN).

Exercise[edit | edit source]

We propose the construction of such an artificial neural network in order to understand the complicated concepts of this interesting field.

Context[edit | edit source]

The ANN itself will represent a simple model of a rat's hippocampal place cells and action taking neurons when thrown into a Morris water maze. What is happening in the rat's brain as it learns to swim consistently straight to the platform after a couple of trials? At this point, science can't really answer this question in full detail. One can only speak of spatial learning, place learning, cognitive maps formation and memory in general. This project tries to place a simple mathematical model on these processes.

The SARSA algorithm[edit | edit source]

We now explain how our ANN will be constructed. It will be structured of only 2 layers (a bit like a perceptron). One named the 'input layer' comprised of 100 neurons and a second titled the 'output layer' comprised of only 4 neurons. The input neurons (i) correspond to the fictional rat's place cells and are arranged in a grid of 10 by 10 covering the whole pool area. These neurons modulate their firing rate (Φ) according to the rat's position in the pool (s). Next, each of the four output neurons (a) have their own firing rate (Q) which is modulated solely by the activity of the input neurons to which they are connected. Every output neuron is connected to every input neuron with a different weight (w).

Initialization[edit | edit source]

Moving[edit | edit source]

Rewards[edit | edit source]

Weights update[edit | edit source]