⚠ Switch to EXCALIDRAW VIEW in the MORE OPTIONS menu of this document. ⚠ You can decompress Drawing data with the command palette: ‘Decompress current Excalidraw file’. For more info check in plugin settings under ‘Saving’

Excalidraw Data

Text Elements

The RL Interaction Loop

AGENT

(Decision Maker)

ENVIRONMENT

(External World)

Action (aₜ)

State (sₜ₊₁)

Reward (rₜ)

At each timestep t:

1

Agent observes current state sₜ

2

Agent selects action aₜ based on policy π

3

Environment transitions to new state sₜ₊₁

4

Agent receives reward rₜ

🔄 Repeat until terminal state (or forever)