⚠ Switch to EXCALIDRAW VIEW in the MORE OPTIONS menu of this document. ⚠ You can decompress Drawing data with the command palette: ‘Decompress current Excalidraw file’. For more info check in plugin settings under ‘Saving’
Excalidraw Data
Text Elements
The RL Interaction Loop
AGENT
(Decision Maker)
ENVIRONMENT
(External World)
Action (aₜ)
State (sₜ₊₁)
Reward (rₜ)
At each timestep t:
1
Agent observes current state sₜ
2
Agent selects action aₜ based on policy π
3
Environment transitions to new state sₜ₊₁
4
Agent receives reward rₜ
🔄 Repeat until terminal state (or forever)