Bellman Equations

Deep reinforcement learning series

Posted by Yuan on September 10, 2021

We all learn through rewarding systems. Our body also adapted through ‘awarding’ systems.

I went through Reinforcement Learning: Theory and Algorithms, by Alekh Agarwal, Nan Jiang, Sham Kakade, and Wen Sun. This book was too abstract, and I’m drowning in the equation ocean. At the very begining, Bellman Equation is introduced.

Bellman Equation is fundamental to Reinforcement Learning algorithms. It tells you how to decompose the value function into immediate reward and future discounted returns. Thus, we could apply dynamic programming to get the value function. Also, from Bellman Equation, we could get the Bellman Optimality Equation.

Needless to say, I need read other references for a better understanding. After going through several on-line vedios and webpages (listed below), the beauty and logics of this equation were appreciated a little bit:-(

Other References

  1. Wiki:


  3. RL theory by Csaba Szepesvári at the University of Alberta