Dynamic Programming Squared - Value iteration

Hi everyone,

I would like to ask a question about Dynamic Programming Squared (I want to be sure, that I understand this concept properly).

My understanding is, that Dynamic Programming Squared problems include two or more agents, whose behavior is described by Bellman equations, and some variables in each agent’s Bellman equation are results of optimization behavior of another agent, which makes both problems interdependent. I am thinking about a model with non-convex choice (two agents, whose choices are interdependent) set, so I can’t use usual approach, that combines first-order conditions with market-clearing conditions and law of motion equations, also I don’t want to use LQ approximation.

I plan to solve model directly using dynamic programming/value function iteration, firstly build “nested Bellman equation” -> pick one agent, and substitute the “strategic” variables in his optimization problem with argmax of another agent’s Bellman equation, discretize whole problem, and for each choice of “outer” agent solve “inner” Bellman equation. However, I am not sure, how to represent “all choices” of the outer agent. I think, that each “choice” to be whole policy function (to get the law of motion, subject to which “inner” agent optimize). Should I use full discretization, and then try all possible permutations of mapping from state (point) to choice (in discrete setting, number of these permutations will be finite)? Is this a correct way, how to solve a multiagent problem, or did I understand “DP^2” in an incorrect way?

Any guidance or reference will be very welcome!

Best
Jan Žemlička

1 Like

Hi @Honza9723, sorry I missed this. Tom Sargent is more the expert in this area. (I usually work with single agent problems embedded in a heterogeneous agent framework, which is easier!) But your understanding of DP^2 is the same as mine.

I would say that you should avoid full discretization, since it’s completely subject to the curse of dimensionality (due to lack of “space-filling” properties). I would use interpolation and my standard go-to is linear interpolation because it’s simple and shape preserving. (Monotonicity and concavity/convexity of sampled functions is preserved in the interpolation.) This library is fast because it’s JIT-compiled. (Or are you using Julia?) There’s examples of how to apply it scattered through our lectures.

I hope that helps, at least a little.