Help with the code in `DiscreteDP` (_s_wise_matrix)

used4regnum · January 22, 2025, 9:23pm

I’m reading the chapter on 4. Discrete State Dynamic Programming. As I see in the code of the class DiscreteDP (from ddp.py), the main heavy lifting of the algorithm happens behind the two auxiliary functions (I’ve made them in functional style in order to avoid classes self everywhere):

def _s_wise_max(a_indices, a_indptr, vals, out_max):
n = len(out_max)   # grid_size
    
for i in range(n):
    if a_indptr[i] != a_indptr[i+1]:
        m = a_indptr[i]
        for j in range(a_indptr[i]+1, a_indptr[i+1]):
            if vals[j] > vals[m]:
                m = j
        out_max[i] = vals[m]

which in its turn depends on:

# from utilities
def generate_a_indptr(num_states, s_indices, out):  #expects: out= empty a_indprt
    """
    helper for _s_wise_max
    """
    idx = 0
    out[0] = 0
    for s in range(num_states-1):
        while(s_indices[idx] == s):
            idx += 1
        out[s+1] = idx
    out[num_states] = len(s_indices)

This functions take L size vector R + beta * Q.dot(v) and transform it to grid_size Tv, which is the main part of Bellman operator routing.

Can someone explain the logic behind the procedure, i.e. what this a_indprt index does. Or may be there is a description of this algorithm. The reference also would be great!

oyama.daisuke · January 29, 2025, 12:11am

It is based on the “compressed sparse row” (CSR) data structure. See for example:

Wikipedia entry or
Documentation for scipy.sparse. csr_matrix.

Topic		Replies	Views
Generalizing the mapping of indices with more than two state variable Lectures	3	1466	July 14, 2017
More efficient way to compute the Q matrix for discrete dynamic programming Lectures	2	1190	July 25, 2018
Discrete DP error - ' Rows of P must sum to 1'	4	1234	February 12, 2019
DiscreteDP: Multiple Actions	4	832	January 22, 2025
More than 3 dimensions in Discrete Dynamic Programming	3	1408	May 21, 2019

Help with the code in `DiscreteDP` (_s_wise_matrix)

Related topics