Collocation with NLsolve.jl

Hi everyone,

I would like to ask you for advice about my collocation code. I am trying to solve a simple deterministic optimal growth model using collocation with Chebyshev polynomials. To solve these equations, I use NLsolve.jl package. Unfortunately, solvers are diverging (both newton and trusted region methods), even after a massive increase in the allowed number of iterations, except when they start exactly at the solution… Would anybody able to help me with this issue?

I found a way around this by applying an optimization package to the square of residuals. It works robustly, but it is quite slow, so I would like to find a way, how to solve collocation equations using standard Newton-type methods. Thanks in advance!

#----------------------------------------------*
#-----Solve simple deterministic growth model using ChebyshevProjection-----
#----------------------------------------------*
#(1) Install and initialize packages
using Pkg
Pkg.add(“NLsolve”)
Pkg.add(“Distributed”)
Pkg.add(“Plots”)
Pkg.add(“LinearAlgebra”)
Pkg.add(“Random”)
Pkg.add(“Distributions”)
Pkg.add(“Parameters”)
Pkg.add(“Optim”)
using NLsolve
using Distributed
using Parameters
using LinearAlgebra
using Random
using Distributions
using Plots
using Optim

#(2) Include ChebyFunctions
#--------------------------------*
#Transform general yϵ[a,b] domain into xϵ[-1,1] interval
function Transform(y,a,b)
let
x = 2 .((y.-a)./(b.-a)) .-1
return x
end
end
#Inverse transformation, from [-1,1] to [a,b]
function iTransform(x,a,b)
let
y = a .+ (b.-a).
(x.+1)./2
return y
end
end
#--------------------------------*
#Compute zeros of Chebyshev polynomial (on basic [-1,1] range)
function ChebyZeroBase(n)
let
Zrs = Array{Float64}(undef,n,1)
for i = 1:n
Zrs[i] = -cos(pi*(2i-1)/(2n))
end
return Zrs
end
end
#---------------------------------
#Build Chebyshev polynomial
function Cheby(n,x)
let
m = n+1
if(n<0)
println(“Error, please set order to n >=0”)
return
elseif(n==0)
y = 1.0
return y
elseif(n==1)
y = x
return y
elseif(n>=2)
T = Array{Float64}(undef,m,1)
T[1,1] = 1.0
T[2,1] = x
for i = 3:m
T[i,1] = 2xT[i-1,1] - T[i-2,1]
end
y = T[m,1]
return y
end
end
end
function ChebyDir(n,x)
let
y = cos.(n.acos(x))
return y
end
end
#
----------------------------------
#Chebyshev orthogonal regression
function BuildX(Param,n,ζ)
let
@unpack a,b = Param
X = Array{Float64}(undef,length(ζ),n+1)
𝞯 = Transform(ζ,a,b)
for i in 1:length(𝞯),j in 1:(n+1)
X[i,j] = Cheby(j-1,𝞯[i])
end
return X
end
end
#---------------------------------
#Build Chebyshev approximation function
function ChebyAprox(Param,θ,ζ)
let
@unpack a,b = Param
X = BuildX(Param,length(θ)-1,ζ)
φ = Xθ
return φ
end
end
function ChebyApIn(Param,θ,y)
let
@unpack a,b = Param
ϕ = Array{Float64}(undef,length(θ),1)
for i in 1:length(θ)
ϕ[i] = θ[i]Cheby(i-1,Transform(y,a,b))
end
π = sum(ϕ)
return π
end
end
function ChebyApOt(Param,θ,ζ)
let
φ = Array{Float64}(undef,length(ζ),1)
for i in 1:length(ζ)
φ[i] = ChebyApIn(Param,θ,ζ[i])
end
return φ
end
end
function NGsteady(Econ)
@unpack α,β,δ = Econ
Kss = (α/(1/β - (1-δ)))^(1/(1-α))
Css = Kss^α - δ
Kss
return Kss,Css
end
function BMPolicy(Econ,K,A=nothing)
@unpack α,β = Econ
if(A==nothing)
A = 1.0
C = (1-α
β).*A.K.^α
else
C = (1-α
β).*A.*K.^α
end
return C
end

#(4) Calibrate parameter values
@with_kw struct Param
a::Float64 = 0.5-1.5^(1/3)
b::Float64 = 0.5+2^(1/3)
end
@with_kw struct Econ
α::Float64 = 1/3
β::Float64 = 1/(1+0.05)
γ::Float64 = 1
δ::Float64 = 0.85
end
Econ1 = Econ(δ=1.0)
Kss = NGsteady(Econ1)[2]
Param2 = Param(a=Kss0.25,b=Kss5.5)
@unpack a,b = Param2
τ = 15

#(3) Build grid
ζ = iTransform(ChebyZeroBase(500),a,b)
kGrid = iTransform(ChebyZeroBase(τ+1),a,b)

#(4) Prepare initial guess
Y = log.(BMPolicy(Econ1,ζ))
X = BuildX(Param2,τ,ζ)
θ = inv(transpose(X)*X)*transpose(X)*Y
ψ = exp.(ChebyApOt(Param2,θ,ζ))
display(Plots.plot(ζ,exp.(Y),title=“Policy function computed by ChebyshevProjection”,
xlabel=“K(t)”,ylabel=“C(t)”,legend=:bottomright,label=“BrockMirman”))
display(Plots.plot!(ζ,ψ,label=“LeastSquares”))

#(5) Prepare residual function
function EulerCollocation(θ,Econ,Param,Grid)
let
@unpack α,β,γ,δ = Econ
@unpack a,b = Param
#(2) Build residuals on a grid
Eu = Array{Float64}(undef,length(Grid),1)
for i in 1:length(Grid)
#Compute t,t+1 consumption
Ct = exp.(ChebyApIn(Param,θ,Grid[i]))
Kt1 = (1-δ)Grid[i] + Grid[i]^α - Ct
Ct1 = exp.(ChebyApIn(Param,θ,Kt1))
#Now, prevent consumption to be negative
#Then compute euler equation residuals
Eu[i] = (Ct^(-γ) - β
(Ct1^(-γ))(Kt1^(α-1))+(1-δ)))
end
ρ = sum(Eu)
return Eu
end
end
Econ2 = Econ(δ=0.99)

EulerCollResid(θ) = EulerCollocation(θ,Econ1,Param2,kGrid)

Sol = nlsolve(EulerCollResid,θ,ftol=10^(-14))

θ2 = Sol.zero

ψ2 = exp.(ChebyApOt(Param2,θ2,ζ))
display(Plots.plot!(ζ,ψ2,label=“Collocation”))

EulerCollResid(θ) = EulerCollocation(θ2,Econ2,Param2,kGrid)
Sol2 = nlsolve(EulerCollResid,θ2,ftol=10^(-12),iterations=100000)

Best,
Honza

You can try Anderson iteration, which is supported by NLsolve.

But in general, you really need Jacobians to do this sort of thing efficiently (For either the solver or the nlls approach). I usually make sure that sparse auto differentiation works on the function. Then if NLsolve doesn’t do it, you could try commercial solvers.

But the basic problem with these methods is that the functions are not convex… So perhaps it isn’t converging because it shouldn’t converge from your starting point.

@jlperla Thank you for a response! I will definitely try to do that. So it is possible, that NLsove failed because finite differencing is inaccurate? Should I try ForwadDiff.jl or Zygote.jl?

Actually, I think, that I “should” work, starting from this initial condition. It is (pretty accurate) least-squares approximation of closed-form solution, and I tried to use it just for depreciation rate = 0.99, which is kind of “epsilon” deviation. Actually, I saw Matlab codes which used as guess linear approximation, which is much worse than this. That is, why I am worried, wheter I didn’t mess up something about NLsolve.jl syntax.

That can happen, but it is more likely that it will be very slow so you will find it hard to do a multi-start approach.

Zygote is reverse-mode and only appropriate for thigns like R^N → R. ForwardDiff is good for the R^N → R^N, but the real benefit comes from the sparsity of the jacobian. See GitHub - JuliaDiff/SparseDiffTools.jl: Fast jacobian computation through sparsity exploitation and matrix coloring for more. The sparisty is independent of the difference scheme (i.e. you can use finite differences or auto-diff).

Over the next months, another option will be GitHub - YingboMa/ForwardDiff2.jl - which is intended for vectorized functions, but I am not sure it is ready for you to play with, or that Sparsity is conveniently wrapped. It is “the future” for this sort of problem, ethough.

If you have a solution that you know is locally correct (and isn’t a saddlepoint, etc), and it isn’t converging from there, then I don’t see why NLsolve would be any different than the others. You might be right that there is a mistake.

You have a lot of code there, and may have made a mistake in the implementation. I would try to use packages whereever possible, such as GitHub - QuantEcon/BasisMatrices.jl: Routines for constructing BasisMatrices of different types (Chebyshev polynomials, B-Splines, piecewise linear, complete monomials, Smolyak...) or look at ApproxFun.jl for the cheybysehv basis.

Basically, avoid writing a single line of code you don’t need to, because otherwise you need to test all of those functions in isolation, one by one…

Thank you for your suggestions. I will try that! I coded this stuff by myself just to see, how these things work in detail, in any “real” project, I am using packages written by people, who understood thing better than me… :smiley: The rest of the code should be ok (I used it to solve the model with BlackBoxOptim.jl, code was the same, except squaring residuals, and it found a solution without any problem…).

1 Like