Python library recommendations for AD, Polynomial Basis, and Constrained Nonlinear Least Squares


#1

To implement a number of pseudo-spectral and other algorithms, I often require several libraries:

  • High quality auto-differentiation (AD) capable of supporting thousands or hundreds of thousands of variables (with a high degree of sparsity, of course). Support for sparsity patterns is essential.

  • Constrained nonlinear least squares (or at least a constrained nonlinear optimizer), also capable of handling big sparse systems

  • Polynomial basis and quadrature (in one or two dimensions)

In Matlab, the answer is clear: the commericial Tomlab library has AD and wrappers to high-quality commerical optimizers like Knitro, NPSOL, etc. CompEcon provides quadrature rules and basis matrices.

What are the appropriate python libraries for this that are of commerical quality with comprehensive tests and easy installation with conda/etc.:

  • AD: Many of the libraries, like Theano, work on symbolic expressions, which won’t work here. I found https://github.com/HIPS/autograd/ but it seems that tensorflow also has an AD implementation (as far as I can tell). Gradients and Jacobians are essential for my immediate requirements, and Hessians are essential in general.

  • Constrained nonlinear least squares: I may have hundreds of thousands of variables with high sparsity and linear constraint so a high quality library is essential. I didn’t have much luck with Ipopt on a previous projects, for example, with led me to license knitro/npsol.

  • Constrained Optimizer: More generally for other problems, what are the nonlinear optimizers people have had luck with in the python ecosystem that can handle big problems?

  • Polynomial Basis: In particular one or two-dimensional chebyshev polynomials. For python, do people use SciPy for this (e.g. https://docs.scipy.org/doc/numpy/reference/routines.polynomials.chebyshev.html ) It was unclear from the documentation how you would get a basis matrix and the Differentiation Matrix if this is the library people use.

  • Quadrature: Chebyshev, Laguerre, and Gauss-Lobatto quadrature weights and rules. Also, I see https://docs.scipy.org/doc/numpy/reference/routines.polynomials.chebyshev.html routines, but couldn’t find Gauss-Lobatto. What about https://pypi.python.org/pypi/quadpy ?

Thanks for any thoughts. More generally, I think it would be very valuable to collect a curated list of libraries for QuantEcon so economists stop writing (and insufficiently testing) so much of their own code from scratch.


#2

Hey Jesse – lots of good questions.

Thanks for posting.

I’ll respond to the items I know something about.

AD

Python

A quick google search turned this up: https://arxiv.org/pdf/1606.06311.pdf

Hopefully it gives a good sense of what perf is like

Julia

Julia actually has very good AD support.

Both ForwardDiff.jl or ReverseDiff.jl are well-wriitten implementations of their flavor of AD. Which is better is usually application dependent so you should do some benchmarks on your problem.

Jarret Revels is the primary author of these (and the up-and-coming Casette.jl) and I’ll rope him in and ask to comment more if necessary.

Quadrature

QuantEcon has quadrature routines in both python and julia. Python docs list available routines.

They might not be the very fastest implementations out there, but they work for me.

Polynomial basis

BasisMatrices.jl is what I’d recommend in Julia. You ask later on about robustness of testing. I think coverage is something like 95%. I’d give the test suite an overall grade of B+ or A-. There is always more we can test, but I try to cover all the basics and at least make sure the math is correct.

For python there was some work in the albop/linear_basis branch of interpolation.py. This is not finished yet, but wouldn’t take too much work to polish up if need be. cc @pablo.winant.

I needed a version of BasisMatrices.jl in python a few months ago so I wrote this up: https://gist.github.com/sglyon/7790b35049342be7cc688549ad69285c. There isn’t an included test suite yet (no time), but I did test against BasisMatrices.jl when developing. Performance was pretty good…

Optimization

I don’t have too much experience here, so I’ll just refer to the juliaopt home page so you can see what is available in Julia. I would imagine that all the wrapped C libraries also have a python wrapper.

If I were writing this in C++ and only needed box constraints on parameters I would like to experiment with google’s ceres-solver. I think there are some python bindings, but they might be out of date/limited.


#3

Also, RE tensorflow. If you can express your algorithm in terms of tensorflow operations then you can have it use AD to give derivative info.

Sometimes that’s harder than it sounds though…


#4

Thanks for the response (and sorry my response is so late). One general question I have is why so many of the routines suggested are from the QuantEcon library? For Julia I get that there is a limited number of libraries out there so you sometimes need to fill in the blanks, but python should have a whole bunch of well tested libraries that are easily installed with pip and conda? Or perhaps the python ecosystem is more sparse than I thought…

AD:
I had looked at python benchmark previously, but it gives little sense of the usability of the routines (e.g. does it work well with external routines). I have used both ADOL-C and CppAD within C++, but it says that these are python wrappers for those routines. Does it really work for non-trivial setups with nested functions, etc?

If anyone has experience with using any of these? https://github.com/HIPS/autograd looks pretty solid, but I prefer to use the best supported ones out there. Thanks confirmation that the symbolic differentiation in tensorflow is tough to use.

Quadrature and Polynomial Basis
Is there a reason you wrote something for python? Are there really not commonly used libraries for this kind of stuff?

Optimization
For python, has anyone used https://github.com/Pyomo for general modelling/etc.? Is this the rough comparison to the JuliaOpt project? Is it too AMPL like to be usable for most economics problems?


#5

Both Julia and Python’s ecosystems are pretty solid at this point. Most of the lectures can probably be rewritten to be calls to pretty well-tested packages, but that’s not the best thing to choose from. So the other questions like polynomial basis, yes there’s some things in libraries like SciPy you can use (with varying degrees of efficiency…), but sometimes it’s easier to teach by writing it out.

Can you point to a topic in which Julia has a limited number of libraries? I don’t find that to be the case in any of the disciplines I work in, whereas Python I have found some major limitations in customizability that can be limiting.

Pyomo is similar to Julia’s JuMP. I haven’t used it though, but from what I know it should cover most problems.


#6

Thanks for the response, and your amazing libraries for Julia.

My question was more generally about python libraries to use and tell students about, rather than specifically about the lectures, but I think it is an important philosophical point to consider.

Perhaps it is “easier to teach by writing it out” if one is teaching a class on computational methods, but not if we are trying to teach the use of computational methods for a field. In that case, best to teach people the right algorithms for the right jobs, and get them to use high-quality libraries where there is a high likelihood of continued maintenance. People using these lecture notes will be basing code on them, and are not going to hunt for libraries on their own. Moreover, I don’t want people thinking that general purpose code is really that easy to write, test, and maintain. For example, I don’t want my students having to worry about cache locality, that even something as seamingly trivial as sum might be best handled by Kahan-Babuska-Neumaier, how to check if the compiler has unrolled loops for SIMD or whether you need to tweak it manually, etc.

The broader issue is that the biggest problems with computational economics at this point are: (1) a do it yourself attitude, rather than leveraging the work of experts in OR, applied math, etc.; and (2) insufficient training in software engineering. If there is a mess of libraries with `varying degrees of efficiency’ then I hope we can help people to figure out the best supported winners and how to patch them together. I couldn’t (for python at least).

Yes, with Julia it is much easier to get a sense of specialized packages and they naturally work well together. As far as I can tell, Python seems to have a number of huge, monolithic libraries (which might come from the structure of it being a terrible language for scientific computing, and requiring tricks to get performance acceptable that make interoperability of libraries difficult?).

For what it is worth, I have decided Python is too much of a step backwards towards a 1990s OO-mindset, and am going to have my students work in Julia. And I am looking forward to telling them about https://github.com/JuliaDiffEq !


#7

HI @jlperla, thanks for all your comments.

I’m not sure that Python is a ‘terrible language for scientific computing’, although it does of course have limitations given its design. But anyway, focusing on libraries, part of what we’d like to do at QuantEcon is to help coordinate economists on the best libraries for the job in the Python / Julia ecosystems. There’s a few lectures where we use more of a ‘bare hands’ approach to help people understand algorithms, but for the most part we’d rather leave the heavy lifting to the best tools for the job.

It would be great if you could point to a specific instance where the lectures use a QuantEcon library even though, in your view, some external library would be more suitable. That kind of feedback is very helpful.

As for the contents of of QuantEcon.py and QuantEcon.jl, that’s largely up to the package maintainers, but in my experience they are very open to suggestions on their respective issue trackers.

I’m also increasingly relying on Julia for my research (dynamic macro), although I don’t have the negative feelings about Python regarding 90s OO mindset. It’s still elegant and useful as far as I’m concerned.

Thanks again for your feedback. I want the lectures to be as helpful as possible for the broader econ community.


#8

you guys have no idea how thankful i am for all the information i found here. i couldn’t even think that i could get all this information simply by visiting a single post! thanks for everything guys. i really need to learn and thanks to you i will surely be able to do so! i just wanted to ask you if you won’t mind if i am going to have some questions to ask you later? thanks a lot!


#9

Hi @Whily1981,

We’re glad this discussion was useful to you :slight_smile: Please post on the forum if you have any more questions.


#10

Thanks @john.stachurski

Again, thanks to all of you guys for the spectacular book and the software. I am excited to teach from it. Consider my comments about packages more a discussion of where general-purpose code could go in the next iteration, rather than questioning where they came from originally (i.e., when you started on this project, many of the packages didn’t exist)

One important philosophical point to consider: Languages with modern package management systems encourage focused, loosely coupled packages

  • I think that Perl’s CPAN, R’s CRAN, and even Stata’s SSC are part of the reason for their success. Even terrible languages can have great package management systems.
  • Since Matlab has no package system, it leads to libraries like CompEcon (or the Tomlab software I license) as self-contained, integrated, monolithic libraries. These may have been the constrained-optimal approach to getting economists to use libraries at the time, but that is mostly due to limitations in matlab language (e.g. no namespaces) and the lack of a package system.
  • Even though it has strong package management, Python is a little harder to navigate and more monolithic (due to difficulty in implementating AD and interoperability with Cython code?), but searching with https://pypi.python.org/pypi?%3Aaction=search&term=kalman&submit=search I find things like https://github.com/rlabbe/filterpy

For Julia: the github focus means there is the advantage that smaller and loosely coupled packages can be grouped in a semi-official organization (which can act as a coordination device).

It probably makes more sense to do a thorough analysis after Julia reaching 0.7/1.0 and Pkg3 is rolled-out. Until then, here is a list of major Julia organizations that I have been tracking. I am certain the QuantEcon maintainers already know about all of them, but I hope it is useful for other people reading this thread.

Other organizations worth pointing people to, even if they don’t currently replace something currently in the notes

There are many reasons to avoid duplication of functionality (e.g. mostly the worry of spreading the precious “maintenance” resources too thin), but with a language supporting generic types (like Julia) there is another reason: pervasive auto-differentiation support. Having AD automagically propagate throughout all your Julia code leads to a revolutionary change in expressiveness, clarity, and performance of algorithms. It also opens up all sorts of different algorithms which are closer to white-board math (e.g. move away from fragile and bug-prone nested iterative fixed-point algorithms towards ones exploiting derivatives and contained optimization)


#11

I came across this issue in QuantEcon, which is a case in point for how game-changing pervasive AD would be: https://discourse.julialang.org/t/interpolation-in-constraint/7325

But part of getting AD (as well as GPU vectors, etc) to work correctly writing generic code, which is tricky. If you want to see how hard it is to write and design generic code for general purpose utilities, read through https://github.com/JuliaLang/julia/issues/24595 which summarizes a lot of the discussion around deprecating ones and zeros https://github.com/JuliaLang/julia/issues/24444

A lot comes down to careful mathematics of what is the identify operator for different operations (i.e. additive vs. multiplicative identity) and using it for a particular type rather than just for Float64, etc.


#12

Hi @jlperla,

Thanks for this great list! I hear what you are saying about duplication. Avoiding it means chasing a moving target and we need to keep on top of that. For my part the truth is that I’m behind the curve on a lot of these libraries at the moment, having had my head buried deep in maths problems over the last year or so.

I’m really excited to learn more about AD for solving continuous time dynamic systems and high dimensional optimization and integration problems. Lately I’ve been working things that are orthogonal to AD but calculus is such a powerful tool — when a new way to leverage it comes along it’s exciting.

I’ll be looking at your list over the coming months and thinking about how we can better exploit these tools in the lectures with Tom Sargent. All additional suggestions are greatly appreciated.


#13

AD works, but it’s usually not the right tool. You can AD all the way through DifferentialEquations.jl’s native Julia solvers (make sure the initial condition and time are in terms of dual numbers, https://github.com/JuliaDiffEq/OrdinaryDiffEq.jl/issues/225) which is a great testament to how generic the AD works in Julia. I’m sure lots of the integration tools can be similarly AD-able if they aren’t already. Generic AD is just absolutely fantastic when you’re lazy, so for now we use it in our parameter estimation stuff. However, the better way to handle derivatives than straight forward AD on many problems. For ODEs, it is through forward and adjoint sensitivity equations (so we’re building libraries for making getting derivatives through these easier). So it’s not the “endgame” in many disciplines, but it’s definitely a tool that can get quite close quite easily (if it’s at the language-level like Julia).


#14

AD on the solution of the differential equation? Truly amazing that works at all! But I think you are a few steps ahead of where most economists are thinking are the possibilities of AD.

@ChrisRackauckas What you may not realize about economists outside of this discourse, is that for more code than you could possibly imagine, the decision is not whether to hand-code derivatives and/or use specialized algorithms vs. use AD.

Instead: People are using optimizers, solvers, and finding fixed-points with either (1) no derivative information at all (e.g. blindly using Nelder–Mead where it is wholly inappropriate because they don’t realize how useful derivatives are to optimizers) or (2) using finite-differences to calculate gradients and Jacobians, sometimes without understanding machine precision issues. Fixing the “laziness” issue is the first-order concern!


#15

I added it to the FAQ:

http://docs.juliadiffeq.org/latest/basics/faq.html#Are-the-native-Julia-solvers-compatible-with-autodifferentiation?-1


#16

Just chiming in to point to the Python Tangent library for AD developed by Google. It’s very new, so they are still actively adding functionality.

Here’s a list of Python optimization libraries that i’ve found useful at some point:

  • Pygmo: Developed by the European Space Agency, it has several flavors of Differential Evolution optimizers and Archipelago algorithms for global optimization of non-convex problems. It also contains a nice interface to the NLOpt library.

  • Mystic “trac.mystic.cacr.caltech.edu/project/mystic/wiki.html”: Lot’s of cool stuff for global optimization with easy parallelization.

  • Pyomo: Has been mentioned above. It’s really cool and powerful algebraic library. You still need to plug it into an optimizer like IPopt or Knitro. Also, you need to code your problems in a slightly different way than we’re used to in economics (kind of like AMPL), but the way you write most problmes it’s actually closer to how you would formulate with pen and paper. The Julia equivalent is JuMP, which I believe is not as powerful, but it’s very easy to use.

  • CVXPY and CVXOPY: for convex optimization. I actually haven’t used these, but they are very well documented and they’re widely used in industry.


#17

Also, PyTorch is the bomb. I wouldn’t call it an optimizer, it’s very similar to Tensorflow but significantly easier to use (because of dynamically generated paths, read the docs). It’s relatively new, so improving rapidly. It’s open source, originally developed by Facebook. It’s great for prototyping and research, then one can port the algorithms to Caffe for production, if that’s something your need.
You can code almost anything in Pytorch, and it requires a bit of manual labor in setting the optimizer, like actually writing the loop through the optimization steps, computing the gradient and optimal step, etc. It turns out to be quite easy and there’s a lively community at https://discuss.pytorch.org/.

All that being said, finding the right optimizer it’s still always a struggle for me.


#18

I forgot to add that there’s a pretty good tutorial from the Scipy conference in 2017 on using Mystic by its creator: https://www.youtube.com/watch?v=geFER2oVvvU


#19

I have often used CasADi with Ipopt as the optimizer for several constrained non-linear problems. With some wrapper code, I can easily define my objective and constraint functions and I get solutions quite fast, thanks to CasADi’s automatic differentiation scheme along with efficient parsing through huge expression graphs of objective and constraints.

I can easily test a variety of solvers, and for mixed-integer NLP (MI-NLP) problems, that I encountered recently, the access to Bonmin solver within CasADi was also quite a pleasant surprise!

So I would highly recommend it, although it is meant more for engineering applications.

Sam