Poor Performance Code in the Julia Introduction About Performance


#1

I noticed an error on this page:

which talks about performance, but then uses

type AR1_explicit
    a::Real
    b::Real
    sigma::Real
    phi::Distribution
end

as its type definition. Since this is abstractly typed, there is inferability issues in the code that uses this, and thus this is not recommended. Instead, this should be parameterically typed

type AR1_explicit{T<:Real,T2<:Distribution}
    a::T
    b::T
    sigma::T
    phi::T2
end

or there should be a note that the non-parameterized form will get poor performance.


#2

@ChrisRackauckas Great feedback.

Thanks for reporting this. We should fix this. We introduce type parameters immediately after that, so we should either

  1. Reorganize the lecture a little bit or
  2. Be less general about the set of AR(1)s that get considered and restrict a, b, and sigmato Float64 or another concrete type or
  3. Explain that this isn’t the most performant version of this type and let them know that we will explain why in the next section.

I am a little torn on 3, though I think I prefer it. I like the idea of having an example to discourage people from these types of performance issues (type inference problems), but don’t want people who skim over this and develop bad habits (though I guess this is their own fault…).

@spencer.lyon and @john.stachurski, do you have preferences/thoughts on this?


#3

@ChrisRackauckas Thanks a lot for the comment. It’s really much appreciated. I’m a long way from an authority on Julia but many of the useful bits of knowledge I’ve collected in my head so far are thanks to reading your excellent blog.

So, just to be sure I understand, in the example in question, when you rewrite AR1_explicit as parametrically typed, instances are concretely typed at the time of creation, and that concrete type information is available at runtime to methods that take this instance as an argument. Hence there is no problem with type inference.

@cc7768 Assuming my understanding is correct, I think the right thing to do here is to use concrete types as in your option 2. Then, in the section below where parametric types are introduced, revisit this example and explain how we can loosen up the type restrictions without affecting performance. It’s actually a nice example of how parametric types are useful.


#4

That is essentially correct.


#5

I think the right thing to do here is to use concrete types as in your option 2. Then, in the section below where parametric types are introduced, revisit this example and explain how we can loosen up the type restrictions without affecting performance. It’s actually a nice example of how parametric types are useful.

Great idea!! This has my vote


#6

That is a very good idea. I’ll probably steal this idea :wink:.

FYI, I’ve been going through your stuff because I have been planning a workshop and am updating my notes to be more “beginner friendly”. I think your tutorials do a good job at it, but I am a stickler for making sure people are lead to “proper performant and generic Julia code”… so I’ve always had type parameters very early on. This suggestion is definitely a good compromise: starting with simpler and performant but not yet non-generic code, and generalizing it later.


#7

Please feel free to steal.

The new discussion is here. Thanks everyone for their input.

I actually rewrote that lecture to try to make it clearer, and added a new lecture purely on performance. Hopefully it’s useful for beginners who are starting to think about performance.

That leaves the issue of code in the remainder of the lectures, which should be adjusted in places to reflect our discussion. I’ll put some notes in the issue tracker.

@ChrisRackauckas Thanks again for kicking off this discussion.


#8

This might be complexity creep, but I think you should add something on dynamic dispatch, explaining what it is, why it’s costly (and how much it costs), and why it shows up in these cases. I think it makes the “compiler information” idea more concrete.


#9

I don’t know enough about dynamic dispatch to do that right now (I thought all dispatch in Julia was dynamic…?) and I’m trying to finish off some research and not get distracted, so I doubt I’ll edit again in the short term. But thanks for the suggestion, I appreciate it.


#10

The reason why type inference makes things fast is because if everything is inferred, then inside the function the local variables become statically typed and the dispatches all become static dispatch. When something is not inferred, that’s when you have dynamic dispatch, i.e. at runtime it has to find out what type it is, check the methods table, and use that to find out what method to call (~100ns). But it’s easy to see that if all of the types are known at compile-time, the compiler can check the methods table and simply insert the correct dispatches, making them all static. That, plus variable boxing, is why lack of type inference and type stability makes code slow.


#11

Just saw this, but wanted to let y’all know that is addressed in the new version which is under development.


#12

Great, thanks, can’t wait to read it.