Comments (54)
Isn't this handled by closures already?
from forwarddiff.jl.
True but in iterative algorithms where the second argument changes, the closure has to be redefined in each step/iteration, and this is not always the best design for extraneous reasons.
from forwarddiff.jl.
The closure "updates" together with the captured variables. There is no need to redefine it.
from forwarddiff.jl.
That's good to know. At the same time it is good to allow for extra dummy input arguments (dummy from the autodiff perspective) as it would enable more diverse development. For example I have separated parameters (methods) from states (data), so my software engineering approach is conceptually orthogonal to closures.
from forwarddiff.jl.
I still don't understand why closures doesn't solve this.
If you have a function f(x, data, parameters, dummy, whatever)
and want to take the derivative of f w.r.t x you just create f(x) = f(x, data, parameters, dummy, whatever)
where the arguments now are actual instantiated variables and pass f to ForwardDiff. This is how all the optimization and solver packages does it. Your method is similar to what you do in c where you pass a void pointer as the second argument which you can cast to whatever you want but closures solves this imo.
from forwarddiff.jl.
The issue of allowing additional arguments to function was discussed in #32. It was also concluded that closures do indeed are a solution. A potential worry is that creating many closure --- for instance in MCMC
case you might to create a closure for each iteration --- may end up with a big performance penalty. I have done some benchmarking and it seems that creating closures (many closures) is not a big issue as the penalty is an order of magnitude smaller than the cost of obtaining the function being differentiated.
from forwarddiff.jl.
Closing as duplicate of #32.
@scidom Feel free to reopen if you think this needs to be revisited, but I really think closures are the best way to do this. If it helps to approach it a different way, remember that ForwardDiff now supports differentiating callable types.
from forwarddiff.jl.
Thanks @jrevels. I wanted to kindly ask if it would be possible to add this as a feature request. I understand that closures is a possible solution, but I want as a programmer and ForwardDiff user to have the alternative option of inputing a function with more than one arguments for two reasons, to avoid any small performance penalty (however small it may be, no penalty is better than small penalty), and most importantly because it may be helpful to provide an alternative usage case/programming paradigm.
If of course the answer is that it is not possible, then it is ok, but wanted to make the effort and ask for this feature.
from forwarddiff.jl.
I tried the following:
a = rand(2000)
x = rand(2000)
n = 1000000
function f(x)
x+a
end
function g(x, y)
x+y
end
@time for i in 1:n
f(x)
end
@time for i in 1:n
g(x, a)
end
It seems that actually f is a tiny bit faster than g, i.e. closures are a tiny bit faster, is this possible?
from forwarddiff.jl.
Now you are using a global variable. That is not the same as creating an closure inside another function that captures function local variables.
from forwarddiff.jl.
Consider what it would be like if we DID have this feature - I think the package would actually be worse off. Here are some reasons why:
- It'd make the API more complicated, and the API is already somewhat complicated.
- We'd have to write (and maintain) tests that cover all the new behaviors enabled by this feature.
- A general, maintainable implementation of this feature could very well be slower than just letting the user pass in closures and callable types, and could make our performance model even harder to understand.
from forwarddiff.jl.
Try this
n = 1000000
a = rand(n)
x = rand(n)
function g(x, y)
x+y
end
@time for i in 1:n
g(x[i], a[i])
end
@time for i in 1:n
cl(x) = g(x, a[i])
cl(x[i])
end
from forwarddiff.jl.
You will not get any useful data by benchmarking functions accessing global variables.
from forwarddiff.jl.
Thanks for the feedback @KristofferC and for the snippet @gragusa.
@jrevels I can see your point about complexity of development. My primary criterion during development tends to be performance. Along these lines, to quantify performance, shall we try to think of a simple benchmark, without using globals as @KristofferC pointed, just to see what is the penalty of closures if any? It is hard for all of us to know the pros and cons without measuring them... I will try to think of a reasonable example, not necessarily related to ForwardDiff.
from forwarddiff.jl.
You can always enclose the above into a function. But why accessing global variables should be different for closures?
from forwarddiff.jl.
@gragusa you mean sth like the example below?
n = 1000000
a = rand(n)
x = rand(n)
function g(x, y)
x+y
end
function z()
for i in 1:n
g(x[i], a[i])
end
end
function w()
for i in 1:n
cl(x) = g(x, a[i])
cl(x[i])
end
end
@time z()
@time w()
This still uses globals, plus we redefine the closure in each iteration, so we may want to improve on these.
from forwarddiff.jl.
@scidom I was thinking more along the lines of
function f()
n = 1000000
a = rand(n)
x = rand(n)
function g(x, y)
x+y
end
@time for i in 1:n
cl(x) = g(x, a[i])
cl(x[i])
end
end
This is the worst case scenario --- when closures need to be created at each iteration. Think about MCMC within Gibbs: at each iteration you have to create a closure enclosing the blocks of variables.
from forwarddiff.jl.
Global variables can change type at any time and must therefore be boxed. Local variables can be reasoned about. This does not change when using closures.
Regarding performance. We are already passing in a function as argument to ForwardDiff which has its own overhead so the closure overhead should be amortized. For best performance, create a functor (type which overloads Base.call) and pass that to ForwardDiff.
from forwarddiff.jl.
Yes, this Gibbs scenario is exactly the situation I am dealing with 👍
from forwarddiff.jl.
@KristofferC Yes, I understand the boxing. The fact that "this does not change when using closures" is what I thought --- and so the benchmarking in this case is not misleading because both functions deal with global in the same way.
from forwarddiff.jl.
@gragusa I tried
function f()
n = 1000000
a = rand(n)
x = rand(n)
function g(x, y)
x+y
end
@time for i in 1:n
cl(x) = g(x, a[i])
cl(x[i])
end
@time for i in 1:n
g(x[i], a[i])
end
end
f()
and there is massive gap in performance. Performance with Gibbs sampling is even worse than our toy benchmark because each Gibbs iteration requires redefining several closures - pretty much as many as the number of sampled parameters.
@KristofferC can you provide a toy example of what you meant about overloading Base.call()
? It doesn't have to be a benchmark, just an example of what you have in mind about functors.
from forwarddiff.jl.
See https://github.com/JuliaLang/julia/blob/master/base/functors.jl
This types could also hold states. When passing these as arguments julia can do inlining and all that jazz.
from forwarddiff.jl.
Thanks, I also found the following in the documentation:
http://docs.julialang.org/en/release-0.4/manual/methods/#call-overloading-and-function-like-objects
from forwarddiff.jl.
Also what is interesting is not the cost of the closure vs calling a normal function but the cost of passing a closure vs a function + arguments to ForwardDiff. My gut says there will be very little difference. I'm in bed now so can't test it.
from forwarddiff.jl.
Here's a concrete example of making a callable type and using it with ForwardDiff:
julia> immutable Foo{T}
x::T
y::T
end
julia> call(f::Foo, a) = a^(f.x) + (a / f.y)
call (generic function with 1033 methods)
julia> f = Foo(3, 2)
Foo{Int64}(3,2)
julia> f(2.5)
16.875
julia> using ForwardDiff
julia> ForwardDiff.derivative(f, 2.5)
19.25
As @KristofferC said, you can use these types instead of closures. Any time where you would make a new closure to update the values, you can instead make a new instance of one of these types. If your type is mutable, you could just update the fields as you go instead of making new instances (though keeping it immutable could work better if you're storing isbits
field types, like numeric primitives).
AFAIK, the plan is for closures to start using this callable type strategy "under the hood" in v0.5 (see JuliaLang/julia#13412).
from forwarddiff.jl.
That's amazing, I must say. I am feeling a bit puzzled though in two respects; i) in your example f
is of type Foo
and not of type Function
, right? How is it possible then that Forward.derivative(f, 2.5)
perceives f
as a generic function instead of Foo
?! ii) Jeff mentioned in the issue you forwarded that the call
function will be deprecated; in this case, would your nice example remain applicable? Bed time for me too, but will see your reply tomorrow.
from forwarddiff.jl.
Actually, I found the answer to the second question, I think call
will be replaced by (f::Foo)(a) = a^(f.x) + (a / f.y)
in our example...
from forwarddiff.jl.
I am trying to understand how this would work in practice, if I would need to introduce a new log-target type and make it callable, too tired to think clearly now, tomorrow again...
from forwarddiff.jl.
I wanted to add 2 more thoughts. Firstly, the functors are a great idea but (as any tool) they are not the proper means for every situation. More specifically, consider the case of having a functor f
with field f.y
for holding the field that would otherwise be enclosed in a closure, and let's say we call f(x)
, where f
in its body uses f.y
. This is all fine. But once we consider a second functor g
with the same field g.y
and call g(x)
this becomes problematic if we would want f.y
and g.y
to point to a common vector (because f
and g
have their own copies of y
).
The second remark relates to the comment made by @KristofferC; I wrote a simple example for which in one case we pass a closure f
whereas in another case we pass a function g
plus an argument a
. If the closure doesn't need to be redefined, then there seems to be no difference in performance:
function mytest()
len = 2000
a = rand(len)
x = rand(len)
n = 1000000
function f(x)
x+a
end
function g(x, y)
x+y
end
function fwrapper(f, x)
f(x)
end
function gwrapper(g, x, y)
g(x, y)
end
@time for i in 1:n
fwrapper(f, x)
end
@time for i in 1:n
gwrapper(g, x, a)
end
end
mytest()
The only problem of course is development-related, i.e. having one version of user-defined function g(x, a)
and then another internal version of it calling the relevant closure h(x)
. If this needs to be done for many functions, I need to maintain double copies for each function to communicate with the user, one external g(x, a)
that doesn't enclose a
(because as far as the API goes I do need to have a version of g
that is not a closure so that l can create the closure later on, once a
has been defined) and an internal that provides the respective closure. This is my glue code solution to proceed, with some extra memory penalty of course... but that's the only way I can reconcile the restriction posed by only allowing closures with ForwardDiff in the context of a general Gibbs scheme.
from forwarddiff.jl.
There is no reason why you couldn't make two separate instances of a functor hold references to the same vector.
Regarding recreating the closure, why not do something like this:
function f()
n = 1000000
a = rand(n)
x = rand(n)
function g(x, y)
x+y
end
@time for i in 1:n
cl(x) = g(x, a[i])
cl(x[i])
end
a_val = 0.0
cl2(x) = g(x, a_val)
@time for i in 1:n
a_val = a[i]
cl2(x[i])
end
@time for i in 1:n
g(x[i], a[i])
end
end
Since the captured variable updates there is no need to recreate the closure unless you will change the type of captured variable in every iteration (seems unlikely).
from forwarddiff.jl.
I agree @KristofferC, there is no need to redefine the closure. The only difficulty is that I want to be able to define method fields in a parameter type independently of the enclosed value, which means that I need to delay the (one and only) definition of the closure until I know the delayed value.
About functors, you are actually right. For example the following works (therefore I will consider more seriously the possibility of using functors):
immutable Foo1{T}
y::Vector{T}
end
immutable Foo2{T}
y::Vector{T}
end
y = [2., 3.]
f1 = Foo1(y)
f2 = Foo2(y)
call(f::Foo1, a) = a*f.y
call(f::Foo2, a) = a+f.y
f1(2.)
f2(2.)
y[2] = 30.
f1(2.)
f2(2.)
from forwarddiff.jl.
P.S. I find the functor solution quite ingenious the more I think about it.
from forwarddiff.jl.
Note that in the (not so far away) future, the plan for julia is for closures to basically lower into their functor equivalent version making them just as performant.
from forwarddiff.jl.
This is great. I find functors more handy than closures in Lora because they are more low level exactly, which makes dev easier.
from forwarddiff.jl.
After more thinking and an attempt to implement the idea, it turns out functors won't work unfortunately for the application at hand. The problem is that the definition call(f::Foo, x)
is Foo
-specific and not f
-specific, meaning that the overloading of call
is universal for Foo
, whereas I need different calls for different instances f
of Foo
. It seems after all that closures are the only way out probably.
from forwarddiff.jl.
After having spent a good deal of time thinking over this problem and after having tried to tackle it in practice, I would like to share my thoughts and to make a case for adding the extra feature of a wrt
optional argument.
I am convinced by now that there is no performance penalty when using closures, so performance is not a concern anymore. The reason I think the wrt
feature is important is programming-related only. Alternative solutions have been proposed but none of them truly resolves the development scenario I am dealing with.
Functors are indeed a great solution but the main issue, as explained above, is that the definition of call()
is made on the basis of the underlying functor, whereas in my case conceptually the call()
function varies between instances of the functor.
Closures are great, but in a larger scale project of a "real-life" application they can become impractical. For instance, if we were to ask the user to pass just one closure to gradient()
this would make sense. However, in many cases this is impractical because i) many closures may be involved and ii) most importantly the additional argument enclosed by the closure is not naturally known to the user, or at least not early in model specification, so the API for a model would become obfuscated and not transparent by asking the user to create a dummy variable with a specific name (so that it can later by modified before calling the closure).
The bottom line is that there is no performance consideration, but there are important use cases where it becomes less natural and more programmatically impractical to request the user to define a closure.
Along these lines, I want to kindly re-open issue #32 and consider adding the wrt
option. This will not make ForwardDiff more complicated for the user, because wrt
can have a default value; it will heve some impact only on maintenance of ForwardDiff but I think it is worth the effort in order to provide this extra feature that can prove to be invaluable in practice when using ForwardDiff inside other prackages.
from forwarddiff.jl.
the
call()
function varies between instances of the functor
@scidom, could you explain what you mean by this?
from forwarddiff.jl.
I will try to explain using a fictitious toy example:
immutable LogPrior
y::Vector
end
call(lp::LogPrior, x::Vector{Float64}) = x+lp.y
y = [1., 2.5]
logprior = LogPrior(y)
logprior([2., 5.])
This works fine, but the problem is that I may want to define more than one priors. Since multiple dispatch for call(::LogPrior, ::Vector{Float64})
for the functor LogPrior
can be exploited only once, I can not define, say
call(logprior::LogPrior, x::Vector{Float64}) = x-0.5y
for another prior...
from forwarddiff.jl.
I could code-generate many functors for different priors with @gensym LogPrior
names but not sure if this is a reasonable coding approach.
from forwarddiff.jl.
What would the wrt
form look like?
from forwarddiff.jl.
I'm not sure how the issue is any different in that case
from forwarddiff.jl.
I was thinking that if we have a function say f(x, y)
and we can call
ForwardDiff.gradient(f, wrt=:x)
or perhaps
ForwardDiff.gradient(f, wrt=1)
then this may allow passing a function f
with more than one arguments. I am not sure if this is the best solution, just trying to think of a way of maintaining a reasonable API and interface between ForwardDiff and Lora.
from forwarddiff.jl.
To explain my problem, if I ask the user to provide a closure f(x)
with a mangled name _y
inside the body of the closure for the enclosed value, and also define a dummy _y = sth
wouldn't that be ugly on the Lora end?
from forwarddiff.jl.
The other solution is that I ask the user to provide f(x, y)
and I codegen the closure f(x)
internally which is possible.
from forwarddiff.jl.
The functor object can just store the user-provided function and the arguments.
from forwarddiff.jl.
All the solver packages, and optimization packages seems to do just fine with only closures so I am sure there should be some way to solve this.
Regarding extra arguments in ForwardDiff. My first thought is that allowing 1 extra argument is completely arbitrary. What if you want to pass multiple extra arguments? The only way you can do that in a type stable way is to write a tailor made type for your specific function. This means that you suddenly impose on the user to always write their functions as f(x, extra_parameters), instead of letting them write them however they want and just ask them to pass the closure to the library. They also have to deal with the extra boilerplate of defining a new type for each function.
The functor object can just store the user-provided function and the arguments.
Isn't that pretty much exactly what a closure is?
from forwarddiff.jl.
@KristofferC, I agree, one would need to generalize the way wrt
would operate, the above was just a first naive example.
@mlubin your solution works for me, it is really great, thank you:
immutable LogPrior
f::Function
y::Vector
end
call(lp::LogPrior, x::Vector{Float64}) = lp.f(x, lp.y)
y = [1., 2.5]
logprior = LogPrior((x, y) -> x+y, y)
logprior([2., 5.])
I will then close #32 and will use the proposed solution (and yes, @KristofferC, I presume this functor definition is pretty much conceptually equivalent to a closure).
from forwarddiff.jl.
One more thing, the reason this lower-level functor approach works better for me than just a closure is because I can then define the constructor (using pseudocode below)
LogPrior(f::Function, n::Int) = LogPrior(f, Array(MyLoraType, n))
so the user doesn't need to define the dummy variable y
anymore, nor I need to worry about reserving a name for it.
from forwarddiff.jl.
I may be doing sth wrong (highly likely due to being late), but I get an error when I try to use ForwardDiff with the above solution:
import ForwardDiff
immutable LogPrior
f::Function
y::Vector
end
call(lp::LogPrior, x::Vector) = lp.f(x, lp.y)
function f(x, y)
x+y
end
y = [1., 2.5]
logprior = LogPrior(f, y)
logprior([2., 5.])
g = ForwardDiff.gradient(logprior)
g([2., 5.])
Although g
is defined via ForwardDiff without issues, calling it throws
ERROR: MethodError: `convert` has no method matching convert(::Type{ForwardDiff.GradientNumber{2,Float64,Tuple{Float64,Float64}}}, ::Array{ForwardDiff.GradientNumber{2,Real,Tuple{Real,Real}},1})
This may have arisen from a call to the constructor ForwardDiff.GradientNumber{2,Float64,Tuple{Float64,Float64}}(...),
since type constructors fall back to convert methods.
Closest candidates are:
ForwardDiff.GradientNumber{N,T,C}(::Any, ::Any)
call{T}(::Type{T}, ::Any)
convert{T<:Real}(::Type{T<:Real}, ::Complex{T<:Real})
...
in _calc_gradient at /Users/theodore/.julia/v0.4/ForwardDiff/src/api/gradient.jl:86
in g at /Users/theodore/.julia/v0.4/ForwardDiff/src/api/gradient.jl:54
from forwarddiff.jl.
So with current version how can I use additional arguments to function being differentiated?
from forwarddiff.jl.
Same as before. Pass a closure as your function to be differentiated where it closes over your parameters.
from forwarddiff.jl.
With Call overloading?
Okay I tried like this.. it's working.
I will apply this to my code. Thanks!
immutable LogLike
f::Function
y::Vector
end
call(ll::LogLike, params1::Vector) = ll.f(params1, ll.y)
function f(params1, y)
dot(params1, y).^2+1
end
y = [1., 3, 10.]
LL = LogLike(f, y)
param = [2., 5., 1.]
LL(param)
res = GradientResult(param)
g = ForwardDiff.gradient!(res,LL,param)
ForwardDiff.gradient(res)
from forwarddiff.jl.
Call overloading is not needed anymore for performance. It is easier and more flexible to create a closure. Read through this thread and you will find examples. I am on mobile so can't post a full code example.
from forwarddiff.jl.
Oh.. you mean like this?
import ForwardDiff
using ForwardDiff
param = [2., 5., 1.]
y = [1., 3, 10.]
function test(param, y)
function g(params1)
dot(params1, y).^2+1
end
LL = g(param)
res = GradientResult(param)
ForwardDiff.gradient!(res,g,param)
ForwardDiff.gradient(res)
end
test(param,y)
from forwarddiff.jl.
Related Issues (20)
- AD in-place instead of broadcast HOT 1
- Is the mutating code the problem here? How could I debug this? HOT 2
- Rationals and Modulo
- `NaNMath` (and `SpecialFunctions`) as extensions? HOT 5
- Broken external link
- `construct_seeds` for types where `typeof(one(T)) !=T` is broken HOT 1
- incorrect 2nd derivative of complex exponential HOT 2
- Can you take derivative of complicated function whose symbolic form is not explicit or not known?
- Cancellation with sparse arrays HOT 5
- Implement hessian! for scalar x
- Implement gammalogccdf for ForwardDiff HOT 1
- `ForwardDiff.jacobian` throws error for `fft` HOT 1
- Correctly forming nested dual numbers. HOT 8
- Derivative of a function of derivatives HOT 7
- Symbolics.jl compatibility HOT 1
- Support derivative(f, ::Complex)
- `ForwardDiff` fails to compute correct derivative HOT 3
- Incorrect Hessian by `exp` function HOT 1
- Method ambiguities reported by Aqua HOT 3
- Document internals? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from forwarddiff.jl.