Giter Site home page Giter Site logo

juliadecisionfocusedlearning / implicitdifferentiation.jl Goto Github PK

View Code? Open in Web Editor NEW
122.0 5.0 7.0 1.73 MB

Automatic differentiation of implicit functions

Home Page: https://juliadecisionfocusedlearning.github.io/ImplicitDifferentiation.jl/

License: MIT License

Julia 100.00%
julia automatic-differentiation machine-learning autodiff implicit optimization

implicitdifferentiation.jl's Introduction

ImplicitDifferentiation.jl

Stable Dev Build Status Coverage Code Style: Blue Aqua QA

ImplicitDifferentiation.jl is a package for automatic differentiation of functions defined implicitly, i.e., forward mappings

$$x \in \mathbb{R}^n \longmapsto y(x) \in \mathbb{R}^m$$

whose output is defined by conditions

$$c(x,y(x)) = 0 \in \mathbb{R}^m$$

Background

Implicit differentiation is useful to differentiate through two types of functions:

  • Those for which automatic differentiation fails. Reasons can vary depending on your backend, but the most common include calls to external solvers, mutating operations or type restrictions.
  • Those for which automatic differentiation is very slow. A common example is iterative procedures like fixed point equations or optimization algorithms.

If you just need a quick overview, check out our JuliaCon 2022 talk. If you want a deeper dive into the theory, you can refer to the paper Efficient and modular implicit differentiation by Blondel et al. (2022).

Getting started

To install the stable version, open a Julia REPL and run:

using Pkg; Pkg.add("ImplicitDifferentiation")

For the latest version, run this instead:

using Pkg; Pkg.add(url="https://github.com/JuliaDecisionFocusedLearning/ImplicitDifferentiation.jl")

Please read the documentation, especially the examples and FAQ.

Related projects

In Julia:

In Python:

  • google/jaxopt: hardware accelerated, batchable and differentiable optimizers in JAX

implicitdifferentiation.jl's People

Contributors

baggepinnen avatar gdalle avatar github-actions[bot] avatar mohamed82008 avatar pitmonticone avatar sfalmo avatar thorek1 avatar yingboma avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

implicitdifferentiation.jl's Issues

More examples

We should add more illustrations of what implicit differentation can do.

A few ideas:

  • Constrained optimization (projected / proximal gradient, Frank-Wolfe)
  • Optimal transport
  • ODEs

Test static arrays and use direct linear solver

I think we should add tests for static array inputs and outputs. Ideally we should remain static throughout the whole computation. Also a nice heuristic could be to use a direct linear solver whenever static arrays are used instead of gmres.

Issue with second derivative

Hi,
I have been trying to use ImplicitDifferentiation.jl for higher order jvp but I was not lucky.

I want to use it for BifurcationKit.jl, where I need 3rd order jvp

Basic example:

using ImplicitDifferentiation
using Optim
using Random
using Zygote


Random.seed!(63)

function dumb_identity(x)
    f(y) = sum(abs2, y-x)
    y0 = zero(x)
    res = optimize(f, y0, LBFGS(); autodiff=:forward)
    y = Optim.minimizer(res)
    return y
end;

zero_gradient(x, y) = 2(y - x);
implicit = ImplicitFunction(dumb_identity, zero_gradient);
x = rand(3, 2)
h = rand(3, 2)
D(x,h) = Zygote.jacobian(t->implicit(x .+ t .* h), 0)[1]
D(x,h) # works
D2(x,h1,h2) = Zygote.jacobian(t->D(x .+ t .* h2,h1), 0)[1]
D2(x,h,h) # does not work

I also have this code with ForwardDiff. Not sure the problem is the same

using ForwardDiffChainRules, ForwardDiff
@ForwardDiff_frule (f::typeof(implicit))(x::AbstractMatrix{<:ForwardDiff.Dual})
D(x,h) = ForwardDiff.derivative(t->implicit(x .+ t .* h), 0)
D(x,h) # works
D2(x,h1,h2) = ForwardDiff.derivative(t->D(x .+ t .* h2,h1), 0)
D2(x,h,h) # does not work

please register

While I understand it is WIP, this package is already useful, nicely written and documented, so please consider registering it.

Type inference in linear solve

At the moment, since IterativeSolvers can return a convergence history depending on a keyword argument, our pushforward and pullback are type-unstable.

If this isn't fixed by #5, we can consider annotations like

linear_solver(A, b)::AbstractVector = gmres(A, b)

or

u::AbstractVector = linear_solver(A, b)

Complex-valued arguments

Currently, the implicit function is defined for real arguments only

function (implicit::ImplicitFunction)(
    x_and_dx::AbstractArray{Dual{T,R,N}}; kwargs...
) where {T,R,N}

It would be useful to have it work for complex-valued arguments as well. Naively defining the same function with modified type signature did not seem to work.

Document custom rrule trick for ComponentArray constructor

Regarding #23, I was trying to re-implement this
in Julia.

I found that the use of ComponentArrays may not work properly with Zygote when the optimization parameters are given from another source (here, a neural network) as the following (with v0.5.0-DEV).

I think this is related to jonniedie/ComponentArrays.jl#176.
Therefore, it may be better to clarify the limitation of ComponentArrays for multiple arguments cases (or, just avoid this for now).

Case 1: with ComponentArrays (not working)

Code

using ImplicitDifferentiation
using Flux
using Convex, ECOS
using ComponentArrays
using Random


Random.seed!(2024)
i_max = 10
n = 3
m = 2
model = Chain(
              Dense(3, 64, Flux.leakyrelu),
              Dense(64, i_max*(m+1)),
             )
u_min = -2*ones(m)
u_max = +2*ones(m)


function minimise_logsumexp(θ; T, u_min, u_max, initial_guess, solver=() -> ECOS.Optimizer())
    # A = θ[:, 1:end-1]
    # B = θ[:, end]
    (; A, B) = θ
    m = size(A)[2]
    u = Convex.Variable(m)
    if initial_guess != nothing
        u.value = initial_guess
    end
    obj = T * Convex.logsumexp((1/T)*(A*u + B))
    prob = Convex.minimize(obj)
    prob.constraints += [u >= u_min]
    prob.constraints += [u <= u_max]
    solve!(prob, solver(), silent_solver=true, verbose=false)
    minimiser = typeof(u.value) <: Number ? [u.value] : u.value[:]  # to make it a vector
    return minimiser
end


function forward_cstr_optim(θ; kwargs...)
    u = minimise_logsumexp(θ; kwargs...)
    return u
end


function proj_hypercube(p; u_min, u_max)
    return max.(u_min, min.(u_max, p))
end

function conditions_cstr_optim(θ, u; kwargs...)
    (; A, B) = θ
    # A = θ[:, 1:end-1]
    # B = θ[:, end]
    ∇₂f = A' * Flux.softmax(A*u+B)
    η = 0.1
    return u .- proj_hypercube(u .- η .* ∇₂f; kwargs...)
end


function implicit_cstr_optim_func(θ; T, u_min, u_max, initial_guess, solver)
    tmp = ImplicitFunction(
                           θ -> forward_cstr_optim(θ; T, u_min, u_max, initial_guess, solver),
                           (θ, u) -> conditions_cstr_optim(θ, u; u_min, u_max),
                          )
    tmp(θ)
end

x = 100*(2*rand(n) .- 1)
val, grads = Flux.withgradient(model) do _model
    AB = reshape(_model(x), i_max, m+1)
    A = AB[:, 1:m]
    B = AB[:, m+1]
    θ = ComponentArray(A=A, B=B)
    # θ = reshape(_model(x), i_max, m+1)
    u_star = implicit_cstr_optim_func(θ; T=1, u_min=-ones(m), u_max=+ones(m), initial_guess=nothing, solver=() -> ECOS.Optimizer())
    sum(u_star)
end

Result

ERROR: Mutating arrays is not supported -- called push!(Vector{Any}, ...)
This error occurs when you ask Zygote to differentiate operations that change
the elements of arrays in place (e.g. setting values with x .= ...)

Possible fixes:
- avoid mutating operations (preferred)
- or read the documentation and solutions for this error
  https://fluxml.ai/Zygote.jl/latest/limitations

Case 2: without ComponentArrays (working)

Code

using ImplicitDifferentiation
using Flux
using Convex, ECOS
using ComponentArrays
using Random


Random.seed!(2024)
i_max = 10
n = 3
m = 2
model = Chain(
              Dense(3, 64, Flux.leakyrelu),
              Dense(64, i_max*(m+1)),
             )
u_min = -2*ones(m)
u_max = +2*ones(m)


function minimise_logsumexp(θ; T, u_min, u_max, initial_guess, solver=() -> ECOS.Optimizer())
    # A = θ[:, 1:end-1]
    # B = θ[:, end]
    (; A, B) = θ
    m = size(A)[2]
    u = Convex.Variable(m)
    if initial_guess != nothing
        u.value = initial_guess
    end
    obj = T * Convex.logsumexp((1/T)*(A*u + B))
    prob = Convex.minimize(obj)
    prob.constraints += [u >= u_min]
    prob.constraints += [u <= u_max]
    solve!(prob, solver(), silent_solver=true, verbose=false)
    minimiser = typeof(u.value) <: Number ? [u.value] : u.value[:]  # to make it a vector
    return minimiser
end


function forward_cstr_optim(θ; kwargs...)
    u = minimise_logsumexp(θ; kwargs...)
    return u
end


function proj_hypercube(p; u_min, u_max)
    return max.(u_min, min.(u_max, p))
end

function conditions_cstr_optim(θ, u; kwargs...)
    # (; A, B) = θ
    A = θ[:, 1:end-1]
    B = θ[:, end]
    ∇₂f = A' * Flux.softmax(A*u+B)
    η = 0.1
    return u .- proj_hypercube(u .- η .* ∇₂f; kwargs...)
end


function implicit_cstr_optim_func(θ; T, u_min, u_max, initial_guess, solver)
    tmp = ImplicitFunction(
                           θ -> forward_cstr_optim(θ; T, u_min, u_max, initial_guess, solver),
                           (θ, u) -> conditions_cstr_optim(θ, u; u_min, u_max),
                          )
    tmp(θ)
end

x = 100*(2*rand(n) .- 1)
val, grads = Flux.withgradient(model) do _model
    AB = reshape(_model(x), i_max, m+1)
    # A = AB[:, 1:m]
    # B = AB[:, m+1]
    # θ = ComponentArray(A=A, B=B)
    θ = reshape(_model(x), i_max, m+1)
    u_star = implicit_cstr_optim_func(θ; T=1, u_min=-ones(m), u_max=+ones(m), initial_guess=nothing, solver=() -> ECOS.Optimizer())
    sum(u_star)
end

Result

(val = 1.2111089906277515, grad = ((layers = ((weight = Float32[-1.9894879 0.016161848 -1.9257156; -0.009999948 8.12358f-5 -0.009679403; … ; 0.31266153 -0.0025399441 0.30263928; -1.2044753 0.009784702 -1.1658663], bias = Float32[-0.026939616, -0.0001354091, 0.0067554815, -0.012562656, -0.0116912015, -0.0072106766, -0.019949805, -0.012255933, -0.008887393, -0.00019493952  …  8.364284f-6, 0.0029845645, 0.00020423913, -0.00016008505, -0.00011763882, 0.01743382, 0.015867122, 0.013238616, 0.0042337435, -0.016309775], σ = nothing), (weight = Float32[2.7479534f-31 -8.035564f-33 … 6.655762f-31 8.295855f-31; 4.978378f-17 -1.455777f-18 … 1.2058026f-16 1.502933f-16; … ; 5.054912f-10 -1.478157f-11 … 1.2243399f-9 1.5260381f-9; 3.615537f-16 -1.0572551f-17 … 8.757118f-16 1.0915021f-15], bias = Float32[2.8529254f-32, 5.168552f-18, 1.702376f-20, -0.011397834, 0.012743557, 3.4305703f-10, 2.0157113f-23, -0.010961587, 9.4864516f-12, 7.707587f-18  …  1.4720016f-31, 2.3545986f-17, 7.705814f-20, -0.035908565, 0.057037998, 1.6192033f-9, 9.8762665f-23, -0.021129435, 5.2480097f-11, 3.7536507f-17], σ = nothing)),),))

Compatibility with scalar functions

At the moment our implementation fails for scalar functions, even though they should be a special case of vector. Need to investigate why

Basic question on square root example

Hi there,

Nice package! I'm wondering if someone could suggest how to use this package to compute the derivative of a function that computes the square roots of a number by using a solver. I asked this question on discourse here.

Thanks

Add precompilation workflow

Seems difficult since the base package doesn't have access to any autodiff backend. My best guess would be to put it in the extensions.

Direct linear solver not efficiently handled

https://github.com/gdalle/ImplicitDifferentiation.jl/blob/7f9fa8fd5b3851fd55421501b1f061f784c68ab0/ext/ImplicitDifferentiationChainRulesExt.jl#L63

Currently to use a direct linear solver, I need to do pass in (A, b) -> (Matrix(A) \ b, (solved=true,)). There are 2 problems with that. First it is inconvenient to pass it in like this. Second, it is inefficient because it will call Matrix(A) and factorise it every time the pullback function is called. Calling Matrix(A) and factorising it can be expensive and is only needed to be done once and then reused for every pullback call. This currently makes Jacobian computations with direct solvers much slower than necessary.

ForwardDiff compatibility

At the moment, autodiff of implicit functions relies on ChainRules, which means ForwardDiff won't work.

Suggested fix: manually overload the callable on ForwardDiff.Dual

QR decomposition

Hi! First, have I already said this package is really cool? If not, it is. Really exciting stuff.

I thought I would try to use the to get rules for a QR decomposition. This is very complex with the usual approach (see JuliaDiff/ChainRules.jl#651 for lots of linked PRs and issues). But the invariant satisfied by QR is pretty clean, so I thought it could be a fun test run for this package. I don't have any pressing application at this point - this is just a test run.

So here's my first shot at it:

using LinearAlgebra, ImplicitDifferentiation, ChainRules

function qr_constraints(x,QR)
    Q, R = QR

    mynorm(x) = sum(xj -> xj^2, x)
    mynorm(Q' * Q - I) + mynorm(Q * R - x)
end

myqr = let
    function primal(x)
        Q,R = qr(x)
        Matrix(Q), R
    end
    ImplicitFunction(primal, qr_constraints)
end

But if I try this out, say with

using Zygote

function my_test_function(x)
    Q,R = myqr(x)
    sum(R)
end

I get

julia> x = randn(3,2)
3×2 Matrix{Float64}:
 0.380649  -0.540067
 0.756046  -1.30804
 1.07281   -0.514084

julia> Zygote.withgradient(my_test_function, x)
ERROR: MethodError: no method matching vec(::ChainRulesCore.Tangent{Any, Tuple{ChainRulesCore.ZeroTangent, FillArrays.Fill{Float64, 2, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}}}})
Closest candidates are:
  vec(::Transpose{<:Number, <:AbstractVector}) at ~/julia/julia-1.8.0/share/julia/stdlib/v1.8/LinearAlgebra/src/adjtrans.jl:219
  vec(::Adjoint{<:Real, <:AbstractVector}) at ~/julia/julia-1.8.0/share/julia/stdlib/v1.8/LinearAlgebra/src/adjtrans.jl:220
  vec(::SparseArrays.AbstractSparseVector) at ~/julia/julia-1.8.0/share/julia/stdlib/v1.8/SparseArrays/src/sparsevector.jl:965
  ...
Stacktrace:
 [1] (::ImplicitDifferentiation.var"#implicit_pullback#11"{Float64, Matrix{Float64}, LinearOperators.LinearOperator{Float64, Int64, ImplicitDifferentiation.var"#mul_Bᵀ!#10"{ComposedFunction{typeof(last), Zygote.var"#ad_pullback#50"{Tuple{ImplicitDifferentiation.var"#conditions_x#7"{Tuple{Matrix{Float64}, Matrix{Float64}}, typeof(qr_constraints)}, Matrix{Float64}}, typeof((λ))}}, Tuple{Matrix{Float64}, Matrix{Float64}}}, Nothing, Nothing, Vector{Float64}}, LinearOperators.LinearOperator{Float64, Int64, ImplicitDifferentiation.var"#mul_Aᵀ!#9"{ComposedFunction{typeof(last), Zygote.var"#ad_pullback#50"{Tuple{ImplicitDifferentiation.var"#conditions_y#8"{Matrix{Float64}, typeof(qr_constraints)}, Tuple{Matrix{Float64}, Matrix{Float64}}}, typeof((λ))}}, Tuple{Matrix{Float64}, Matrix{Float64}}}, Nothing, Nothing, Vector{Float64}}, typeof(Krylov.gmres)})(dy::ChainRulesCore.Tangent{Any, Tuple{ChainRulesCore.ZeroTangent, FillArrays.Fill{Float64, 2, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}}}})
   @ ImplicitDifferentiation ~/.julia/packages/ImplicitDifferentiation/90Hcu/src/implicit_function.jl:107
 [2] ZBack
   @ ~/.julia/packages/Zygote/PD12J/src/compiler/chainrules.jl:206 [inlined]
 [3] Pullback
   @ ./REPL[162]:2 [inlined]
 [4] (::typeof((my_test_function)))(Δ::Float64)
   @ Zygote ~/.julia/packages/Zygote/PD12J/src/compiler/interface2.jl:0
 [5] (::Zygote.var"#60#61"{typeof((my_test_function))})(Δ::Float64)
   @ Zygote ~/.julia/packages/Zygote/PD12J/src/compiler/interface.jl:45
 [6] withgradient(f::Function, args::Matrix{Float64})
   @ Zygote ~/.julia/packages/Zygote/PD12J/src/compiler/interface.jl:124
 [7] top-level scope
   @ REPL[164]:1

Any idea what's going on? Or maybe it's a deeper Zygote issue?

How to do implicit differentiation for neural network parameters?

Hi!

I'm trying to do implicit differentiation of the parameters for a given neural network.

For example, nn is a neural network constructed by Flux.jl. I can get the parameters by Flux.params(nn).

In this tutorial, I need to provide the parameters as the arguments of the forward solver function (here, it's lasso(data::ComponentArray)).

But I don't know how to do this for my case; namely, optimization_prob(parameters) = ...?

If I can overwrite network parameters with p = Flux.params(nn), then I would be able to do so like

function optimization_prob(parameters)  # will be provided by `Flux.params(nn)` outside this function
    load_parameters!(nn, parameters)
    # etc...
end

Is there any workaround for this?

QR decomposition example

Here is an example of differentiating QR decomposition using this package. Would be nice to have an example for each factorization.

using LinearAlgebra, ImplicitDifferentiation, ComponentArrays, Zygote

function qr_conditions(A, x)
  Q, R = x.Q, x.R
  return vcat(
    vec(UpperTriangular(Q' * Q) + LowerTriangular(R) - I - Diagonal(R)),
    vec(Q * R - A),
  )
end
function qr_forward(A)
  qr_res = qr(A)
  Q = copy(qr_res.Q[:, 1:size(A, 2)])
  R = qr_res.R
  return ComponentVector(; Q, R)
end

diff_qr = ImplicitFunction(qr_forward, qr_conditions)

A = rand(10, 4)
x = diff_qr(A)

J = Zygote.jacobian(diff_qr, A)[1]

Jacobian byproduct assumed to be constant in x - incompatible with higher order AD

Currently the byproduct z is assumed to be a constant in x. In optimization, z may be used to return a Jacobian approximation for use in the conditions function. This is fine as long as we only need the gradient of the optimization solution y. If we need higher order derivatives, then we would need to propagate the derivatives back through the byproduct z which is currently not possible because we always define the cotangent wrt z to be NoTangent(). I am not sure if there is a way to handle this correctly except to document it perhaps.

Generic input types

At the moment, the library is designed with vector inputs x and outputs y in mind. We should be more generic than that, which may require some form of flattening mechanism

Reverse mode AD with forward mode backend for conditions (and vice versa)

I have a case where I need to autodiff through scalar functions and matrix operations. I understand forward mode is better for the former and reverse mode more efficient for the latter.

what might solve this "problem" is using reverse mode on the overall chain and forward mode as a backend on the conditions involving the scalar functions.

looking at the current implementation I would suggest to make backend an argument similar to return_byproduct

Add example use of byproduct

I just realized that using the byproduct in the conditions is fairly easy, but using the byproduct in the condition derivatives is harder, unless the user defines their own chain rule. What do you think?

Allow forward pass to return Jacobian

In some cases, the forward pass will compute a Jacobian along with the output. We should use it instead of duplicating the work in the conditions

Assert that the output from the forward function is a tuple of byproducts are allowed

I noticed that if the forward function returns an array, that the following deconstructing syntax still works assigning 1 to y, 2 to z and ignoring 3.

y, z = [1, 2, 3]

This can make incorrect code work and lead to incorrect behaviour if the user allows a byproduct but forgot to return it from the forward function. We should probably add an assertion that the output from the forward function is a tuple of length 2 if byproducts are allowed.

TagBot trigger issue

This issue is used to trigger TagBot; feel free to unsubscribe.

If you haven't already, you should update your TagBot.yml to include issue comment triggers.
Please see this post on Discourse for instructions and more details.

If you'd like for me to do this for you, comment TagBot fix on this issue.
I'll open a PR within a few hours, please be patient!

make additional information z optional

what about making z optional?

I see benefits in cleaner and more concise code when writing up forward and condition functions and possibly minor speed gains

Documentation inconsistency for v0.4.4

I was trying to implement my own implicit diff function for the minimization of log-sum-exp function, and I found that the documentation for the stable version is somewhat different from what I actually obtained by running a code.

For example, according to the doc of the stable version (https://gdalle.github.io/ImplicitDifferentiation.jl/stable/examples/4_constrained_optim/), I wrote this code:

using ImplicitDifferentiation
using Flux
using Convex, ECOS
using ComponentArrays


# x = rand(2) .+ [0, 1]
i_max = 10
m = 2
A = 2 * rand(i_max, m) .- 1
B = 2 * rand(i_max) .- 1
θ = ComponentArray(A=A, B=B)
u_min = -2*ones(m)
u_max = +2*ones(m)


# function mysqrt_cstr_optim(x)
function minimise_logsumexp(θ; T, u_min, u_max, initial_guess, solver=() -> ECOS.Optimizer())
    (; A, B) = θ
    m = size(A)[2]
    u = Convex.Variable(m)
    if initial_guess != nothing
        u.value = initial_guess
    end
    obj = T * Convex.logsumexp((1/T)*(A*u + B))
    prob = Convex.minimize(obj)
    prob.constraints += [u >= u_min]
    prob.constraints += [u <= u_max]
    solve!(prob, solver(), silent_solver=true, verbose=false)
    minimiser = typeof(u.value) <: Number ? [u.value] : u.value[:]  # to make it a vector
    return minimiser
end


function forward_cstr_optim(x; kwargs...)
    # y = mysqrt_cstr_optim(x)
    y = minimise_logsumexp(x; kwargs...)
    # z = 0
    return y
end


# function proj_hypercube(p)
#     return max.(0, min.(1, p))
# end
function proj_hypercube(p; u_min, u_max)
    return max.(u_min, min.(u_max, p))
end

# function conditions_cstr_optim(x, y)
function conditions_cstr_optim(θ, u; kwargs...)
    # ∇₂f = 2 .* (y .^ 2 .- x)
    (; A, B) = θ
    ∇₂f = A' * Flux.softmax(A*u+B)
    η = 0.1
    # return y .- proj_hypercube(y .- η .* ∇₂f)
    return u .- proj_hypercube(u .- η .* ∇₂f; kwargs...)
end


function implicit_cstr_optim_func(θ; T, u_min, u_max, initial_guess, solver)
    tmp = ImplicitFunction(
                           θ -> forward_cstr_optim(θ; T, u_min, u_max, initial_guess, solver),
                           (θ, u) -> conditions_cstr_optim(θ, u; u_min, u_max),
                          )
    tmp(θ)
end

implicit_cstr_optim_func(θ; T=1, u_min=-ones(m), u_max=+ones(m), initial_guess=nothing, solver=() -> ECOS.Optimizer())

In docs, when calling the implicit function, it is expected to give an array.
With v0.5.0-DEV, the result seems fine:

julia> implicit_cstr_optim_func(θ; T=1, u_min=-ones(m), u_max=+ones(m), initial_guess=nothing, solver=() -> ECOS.Optimizer())

2-element Vector{Float64}:
  0.11430458077043977
 -0.9999999846444508

However, with v0.4.4 (the stable version), it gives me a tuple like this:

julia> implicit_cstr_optim_func(θ; T=1, u_min=-ones(m), u_max=+ones(m), initial_guess=nothing, solver=() -> ECOS.Optimizer())
(0.710863877230062, -0.932445241056164)

Support several inputs instead of just `x`

At the moment, if derivatives wrt several inputs are needed, our recommendation is to use ComponentArrays.jl to concatenate them into a single vector. However, this is not very handy and can lead to performance issues: the naive keyword-based constructor for ComponentVector is type-unstable:

julia> using ComponentArrays, Test

julia> @inferred ComponentVector(a=1, b=[2, 3])
ERROR: return type ComponentVector{Int64, Vector{Int64}, Tuple{Axis{(a = 1, b = 2:3)}}} does not match inferred return type Any
Stacktrace:
   [1] error(s::String)
     @ Base ./error.jl:35

julia> f() = ComponentVector(a=1, b=[2, 3])
f (generic function with 1 method)

julia> @inferred f()
ERROR: return type ComponentVector{Int64, Vector{Int64}, Tuple{Axis{(a = 1, b = 2:3)}}} does not match inferred return type Any
Stacktrace:
   [1] error(s::String)
     @ Base ./error.jl:35

In fact, achieving type-stable construction would require knowing the array sizes statically (jonniedie/ComponentArrays.jl#126 (comment)), which is impossible in many cases.

An alternative would be to accept an arbitrary number of input arrays on our end. After brief experimentation, I fear this would introduce new sources of type instability as we loop through the args and splat them, storing results in vectors or tuples.

My proposal is to manually add support for two arguments. It will lead to code duplication but ensure type stability, and should cover a lot of use cases. Most notably, it will be necessary to implement implicit linear solvers, and thus enable second-order differentiation out of the box (#26).

Working with the tangential space of a manifold

Hi there, thanks for your amazing work!

I was wondering if I could use this approach to iteratively work out the tangential space of the manifold defined by f(x)=0, with x ∈ ℝⁿ, f ∈ℝᵈ. Right now I'm using the implicit function theorem to calculate the tangential vectors. However, with the naive implementation (using DiffOpt.jl) the accuracy is very low compared to the tangential vectors I get by sampling over points and taking x-x'/|x-x'| for x-x'->0. I do not know about the concrete details of your implementation, but I have the impression that it is superior to my naive Ansatz.

Further, do you think it could be possible to calculate the curvature tensor (and henceforth, the parallel transport) of the manifold at each point? In the end I'm interested in singularities (i.e. bifurcations) and for this I need to be able to calculate derivatives up to high precision.

Best,

v.

Fix type stability tests in reverse mode

At the moment, the JET tests in reverse mode are breaking due to runtime dispatch. This is presumably happening because Zygote-computed pullbacks introduce type instability.
The solution may be to write conditions with explicit rrules.

byproduct not returned

Hi

here is some code which works with v0.3.0 but is missing the byproducts and below is another implementation using v0.4.4 (same on master branch) trying and failing to get the byproducts. Any help to get them is much appreciated.

block_solver_AD(parameters_and_solved_vars::Vector{<: Real}, 
    n_block::Int, 
    ss_solve_blocks::Function, 
    guess::Vector{Float64}, 
    lbs::Vector{Float64}, 
    ubs::Vector{Float64};
    tol::AbstractFloat = eps(Float64),
    verbose::Bool = false) = ImplicitFunction(x -> block_solver(x,
                                                            n_block, 
                                                            ss_solve_blocks,
                                                            guess,
                                                            lbs,
                                                            ubs;
                                                            tol = tol,
                                                            verbose = verbose)[1],  
                                        (x,y) -> ss_solve_blocks(x,y))

function block_solver(parameters_and_solved_vars::Vector{Float64}, 
    n_block::Int, 
    ss_solve_blocks::Function, 
    guess::Vector{Float64}, 
    lbs::Vector{Float64}, 
    ubs::Vector{Float64};
    tol::AbstractFloat = eps(),
    verbose::Bool = false)

    #=
    ...
    =#

    return sol_values, sol_minimum
end


block_solver_RD = block_solver_AD([alp, bet, psi, del, m, ➕₂, ➕₃, ➕₄, ➕₅], 1, 𝓂.ss_solve_blocks[1], inits, lbs, ubs, verbose = true)

solution = block_solver_RD([alp, bet, psi, del, m, ➕₂, ➕₃, ➕₄, ➕₅])

when I try to get the byproduct using the below formulation and v0.4.4 it does not return the byproduct:

block_solver_AD(parameters_and_solved_vars::Vector{<: Real}, 
    n_block::Int, 
    ss_solve_blocks::Function, 
    guess::Vector{Float64}, 
    lbs::Vector{Float64}, 
    ubs::Vector{Float64};
    tol::AbstractFloat = eps(Float64),
    verbose::Bool = false) = ImplicitFunction(x -> block_solver(x,
                                                            n_block, 
                                                            ss_solve_blocks,
                                                            guess,
                                                            lbs,
                                                            ubs;
                                                            tol = tol,
                                                            verbose = verbose),  
                                        (x,y,z) -> ss_solve_blocks(x,y), Val(true))

sry for not providing a mwe. the whole code is a bit involved in parts unrelated to implicitDifferentiation. I hope the point comes across nonetheless.

Fix kwargs support

At the moment kwargs give rise to a strange type inference bug (see the discussion in #19), which is only fixed with a contorsion (here)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.