Some optimization algorithms require several independent cost function evaluations per

Implemented in <a class="issue-link js-issue-link" data-error-text="Failed to load tit

Parallel evaluations of cost function about argmin HOT 8 CLOSED

argmin-rs commented on June 3, 2024

Parallel evaluations of cost function

from argmin.

Comments (8)

stefan-k commented on June 3, 2024

Sure, I personally don't mind additional dependencies. We can think about having parallel execution behind a feature gate because the only thing I think is important, is that whether something runs in parallel or not remains the choice of the user. If I want to run thousands of optimizations on a cluster or if my cost function implementation is parallelized, I often don't want the evaluations to run in parallel. This doesn't neccessarly need to be handled by a feature gate though, it could also be solved in another way.

At which level would you perform the parallelization? Maybe extending the ArgminOp trait with the methods apply_par, gradient_par and so on makes sense? They could be automatically implemented like this (I'm not sure if this compiles):

fn apply_par(&mut self, params: &[Self::Param]) -> Result<Vec<Self::Output>, Error> {
    params.par_iter().map(|p| self.apply(p)?).collect()
}

This would give automatic parallelization, but it would also leave the implementer of the ArgminOp trait the choice of changing it if necessary.
It could look like this behind a feature gate:

fn apply_par(&mut self, params: &[Self::Param]) -> Result<Vec<Self::Output>, Error> {
    #[cfg(feature = "rayon")]
    {
        params.par_iter().map(|p| self.apply(p)?).collect()
    }
    #[cfg(not(feature = "rayon"))]
    {
        params.iter().map(|p| self.apply(p)?).collect()
    }
}

Similarly we could have another compute_all_the_things (name is open for suggestions ;)) method which computes cost, gradient and Hessian in parallel (or a subset of them, depending on what is needed for a particular solver).

Automatic decision of whether parallelization makes sense or not also sounds like a good idea, given that we already measure the time of each iteration. But as a user, I'd also like to be able to opt out of this.

What do you think about this? I have to admit I haven't thought too much about this yet and my experience with rayon is rather limited.

from argmin.

TheIronBorn commented on June 3, 2024

I'm trying to implement Parallel Tempering with Argmin and something like this would make it much easier

from argmin.

TheIronBorn commented on June 3, 2024

I was playing with this myself and it seems really easy to implement thanks to the way collect handles Result

#[cfg(feature = "rayon")]
fn apply_par(&self, params: &[Self::Param]) -> Result<Vec<Self::Output>, Error> {
    params.par_iter().map(|p| self.apply(p)).collect()
}

#[cfg(feature = "rayon")]
fn modify_par(
    &self,
    params: &[Self::Param],
    extents: &[Self::Float],
) -> Result<Vec<Self::Param>, Error> {
    params
        .par_iter()
        .zip_eq(extents)
        .map(|(p, e)| self.modify(p, *e))
        .collect()
}

And solvers can use a default sequential trait version if the user didn't implement it

from argmin.

stefan-k commented on June 3, 2024

Thanks for having a look at this! I like this approach; however, if we extend the ArgminOp trait with apply_par and modify_par then I think the rayon feature gate should be moved inside the default implementation of the methods:

fn par_apply(&self, params: &[Self::Param]) -> Result<Vec<Self::Output>, Error> {
    #[cfg(feature = "rayon")]
    params.par_iter().map(|p| self.apply(p)).collect()
    #[cfg(not(feature = "rayon"))]
    params.iter().map(|p| self.apply(p)).collect()
}

fn par_modify(
    &self,
    params: &[Self::Param],
    extents: &[Self::Float],
) -> Result<Vec<Self::Param>, Error> {
    #[cfg(feature = "rayon")]
    params
        .par_iter()
        .zip_eq(extents)
        .map(|(p, e)| self.modify(p, *e))
        .collect()
    #[cfg(not(feature = "rayon"))]
    params
        .iter()
        .zip_eq(extents)
        .map(|(p, e)| self.modify(p, *e))
        .collect()
}

Then it is not necessary to deal with the feature gate in the solvers. Also, in contrast to my previous suggestion, I would prefer par_{apply,modify} over {apply,modify}_par since it is more in line with iter and par_iter.
@TheIronBorn what do you think? Would you be willing to provide a PR for this?

from argmin.

TheIronBorn commented on June 3, 2024

I looked into that but it seemed it adds more trait requirements like Send/Sync. I don't know how reasonable that is for sequential projects. Maybe there's some trait magic we could do.

Also it might be good to let the user choose which to parallelize since it might be only the cost function or the jacobian/etc which is expensive and so parallelizing everything might slow things down.

If the final version doesn't end up parallel by default perhaps we should reduce the name to {apply/etc}_many or bulk_{apply/etc}, though that might make its use-case less obvious.

from argmin.

stefan-k commented on June 3, 2024

Those are all very good points. How do you feel about implementing the sequential (=non parallelized) version by default, and leaving the parallel implementation to the user if needed (for now)? Then users can choose how they want to parallelize their code. In that case, bulk_{...} is probably the better choice for the method names.
I feel this would be a rather temporary solution until we come up with something more convenient, but it would at least allow users to provide a parallelized version if they want to.

from argmin.

TheIronBorn commented on June 3, 2024

Yeah I'll do that

from argmin.

stefan-k commented on June 3, 2024

Implemented in #219 .

The rayon feature enables parallel evaluation of cost function, gradient, operator, Jacobian and Hessian via the bulk_* methods. One can still opt out of parallelization for individual problem related traits. Say you don't want to compute the Hessian in parallel: by implementing the parallelize method of the Hessian trait to return false, it will be computed sequentially.

The bulk_* methods can of course also be overwritten.

The only solver which uses this so far is Particle Swarm Optimization.
I'm happy to hear feedback on the design, in particular if anyone is using this for their own solvers.

from argmin.

Parallel evaluations of cost function about argmin HOT 8 CLOSED

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent