Giter Site home page Giter Site logo

Comments (5)

Pratyush avatar Pratyush commented on June 24, 2024

PS @ebfull, did you have any notes on why Rayon was slower? In Zexe we switched to using only rayon, and it has made the MSM ~10-15% slower. (Even when the new code is structured almost identically to the old code).

from bellman.

hdevalence avatar hdevalence commented on June 24, 2024

I don't know about bellman, but in a very early version of Bulletproofs we tried using Rayon for the parallel parts of the inner product proof and got minimal speedup even for a very parallel task. I don't think we dug super far into it but from perf counters it seemed that it was doing a ton of context switches. Perhaps the work-stealing has more overhead than expected?

from bellman.

Pratyush avatar Pratyush commented on June 24, 2024

Hmm how large were those inner product proofs? Perhaps on small instance sizes the overhead is too large? In our case, an MSM over large inputs does noticeably speed up when parallelized, but is still slower than bellman's MSM on the same input size.

Moreover, futures_cpupool also does work-stealing. It would be quite interesting to investigate if, in some cases, futures_cpupool achieves lower overhead for work-stealing than rayon; maybe the lessons learnt could be used to improve rayon as well.

from bellman.

hdevalence avatar hdevalence commented on June 24, 2024

I don't remember the size but I believe the timings were in the range of 1-20 ms.

from bellman.

Pratyush avatar Pratyush commented on June 24, 2024

OK so I did some investigation, and came up with the following hypothesis: using futures_cpupool in the Groth16 prover gives better performance than using rayon because somehow futures_cpupool::CpuPool schedules tasks for the MSM better than rayon does when there are multiple MSMs happening in quick succession.

To justify this hypothesis, I performed the following test: I modified the multiexp code to create a new CpuPool for each invocation (so CpuPools are not shared by different MSMs). The resulting code has the ~same performance as the rayon-ized version. When used in the Groth16 prover, it results in worse performance than what's currently in master.

from bellman.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.