Giter Site home page Giter Site logo

Comments (7)

mtfishman avatar mtfishman commented on September 26, 2024

Thanks for the report. @b-kloss looks like an issue with the custom _Allreduce function we wrote.

from itensorparallel.jl.

b-kloss avatar b-kloss commented on September 26, 2024

The _Allreduce is written such that the addition over contributions from different terms in different nonzero blocks is handled by MPI (and +) and I doubt that one can employ the threading over blocks there. Outside of what the _Allreduce does (i.e. applying the environment tensors to the site tensor), threaded_blocksparse should not pose a problem though.
So it could be a matter of handling which thread calls MPI and making sure that all threads are done before calling MPI.
For the former, see
https://juliaparallel.org/MPI.jl/v0.13/environment.html#MPI.ThreadLevel
For the latter, I am not sure how to acquire and release a lock for threading at the threaded_blocksparse level.

from itensorparallel.jl.

mtfishman avatar mtfishman commented on September 26, 2024

It doesn't make much sense to me since _Allreduce should be totally independent of the block sparse multithreading, which is only used in tensor contraction. Basically it is organized as follows:

  1. Per MPI process, it performs tensor contractions within the eigensolver (either multithreaded over blocks or not).
  2. Reduce over the result of each process.

Somehow the code is giving different results depending on if 1. uses multithreading or not. It makes me think that the multithreaded contract code is doing something strange and outputting different tensors compared to the non-multithreaded contract code, which are then causing issues for _Allreduce.

from itensorparallel.jl.

b-kloss avatar b-kloss commented on September 26, 2024

Yes, I agree. I also tried disabling threaded_blocksparse within the call to product(P::MPIProjMPOSum,v) and enabling it afterwards, and it still gives the BoundsError.
1.) Do we assume that the blocks are of the maximally possible size (allowed by quantum numbers) in the _Allreduce?
2.) Does threaded_blocksparse fragment blocks that are compatible quantum-number-wise into subblocks depending on the number of threads?

from itensorparallel.jl.

mtfishman avatar mtfishman commented on September 26, 2024

1.) Do we assume that the blocks are of the maximally possible size (allowed by quantum numbers) in the _Allreduce?

No, most of the logic is analyzing the blocks that actually exist in the tensors and sharing that information across the processes.

2.) Does threaded_blocksparse fragment blocks that are compatible quantum-number-wise into subblocks depending on the number of threads?

No, it only threads over the list of blocks, not within blocks (ideally threading within blocks would get taken care of by BLAS, but currently BLAS and Julia threads don't compose with each other).

from itensorparallel.jl.

b-kloss avatar b-kloss commented on September 26, 2024

Here's a hack that fixes the issue for me: Add this function redefinition to the top of your script

function ITensors.NDTensors.contract_blockoffsets(args...)
  #if using_threaded_blocksparse() && nthreads() > 1
  #  return _threaded_contract_blockoffsets(args...)
  #end
  return ITensors.NDTensors._contract_blockoffsets(args...)
end

I don't expect the hack to incur a big performance penalty by using the non-threaded version since it seems to me that the above function only does preparation/logic for the contraction and not the contraction itself, but I may be wrong about this.

Update:
Adding an extra sort statement in _threaded_contract_blockoffsets(args...) like

contraction_plan = reduce(vcat, contraction_plans)
sort!(contraction_plan; by=last)

fixes the issue (without the above redefinition). I'll put in a PR for this issue in ITensors.jl after the weekend.

from itensorparallel.jl.

mtfishman avatar mtfishman commented on September 26, 2024

@nbaldelli could you test out the latest version of the package which includes #13? That should fix the issue you first reported.

from itensorparallel.jl.

Related Issues (4)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.