Giter Site home page Giter Site logo

b3sum incredibly slow about blake3 HOT 13 CLOSED

tERyceNzAchE avatar tERyceNzAchE commented on August 23, 2024
b3sum incredibly slow

from blake3.

Comments (13)

oconnor663 avatar oconnor663 commented on August 23, 2024 3

You know, I wonder if the performance problem here is audible :)

from blake3.

tERyceNzAchE avatar tERyceNzAchE commented on August 23, 2024 2

I wonder if the performance problem here is audible :)

It does seem to be.

from blake3.

tERyceNzAchE avatar tERyceNzAchE commented on August 23, 2024 1

That took it from minutes to seconds!

from blake3.

tERyceNzAchE avatar tERyceNzAchE commented on August 23, 2024 1

Yes, this is from a spinning drive.

from blake3.

xzfc avatar xzfc commented on August 23, 2024 1

This isn't documented in b3sum itself, but you can disable multi-threading by setting RAYON_NUM_THREADS=1 environment variable.

from blake3.

tERyceNzAchE avatar tERyceNzAchE commented on August 23, 2024 1

This isn't documented in b3sum itself, but you can disable multi-threading by setting RAYON_NUM_THREADS=1 environment variable.

This didn't make a significant difference (like b3sum < big_file did).

from blake3.

oconnor663 avatar oconnor663 commented on August 23, 2024

Out of curiosity, what happens if you hash via stdin like this: b3sum < big_file? (Does that work on Windows?) I ask because that'll have the side effect of disabling memory mapping and multi-threading, and there's a chance that having a bunch of threads thrashing your disk could trigger some pathological performance issue.

from blake3.

oconnor663 avatar oconnor663 commented on August 23, 2024

Cool. "Thrashing the disk" is my current theory for your slow numbers. The next thing I'd be curious to try is a file small enough to fit in memory. That should work better with all the threads, because you won't actually be hitting the disk do much. Also, do you happen to have a spinning drive?

from blake3.

oconnor663 avatar oconnor663 commented on August 23, 2024

"Don't bother with multi-threading when you're on a spinning drive and the file is too small large to fit in cache" would be a decent performance heuristic, but I don't know of any simple or reliable way to detect either of those two conditions...

from blake3.

tERyceNzAchE avatar tERyceNzAchE commented on August 23, 2024

"Don't bother with multi-threading when you're on a spinning drive and the file is too small to fit in cache" would be a decent performance heuristic, but I don't know of any simple or reliable way to detect either of those two conditions...

This isn't a small file. Perhaps a command line flag to disable multi-threading?

from blake3.

oconnor663 avatar oconnor663 commented on August 23, 2024

Oops, meant to say "too large" above. Some flag like that might be useful for benchmarking and testing (situations like this), and we could totally add it. That said, I don't expect that most users facing this sort of issue will be able to discover that the --single-threaded flag (or whatever we call it) fixes things.

from blake3.

oconnor663 avatar oconnor663 commented on August 23, 2024

Interesting. I wonder if memory mapping itself is leading to thrashing, even if the access pattern of the mapping is serial? I don't know enough about the inner working of memory mapping to say, and especially not on Windows.

from blake3.

oconnor663 avatar oconnor663 commented on August 23, 2024

I'm going to close this issue for now, but we can re-open if it starts to look like there's something b3sum itself should be doing differently in the default case.

from blake3.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.