Comments (3)
mold does not use SIMD instructions explicitly, but SIMD is used at a lot of places in mold, because many library functions are implemented using SIMD. For example, glibc's strlen is implemented using SSE 4.2's instructions, I believe. Other example is xxhash3. There might be other places that I can use SIMD to improve mold's performance, but I can't come up with anything right now.
As to cryptographic hashing, I actually tried BLAKE3. We are currently using SHA-256 for Identical Comdat Folding and Build-ID computation. For the former use case, we need to compute a cryptographic hash for small data (typically less than 100 bytes). For the latter, we compute a SHA-256 for the entire output file, which can be as large as multi-gigabyte.
It looks like BLAKE3 is slower than SHA-256 at least on my machine for small data. This is perhaps due to high initialization and finalization cost. For large data, BLKAE3 is indeed faster than SHA-256 by a factor of two. If we have enough number of cores, build-id computation is bounded by memory bandwidth even with SHA-256, so I don't see an immediate need to switch to BLAKE3, though.
from mold.
SIMD can speed up large loops where each iteration does the same thing (no input-dependent branches), and each iteration does not depend on the previous one. For example, SIMD can speed up most kinds of non-cryptographic checksum calculation.
SIMD can not speed up anything else, including branchy code, recursion, too short loops, non-loops, and most data structure operations (arrays are fine, few other things are). Sometimes it's possible to rewrite a function into a SIMD-friendlier form, but it's rare.
I think mold spends most of its time in TBB hashmaps. The hash calculation may be SIMDable, unless TBB already does that; the rest of the hashmap is, as far as I know, not SIMDable.
SIMD can also only speed up your own code. Disk access belongs to the kernel, not to mold.
from mold.
I agree with you, I was more so referring to the resolving symbols step. simdjson uses some clever SIMD tricks discussed in their paper to avoid branching while still being able to resolve symbols, but I'm not certain how applicable it would be here.
As for hashing, https://github.com/BLAKE3-team/BLAKE3 (6.8 GiB/s versus SHA1's 1 GiB/s in their benchmark), would be a good candidate, but I imagine @rui314 isn't keen on adding more dependencies unless they're absolutely necessary.
from mold.
Related Issues (20)
- Failing tests on Alpine with 2.32.1 HOT 5
- Runtime error with Intel MKL and FEAST solver HOT 2
- VTT and discarded COMDAT error HOT 3
- x86_64-as-needed-weak (Failed) HOT 4
- Undefined reference when linking shared libraries not raised HOT 1
- mold: error: undefined symbol: OPENSSL_ia32cap_P HOT 7
- cosmic-applets, rust, out of memory when linking? HOT 6
- unknown command line option: -dynamic-list-data HOT 1
- Regarding no-execute HOT 3
- Alpinelinux 32bits build failure HOT 10
- mold 2.33.0 segfaults with --icf=all
- fatal: invalid --build-id argument: fast
- Solutions for ```__wine_spec_nt_header```? HOT 2
- gentoo-test.sh failes due to a conflict HOT 8
- Crash with `-D_GLIBCXX_ASSERTIONS` when building dtrace HOT 4
- void* sizeof ... 0 HOT 1
- Fails to build ninja with mold linker on gcc15 HOT 5
- Linking with mold breaks initialization of thread local storage in Rust under FreeBSD on AMD64 HOT 6
- Compilation Issue HOT 4
- riscv64-absolute-symbols fails with GCC 14 HOT 9
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mold.