Comments (7)
In this case some algorithms that avoid SSE4.2 may start using it. This allows simplifying/optimizing bitsett vectorization, also maybe some of reversing algos coud be direct pshufb
What is the fraction of SSE4.2 but not AVX?
from stl.
In this case some algorithms that avoid SSE4.2 may start using it.
Ah, even better! 😻
What is the fraction of SSE4.2 but not AVX?
In theory I could get accurate numbers, but I'd have to ask around through several people. The 2024-02 Steam Hardware Survey suggests a lower bound - it says that 0.46% of CPUs don't support SSE4.2 (which indicates that it's a vaguely reasonable lower bound for the numbers across all CPUs that we target, not just performance-minded gamers, given the similarity to the number I heard), while 6.75% don't support AVX2. I'd guess the actual number for us is in the range of 10-20%, so that AVX2 optimizations benefit the vast majority of users, but that we won't be able to assume its existence for a decade.
from stl.
Oh, I see there are still a lot of machines with AVX and not AVX2...
I wanted to know if the SSE code path is still useful at all, but looks like it is.
(It is possible to rewrite some of AVX2 algorithms to use AVX, but let's not do that for various reasons, including the reason of this issue)
from stl.
Oh, I see there are still a lot of machines with AVX and not AVX2...
Exactly.
Ideally vector_algorithms should provide something like (IMO):
if (Has_AVX2()) {
...
} else if (Has_AVX()) {
...
} else if (Has_SSE()) { //All SSEs (2, 3, 4.*).
...
} else { //Scalar.
...
}
AVX512 is out of this equation for the next decade I believe.
from stl.
Adding codepaths to distinguish AVX1 from AVX2 raises the same sort of hazards that I'm concerned about with SSE2 versus SSE4.2. Although the AVX1/AVX2 delta is maybe 4% of processors, I think the risk isn't worth it.
from stl.
Although the AVX1/AVX2 delta is maybe 4% of processors, I think the risk isn't worth it.
Then, am I right that your suggestion is:
//vector_algorithms.cpp
if (Has_AVX()) { //Exactly AVX1 and AVX2.
...
} else if (Has_SSE()) { //All SSEs (2, 3, 4.*).
...
} else { //Scalar.
...
}
from stl.
Yes, and we already have these functions (they are properly named _Use_avx2()
and _Use_sse42()
), so we just need to fuse the _Use_sse2()
codepaths:
STL/stl/src/vector_algorithms.cpp
Lines 25 to 39 in be81252
from stl.
Related Issues (20)
- P2300R10 `std::execution`
- P2389R2 `dims`
- P2985R0 `is_virtual_base_of` HOT 1
- P2997R1 Removing The Common Reference Requirement From The Indirectly Invocable Concepts
- P3168R2 `std::optional` Range Support
- Is std::sort allowed to use stack space in the standard? HOT 3
- `<xutility>`: Use addition and multiplication overflow check MSVC intrinsics like `_add_overflow_i8`, `_mul_overflow_i16`, and `_mul_full_overflow_i8` HOT 3
- `<execution>`: Should threadpool callback priority be influenced by parent thread priority? HOT 9
- Performance improvement about std::binary_semaphore::acquire HOT 5
- `<iomanip>`: `std::put_time` should copy unknown conversion specifiers instead of crash HOT 13
- Module STD: Redefinition Errors When Using import std Along With Other STL Includes HOT 7
- Standard Library Modules: error C2039: `'promise_type'`: is not a member of `'std::coroutine_traits<TestModule::FireAndForget>'` HOT 3
- `STL.natvis`: `move_iterator` visualizer should use `_Current`, not `current`
- `STL.natvis`: `ranges::view_interface` doesn't always have `size()`
- `STL.natvis`: The VS copy is outdated
- `README.md`: Update working draft revision to N4986
- xlocnum error C2065: 'PTRDIFF_MAX': undeclared identifier HOT 10
- `<type_traits>`: Logical operator traits with non-`bool_constant` arguments emit truncation warnings
- STL: `expected<any, T>` and its friends can break container iterator comparison HOT 2
- Shouldn't `std::exception::what()` be declared as `noexcept`? HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from stl.