Giter Site home page Giter Site logo

Comments (7)

AlexGuteniev avatar AlexGuteniev commented on August 20, 2024

In this case some algorithms that avoid SSE4.2 may start using it. This allows simplifying/optimizing bitsett vectorization, also maybe some of reversing algos coud be direct pshufb

What is the fraction of SSE4.2 but not AVX?

from stl.

StephanTLavavej avatar StephanTLavavej commented on August 20, 2024

In this case some algorithms that avoid SSE4.2 may start using it.

Ah, even better! 😻

What is the fraction of SSE4.2 but not AVX?

In theory I could get accurate numbers, but I'd have to ask around through several people. The 2024-02 Steam Hardware Survey suggests a lower bound - it says that 0.46% of CPUs don't support SSE4.2 (which indicates that it's a vaguely reasonable lower bound for the numbers across all CPUs that we target, not just performance-minded gamers, given the similarity to the number I heard), while 6.75% don't support AVX2. I'd guess the actual number for us is in the range of 10-20%, so that AVX2 optimizations benefit the vast majority of users, but that we won't be able to assume its existence for a decade.

from stl.

AlexGuteniev avatar AlexGuteniev commented on August 20, 2024

Oh, I see there are still a lot of machines with AVX and not AVX2...
I wanted to know if the SSE code path is still useful at all, but looks like it is.
(It is possible to rewrite some of AVX2 algorithms to use AVX, but let's not do that for various reasons, including the reason of this issue)

from stl.

jovibor avatar jovibor commented on August 20, 2024

Oh, I see there are still a lot of machines with AVX and not AVX2...

Exactly.
Ideally vector_algorithms should provide something like (IMO):

if (Has_AVX2()) {
...
} else if (Has_AVX()) {
...
} else if (Has_SSE()) { //All SSEs (2, 3, 4.*).
...
} else { //Scalar.
...
}

AVX512 is out of this equation for the next decade I believe.

image

from stl.

StephanTLavavej avatar StephanTLavavej commented on August 20, 2024

Adding codepaths to distinguish AVX1 from AVX2 raises the same sort of hazards that I'm concerned about with SSE2 versus SSE4.2. Although the AVX1/AVX2 delta is maybe 4% of processors, I think the risk isn't worth it.

from stl.

jovibor avatar jovibor commented on August 20, 2024

Although the AVX1/AVX2 delta is maybe 4% of processors, I think the risk isn't worth it.

Then, am I right that your suggestion is:

//vector_algorithms.cpp

if (Has_AVX()) { //Exactly AVX1 and AVX2.
...
} else if (Has_SSE()) { //All SSEs (2, 3, 4.*).
...
} else { //Scalar.
...
}

from stl.

StephanTLavavej avatar StephanTLavavej commented on August 20, 2024

Yes, and we already have these functions (they are properly named _Use_avx2() and _Use_sse42()), so we just need to fuse the _Use_sse2() codepaths:

bool _Use_avx2() noexcept {
return __isa_enabled & (1 << __ISA_AVAILABLE_AVX2);
}
bool _Use_sse42() noexcept {
return __isa_enabled & (1 << __ISA_AVAILABLE_SSE42);
}
bool _Use_sse2() noexcept {
#ifdef _M_IX86
return __isa_enabled & (1 << __ISA_AVAILABLE_SSE2);
#else
return true;
#endif
}

from stl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.