vectorclass / add-on Goto Github PK
View Code? Open in Web Editor NEWAdd-on packages for Vector class library
License: Apache License 2.0
Add-on packages for Vector class library
License: Apache License 2.0
Hi,
First of all, thanks for your great code!
I'm having a strange bug when using the version 2.0 of the ranvec1 class when using gentype 3
under Windows10 x64 with the VS2019 compiler on a ryzen 3900x processor
The following code works flawless when using version 1.25 of the ranvec1 class:
static const Vec8f BernoulliVecFloat(const Float& keepRate)
{
__declspec(align(64)) static thread_local Ranvec1* generator;
if (!generator)
{
generator = new Ranvec1(3);
generator->init(int(std::chrono::duration_cast<std::chrono::nanoseconds>(std::chrono::high_resolution_clock::now().time_since_epoch()).count() + std::hash<std::thread::id>()(std::this_thread::get_id())));
}
return select((*generator).random8f() < keepRate, Vec8f(1), Vec8f(0));
}
I must make a small change to Ranvec1base class to get the aligment requirements right:
__declspec(align(64)) class Ranvec1base {
public:
Ranvec1base(int gtype = 3); // Constructor
void* operator new(size_t i)
{
return _aligned_malloc(i, 64);
}
void operator delete(void* p)
{
_aligned_free(p);
}
The bug doesn't happen in Debug mode. And on a Intel Haswell platform it strangly doesn't occur!
Has this something to do with the problem Zen2 has with the rdrand instruction?
thanks
Hi. Would it be possible to add the license file in this add-on repository? As it is, the only mention (by name only) of a license seems to be in the manuals. Thanks.
First of all, this is a great library. Has been really helpful in my machine learning algorithms implementations.
I wanted to suggest a change in vector_containers.h
on the load method
It should take const void*
type instead void*
Related issue
In the excellent vc_manual, on page 73, there is the following snippet:
// Access array as single elements :
float* mydataf = (float*)mydata ;
int i;
for (i = 0; i < datasize ; ++i) {
mydataf[i] = (float)i ;
}
But I have seen that although casting from float to "simd_type" is fine, it may not be okay, the other way around.
float x ;
__m128* x_simd = reinterpret_cast<__m128*>(&x) ; // <--- This is fine
but
__m128 x _simd;
float* x = reinterpret_cast<float*>(&x_simd) ; // <--- This may not be fine
See for reference this stackoverflow answer and this stack overflow post
I had raised a similar question in the Vc library which is also used in some of the codebases I had worked on.
Hi,
I was forced to make a small addition to the default constructor of the Ranvec1 class. I couldn't use the provided init(int seed,...) ... functions to initialize with the desired seeds because I used it as a static thread_local object. So the seed initialization must happen in this case in the constructor. Is this something that could be changed in your repository and can be of any use in other projects?
thanks
#if defined(DNN_AVX512BW) || defined(DNN_AVX512)
typedef Vec16f VecFloat;
constexpr auto VectorSize = 16ull;
#elif defined(DNN_AVX2) || defined(DNN_AVX)
typedef Vec8f VecFloat;
constexpr auto VectorSize = 8ull;
#elif defined(DNN_SSE42) || defined(DNN_SSE41)
typedef Vec4f VecFloat;
constexpr auto VectorSize = 4ull;
#endif
inline static auto BernoulliVecFloat(const Float p = Float(0.5)) noexcept
{
static thread_local auto generator = Ranvec1(Seed<int>(), static_cast<int>(std::hash<std::thread::id>()(std::this_thread::get_id())), 3);
#if defined(DNN_AVX512BW) || defined(DNN_AVX512)
return select(generator.random16f() < p, VecFloat(1), VecFloat(0));
#elif defined(DNN_AVX2) || defined(DNN_AVX)
return select(generator.random8f() < p, VecFloat(1), VecFloat(0));
#elif defined(DNN_SSE42) || defined(DNN_SSE41)
return select(generator.random4f() < p, VecFloat(1), VecFloat(0));
#endif
}
small addition to the default constructor (ranvec1.h beginning at line 290) :
/******************************************************************************
Ranvec1: Class for combined random number generator
Make one instance of Ranvec1 for each thread.
Remember to initialize it with a seed.
Each instance must have a different seed if you want different random sequences
******************************************************************************/
// Combined random number generator. Derived class with various output functions
// (Total size depends on INSTRSET and MAX_VECTOR_SIZE)
class Ranvec1 : public Ranvec1base {
public:
// Constructor
Ranvec1(int gtype = 3) : Ranvec1base(gtype), buf32(this), buf64(this), buf128(this)
#if MAX_VECTOR_SIZE >= 256
, buf256(this)
#endif
#if MAX_VECTOR_SIZE >= 512
, buf512(this)
#endif
{
randomixInterval = randomixLimit = 0;
}
Ranvec1(int seed1, int gtype = 3) : Ranvec1base(gtype), buf32(this), buf64(this), buf128(this)
#if MAX_VECTOR_SIZE >= 256
, buf256(this)
#endif
#if MAX_VECTOR_SIZE >= 512
, buf512(this)
#endif
{
randomixInterval = randomixLimit = 0;
Ranvec1base::init(seed1);
resetBuffers();
}
Ranvec1(int seed1, int seed2, int gtype = 3) : Ranvec1base(gtype), buf32(this), buf64(this), buf128(this)
#if MAX_VECTOR_SIZE >= 256
, buf256(this)
#endif
#if MAX_VECTOR_SIZE >= 512
, buf512(this)
#endif
{
randomixInterval = randomixLimit = 0;
Ranvec1base::init(seed1, seed2);
resetBuffers();
}
Ranvec1(int32_t const seeds[], int numSeeds, int gtype = 3) : Ranvec1base(gtype), buf32(this), buf64(this), buf128(this)
#if MAX_VECTOR_SIZE >= 256
, buf256(this)
#endif
#if MAX_VECTOR_SIZE >= 512
, buf512(this)
#endif
{
randomixInterval = randomixLimit = 0;
Ranvec1base::initByArray(seeds, numSeeds);
resetBuffers();
}
When you set MAX_VECTOR_SIZE=256
, vectorclass1.h
fails to compile:
In file included from ../src/functionality/generateBalloon.cpp:10:
../secret-project-name/external/vectorclass-version2/include/add-on/complex/complexvec1.h:1010:5: error: unknown type name 'Vec16f'
Vec16f y; // vector of 8 floats
^
../secret-project-name/external/vectorclass-version2/include/add-on/complex/complexvec1.h:1058:15: error: unknown type name 'Vec16f'
Complex8f(Vec16f const x) { // constructor to convert from emulated Vec16f
^
../secret-project-name/external/vectorclass-version2/include/add-on/complex/complexvec1.h:1076:5: error: unknown type name 'Vec16f'
Vec16f to_vector() const {
^
../secret-project-name/external/vectorclass-version2/include/add-on/complex/complexvec1.h:1025:13: error: use of undeclared identifier 'Vec16f'
y = Vec16f(Vec8f(a2), Vec8f(a2));
^
../secret-project-name/external/vectorclass-version2/include/add-on/complex/complexvec1.h:1034:13: error: use of undeclared identifier 'Vec16f'
y = Vec16f(Vec8f(a2), Vec8f(a2));
^
../secret-project-name/external/vectorclass-version2/include/add-on/complex/complexvec1.h:1039:13: error: use of undeclared identifier 'Vec16f'
y = Vec16f(Vec8f(a), Vec8f(b));
^
../secret-project-name/external/vectorclass-version2/include/add-on/complex/complexvec1.h:1044:13: error: use of undeclared identifier 'Vec16f'
y = Vec16f(Vec8f(Complex4f(Complex2f(a0,a1),Complex2f(a2,a3))),
^
../secret-project-name/external/vectorclass-version2/include/add-on/complex/complexvec1.h:1081:13: error: use of undeclared identifier 'Vec16f'
y = Vec16f().load(p);
^
../secret-project-name/external/vectorclass-version2/include/add-on/complex/complexvec1.h:1201:15: error: unknown type name 'Vec16fb'
static inline Vec16fb operator == (Complex8f const a, Complex8f const b) {
^
../secret-project-name/external/vectorclass-version2/include/add-on/complex/complexvec1.h:1208:12: error: use of undeclared identifier 'Vec16fb'
return Vec16fb(a.get_low() == b.get_low(), a.get_high() == b.get_high());
^
../secret-project-name/external/vectorclass-version2/include/add-on/complex/complexvec1.h:1213:15: error: unknown type name 'Vec16fb'
static inline Vec16fb operator != (Complex8f const a, Complex8f const b) {
^
../secret-project-name/external/vectorclass-version2/include/add-on/complex/complexvec1.h:1261:38: error: use of undeclared identifier 'Vec16f'
return Complex8f(a.to_vector() / Vec16f(b));
^
../secret-project-name/external/vectorclass-version2/include/add-on/complex/complexvec1.h:1331:33: error: unknown type name 'Vec16fb'
static inline Complex8f select (Vec16fb const s, Complex8f const a, Complex8f const b) {
^
../secret-project-name/external/vectorclass-version2/include/add-on/complex/complexvec1.h:1947:5: error: unknown type name 'Vec8d'
Vec8d y; // vector of 4 doubles
^
../secret-project-name/external/vectorclass-version2/include/add-on/complex/complexvec1.h:1985:15: error: unknown type name 'Vec8d'
Complex4d(Vec8d const x) {
^
../secret-project-name/external/vectorclass-version2/include/add-on/complex/complexvec1.h:1993:29: error: unknown type name 'Vec8d'
Complex4d & operator = (Vec8d const x) {
^
../secret-project-name/external/vectorclass-version2/include/add-on/complex/complexvec1.h:2005:5: error: unknown type name 'Vec8d'
Vec8d to_vector() const {
^
../secret-project-name/external/vectorclass-version2/include/add-on/complex/complexvec1.h:1961:13: error: use of undeclared identifier 'Vec8d'
y = Vec8d(Complex2d(a).to_vector(), Complex2d(a).to_vector());
^
../secret-project-name/external/vectorclass-version2/include/add-on/complex/complexvec1.h:1966:13: error: use of undeclared identifier 'Vec8d'
y = Vec8d(Vec4d(Complex2d(a0,a1)), Vec4d(Complex2d(a2,a3)));
Complex4d
is not used, of course.
The code in physical_processors.cpp is not within VCL_NAMESPACE causing a conflict with the prototypes in vectorclass/instrset.h when VCL_NAMESPACE is defined.
// functions in instrset_detect.cpp:
#ifdef VCL_NAMESPACE
namespace VCL_NAMESPACE {
#endif
int instrset_detect(void); // tells which instruction sets are supported
bool hasFMA3(void); // true if FMA3 instructions supported
bool hasFMA4(void); // true if FMA4 instructions supported
bool hasXOP(void); // true if XOP instructions supported
bool hasAVX512ER(void); // true if AVX512ER instructions supported
bool hasAVX512VBMI(void); // true if AVX512VBMI instructions supported
bool hasAVX512VBMI2(void); // true if AVX512VBMI2 instructions supported
// function in physical_processors.cpp:
int physicalProcessors(int * logical_processors = 0);
#ifdef VCL_NAMESPACE
}
#endif
When the load method is called, the elements in the corresponding array in not modified. So, the function parameters should be const.
This is the case for the vectorclass members where:
load(const pointer
is the signature.
Since, load method of ContainerV just calls this iteratively, this change can be made.
Hi,
I think it would be very useful if there was a built-in utility to convert between Vec4f real, Vec4f imag -> Complex4f complexVec
and vice-versa, i.e. Complex4f complexVec -> Vec4f real, Vec4f imag
.
Sometimes you maybe want to use atan2
which requires separate x and y vectors, or have to do some calculations on complex numbers which are stored in seperate real / imag arrays because your function is called from an external library, etc...
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.