Comments (30)
@jishnub The default algorithm seems to call syevd
and not syev
, based on the docs. But all the 3 algorithms are failing. :-(
from julia.
The segfault happens for all the algorithms
from julia.
We just bumped openblas yesterday, so it is quite likely an upstream bug.
On my mac I'm not getting a segmentation fault, but it seems to keep running endlessly. We should report this upstream.
from julia.
Works fine for size 63x63
for me and stops working 64x64
onwards.
from julia.
For the record, I am being able to call dsyev
in C using OpenBLAS's master branch when compiling with gcc
directly.
E.g., using the following code:
#include <stdio.h>
#include <stdlib.h>
#include <lapacke.h>
int main() {
int n = 64;
double *a;
a = (double*) malloc(n*n*sizeof(double));
// Check if the memory has been successfully
// allocated by malloc or not
if (a == NULL) {
printf("Memory not allocated.\n");
exit(0);
}
for(int row=0; row<n; row++){
for(int col=0; col<=row; col++){
a[row*n + col] = (double)rand()/(double)(RAND_MAX);
}
}
// Allocate space for eigenvalues
double *w;
w = (double*) malloc(n * sizeof(double));
if (w == NULL) {
printf("Memory not allocated.\n");
exit(0);
}
// Leading dimension of the matrix
int lda = n;
// Compute eigenvalues and eigenvectors
LAPACKE_dsyev(LAPACK_ROW_MAJOR, 'N', 'U', n, a, lda, w);
// Print eigenvalues
printf("Eigenvalues:\n");
for (int i = 0; i < n; i++) {
printf("%f\n", w[i]);
}
free(a);
free(w);
return 0;
}
Compiling with
gcc -o hermeig hermeig.c -Iopenblasinstallation/include -Lopenblasinstallation/lib -lopenblas
This runs correctly and produces results.
from julia.
I've noticed that this error occurs in julia when I set BLAS.set_num_threads(4)
or higher. There is no segfault for 3 or fewer threads.
rr trace attached: https://julialang-dumps.s3.amazonaws.com/reports/2024-08-13T11-47-58-jishnub.tar.zst
from julia.
For the record, I am being able to call dsyev in C using OpenBLAS's master branch when compiling with gcc directly.
It'd be useful to make sure you can reproduce the segfault when linking to Julia's libopenblas build: if not, that test may not be useful.
from julia.
Doesn't segfault when linking to our openblas with 1 or 4 threads. Used slightly modified driver below:
➜ lib git:(master) gcc -o hermeig hermeig.c -Iopenblasinstallation/include -Lopenblasinstallation/lib -L. -lopenblas64_ -I../include -Wno-implicit-function-declaration
ld: warning: search path 'openblasinstallation/lib' not found
ld: warning: reexported library with install name '@rpath/libunwind.1.dylib' found at '/Users/viral/julia/usr/lib/libunwind.1.0.dylib' couldn't be matched with any parent library and will be linked directly
➜ lib git:(master) OPENBLAS_NUM_THREADS=4 DYLD_LIBRARY_PATH=. ./hermeig
Eigenvalues:
0.000008
0.022556
0.036368
0.050084
#include <stdio.h>
#include <stdlib.h>
#include <lapacke.h>
int main() {
long n = 64;
double *a;
a = (double*) malloc(n*n*sizeof(double));
openblas_set_num_threads64_(4);
// Check if the memory has been successfully
// allocated by malloc or not
if (a == NULL) {
printf("Memory not allocated.\n");
exit(0);
}
for(long row=0; row<n; row++){
for(long col=0; col<=row; col++){
a[row*n + col] = (double)rand()/(double)(RAND_MAX);
}
}
// Allocate space for eigenvalues
double *w;
w = (double*) malloc(n * sizeof(double));
if (w == NULL) {
printf("Memory not allocated.\n");
exit(0);
}
// Leading dimension of the matrix
long lda = n;
// Compute eigenvalues and eigenvectors
LAPACKE_dsyev64_(LAPACK_ROW_MAJOR, 'N', 'U', n, a, lda, w);
// Print eigenvalues
printf("Eigenvalues:\n");
for (int i = 0; i < n; i++) {
printf("%f\n", w[i]);
}
free(a);
free(w);
return 0;
}
from julia.
Then that doesn't seem to be a good reproducer 🙂
from julia.
Or there could perhaps be a bug on the Julia side?
The LAPACKE interface seems to hide the lwork
discovery.
from julia.
Or there could perhaps be a bug on the Julia side?
Perhaps, but I'd spend more time to make sure the C reproducer is 100% faithful to what we're doing on the Julia side. Also, if you want to build OpenBLAS locally, to reproduce the Yggdrasil build I'd suggest compiling OpenBLAS with something like
make USE_THREAD=1 GEMM_MULTITHREADING_THRESHOLD=400 NO_AFFINITY=1 NO_STATIC=1 NUM_THREADS=512 INTERFACE64=1 SYMBOLSUFFIX=64_ DYNAMIC_ARCH=1 TARGET=GENERIC all
from julia.
Also, do we have a backtrace with gdb?
from julia.
svd
also fails. The failures are only on Hermitian matrices 64x64 and larger. Other factorizations seem to be working fine on Hermitians.
Of course svd failing may not be additional data, in that it is probably using the same internal LAPACK routines.
from julia.
@giordano I could try get a gdb trace, but on M1 mac, it just goes into an infinite loop (should I ctrl-C?) - not a segfault. @jishnub Which platform are you seeing the segfault on?
from julia.
Also, do we have a backtrace with gdb?
julia> eigen(H);
Thread 9 "julia" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffd23f9640 (LWP 3380628)]
0x00007fffde86df6c in dgemm_itcopy_ZEN () from /home/mose/.julia/juliaup/julia-nightly/bin/../lib/julia/libopenblas64_.so
(gdb) where
#0 0x00007fffde86df6c in dgemm_itcopy_ZEN () from /home/mose/.julia/juliaup/julia-nightly/bin/../lib/julia/libopenblas64_.so
#1 0x00007fffdda5f70f in dsyr2k_UN () from /home/mose/.julia/juliaup/julia-nightly/bin/../lib/julia/libopenblas64_.so
#2 0x00007fffddb99cd8 in exec_threads () from /home/mose/.julia/juliaup/julia-nightly/bin/../lib/julia/libopenblas64_.so
#3 0x00007fffddb99efc in blas_thread_server () from /home/mose/.julia/juliaup/julia-nightly/bin/../lib/julia/libopenblas64_.so
#4 0x00007ffff7e08ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#5 0x00007ffff7e9a850 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
So, it looks to me the segmentation fault is in OpenBLAS. Certainly the issue could be in what Julia passes to OpenBLAS, but this again begs for a more faithful C reproducer.
from julia.
x86_64-linux-gnu
from julia.
So, if I set BLAS num threads to 1, then no crash on the Hermitian 64x64.
julia> H =Hermitian(rand(64,64))
julia> BLAS.set_num_threads(4)
julia> eigen(H)
^C^C^C^C^CWARNING: Force throwing a SIGINT
^C^C^C^C^CERROR: ^C^C^C^C^C^CInterruptException:
julia> ^C
julia> BLAS.set_num_threads(1)
julia> eigen(H)
Eigen{Float64, Float64, Matrix{Float64}, Vector{Float64}}
values:
64-element Vector{Float64}:
-4.105758427499317
-3.971213922931517
from julia.
I believe a more faithful reproducer using the direct fortran API calling the workspace query, which does not crash:
➜ lib git:(vs/55471) ✗ gcc -g -o hermeig hermeig.c -L. -lopenblas64_ -I../include -Wno-implicit-function-declaration && DYLD_LIBRARY_PATH=. ./hermeig
ld: warning: reexported library with install name '@rpath/libunwind.1.dylib' found at '/Users/viral/julia/usr/lib/libunwind.1.0.dylib' couldn't be matched with any parent library and will be linked directly
First 5 Eigenvalues:
-4.396484
-4.053237
-3.783926
-3.699819
-3.197730
#include <stdio.h>
#include <stdlib.h>
int main() {
long n = 64;
double *a;
a = (double*) malloc(n*n*sizeof(double));
openblas_set_num_threads64_(4);
// Check if the memory has been successfully
// allocated by malloc or not
if (a == NULL) {
printf("Memory not allocated.\n");
exit(0);
}
for(long row=0; row<n; row++){
for(long col=0; col<=row; col++){
a[row*n + col] = (double)rand()/(double)(RAND_MAX);
}
}
// Allocate space for eigenvalues
double *w;
w = (double*) malloc(n * sizeof(double));
if (w == NULL) {
printf("Memory not allocated.\n");
exit(0);
}
// Leading dimension of the matrix
long lda = n;
// Allocate work
long lwork = -1;
double *work = (double*) malloc(sizeof(double));
long info = -100;
char nchar = 'N';
char uchar = 'U';
// Compute eigenvalues and eigenvectors
dsyev_64_(&nchar, &uchar, &n, a, &lda, w, work, &lwork, &info);
lwork = (long)work[0];
work = (double*)realloc(work, lwork*sizeof(double));
dsyev_64_(&nchar, &uchar, &n, a, &lda, w, work, &lwork, &info);
// Print eigenvalues
printf("First 5 Eigenvalues:\n");
for (int i = 0; i < 5; i++) {
printf("%f\n", w[i]);
}
free(a);
free(w);
return 0;
}
from julia.
@jishnub can you try with https://github.com/giordano/OpenBLAS_jll.jl/releases/download/OpenBLAS-v0.3.28%2B1/OpenBLAS.v0.3.28.x86_64-linux-gnu-libgfortran5.tar.gz?
I get no crashes:
julia> using LinearAlgebra
julia> BLAS.lbt_forward("./libopenblas64_.so"; clear=true)
5037
julia> H = Hermitian(ones(1024, 1024));
julia> eigen(H);
julia>
This was compiled with GCC 12. As I mentioned in OpenMathLib/OpenBLAS#4868 (comment), the segmentation fault appears to go away depending on the version of GCC used (but I can't tell whether it's a bug in OpenBLAS which is hidden/surfaced by some compiler versions, or a genuine compiler error)
from julia.
Oddly, I seem to still face the issue using this:
julia> using LinearAlgebra
julia> BLAS.lbt_forward("./libopenblas64_.so"; clear=true)
5037
julia> H = Hermitian(ones(1024, 1024));
julia> eigen(H);
julia> BLAS.get_num_threads()
1
julia> BLAS.set_num_threads(4)
julia> eigen(H);
[88907] signal 11 (1): Segmentation fault
[88907] signal 11 (1): Segmentation fault
in expression starting at REPL[7]:1
Allocations: 2069146 (Pool: 2069051;Allocations: 2069146 (Pool: 2069051; Big: 95); GC: 4
Allocations: 2069146 (Pool: 2069051; Big: 95); GC: 4
[1] 88907 segmentation fault (core dumped) julia +nightly
from julia.
Aaah, right, when forwarding the blas library the number of threads is reset.
from julia.
which does not crash:
I think we need two cases:
- a code which linked to julia's build of openblas crashes in the same way as within julia
- the exact same code which doesn't crash when linked to a different build of openblas, but please provide all information of compiler used and build configuration
Any other combination of reproducers which doesn't involve the two above is probably not very useful to narrow down the issue, because it doesn't necessarily show much.
from julia.
@jishnub can you please test again https://github.com/giordano/OpenBLAS_jll.jl/releases/download/OpenBLAS-v0.3.28%2B1/OpenBLAS.v0.3.28.x86_64-linux-gnu-libgfortran5.tar.gz (same URL as before, but it's a new build, which includes OpenMathLib/OpenBLAS#4871)? It seems to work for me now:
julia> using LinearAlgebra
julia> BLAS.lbt_forward("./libopenblas64_.so"; clear=true)
5037
julia> BLAS.set_num_threads(4)
julia> H = Hermitian(ones(1024, 1024));
julia> eigen(H);
julia>
from julia.
This works for me as well, thanks!
from julia.
A more minimal Julia reproducer is
using LinearAlgebra
LAPACK.syevd!('V', 'U', ones(1024, 1024));
We're entering this function
julia/stdlib/LinearAlgebra/src/lapack.jl
Line 5432 in e1aefeb
DSYEVD
. Can someone please confirm this is a sensible program (NOTE: I updated the code compared the a previous version of this message):
#include <stdio.h>
#include <stdlib.h>
int main() {
long n = 1024;
double *a;
a = (double*) malloc(n*n*sizeof(double));
openblas_set_num_threads64_(64);
// Check if the memory has been successfully
// allocated by malloc or not
if (a == NULL) {
printf("Memory not allocated.\n");
exit(0);
}
for(long row=0; row<n; row++){
for(long col=0; col<=row; col++){
a[row*n + col] = 1.0;
}
}
// Allocate space for eigenvalues
double *w;
w = (double*) malloc(n * sizeof(double));
if (w == NULL) {
printf("Memory not allocated.\n");
exit(0);
}
// Leading dimension of the matrix
long lda = n;
// Allocate work
double *work = (double*) malloc(sizeof(double));
long lwork = -1;
long *iwork = (long*) malloc(sizeof(long));
long liwork = -1;
long info = -100;
char jobz = 'V';
char uplo = 'U';
dsyevd_64_(&jobz, &uplo, &n, a, &lda, w, work, &lwork, iwork, &liwork, &info, 1, 1);
// Print eigenvalues
printf("First 5 Eigenvalues:\n");
for (int i = 0; i < 5; i++) {
printf("%f\n", w[i]);
}
free(a);
free(w);
return 0;
}
But this doesn't crash for me
$ gcc -o test test.c -L${HOME}/.julia/juliaup/julia-nightly/lib/julia -lopenblas64_ -Wl,-rpath,${HOME}/.julia/juliaup/julia-nightly/lib/julia
test.c: In function ‘main’:
test.c:9:5: warning: implicit declaration of function ‘openblas_set_num_threads64_’ [-Wimplicit-function-declaration]
openblas_set_num_threads64_(64);
^~~~~~~~~~~~~~~~~~~~~~~~~~~
test.c:46:5: warning: implicit declaration of function ‘dsyevd_64_’ [-Wimplicit-function-declaration]
dsyevd_64_(&jobz, &uplo, &n, a, &lda, w, work, &lwork, iwork, &liwork, &info, 1, 1);
^~~~~~~~~~
$ ./test
First 5 Eigenvalues:
0.000000
0.000000
0.000000
0.000000
0.000000
from julia.
You have to call dsyevd
twice. First time for the workspace query. Then take that value and allocate the work array and then actually call it again. It should segfault in the second call where the computation will actually happen.
#include <stdio.h>
#include <stdlib.h>
int main() {
long n = 1024;
double *a;
a = (double*) malloc(n*n*sizeof(double));
openblas_set_num_threads64_(64);
// Check if the memory has been successfully
// allocated by malloc or not
if (a == NULL) {
printf("Memory not allocated.\n");
exit(0);
}
for(long row=0; row<n; row++){
for(long col=0; col<=row; col++){
a[row*n + col] = 1.0;
}
}
// Allocate space for eigenvalues
double *w;
w = (double*) malloc(n * sizeof(double));
if (w == NULL) {
printf("Memory not allocated.\n");
exit(0);
}
// Leading dimension of the matrix
long lda = n;
// workspace query
double *work = (double*) malloc(sizeof(double));
long lwork = -1;
long *iwork = (long*) malloc(sizeof(long));
long liwork = -1;
long info = -100;
char jobz = 'V';
char uplo = 'U';
dsyevd_64_(&jobz, &uplo, &n, a, &lda, w, work, &lwork, iwork, &liwork, &info, 1, 1);
// Workspace allocation
lwork = work[0];
work = (double *) malloc(lwork * sizeof(double));
liwork = iwork[0];
iwork = (long *) malloc(liwork * sizeof(long));
dsyevd_64_(&jobz, &uplo, &n, a, &lda, w, work, &lwork, iwork, &liwork, &info, 1, 1);
// Print eigenvalues
printf("First 5 Eigenvalues:\n");
for (int i = 0; i < 5; i++) {
printf("%f\n", w[i]);
}
return 0;
}
from julia.
Thanks. Sadly, adding
lwork = (long)work[0];
work = (double*)realloc(work, lwork*sizeof(double));
liwork = (long)iwork[0];
iwork = (long*)realloc(iwork, liwork*sizeof(long));
dsyevd_64_(&jobz, &uplo, &n, a, &lda, w, work, &lwork, iwork, &liwork, &info, 1L, 1L);
after the first dsyevd_64_
call doesn't seem to change the outcome 😞 However it looks like OpenMathLib/OpenBLAS#4879 works for me on the machines where I was still observing a segmentation fault after the first patch. Hopefully we'll be good after that, but I'm still puzzled by the fact we can't prepare a C reproducer. I also tried to use dlopen
/dlsym
, to even more closely match what we do in Julia, to no avail.
from julia.
The bug is real, but it seems that Julia is just better at triggering it than C - whether it is our allocator or LBT or something else.
from julia.
The C script does crash for me with the second call to dsyevd on mac.
from julia.
The C script does crash for me with the second call to dsyevd on mac.
Ah, interesting, I can confirm it segfaults for me as well on an M1 with OpenBLAS from Julia nightly, but not with OpenBLAS from Julia v1.10.4:
% clang -Wno-implicit-function-declaration -o test test.c -L${HOME}/repo/julia/usr/lib -Wl,-rpath,${HOME}/repo/julia/usr/lib -lopenblas64_
% ./test
zsh: segmentation fault ./test
% clang -Wno-implicit-function-declaration -o test test.c -L${HOME}/.julia/juliaup/julia-1.10.4+0.aarch64.apple.darwin14/lib/julia -Wl,-rpath,${HOME}/.julia/juliaup/julia-1.10.4+0.aarch64.apple.darwin14/lib/julia -lopenblas64_
% ./test
First 5 Eigenvalues:
-0.000000
-0.000000
-0.000000
-0.000000
-0.000000
However it never segfaults for me on an x86_64-linux-gnu machine where I could reproduce the segfault in Julia
from julia.
Related Issues (20)
- `@profile` segfault when threading on x64 linux HOT 3
- Markdown regression on 1.11, probably by 16x (3x for PythonCall) HOT 11
- `rand_twiceprecision` tests intermittently failing on 32-bit Linux HOT 1
- `@time_imports` not showing extension parents HOT 3
- Type instabilities when using `mapslices` HOT 5
- Allocations of `eachslice` with variable `dims` HOT 3
- 4x regression for loading JLLs in 1.11, or likely the first one, such as micromamba_jll HOT 7
- `pkgdir` returns wrong dir for extension if extension is in its own folder
- Extension not loaded if trigger is a strong dependency [on 1.10] HOT 6
- `libutf8proc.a: error adding symbols: file format not recognized` blocking build on Windows 11 on ARM via WSL HOT 8
- 136x regression for LazyAritifacts HOT 4
- LinearAlgebra/addmul all-NAN batman? HOT 5
- `Markdown.parse(::LazyString)` errors
- Allow `stack` on empty collection HOT 1
- Learn program HOT 1
- `+` function string concatenation hint should not apply to zero args HOT 1
- Typed comprehension with begin/end is illegal syntax HOT 1
- Julia v1.11 (all RCs) breaks package SolidStateDetectors (not caught by PkgEval due to timeout)
- REPL precompilation workload is broken HOT 1
- `MethodError` thrown while preparing to throw `InexactError` in conversion of `Rational` to `Integer`
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from julia.