Giter Site home page Giter Site logo

szcompressor / sz Goto Github PK

View Code? Open in Web Editor NEW
146.0 146.0 54.0 10.47 MB

Error-bounded Lossy Data Compressor (for floating-point/integer datasets)

Home Page: http://szcompressor.org

License: Other

Makefile 5.31% Shell 5.95% C 86.65% M4 0.11% Fortran 0.78% C++ 0.68% CMake 0.31% Python 0.02% SWIG 0.19%

sz's People

Contributors

ayzk avatar borelset avatar dingwentao avatar disheng222 avatar ggorman avatar jschueller avatar lxaltria avatar munnybearz avatar oliver-pola avatar robertu94 avatar the-alchemist avatar vasole avatar xantares avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sz's Issues

Missing parentheses in function convertBytesToSZParams

It seems that in ByteToolkit.c, function convertBytesToSZParams(unsigned char* bytes, sz_params* params), line 963 should be

exe_params->optQuantMode = (flag1 & 0x40) >> 6;

Missing parentheses will cause incorrect optQuantMode.

1D and 2D array dimensions.

A question about the api for 1D and 2D arrays. Are these dimensions equivalent?

r5=0 r4=0 r3=0 r2=1 r1=1000
r5=0 r4=0 r3=0 r2=0 r1=1000

In both cases, I pass a 1D float array that is 1000 floats.

In the first case, it encounters a divide-by-zero FPE. Looking at the code, it seems to use the 2D case for compressing.
In the second case, it works fine.

cmake: headers are not installed

with cmake, sz.h headers and friends are not installed
note that dictionary.h and iniparser.h must not be included as they are part of iniparser

also, probably test executable dont need to be installed

errors on small data with small abs error bound

I updated sztest to take rows/cols/abs bound exp as args. I get the following errors:

./sztest-static 16 17 -5
size 16x17, 2176 bytes
[SZ] Reading SZ configuration file (sz.config) ...

== RAND ==
out_size = -5 (-0.0023)
out = 0x1998d20
abs bound = 0.0000100000000000
Error: zlib_uncompress2: stream.avail_in != cmpSizeWrong version: 
Compressed-data version (0.0.0)
Current sz version: (1.4.9)

This also happens for smaller values of rows/cols.

If I go larger, I get a segfault:

./sztest-static 17 17 -5
size 17x17, 2312 bytes
[SZ] Reading SZ configuration file (sz.config) ...

== RAND ==
out_size = 1302 (0.5631)
out = 0x1d99f70
abs bound = 0.0000100000000000
Segmentation fault (core dumped)

Here is a backtrace for the segfault:

Program received signal SIGSEGV, Segmentation fault.
decode (out=0x11cf330, t=0x73f2f0, targetLength=289, s=0x6499cc "\377\177")
    at src/Huffman.c:275
275			if (n->t) {
(gdb) backtrace
#0  decode (out=0x11cf330, t=0x73f2f0, targetLength=289, s=0x6499cc "\377\177")
    at src/Huffman.c:275
#1  decode_withTree (s=0x6499c0 "", targetLength=289, out=0x11cf330)
    at src/Huffman.c:671
#2  0x0000000000404fe9 in decompressDataSeries_double_2D (data=0x7fffffffca88, 
    r1=17, r2=17, tdps=0x649920) at src/TightDataPointStorageD.c:364
#3  0x0000000000408b1c in getSnapshotData_double_2D (
    data=data@entry=0x7fffffffca88, r1=r1@entry=17, r2=r2@entry=17, 
    tdps=tdps@entry=0x649920, errBoundMode=errBoundMode@entry=0)
    at src/TightDataPointStorageD.c:1256
#4  0x000000000041a3f5 in SZ_decompress_args_double (
    newData=newData@entry=0x7fffffffca88, r5=<optimized out>, r4=0, 
    r3=r3@entry=0, r2=r2@entry=17, r1=r1@entry=17, cmpBytes=<optimized out>, 
    cmpSize=1302) at src/sz_double.c:1107
#5  0x0000000000401f54 in SZ_decompress (r1=17, r2=17, r3=0, 
    r4=<optimized out>, r5=<optimized out>, byteLength=<optimized out>, 
    bytes=<optimized out>, dataType=1) at src/sz.c:369
#6  SZ_decompress_args (dataType=<optimized out>, bytes=<optimized out>, 
    byteLength=<optimized out>, decompressed_array=0x649010, 
    r5=<optimized out>, r4=<optimized out>, r3=0, r2=17, r1=17) at src/sz.c:394
#7  0x0000000000401732 in test_roundtrip (data=0x647420, n=289, 
    abs_bound=1.0000000000000001e-05, r5=0, r4=0, r3=0, r2=17, r1=17)
    at sztest.c:125

installation of SZ

hello, I'm new in linux and i'm very confused.
I installing sz by using this link
git clone https://github.com/disheng222/SZ
and now iam trying to go to step 2

mkdir build && cd build
cmake .. -DCMAKE_INSTALL_PREFIX:PATH=[INSTALL_DIR]
make
make install

I creat build directory and cd build and then

/lin/build$ cmake -- -DCMAKE_INSTALL_PREFIX:PATH=/home/mainuu/lin/build

I get on this error ?? how can fix it please ?

CMake Error: The source directory "/home/mainuu/lin/build/--" does not exist.
Specify --help for usage, or press the help button on the CMake GUI.

unable to build from github checkout

I can run ./configure, but I get the following error when running make:

CDPATH="${ZSH_VERSION+.}:" && cd . && /bin/sh /home/bda/codar/SZ/missing aclocal-1.13 
/home/bda/codar/SZ/missing: line 81: aclocal-1.13: command not found
WARNING: 'aclocal-1.13' is missing on your system.
         You should only need it if you modified 'acinclude.m4' or
         'configure.ac' or m4 files included by 'configure.ac'.
         The 'aclocal' program is part of the GNU Automake package:
         <http://www.gnu.org/software/automake>
         It also requires GNU Autoconf, GNU m4 and Perl in order to run:
         <http://www.gnu.org/software/autoconf>
         <http://www.gnu.org/software/m4/>
         <http://www.perl.org/>
make: *** [Makefile:349: aclocal.m4] Error 127

There are also several files checked into source control that should probably be removed and ignored. After running ./configure, the following files are modified:

	modified:   Makefile
	modified:   config.log
	modified:   config.status
	modified:   libtool

free of uninitialized pointer causing segfault

I wrote a test program and I'm getting segfaults when calling SZ_compress multiple times. Source is here: https://github.com/bd4/sztest/blob/master/sztest.c

It looks like it may be an issue with SZ_compress_args_double_withinRange initializing the tdps struct itself, and not setting all the values (seems like it should be using the new_..._Empty routine). Here is the valgrind output:

==6406== Memcheck, a memory error detector
==6406== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==6406== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info
==6406== Command: ./sztest
==6406== 
==6406== Conditional jump or move depends on uninitialised value(s)
==6406==    at 0x4E46DF4: free_TightDataPointStorageD (TightDataPointStorageD.c:1667)
==6406==    by 0x4E535C3: SZ_compress_args_double_withinRange (sz_double.c:930)
==6406==    by 0x4E56544: SZ_compress_args_double (sz_double.c:984)
==6406==    by 0x4E56B06: SZ_compress_args (sz.c:279)
==6406==    by 0x400FCC: test_roundtrip (sztest.c:86)
==6406==    by 0x400DD2: main (sztest.c:56)
==6406==  Uninitialised value was created by a heap allocation
==6406==    at 0x4C2AB8D: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==6406==    by 0x4E53531: SZ_compress_args_double_withinRange (sz_double.c:910)
==6406==    by 0x4E56544: SZ_compress_args_double (sz_double.c:984)
==6406==    by 0x4E56B06: SZ_compress_args (sz.c:279)
==6406==    by 0x400FCC: test_roundtrip (sztest.c:86)
==6406==    by 0x400DD2: main (sztest.c:56)
==6406== 
==6406== Conditional jump or move depends on uninitialised value(s)
==6406==    at 0x4E41461: new_TightDataPointStorageD_fromFlatBytes (TightDataPointStorageD.c:86)
==6406==    by 0x4E53688: SZ_decompress_args_double (sz_double.c:1082)
==6406==    by 0x4E56CEF: SZ_decompress (sz.c:362)
==6406==    by 0x4E56DAB: SZ_decompress_args (sz.c:387)
==6406==    by 0x401033: test_roundtrip (sztest.c:93)
==6406==    by 0x400DD2: main (sztest.c:56)
==6406==  Uninitialised value was created by a heap allocation
==6406==    at 0x4C2AB8D: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==6406==    by 0x4E53531: SZ_compress_args_double_withinRange (sz_double.c:910)
==6406==    by 0x4E56544: SZ_compress_args_double (sz_double.c:984)
==6406==    by 0x4E56B06: SZ_compress_args (sz.c:279)
==6406==    by 0x400FCC: test_roundtrip (sztest.c:86)
==6406==    by 0x400DD2: main (sztest.c:56)
==6406== 
==6406== Conditional jump or move depends on uninitialised value(s)
==6406==    at 0x4E4146A: new_TightDataPointStorageD_fromFlatBytes (TightDataPointStorageD.c:91)
==6406==    by 0x4E53688: SZ_decompress_args_double (sz_double.c:1082)
==6406==    by 0x4E56CEF: SZ_decompress (sz.c:362)
==6406==    by 0x4E56DAB: SZ_decompress_args (sz.c:387)
==6406==    by 0x401033: test_roundtrip (sztest.c:93)
==6406==    by 0x400DD2: main (sztest.c:56)
==6406==  Uninitialised value was created by a heap allocation
==6406==    at 0x4C2AB8D: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==6406==    by 0x4E53531: SZ_compress_args_double_withinRange (sz_double.c:910)
==6406==    by 0x4E56544: SZ_compress_args_double (sz_double.c:984)
==6406==    by 0x4E56B06: SZ_compress_args (sz.c:279)
==6406==    by 0x400FCC: test_roundtrip (sztest.c:86)
==6406==    by 0x400DD2: main (sztest.c:56)
==6406== 
==6406== Conditional jump or move depends on uninitialised value(s)
==6406==    at 0x4E536AB: SZ_decompress_args_double (sz_double.c:1086)
==6406==    by 0x4E56CEF: SZ_decompress (sz.c:362)
==6406==    by 0x4E56DAB: SZ_decompress_args (sz.c:387)
==6406==    by 0x401033: test_roundtrip (sztest.c:93)
==6406==    by 0x400DD2: main (sztest.c:56)
==6406==  Uninitialised value was created by a heap allocation
==6406==    at 0x4C2AB8D: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==6406==    by 0x4E53531: SZ_compress_args_double_withinRange (sz_double.c:910)
==6406==    by 0x4E56544: SZ_compress_args_double (sz_double.c:984)
==6406==    by 0x4E56B06: SZ_compress_args (sz.c:279)
==6406==    by 0x400FCC: test_roundtrip (sztest.c:86)
==6406==    by 0x400DD2: main (sztest.c:56)
==6406== 
==6406== Conditional jump or move depends on uninitialised value(s)
==6406==    at 0x4E536E2: SZ_decompress_args_double (sz_double.c:1112)
==6406==    by 0x4E56CEF: SZ_decompress (sz.c:362)
==6406==    by 0x4E56DAB: SZ_decompress_args (sz.c:387)
==6406==    by 0x401033: test_roundtrip (sztest.c:93)
==6406==    by 0x400DD2: main (sztest.c:56)
==6406==  Uninitialised value was created by a heap allocation
==6406==    at 0x4C2AB8D: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==6406==    by 0x4E53531: SZ_compress_args_double_withinRange (sz_double.c:910)
==6406==    by 0x4E56544: SZ_compress_args_double (sz_double.c:984)
==6406==    by 0x4E56B06: SZ_compress_args (sz.c:279)
==6406==    by 0x400FCC: test_roundtrip (sztest.c:86)
==6406==    by 0x400DD2: main (sztest.c:56)
==6406== 
[SZ] Reading SZ configuration file (sz.config) ...
== SET ==
out_size = 70
out = 0x595c0b0
roundtrip differs

== RAND ==
out_size = 16
out = 0x5a1fd10
roundtrip differs

== LIN ==
out_size = 16
out = 0x5a20470
roundtrip differs

== CONST ==
out_size = 16
out = 0x5a20bd0
==6406== 
==6406== HEAP SUMMARY:
==6406==     in use at exit: 240 bytes in 6 blocks
==6406==   total heap usage: 139 allocs, 133 frees, 57,332,215 bytes allocated
==6406== 
==6406== 16 bytes in 1 blocks are definitely lost in loss record 1 of 6
==6406==    at 0x4C2AB8D: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==6406==    by 0x4E466E2: convertTDPStoFlatBytes_double (TightDataPointStorageD.c:1443)
==6406==    by 0x4E53593: SZ_compress_args_double_withinRange (sz_double.c:925)
==6406==    by 0x4E56544: SZ_compress_args_double (sz_double.c:984)
==6406==    by 0x4E56B06: SZ_compress_args (sz.c:279)
==6406==    by 0x400FCC: test_roundtrip (sztest.c:86)
==6406==    by 0x400DD2: main (sztest.c:56)
==6406== 
==6406== 16 bytes in 1 blocks are definitely lost in loss record 2 of 6
==6406==    at 0x4C2AB8D: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==6406==    by 0x4E466E2: convertTDPStoFlatBytes_double (TightDataPointStorageD.c:1443)
==6406==    by 0x4E53593: SZ_compress_args_double_withinRange (sz_double.c:925)
==6406==    by 0x4E56544: SZ_compress_args_double (sz_double.c:984)
==6406==    by 0x4E56B06: SZ_compress_args (sz.c:279)
==6406==    by 0x400FCC: test_roundtrip (sztest.c:86)
==6406==    by 0x400E26: main (sztest.c:58)
==6406== 
==6406== 16 bytes in 1 blocks are definitely lost in loss record 3 of 6
==6406==    at 0x4C2AB8D: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==6406==    by 0x4E466E2: convertTDPStoFlatBytes_double (TightDataPointStorageD.c:1443)
==6406==    by 0x4E53593: SZ_compress_args_double_withinRange (sz_double.c:925)
==6406==    by 0x4E56544: SZ_compress_args_double (sz_double.c:984)
==6406==    by 0x4E56B06: SZ_compress_args (sz.c:279)
==6406==    by 0x400FCC: test_roundtrip (sztest.c:86)
==6406==    by 0x400E7A: main (sztest.c:60)
==6406== 
==6406== LEAK SUMMARY:
==6406==    definitely lost: 48 bytes in 3 blocks
==6406==    indirectly lost: 0 bytes in 0 blocks
==6406==      possibly lost: 0 bytes in 0 blocks
==6406==    still reachable: 192 bytes in 3 blocks
==6406==         suppressed: 0 bytes in 0 blocks
==6406== Reachable blocks (those to which a pointer was found) are not shown.
==6406== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==6406== 
==6406== For counts of detected and suppressed errors, rerun with: -v
==6406== ERROR SUMMARY: 18 errors from 8 contexts (suppressed: 0 from 0)

Low compression ratio on Parihaka Seismic Data

Hi @sheltongeosx @MauricioAP @jiemeng-total

I open a new thread in SZ repo (referenced to this issue in cuSZ), so that the SZ team may help you solve the issue of the low compression ratio on your data set.

Hi @disheng222,

Shelton faced an issue to use SZ or cuSZ to compress his seismic data. Using the value-range-based relative error bound of 1e-4 can only provide a compression ratio of about 6. The data can be downloaded here. The official SGY format data can be found here. The data size is "-3 1168 1126 922". Do you think we can look into this?

$ sz -z -f -c sz.config -M REL -R 1e-4 -i /work/06128/dtao/frontera/dataset/reduction/08_SGY_922_1126_1168=1212584896/Parihaka_PSTM_far_stack.f32 -3 1168 1126 922
compression time = 38.944234 seconds
compressed data file: /work/06128/dtao/frontera/dataset/reduction/08_SGY_922_1126_1168=1212584896/Parihaka_PSTM_far_stack.f32.sz
$ sz -x -s /work/06128/dtao/frontera/dataset/reduction/08_SGY_922_1126_1168=1212584896/Parihaka_PSTM_far_stack.f32.sz -i /work/06128/dtao/frontera/dataset/reduction/08_SGY_922_1126_1168=1212584896/Parihaka_PSTM_far_stack.f32 -3 1168 1126 922 -a
Min=-6893.359375, Max=5448.8828125, range=12342.2421875
Max absolute error = 1.2342240810
Max relative error = 0.000100
Max pw relative error = 89749692.983013
PSNR = 84.876211, NRMSE= 5.7041304938182735089E-05
normError = 24515.434433, normErr_norm = 0.002501
acEff=0.999997
compressionRatio=6.210478
decompression time = 27.484778 seconds.
decompressed data file: /work/06128/dtao/frontera/dataset/reduction/08_SGY_922_1126_1168=1212584896/Parihaka_PSTM_far_stack.f32.sz.out

h5repack pointer being freed was not allocated

Hello SZ,
When to run the h5repack code and it failed with below error.
Could anyone help to look it?

Thanks.
Bin

h5repack(11138,0x10bd43e00) malloc: *** error for object 0x7f926e922030: pointer being freed was not allocated
h5repack(11138,0x10bd43e00) malloc: *** set a breakpoint in malloc_error_break to debug
./h5repack.sh: line 12: 11138 Abort trap: 6 h5repack -f UD=32017,0 -i $inputFile -o $outputFile

When running it via lldb, it has below info.

% lldb h5repack
(lldb) run -f UD=32017,0 -i testfloat_8_8_128.h5 -o testfloat_8_8_128_sz.h5
Process 11129 launched: '/Users/dbin/work/soft/hdf5-1.12.0/build/bin/h5repack' (x86_64)
h5repack(11129,0x10013be00) malloc: *** error for object 0x101fbd030: pointer being freed was not allocated
h5repack(11129,0x10013be00) malloc: *** set a breakpoint in malloc_error_break to debug
Process 11129 stopped

  • thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGABRT
    frame #0: 0x00007fff203c1462 libsystem_kernel.dylib__pthread_kill + 10 libsystem_kernel.dylib__pthread_kill:
    -> 0x7fff203c1462 <+10>: jae 0x7fff203c146c ; <+20>
    0x7fff203c1464 <+12>: movq %rax, %rdi
    0x7fff203c1467 <+15>: jmp 0x7fff203bb6a1 ; cerror_nocancel
    0x7fff203c146c <+20>: retq
    Target 0: (h5repack) stopped.
    (lldb) bt
  • thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGABRT
    • frame #0: 0x00007fff203c1462 libsystem_kernel.dylib__pthread_kill + 10 frame #1: 0x00007fff203ef610 libsystem_pthread.dylibpthread_kill + 263
      frame #2: 0x00007fff20342720 libsystem_c.dylibabort + 120 frame #3: 0x00007fff20223430 libsystem_malloc.dylibmalloc_vreport + 548
      frame #4: 0x00007fff202264c8 libsystem_malloc.dylibmalloc_report + 151 frame #5: 0x0000000101e861b2 libhdf5sz.soH5Z_filter_sz(flags=0, cd_nelmts=, cd_values=, nbytes=32768, buf_size=0x00007ffeefbfe378, buf=0x00007ffeefbfe420) at H5Z_SZ.c:0 [opt]
      frame #6: 0x00000001008b79dd libhdf5.200.dylibH5Z_pipeline + 2797 frame #7: 0x00000001002a07f2 libhdf5.200.dylibH5D__chunk_flush_entry + 1922
      frame #8: 0x0000000100298da8 libhdf5.200.dylibH5D__chunk_flush + 536 frame #9: 0x00000001002e7ea8 libhdf5.200.dylibH5D__flush_real + 616
      frame #10: 0x00000001002e6c96 libhdf5.200.dylibH5D_close + 502 frame #11: 0x0000000100897bd7 libhdf5.200.dylibH5VL__native_dataset_close + 391
      frame #12: 0x0000000100862ecf libhdf5.200.dylibH5VL__dataset_close + 703 frame #13: 0x0000000100862af0 libhdf5.200.dylibH5VL_dataset_close + 832
      frame #14: 0x00000001002f297b libhdf5.200.dylibH5D__close_cb + 475 frame #15: 0x00000001004c1911 libhdf5.200.dylibH5I_dec_ref + 401
      frame #16: 0x00000001004c15f5 libhdf5.200.dylibH5I_dec_app_ref + 181 frame #17: 0x00000001004c1aa5 libhdf5.200.dylibH5I_dec_app_ref_always_close + 181
      frame #18: 0x000000010028052d libhdf5.200.dylibH5Dclose + 877 frame #19: 0x000000010000fa9e h5repackdo_copy_objects + 14510
      frame #20: 0x000000010000b5d0 h5repackcopy_objects + 8416 frame #21: 0x0000000100004685 h5repackh5repack + 101
      frame #22: 0x000000010000191d h5repackmain + 413 frame #23: 0x00007fff2040a621 libdyld.dylibstart + 1

A miss-unlocking bug

Hi, it seems lock best->mutex is missed to be released before line 815. Should it be a bug?

SZ/zstd/dictBuilder/cover.c

Lines 801 to 824 in c30cb79

ZSTD_pthread_mutex_lock(&best->mutex);
--best->liveJobs;
liveJobs = best->liveJobs;
/* If the new dictionary is better */
if (compressedSize < best->compressedSize) {
/* Allocate space if necessary */
if (!best->dict || best->dictSize < dictSize) {
if (best->dict) {
free(best->dict);
}
best->dict = malloc(dictSize);
if (!best->dict) {
best->compressedSize = ERROR(GENERIC);
best->dictSize = 0;
return;
}
}
/* Save the dictionary, parameters, and size */
memcpy(best->dict, dict, dictSize);
best->dictSize = dictSize;
best->parameters = parameters;
best->compressedSize = compressedSize;
}
ZSTD_pthread_mutex_unlock(&best->mutex);

Error with "this" keyword in TightDataPointStorageI.h

I got the following error when I tried to compile SZ with a c++ application:
/Users/jyc/sw/sz/1.4.11.0/clang/include/TightDataPointStorageI.h:47:58: error: expected ')'
void new_TightDataPointStorageI(TightDataPointStorageI **this,
^
/Users/jyc/sw/sz/1.4.11.0/clang/include/TightDataPointStorageI.h:47:32: note: to match this '('
void new_TightDataPointStorageI(TightDataPointStorageI **this,

It turned out "this" keyword in TightDataPointStorageI.h causing the error with c++. If I simply remove "this" in the header, it worked. Can SZ remove "this" keyword in the header files?

Poor compression & quality for difficult-to-compress data

I am doing some compression studies that involve difficult-to-compress (even incompressible) data. Consider the chaotic data generated by the logistic map xi+1 = 4 xi (1 - xi):

#include <cstdio>

int main()
{
  double x = 1. / 3;
  for (int i = 0; i < 256 * 256 * 256; i++) {
    fwrite(&x, sizeof(x), 1, stdout);
    x = 4 * x * (1 - x);
  }
  return 0;
}

We wouldn't expect this data to compress at all, but the inherent randomness at least suggests a predictable relationship between (RMS) error, E, and rate, R. Let σ = 1/√8 denote the standard deviation of the input data and define the accuracy gain as

α = log₂(σ / E) - R.

Then each increment in storage, R, by one bit should result in a halving of E, so that α is essentially constant. The limit behavior is slightly different as R → 0 or E → 0, but over a large range α ought to be constant.

Below is a plot of α(R) for SZ 2.1.12.3 and other compressors applied to the above data interpreted as a 3D array of size 256 × 256 × 256. Here SZ's absolute error tolerance mode was used: sz -d -3 256 256 256 -M ABS -A tolerance -i input.bin -z output.sz. The tolerance was halved for each subsequent data point, starting with tolerance = 1.

The plot suggests an odd relationship between R and E, with very poor compression observed for small tolerances. For instance, when the tolerance is in {2-13, 2-14, 2-15, 2-16}, the corresponding rate is {13.9, 15.3, 18.2, 30.8}, while we would expect R to increase by one bit in each case. Is this perhaps a bug in SZ? Similar behavior is observed for other difficult-to-compress data sets (see rballester/tthresh#7).

logistic

error larger than abs error param on linear data

I'm not 100% sure this isn't an error in my test code, but I'm seeing absolute error that is larger than the abs error param I passed. I've only seen this happen when using certain sizes and small error bounds.

For example:

./sztest-static 10000 1000 -5
size 10000x1000, 80000000 bytes
[SZ] Reading SZ configuration file (sz.config) ...

== LIN ==
out_size = 99809 (0.0012)
out = 0x2104080
abs bound = 0.0000100000000000
roundtrip differs
ERR: data error out of range: -0.0000100000033854

SZ not compiling out of the box on Mac with latest release and master

when building on a separate directory (e.g. build):

../../zstd/./compress/zstd_compress_internal.h:21:10: fatal error: 'zstd_internal.h'
      file not found
#include "zstd_internal.h"
         ^
1 error generated.
../../zstd/./decompress/huf_decompress.c:39:10: fatal error: 'compiler.h' file not
      found
#include "compiler.h"
         ^
../../zstd/./decompress/zstd_decompress.c:59:10: fatal error: 'cpu.h' file not found
#include "cpu.h"
         ^
../../zstd/./compress/zstdmt_compress.c:27:10: fatal error: 'pool.h' file not found
#include "pool.h"        /* threadpool */
         ^
1 error generated.
1 error generated.
make[2]: *** [compress/libzstd_la-zstd_lazy.lo] Error 1
make[2]: *** Waiting for unfinished jobs....
make[2]: *** [decompress/libzstd_la-huf_decompress.lo] Error 1
make[2]: *** [decompress/libzstd_la-zstd_decompress.lo] Error 1
1 error generated.
make[2]: *** [compress/libzstd_la-zstdmt_compress.lo] Error 1
make[1]: *** [all-recursive] Error 1
make: *** [all] Error 2

when building in the source directory:

src/utility.c:17:10: fatal error: 'zstd.h' file not found
#include "zstd.h"
         ^
1 error generated.
make[2]: *** [src/libSZ_la-utility.lo] Error 1
make[2]: *** Waiting for unfinished jobs....

Thanks in advance.

CMake error for module libzstd

During the installation process, when running cmake .. -DCMAKE_INSTALL_PREFIX:PATH=<path>, I get the following error:

-- Checking for one of the modules 'libzstd'
CMake Error at sz/CMakeLists.txt:90 (install):
  install TARGETS given target "zstd" which does not exist in this directory.


-- Configuring incomplete, errors occurred!

Am I missing some library I should have installed first? I didn't see anything about dependencies in the user guide ...

Seg fault when calling SZ_Init_Params

    sz_params sz;
    memset(&sz, 0, sizeof(sz_params));
    sz.max_quant_intervals = 65536;
    sz.quantization_intervals = 0;
    // sz.dataEndianType = LITTLE_ENDIAN_DATA;
    //    sz.sysEndianType = LITTLE_ENDIAN_DATA;
    sz.sol_ID = SZ;
    // sz.layers = 1;
    sz.sampleDistance = 100;
    sz.predThreshold = 0.99;
    //    sz.offset = 0;
    sz.szMode = SZ_BEST_COMPRESSION; // SZ_BEST_SPEED; //SZ_BEST_COMPRESSION;
    sz.gzipMode = 1;
    sz.errorBoundMode = ABS;
    sz.absErrBound = 1E-4;
    sz.relBoundRatio = 1E-3;
    sz.psnr = 80.0;
    sz.pw_relBoundRatio = 1E-5;
    sz.segment_size = (int)pow(5, (double)ndims);
    sz.pwr_type = SZ_PWR_MIN_TYPE;

   Params::const_iterator it;
    for (it = parameters.begin(); it != parameters.end(); it++)
    {
        std::cout << it->first << " => " << it->second << '\n';
        if (it->first == "init")
        {
            use_configfile = 1;
            sz_configfile = std::string(it->second);
        }
        else if (it->first == "max_quant_intervals")
        {
            sz.max_quant_intervals = std::stoi(it->second);
        }
        else if (it->first == "quantization_intervals")
        {
            sz.quantization_intervals = std::stoi(it->second);
        }
        else if (it->first == "sol_ID")
        {
            sz.sol_ID = std::stoi(it->second);
        }
        else if (it->first == "sampleDistance")
        {
            sz.sampleDistance = std::stoi(it->second);
        }
        else if (it->first == "predThreshold")
        {
            sz.predThreshold = std::stof(it->second);
        }
        else if (it->first == "szMode")
        {
            int szMode = SZ_BEST_SPEED;
            if (it->second == "SZ_BEST_SPEED")
            {
                szMode = SZ_BEST_SPEED;
            }
            else if (it->second == "SZ_BEST_COMPRESSION")
            {
                szMode = SZ_BEST_COMPRESSION;
            }
            else if (it->second == "SZ_DEFAULT_COMPRESSION")
            {
                szMode = SZ_DEFAULT_COMPRESSION;
            }
            else
            {
                std::cout << "[WARN] An unknown szMode: " << it->second
                          << std::endl;
            }
            sz.szMode = szMode;
        }
        else if (it->first == "gzipMode")
        {
            sz.gzipMode = std::stoi(it->second);
        }
        else if (it->first == "errorBoundMode")
        {
            int errorBoundMode = ABS;
            if (it->second == "ABS")
            {
                errorBoundMode = ABS;
            }
            else if (it->second == "REL")
            {
                errorBoundMode = REL;
            }
            else if (it->second == "ABS_AND_REL")
            {
                errorBoundMode = ABS_AND_REL;
            }
            else if (it->second == "ABS_OR_REL")
            {
                errorBoundMode = ABS_OR_REL;
            }
            else if (it->second == "PW_REL")
            {
                errorBoundMode = PW_REL;
            }
            else
            {
                std::cout << "[WARN] An unknown errorBoundMode: " << it->second
                          << std::endl;
            }
            sz.errorBoundMode = errorBoundMode;
        }
        else if (it->first == "absErrBound")
        {
            sz.absErrBound = std::stof(it->second);
        }
        else if (it->first == "relBoundRatio")
        {
            sz.relBoundRatio = std::stof(it->second);
        }
        else if (it->first == "pw_relBoundRatio")
        {
            sz.pw_relBoundRatio = std::stof(it->second);
        }
        else if (it->first == "segment_size")
        {
            sz.segment_size = std::stoi(it->second);
        }
        else if (it->first == "pwr_type")
        {
            int pwr_type = SZ_PWR_MIN_TYPE;
            if ((it->first == "MIN") || (it->first == "SZ_PWR_MIN_TYPE"))
            {
                pwr_type = SZ_PWR_MIN_TYPE;
            }
            else if ((it->first == "AVG") || (it->first == "SZ_PWR_AVG_TYPE"))
            {
                pwr_type = SZ_PWR_AVG_TYPE;
            }
            else if ((it->first == "MAX") || (it->first == "SZ_PWR_MAX_TYPE"))
            {
                pwr_type = SZ_PWR_MAX_TYPE;
            }
            else
            {
                std::cout << "[WARN] An unknown pwr_type: " << it->second
                          << std::endl;
            }
            sz.pwr_type = pwr_type;
        }
        else if ((it->first == "abs") || (it->first == "absolute") ||
                 (it->first == "accuracy"))
        {
            sz.errorBoundMode = ABS;
            sz.absErrBound = std::stod(it->second);
        }
        else if ((it->first == "rel") || (it->first == "relative"))
        {
            sz.errorBoundMode = REL;
            sz.relBoundRatio = std::stof(it->second);
        }
        else if ((it->first == "pw") || (it->first == "pwr") ||
                 (it->first == "pwrel") || (it->first == "pwrelative"))
        {
            sz.errorBoundMode = PW_REL;
            sz.pw_relBoundRatio = std::stof(it->second);
        }
        else if ((it->first == "zchecker") || (it->first == "zcheck") ||
                 (it->first == "z-checker") || (it->first == "z-check"))
        {
            use_zchecker = (it->second == "") ? 1 : std::stof(it->second);
        }
        else
        {
            std::cout << "[WARN] An unknown SZ parameter: " << it->first
                      << std::endl;
        }
    }
    SZ_Init_Params(&sz); //seg fault happens here

valgrind output signal an invalid write of size 4:

BPWriteReadSZ.ADIOS2BPWriteRead1D
debugMode:1
accuracy => 0.001
sz.max_quant_intervals: 65536
sz.quantization_intervals: 0
sz.sol_ID: 101
sz.sampleDistance: 100
sz.predThreshold: 0.99
sz.szMode: 1
sz.gzipMode: 1
sz.errorBoundMode: 0
sz.absErrBound: 0.001
sz.relBoundRatio: 0.001
sz.psnr: 80
sz.pw_relBoundRatio: 1e-05
sz.segment_size: 5
sz.pwr_type: 0
==22812== Invalid write of size 4
==22812==    at 0x72AB4F6: SZ_Init_Params (in /opt/sz/2.0.2.0/lib/libSZ.so.1.0.4)
==22812==    by 0x5812752: adios2::core::compress::CompressSZ::Compress(void const*, std::vector<unsigned long, std::allocator<unsigned long> > const&, unsigned long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, void*, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > > const&) const (CompressSZ.cpp:264)
==22812==    by 0x578C645: void adios2::format::BP3SZ::SetDataCommon<float>(adios2::core::Variable<float> const&, adios2::core::Variable<float>::Info const&, adios2::core::Variable<float>::Operation const&, adios2::BufferSTL&) const (BP3SZ.tcc:35)

Undefined symbol _deflateInit_ when linking to SZ after make install

mbpwfg:bin wfg$ ./TestBPWriteReadSZ 
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from BPWriteReadSZ
[ RUN      ] BPWriteReadSZ.ADIOS2BPWriteRead1D
debugMode:1
accuracy => 0.001
sz.max_quant_intervals: 65536
sz.quantization_intervals: 0
sz.sol_ID: 101
sz.sampleDistance: 100
sz.predThreshold: 0.99
sz.szMode: 1
sz.gzipMode: 1
sz.errorBoundMode: 0
sz.absErrBound: 0.001
sz.relBoundRatio: 0.001
sz.psnr: 80
sz.pw_relBoundRatio: 1e-05
sz.segment_size: 5
sz.pwr_type: 0
dyld: lazy symbol binding failed: Symbol not found: _deflateInit_
  Referenced from: /opt/sz/2.0.2.0/lib//libSZ.1.dylib
  Expected in: flat namespace

dyld: Symbol not found: _deflateInit_
  Referenced from: /opt/sz/2.0.2.0/lib//libSZ.1.dylib
  Expected in: flat namespace

Trace/BPT trap: 5

In fact, the symbols are undefined:

mbpwfg:bin wfg$ nm /opt/sz/2.0.2.0/lib/libSZ.dylib | grep deflateInit
                 U _deflateInit2_
                 U _deflateInit_
mbpwfg:bin wfg$ nm /opt/sz/2.0.2.0/lib/libSZ.1.dylib | grep deflateInit
                 U _deflateInit2_
                 U _deflateInit_

Python bindings fail to build in current release

The SWIG python bindings fail with the following error:

[build/v2.1.11.1]$ cmake -DBUILD_PYTHON_WRAPPER=ON ../../source/v2.1.11.1
...
[build/v2.1.11.1]$ make
...
[ 97%] Building CXX object swig/CMakeFiles/pysz.dir/CMakeFiles/pysz.dir/pyszPYTHON_wrap.cxx.o
In file included from /home/tmpuser/sz/build/swig/CMakeFiles/pysz.dir/pyszPYTHON_wrap.cxx:2844:0:
/home/tmpuser/sz/source/swig/pysz.h: In member function ‘exafelSZ_params* ExaFELConfigBuilder::build()’:
/home/tmpuser/sz/source/swig/pysz.h:90:13: error: ‘exafelSZ_params’ has no member named ‘peaks’
     params->peaks = new uint8_t [peaks.size()];
             ^
/home/tmpuser/sz/source/swig/pysz.h:92:59: error: ‘exafelSZ_params’ has no member named ‘peaks’
     std::copy(std::begin(peaks), std::end(peaks), params->peaks);
                                                           ^
/home/tmpuser/sz/source/swig/pysz.h: In static member function ‘static void ExaFELConfigBuilder::free(exafelSZ_params*)’:
/home/tmpuser/source/swig/pysz.h:108:22: error: ‘exafelSZ_params’ has no member named ‘peaks’
     delete[] params->peaks;
                      ^
make[2]: *** [swig/CMakeFiles/pysz.dir/CMakeFiles/pysz.dir/pyszPYTHON_wrap.cxx.o] Error 1
make[1]: *** [swig/CMakeFiles/pysz.dir/all] Error 2
make: *** [all] Error 2
[build/v2.1.11.1]$ 

segmentation fault in H5Z-SZ

Hello, SZ Community,
I had a following segmentation error while running the szToHDF5 code. Could anyone please help to identify the cause and help to provide some solution?

Thanks,
Bin

System is Mac 11.2.3 (20D91), hdf5-1.10.7,
mpicc --version
Apple clang version 12.0.0 (clang-1200.0.32.29)
Target: x86_64-apple-darwin20.3.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

The compiling is OK but it reports segmentation fault when run the szToHDF5 code.

MBP H5Z-SZ % ./test/szToHDF5 -u16 sz.config ../../example/testdata/x86/testint16_8x8x8.dat 8 8 8
config file = sz.config
cfgFile=sz.config
outputfile=../../example/testdata/x86/testint16_8x8x8.dat.sz.h5
Dimension sizes: n5=0, n4=0, n3=8, n2=8, n1=8
sz filter is available for encoding and decoding.
....Writing SZ compressed data.............
original data = 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 ....
zsh: segmentation fault ./test/szToHDF5 -u16 sz.config ../../example/testdata/x86/testint16_8x8x8.dat

When using the lldb to dump the trace log, it shows below information.

MBP H5Z-SZ % lldb ./test/szToHDF5
(lldb) target create "./test/szToHDF5"
Current executable set to '/Users/dbin/work/soft/SZ/hdf5-filter/H5Z-SZ/test/szToHDF5' (x86_64).
(lldb) run -u16 sz.config ../../example/testdata/x86/testint16_8x8x8.dat 8 8 8
Process 37002 launched: '/Users/dbin/work/soft/SZ/hdf5-filter/H5Z-SZ/test/szToHDF5' (x86_64)
config file = sz.config
cfgFile=sz.config
outputfile=../../example/testdata/x86/testint16_8x8x8.dat.sz.h5
Dimension sizes: n5=0, n4=0, n3=8, n2=8, n1=8
sz filter is available for encoding and decoding.
....Writing SZ compressed data.............
original data = 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 ....
Process 37002 stopped

  • thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0xc)
    frame #0: 0x00000001002d89b7 libSZ.1.dylibSZ_Init + 23 libSZ.1.dylibSZ_Init:
    -> 0x1002d89b7 <+23>: movl $0x8, 0xc(%rax)
    0x1002d89be <+30>: movq 0x28eeb(%rip), %rax ; confparams_cpr
    0x1002d89c5 <+37>: xorl %ebx, %ebx
    0x1002d89c7 <+39>: cmpl $0x3, 0x20(%rax)
    Target 0: (szToHDF5) stopped.
    (lldb) bt
  • thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0xc)
    • frame #0: 0x00000001002d89b7 libSZ.1.dylibSZ_Init + 23 frame #1: 0x00000001018cf318 libhdf5sz.soH5Z_sz_set_local(dcpl_id=720575940379279377, type_id=216172782113783878, chunk_space_id=288230376151711748) at H5Z_SZ.c:0 [opt]
      frame #2: 0x00000001005e8d5a libhdf5.103.dylibH5Z__prelude_callback(pline=0x00007ffeefbfd688, dcpl_id=720575940379279377, type_id=216172782113783878, space_id=288230376151711748, prelude_type=<unavailable>) at H5Z.c:779:29 [opt] frame #3: 0x00000001005e890e libhdf5.103.dylibH5Z__prepare_prelude_callback_dcpl(dcpl_id=720575940379279377, type_id=216172782113783878, prelude_type=H5Z_PRELUDE_SET_LOCAL) at H5Z.c:865:21 [opt]
      frame #4: 0x00000001005e8ac3 libhdf5.103.dylibH5Z_set_local(dcpl_id=720575940379279377, type_id=216172782113783878) at H5Z.c:936:9 [opt] frame #5: 0x00000001003c8712 libhdf5.103.dylibH5D__create(file=, type_id=, space=, dcpl_id=, dapl_id=) at H5Dint.c:1238:16 [opt]
      frame #6: 0x00000001003d5410 libhdf5.103.dylibH5O__dset_create(f=<unavailable>, _crt_info=<unavailable>, obj_loc=0x00007ffeefbfd9f0) at H5Doh.c:299:24 [opt] frame #7: 0x00000001004bcd38 libhdf5.103.dylibH5O_obj_create(f=, obj_type=, crt_info=, obj_loc=0x00007ffeefbfd9f0) at H5Oint.c:2495:37 [opt]
      frame #8: 0x000000010048854d libhdf5.103.dylibH5L__link_cb(grp_loc=0x00007ffeefbfdcd0, name="testdata_compressed", lnk=<unavailable>, obj_loc=<unavailable>, _udata=0x00007ffeefbfe1a8, own_loc=0x00007ffeefbfdcec) at H5L.c:1651:53 [opt] frame #9: 0x00000001004594fc libhdf5.103.dylibH5G__traverse_real(_loc=, name="testdata_compressed", target=, op=(libhdf5.103.dylibH5L__link_cb at H5L.c:1627), op_data=<unavailable>) at H5Gtraverse.c:623:16 [opt] frame #10: 0x0000000100458631 libhdf5.103.dylibH5G_traverse(loc=0x00007ffeefbfe2f8, name="testdata_compressed", target=0, op=(libhdf5.103.dylibH5L__link_cb at H5L.c:1627), op_data=<unavailable>) at H5Gtraverse.c:847:8 [opt] frame #11: 0x0000000100487422 libhdf5.103.dylibH5L__create_real(link_loc=, link_name="testdata_compressed", obj_path=0x0000000000000000, obj_file=0x0000000000000000, lnk=, ocrt_info=, lcpl_id=720575940379279374) at H5L.c:1845:8 [opt]
      frame #12: 0x0000000100487574 libhdf5.103.dylibH5L_link_object(new_loc=<unavailable>, new_name=<unavailable>, ocrt_info=<unavailable>, lcpl_id=<unavailable>) at H5L.c:1604:8 [opt] frame #13: 0x00000001003c7a0f libhdf5.103.dylibH5D__create_named(loc=0x00007ffeefbfe2f8, name="testdata_compressed", type_id=216172782113783878, space=0x0000000101b41210, lcpl_id=720575940379279374, dcpl_id=720575940379279376, dapl_id=720575940379279367) at H5Dint.c:337:8 [opt]
      frame #14: 0x00000001003a49bf libhdf5.103.dylibH5Dcreate2(loc_id=72057594037927936, name="testdata_compressed", type_id=<unavailable>, space_id=<unavailable>, lcpl_id=720575940379279374, dcpl_id=720575940379279376, dapl_id=<unavailable>) at H5D.c:151:24 [opt] frame #15: 0x0000000100004c78 szToHDF5main(argc=7, argv=0x00007ffeefbff738) at szToHDF5.c:269:21
      frame #16: 0x00007fff2036b621 libdyld.dylibstart + 1 frame #17: 0x00007fff2036b621 libdyld.dylibstart + 1

Decompressed data becomes half length

I am converting a 1-D array (len = 40485) . Here is the code:

print(arr.shape) 
# (40485, )
np.savetxt('output.dat', arr)

Then running the following commands for compression and decompression

sz -z -f -i output.dat -1 48045
sz -x -f -s output.dat.sz -1 48045

Reading the decompressed file in numpy gives the following results

output_arr = np.fromfile('output.dat.sz.out')
print(f'Before compression shape {arr.shape} | After compression shape {output_arr.shape}')
# Before compression shape (48045, ) | After compression shape (24022, )

How can I retrieve back the original size array ?

SZ fails to losslessly compress/decompress data

SZ fails to losslessly compress/decompress data.

Input file: http://www.cs.txstate.edu/~burtscher/research/datasets/FPsingle/msg_sppm.sp.spdp
Note that the file is compressed with SPDP. You have to decompress it first.

To compress I used the following command:
./sz -i /msg_sppm.sp -f -z tmp.data -1 34874483 -A .0 -p

compressed data file: tmp.data
=================SZ Compression Meta Data=================
Version:                        	 40.181.47
Constant data?:                 	 YES
Lossless?:                      	 YES
Size type (size of # elements): 	 8 bytes
Num of elements:                	 1391624735126898859
Data type:                      	 FLOAT
quantization_intervals:         	 145244089
max_quant_intervals:            	 - 0
dataEndianType (prior raw data):	 LITTLE_ENDIAN
sysEndianType (at compression): 	 LITTLE_ENDIAN
sampleDistance:                 	 20723
predThreshold:                  	 2.573500
szMode:                         	 SZ_BEST_SPEED (without Gzip)
gzipMode:                       	 Z_BEST_SPEED
errBoundMode:                   	 ABS
absErrBound:                    	 -11068089228422190989312.000000

To decompress I ran:
./sz -z tmp.data -x tmp_decompressed.data -1 34874483 -p -A .0

=================SZ Compression Meta Data=================
Version:                        	 2.1.3
Constant data?:                 	 NO
Lossless?:                      	 NO
Size type (size of # elements): 	 8 bytes
Num of elements:                	 34874483
Data type:                      	 FLOAT
quantization_intervals:         	 65536
max_quant_intervals:            	 - 0
dataEndianType (prior raw data):	 LITTLE_ENDIAN
sysEndianType (at compression): 	 LITTLE_ENDIAN
sampleDistance:                 	 100
predThreshold:                  	 0.990000
szMode:                         	 SZ_BEST_COMPRESSION (with Gzip)
gzipMode:                       	 Z_BEST_SPEED
errBoundMode:                   	 ABS
absErrBound:                    	 0.000719

There's a significant difference after a given index in the decompressed data.

Any ideas what could be causing this?
I initially discovered the bug when using the API and reproduced it using the utility binary provided in the project.

sz_omp.c:19:19: error: ‘CLOCK_MONOTONIC_RAW’ undeclared

it seems CLOCK_MONOTONIC_RAW is linux-specific and wont compile for win32:

#ifdef _OPENMP
    return omp_get_wtime();
#else
    struct timespec ts;
    clock_gettime(CLOCK_MONOTONIC_RAW, &ts);

    return (double)ts.tv_sec + (double)ts.tv_nsec / 1000000000.0;
#endif

Any API or fundamental changes in v 2.1.12?

Just wondering if there has been any API or fundamental algorithm changes in v 2.1.12? We have a bunch of tests written in ADIOS for SZ compression, which have been working for quite long time since a number of versions back, but 2.1.12 broke almost all of our tests. I haven't had a chance to look into the details yet, but it seems to me that there is something in 2.1.12 that is very different than previous versions. Any hints would be very much appreciated. Thanks.

spack package and release policy

I created a spack package for SZ in a fork, and would like to submit a pull request soon:
https://github.com/bd4/spack

Tarballs changing without the version also changing is problematic for spack packages, because the spack package definition has a tarball checksum associated with each version. Is 1.4.9-beta ready, and if so can it be frozen so any further changes get a new version number?

HDF5 version hdf5-1.12.0

Hello SZ,
Sorry that I have lots of issues here and thanks for the help to fix them.
I tried the code on the latest version of the HDF5, i.e., hdf5-1.12.0.

It reported the same error like the one in h5repack here #73

See below for the results from running the code with hdf5-1.12.0.

Thanks,
Bin

./szToHDF5 -d sz.config ../../../example/testdata/x86/testdouble_8_8_128.dat 8 8 128

config file = sz.config
cfgFile=sz.config
output file: ../../../example/testdata/x86/testdouble_8_8_128.dat.sz.h5
Dimension sizes: n5=0, n4=0, n3=128, n2=8, n1=8
sz filter is available for encoding and decoding.
....Writing SZ compressed data.............
original data = 0.225612 0.225635 0.225691 0.225739 0.225738 0.225691 0.225623 0.225564 0.225612 0.225635 0.225691 0.225739 0.225738 0.225691 0.225623 0.225564 0.225612 0.225635 0.225691 0.225739 ....
szToHDF5(12144,0x10bee6e00) malloc: *** error for object 0x7fadd99a5030: pointer being freed was not allocated
szToHDF5(12144,0x10bee6e00) malloc: *** set a breakpoint in malloc_error_break to debug
zsh: abort ./szToHDF5 -d sz.config ../../../example/testdata/x86/testdouble_8_8_128.dat

new feature: timing trace file

For analyzing the cost of adding compression to real science applications, we'd like to have a way of logging the time it takes for each compress and decompress operations. I'm thinking of a library level function that sets the log file:

SZ_set_timing_log_file(char *path)

And if this is set, each compress/decompress routine saves the time at start, and logs a single line with the duration. Making it thread safe would require some locking.

Is this something you would be interested in adding or accept as a pull request?

make install fails with "file INSTALL cannot find "[...]/bin/testint_compress".

This is how I try to build sz:

version="master"
git clone https://github.com/szcompressor/SZ.git "${PETSC_DIR}/${PETSC_ARCH}/externalpackages/sz-${version}_src"
mkdir "${PETSC_DIR}/${PETSC_ARCH}/externalpackages/sz-${version}"
cd "${PETSC_DIR}/${PETSC_ARCH}/externalpackages/sz-${version}"
cmake "../sz-${version}_src" -DCMAKE_INSTALL_PREFIX:PATH="${PETSC_DIR}/${PETSC_ARCH}/externalpackages/sz-${version}"
make
make install

cmake and make both succeed, but make install quits with this message:

CMake Error at example/cmake_install.cmake:47 (file):
  file INSTALL cannot find
  "/soft/petsc-3.14.3/foss_debug/externalpackages/sz-master/bin/testint_compress".
Call Stack (most recent call first):
  cmake_install.cmake:56 (include)


Makefile:128: recipe for target 'install' failed
make: *** [install] Error 1

Problem using SZ filter from python with h5py>=3

For the last months I've been successfully using the SZ HDF5 filter to write compressed data in netcdf format from python.
However, while it works great with h5py 2.10, with versions of h5py from 3.0 the filter does not work anymore.
This is problematic for us because we can't stuck to a h5py version because other libraries require later releases.

The specific error that is thrown by HDF5 is :

HDF5-DIAG: Error detected in HDF5 (1.10.5) thread 0:
  #000: H5T.c line 1876 in H5Tget_class(): not a datatype
    major: Invalid arguments to routine
    minor: Inappropriate type
HDF5-DIAG: Error detected in HDF5 (1.10.5) thread 0:
  #000: H5E.c line 1417 in H5Epush2(): can't push error on stack
    major: Error API
    minor: Can't set value
  #001: H5I.c line 1417 in H5I_inc_ref(): can't locate ID
    major: Object atom
    minor: Unable to find atom information (already closed?)
  #002: H5T.c line 1876 in H5Tget_class(): not a datatype
    major: Invalid arguments to routine
    minor: Inappropriate type

The python stack:

  File ".../xarray/backends/h5netcdf_.py", line 315, in prepare_variable
    nc4_var = self.ds.create_variable(
  File ".../h5netcdf/core.py", line 535, in create_variable
    return group._create_child_variable(
  File ".../h5netcdf/core.py", line 512, in _create_child_variable
    self._h5group.create_dataset(
  File ".../h5py/_hl/group.py", line 153, in create_dataset
    dsid = dataset.make_new_dset(group, shape, dtype, data, name, **kwds)
  File ".../h5py/_hl/dataset.py", line 134, in make_new_dset
    dset_id = h5d.create(parent.id, name, tid, sid, dcpl=dcpl)
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py/h5d.pyx", line 87, in h5py.h5d.create
ValueError: Unable to create dataset (error during user callback)

Any idea about what is happening here?

Link error on macOS

Hi, I'm getting link errors when I try to build SZ using cmake on my Mac.

[ 45%] Linking C shared library ../lib/libSZ.dylib
Undefined symbols for architecture x86_64:
  "_ZSTD_compress", referenced from:
      _sz_lossless_compress in utility.c.o
  "_ZSTD_decompress", referenced from:
      _sz_lossless_decompress in utility.c.o
      _sz_lossless_decompress65536bytes in utility.c.o
  "_ZSTD_getFrameContentSize", referenced from:
      _is_lossless_compressed_data in utility.c.o
  "_compress2", referenced from:
      _zlib_compress in callZlib.c.o
     (maybe you meant: _zlib_compress2)
  "_deflate", referenced from:
      _zlib_compress2 in callZlib.c.o
      _zlib_compress3 in callZlib.c.o
      _zlib_compress4 in callZlib.c.o
      _zlib_compress5 in callZlib.c.o
  "_deflateBound", referenced from:
      _zlib_compress in callZlib.c.o
      _zlib_compress2 in callZlib.c.o
      _zlib_compress4 in callZlib.c.o
      _zlib_compress5 in callZlib.c.o
  "_deflateEnd", referenced from:
      _zlib_compress2 in callZlib.c.o
      _zlib_compress3 in callZlib.c.o
      _zlib_compress4 in callZlib.c.o
      _zlib_compress5 in callZlib.c.o
  "_deflateInit2_", referenced from:
      _zlib_compress2 in callZlib.c.o
      _zlib_compress3 in callZlib.c.o
      _zlib_compress4 in callZlib.c.o
  "_deflateInit_", referenced from:
      _zlib_compress5 in callZlib.c.o
  "_inflate", referenced from:
      _zlib_uncompress2 in callZlib.c.o
      _zlib_uncompress3 in callZlib.c.o
      _zlib_uncompress4 in callZlib.c.o
      _zlib_uncompress65536bytes in callZlib.c.o
      _zlib_uncompress5 in callZlib.c.o
  "_inflateEnd", referenced from:
      _zlib_uncompress2 in callZlib.c.o
      _zlib_uncompress3 in callZlib.c.o
      _zlib_uncompress4 in callZlib.c.o
      _zlib_uncompress65536bytes in callZlib.c.o
      _zlib_uncompress5 in callZlib.c.o
  "_inflateInit_", referenced from:
      _zlib_uncompress2 in callZlib.c.o
      _zlib_uncompress3 in callZlib.c.o
      _zlib_uncompress4 in callZlib.c.o
      _zlib_uncompress65536bytes in callZlib.c.o
      _zlib_uncompress5 in callZlib.c.o
  "_uncompress", referenced from:
      _zlib_uncompress in callZlib.c.o
     (maybe you meant: _zlib_uncompress, _zlib_uncompress5 , _zlib_uncompress3 , _zlib_uncompress2 , _zlib_uncompress65536bytes , _zlib_uncompress4 )

I'm using macOS 10.13.6 and GCC8.2.0.

SZ 1.4.6-beta

SZ 1.4.6-beta is the recommended version (Nov., 2016)

fortran build no longer works

Looks like the arg changes related to error handling refactor broke the fortran compat code:

src/szf.c: In function 'sz_batch_decompress_c_':
src/szf.c:514:2: error: too few arguments to function 'SZ_batch_decompress'
  SZ_batch_decompress(bytes, *byteLength);
  ^~~~~~~~~~~~~~~~~~~
In file included from src/szf.c:14:0:
./include/sz.h:421:12: note: declared here
 SZ_VarSet* SZ_batch_decompress(unsigned char* compressedStream, int length, int *status);

user manual: 2d array dimension example error?

In the manual pages 6-7, it says that that if the data array is 2d (M x N), then args should be r5=0, r4=0, r3=0, r2=M, r1=N. Usually M x N means M rows and N cols, and C 2d arrays use row major order, so shouldn't it be r2=N, r1=M?

API is not thread-safe

The compress and decompress functions are not thread-safe. The reason for this is that a global context confparams_cpr, confparams_dec, exe_params is modified via some of the calls. Example:

unsigned char* SZ_compress_args(int dataType, void *data, size_t *outSize, int errBoundMode, double absErrBound, 
double relBoundRatio, double pwrBoundRatio, size_t r5, size_t r4, size_t r3, size_t r2, size_t r1)
{
	confparams_cpr->dataType = dataType;
void *SZ_decompress(int dataType, unsigned char *bytes, size_t byteLength, size_t r5, size_t r4, size_t r3, size_t r2, size_t r1)
{
	if(confparams_dec==NULL)
		confparams_dec = (sz_params*)malloc(sizeof(sz_params));
	memset(confparams_dec, 0, sizeof(sz_params));
	if(exe_params==NULL)
		exe_params = (sz_exedata*)malloc(sizeof(sz_exedata));
	memset(exe_params, 0, sizeof(sz_exedata));
	exe_params->SZ_SIZE_TYPE = 8;

I think it would be beneficial to guarantee that calling the compress/decompress is thread safe. Otherwise, the user would have to introduce a single lock for the compress / decompress paths which would lead to contention and kill multi-threaded performance.

Makefile change to build libhdf5sz.so

Hello!

Building shared libhdf5sz.so library failed for me due to missing symbols from the HDF5 library. Below is the altered build command that worked:

$(CC) -O3 -shared -o $(LIB)/$(SHARED) $(OBJS) $(SZFLAGS) -L$(HDF5PATH)/lib -lc -lSZ -lhdf5 -lzlib -lzstd

Can lossless compression be used

SZ is Lossy Compressor.But Some compressors can provide lossless compression.for example fpzip and zfp.I want to know if sz can use lossless compression and what should I do if it can.

confusing libtool error when fortran compiler is not present

If the fortran compiler is missing, configure still succeeds but make fails when invoking libtool with this stumper:

/bin/bash ../libtool  --tag=FC   --mode=link    -version-info  1:4:0  -o libsz.la -rpath /usr/local/lib libsz_la-ByteToolkit.lo libsz_la-dataCompression.lo libsz_la-DynamicIntArray.lo libsz_la-iniparser.lo libsz_la-CompressElement.lo libsz_la-DynamicByteArray.lo libsz_la-rw.lo libsz_la-TightDataPointStorageD.lo libsz_la-TightDataPointStorageF.lo libsz_la-conf.lo libsz_la-DynamicDoubleArray.lo libsz_la-TypeManager.lo libsz_la-dictionary.lo libsz_la-DynamicFloatArray.lo libsz_la-VarSet.lo libsz_la-test_zlib.lo libsz_la-Huffman.lo libsz_la-sz_float.lo libsz_la-sz_double.lo libsz_la-sz.lo libsz_la-TightDataPointStorageF_pwr.lo libsz_la-TightDataPointStorageD_pwr.lo libsz_la-sz_float_pwr.lo libsz_la-sz_double_pwr.lo  
libtool: link: unrecognized option `-ersion-info'

My best guess is that there is something missing in the autoconf dependencies that is allowing this to happen.

EDIT: this is using the sz-1.4.9-beta tarball, using just ./configure, without enable fortran option.

szf.c:414:2: error: too few arguments to function ‘SZ_batchAddVar’

The latest release (2.1.5.0) fails to build with :

szf.c:414:2: error: too few arguments to function ‘SZ_batchAddVar’
  414 |  SZ_batchAddVar(s2, SZ_FLOAT, data, *errBoundMode, *absErrBound, *relBoundRatio, 0.1, 0, 0, 0, 0, *r1);
      |  ^~~~~~~~~~~~~~
In file included from sz/src/SZ-2.1.5.0/sz/include/sz.h:21,
                 from sz/src/SZ-2.1.5.0/sz/src/szf.c:14:
sz/src/SZ-2.1.5.0/sz/include/VarSet.h:62:6: note: declared here
   62 | void SZ_batchAddVar(int var_id, char* varName, int dataType, void* data,

SZ_batchAddVar has one more argument:

void SZ_batchAddVar(int var_id, char* varName, int dataType, void* data,
                        int errBoundMode, double absErrBound, double relBoundRatio,
                        double pwRelBoundRatio,
                        size_t r5, size_t r4, size_t r3, size_t r2, size_t r1);

test_int.sh fails each decompress

When running the test_int.sh integer example script, each of the decompress steps fails. Tested on an Ubuntu 20.04 VM.

Input

git clone https://github.com/szcompressor/SZ
cd SZ
./configure --prefix=$HOME/.local
make
make install
cd example
./test_int.sh 

Result

compression
cfgFile=sz_int.config
timecost=0.001358, output compressed file: testdata/x86/testint8_8x8x8.dat.sz
done
double free or corruption (!prev)
./test_int.sh: line 5: 333527 Aborted                 (core dumped) testint_decompress -i8 sz_int.config testdata/x86/testint8_8x8x8.dat.sz 8 8 8
cfgFile=sz_int.config
timecost=0.000973, output compressed file: testdata/x86/testint16_8x8x8.dat.sz
done
double free or corruption (!prev)
./test_int.sh: line 8: 333530 Aborted                 (core dumped) testint_decompress -i16 sz_int.config testdata/x86/testint16_8x8x8.dat.sz 8 8 8
cfgFile=sz_int.config
timecost=0.000939, output compressed file: testdata/x86/testint32_8x8x8.dat.sz
done
double free or corruption (!prev)
./test_int.sh: line 11: 333533 Aborted                 (core dumped) testint_decompress -i32 sz_int.config testdata/x86/testint32_8x8x8.dat.sz 8 8 8
cfgFile=sz_int.config
timecost=0.001055, output compressed file: testdata/x86/testint64_8x8x8.dat.sz
done
double free or corruption (!prev)
./test_int.sh: line 14: 333536 Aborted                 (core dumped) testint_decompress -i64 sz_int.config testdata/x86/testint64_8x8x8.dat.sz 8 8 8
cfgFile=sz_int.config
timecost=0.000899, output compressed file: testdata/x86/testint8_8x8x8.dat.sz
done
double free or corruption (!prev)
./test_int.sh: line 17: 333539 Aborted                 (core dumped) testint_decompress -ui8 sz_int.config testdata/x86/testint8_8x8x8.dat.sz 8 8 8
cfgFile=sz_int.config
timecost=0.001012, output compressed file: testdata/x86/testint16_8x8x8.dat.sz
done
double free or corruption (!prev)
./test_int.sh: line 20: 333542 Aborted                 (core dumped) testint_decompress -ui16 sz_int.config testdata/x86/testint16_8x8x8.dat.sz 8 8 8
cfgFile=sz_int.config
timecost=0.000977, output compressed file: testdata/x86/testint32_8x8x8.dat.sz
done
double free or corruption (!prev)
./test_int.sh: line 23: 333545 Aborted                 (core dumped) testint_decompress -ui32 sz_int.config testdata/x86/testint32_8x8x8.dat.sz 8 8 8

Possible to have incorrect numbers with PW_REL?

I am trying to check if I am getting right values with PW_REL method.
Wi the following line in "sz.config" (other parts are same with the file in example):

errorBoundMode = PW_REL
relBoundRatio = 1E-2

After running the following line followed by decompression, the pw relative error I calculated is different from what I expected (I expect lower than 1E-1):

$ ./testdouble_compress sz.config testdouble_8_8_128.dat 8 128  

My calculation is 0.159883. I used some code something like:

    for (i = 0; i < nbEle; i++)
    {
        rel = fabs(ori_data[i]-data[i])/fabs(ori_data[i]);
        if (relMax < rel) relMax = rel;
    }
    printf ("Max pw_relative error = %g\n", relMax);

I am wondering if this is the right result or I did use incorrectly.

Build failing on Summit

Build fails on Summit with gcc 8.1.0 with the following message:

../../zstd/./decompress/zstd_decompress.c:59:10: fatal error: cpu.h: No such file or directory
 #include "cpu.h"
           ^~~~~~~
compilation terminated.

out of range access

When using SZ compression through ADIOS2, memory corruption occurs in certain cases. More specifically, trying to compress a 5x5 float array shows the issue. I don't have a simple example to reproduce the issue right now (in fact, I'm pretty sure there's more than just one problem), but if you're not aware of it, look for the address sanitizer feature of modern compilers (or use valgrind), and they will help you pinpoint and debug the problem.

The first example occurs in sz_float.c around line 5896. I printed cur_data_pos - oriData and it shows

cur_data_pos 6
cur_data_pos 20
cur_data_pos 18
cur_data_pos 28
=================================================================
==97516==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x60b0000034f0 at pc 0x000105b66364 bp 0x7ffeebc00290 sp 0x7ffeebc00288
READ of size 4 at 0x60b0000034f0 thread T0

Since it's a 5x5 array, only 0..24 are valid positions. I'm pretty sure that this isn't the only problem, since the problem that started my debugging session was a write access, not a read access.

sz 1.4.9 beta

Support lossy compression based on point-wise relative error bound.
Decompression doesn't depend on SZ_Init() any more.
Fortran compile is optional.

SZ 1.4.8-beta

Compared with SZ 1.4.6-beta, we fixed some mem leakage bugs. Fix the bugs about memory crash or segmentation faults when the number of data points is pretty large. Fix the sementation fault bug happening when the data size is super small. Fix the error bound may not be guaranteed in some cases.

In SZ 1.4.8-beta, we also increase the maximum number of quantization intervals. The users can set the max_quant_intervals in the configuration files, in order to optimize the compression performance and compression ratio in the cases with different precisions. Please see the description in sz.config for details.

About random access or random decompression

I am curious whether the random decompression function is already available. I found the function "SZ_decompress_args_randomaccess_float()" which is in the file "sz_randomaccess.c" is only declared, but without implementing.

Is it possible to do SZ_Finalize on a task based manner? Or provide a thread safe API?

We are currently running into segfaults when compressing and decompressing in multiple threads that call SZ API. When a compression/decompression task in one thread finishes, if we call SZ_Finalize, then it will try to release all memory buffers that SZ allocates, including those allocated in other threads, which will then cause a segfault. If we don't call SZ_Finalize, then there is going to be a memory leak. Any suggestions on handling this kind of multi-thread workflows? Thanks.

Compilation errors

Various compilation errors of the type:

 void sz_compress_d1_float_(float* data, unsigned char *bytes, size_t *outSize, int *r1) 
      ^
In file included from ./include/sz.h:28:0,
                 from src/szf.c:14:
./include/szf.h:23:6: note: previous declaration of 'sz_compress_d1_float_' was here

The function signature in the .c and .h files does not match.

Testing with gcc 4.9.3 on Titan.

SZ fails decompressing N-Dimensional Data

Hi everybody,

I'm currently trying to compress N-Dimensional numpy arrays using SZ with a set absolute error e.g. 0.1.
For that, I'm generating some random data like this:

# Array size
n = 1000000
np.random.seed(123)
zexact = np.random.randn(n)
# Reshape 3-D
zexact = zexact.reshape((100,100,100))
zexact = zexact.astype(np.float32)

# Save file:
filename = 'zexact.dat'
zexact.tofile(filename)

Next, I run the following commands:

!./sz -z -f  -M ABS -A 0.1 -i zexact.dat -3 100 100 100
!./sz -x -f -s zexact.dat.sz -3 100 100 100 -a -i zexact.dat

The output I get is:

Min=-4.6071300506591796875, Max=4.6275801658630371094, range=9.234710693359375
Max absolute error = 179516227584.0000000000
Max relative error = 19439290368.000000
Max pw relative error = 1672789715114.853271
PSNR = -158.353887, NRMSE= 82735967.982407003641
normError = 764042728252.572754, normErr_norm = 1.000000
acEff=-0.001861
compressionRatio=5.487985
decompression time = 0.058381 seconds.
decompressed data file: zexact.dat.sz.out

As you can see, the error bound is not respected. However, when I reshape the array to be 1-D, it works.

!./sz -z -f  -M ABS -A 0.1 -i zexact.dat -1 1000000
!./sz -x -f -s zexact.dat.sz -1 1000000 -a -i zexact.dat

The output:

Min=-4.6071300506591796875, Max=4.6275801658630371094, range=9.234710693359375
Max absolute error = 0.1000547409
Max relative error = 0.010835
Max pw relative error = 9727.824053
PSNR = 44.072084, NRMSE= 0.0062574274392154221464
normError = 57.785532, normErr_norm = 0.057676
acEff=0.998335
compressionRatio=6.511529
decompression time = 0.024181 seconds.
decompressed data file: zexact.dat.sz.out

Is there any way you could help me? Thank you 🦊

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.