Giter Site home page Giter Site logo

leopard's People

Contributors

catid avatar cskiraly avatar gbletr42 avatar musalbas avatar nemequ avatar thebluematt avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

leopard's Issues

recovery_count > original_count

As the title says, I have small original count, large recovery and need to pad up to 32k/32k. The complexity is not pretty (though still best one can get for a MDS).
Would it be possible to arrange the field so as to avoid the overhead?

I can see some attempts in encodeL(), but indeed as the prototype code says, it doesn't seem to work, or would need specialized decoder. Is this inherent limitation of the algorithm, or just not implemented?

Port to ARM NEON

I had one contact who may have found this useful on cellphones so I should port it to ARM NEON

Decoder to correct both erasure and error

I found a new paper about this FFT based Reed-Solomon Codes. They present erasure-and-error decoding of an (n, k) RS code. The decoding complexity is with only O(n log n + (n − k) log2 (n − k)). As reference, complexity of current erasure decoding is O(n log n).

"On fast Fourier transform-based decoding of Reed-Solomon codes"
by Yunghsiang S. Han, Chao Chen, Sian-Jheng Lin, Baoming Bai
http://ct.ee.ntust.edu.tw/IJAHUC2021.pdf

I post the link here as a notice. If you know it already, just ignore this issue. I cannot understand the theory in the paper. If you implement the error correction also in your leopard-RS library, it may be useful rarely. While erasure correction is enough mostly, possibility of error correction is welcome.

Question regarding the API

I'm writing a go-wrapper for this implementation; BTW the C wrapper makes that really convenient, thanks for that!

I struggle to completely understand the API though. What I'm trying to do is quite basic: given n original data buffers or symbols, I want to generate n "parity symbols" s.t. that I can loose any n (of the now 2*n total) symbols and still recover the data.

I mainly experimented with the API in banchmarks.cpp: OK, sounds like I would need to call leo_encode with original_count == recovery_count for that, right? leo_encode_work_count returns 2*original_count for these params and calling leo_encode yields encode_work_data with the size 2*original_count as expected. So far so good. What is surprising though: I would have expected encode_work_data to contain the original data too instead it does seem to contain 2*original_count recovery or parity symbols. So is there no way I can achieve what described above using the public C API or am I using recovery_count wrong?

Thanks in advance :-)

New benchmark for ECC libraries

https://github.com/Bulat-Ziganshin/ECC-Benchmark

I made ECC benchmark that includes mainly your libraries :) The reason is that I don't know better alternatives for them, so if anyone has better ideas, I'm all ears.

Currently it lacks better build system (just compile.cmd works for me), as well as automatic data gathering/publishing system (I work on it).

Nevertheless, for me it looks better that your own comparison of your own libraries, so you may like to link it from your readme and/or publish yourself any data gathered this way.

BTW, the benchmark can be build with -fopenmp too.

Progressive/incremental encoding ?

Hi, is it possible to do progressive/incremental encoding with this? It would significantly reduce the memory usage if we could feed in one block of the input file at a time instead of having to keep the whole file in memory.

No description how to restore missed original data after decoding

I've looked into benchmark.cpp and the only place decode_work_data is used is at

if (!CheckPacket(decode_work_data[i], params.buffer_bytes))
.

Thus i conjecture decode_work_data[i] has CRC32 in first 4 bytes, then the length of data in next 4 bytes, and then the data (which correspond to missed original_data[i]) itself in remaining (length) data(?).

But if this hypothesis is correct then we should have extra 8 bytes room in each decode_work_data[i], but the code uses buffer_bytes only (exactly as for any other buffer).

If the hypothesis is wrong then where should we extract missed original_data[i] from?

Allow more parity symbols aka losses than data

We've run into some hiccups making this work with more parity symbols than data symbols.

There is a check leopard.cpp:135 to prevent this and if removed it segfaults

#0  leopard::SIMDSafeFree (ptr=0xf8913455fd9fcc4a) at /tmp/leopard/tests/../LeopardCommon.h:480
#1  Benchmark (params=...) at /tmp/leopard/tests/benchmark.cpp:452

due to benchmark/test code assuming all lost symbols come from the data symbols at benchmark.cpp:452. I suppose test cases should cover loosing only some of the original data, but it'll presumably be straightforward to (randomly?) replace less than losses data symbols with parity symbols though.

We also run into some strange slowdown in leo_encode when adjusting the code to produce more parity symbols than data symbols. At first blush, there is seemingly just some runaway code that fills up many gigabytes of memory, which sounds strange since allocations seemingly occur outside leo_encode.

I've not yet fully understood the algorithm so I'd love your input on whether you think either the encode or decode algorithm should run into any performance or other hiccups when handling more parity chunks than data chunks. Thoughts?

Test(s) which terminate

AFAICT both bench_leopard and experiment_leopard just go on forever, or at least for a very long time. It would be nice to have something which could be run on CI.

Speeding up mulE

mulE description says return a*exp[b] over GF(2^r) so it's a simple multiplication in GF(2^r) optimizable with PSHUFB. they just precompute logarithms of twiddle factors to speed up their implementation

Typo in the library name

Traditionally libraries are named lib[name], but in your CMakeLists.txt you have the library name as libleopard, which cmake turns into liblibleopard. If possible, I feel it would be good to change it to leopard, so it follows the standard naming.

GCC ≤ 4.6 build fails

See https://travis-ci.org/nemequ/leopard/jobs/395895113

/home/travis/build/nemequ/leopard/LeopardFF16.cpp:1480:41: error: a brace-enclosed initializer is not allowed here before ‘{’ token
/home/travis/build/nemequ/leopard/LeopardFF16.cpp:1480:42: sorry, unimplemented: non-static data member initializers
/home/travis/build/nemequ/leopard/LeopardFF16.cpp:1480:42: error: ‘constexpr’ needed for in-class initialization of static data member ‘Words’ of non-integral type
/home/travis/build/nemequ/leopard/LeopardFF16.cpp:1484:46: error: a brace-enclosed initializer is not allowed here before ‘{’ token
/home/travis/build/nemequ/leopard/LeopardFF16.cpp:1484:47: sorry, unimplemented: non-static data member initializers
/home/travis/build/nemequ/leopard/LeopardFF16.cpp:1484:47: error: ‘constexpr’ needed for in-class initialization of static data member ‘BigWords’ of non-integral type
/home/travis/build/nemequ/leopard/LeopardFF16.cpp:1487:43: error: a brace-enclosed initializer is not allowed here before ‘{’ token
/home/travis/build/nemequ/leopard/LeopardFF16.cpp:1487:44: sorry, unimplemented: non-static data member initializers
/home/travis/build/nemequ/leopard/LeopardFF16.cpp:1487:44: error: ‘constexpr’ needed for in-class initialization of static data member ‘BiggestWords’ of non-integral type
/home/travis/build/nemequ/leopard/LeopardFF16.cpp: In member function ‘void leopard::ff16::ErrorBitfield::Set(unsigned int)’:
/home/travis/build/nemequ/leopard/LeopardFF16.cpp:1492:9: error: ‘Words’ was not declared in this scope
/home/travis/build/nemequ/leopard/LeopardFF16.cpp: In member function ‘bool leopard::ff16::ErrorBitfield::IsNeeded(unsigned int, unsigned int) const’:
/home/travis/build/nemequ/leopard/LeopardFF16.cpp:1504:26: error: ‘BiggestWords’ was not declared in this scope
/home/travis/build/nemequ/leopard/LeopardFF16.cpp:1509:26: error: ‘BigWords’ was not declared in this scope
/home/travis/build/nemequ/leopard/LeopardFF16.cpp:1511:22: error: ‘Words’ was not declared in this scope
/home/travis/build/nemequ/leopard/LeopardFF16.cpp: In member function ‘void leopard::ff16::ErrorBitfield::Prepare()’:
/home/travis/build/nemequ/leopard/LeopardFF16.cpp:1528:24: error: ‘Words’ was not declared in this scope
/home/travis/build/nemequ/leopard/LeopardFF16.cpp:1545:32: error: ‘Words’ was not declared in this scope
/home/travis/build/nemequ/leopard/LeopardFF16.cpp:1551:9: error: ‘BigWords’ was not declared in this scope
/home/travis/build/nemequ/leopard/LeopardFF16.cpp:1563:28: error: ‘BigWords’ was not declared in this scope
/home/travis/build/nemequ/leopard/LeopardFF16.cpp:1569:5: error: ‘BiggestWords’ was not declared in this scope
make[2]: *** [CMakeFiles/libleopard.dir/LeopardFF16.cpp.o] Error 1
make[2]: Leaving directory `/home/travis/build/nemequ/leopard/build'
make[1]: *** [CMakeFiles/libleopard.dir/all] Error 2
make[1]: Leaving directory `/home/travis/build/nemequ/leopard/build'
make: *** [all] Error 2

4.7+ works.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.