Comments (8)
Thanks for fixing this! I'm glad my bug report was so useful to you. It was really easy to find this edge case using proptest: it was the second or third test run that found it. If you like, I'd be happy to set up proptest to test lexical and lexical-core, and send you a PR. That way it should be easy to find any more values like this lurking in the corners.
from rust-lexical.
@Alexhuszagh I'm happy that the issue was caught and fixed so fast!
I've checked this case in a simplified Scala implementation of your algorithm for base 10 and found that it successfully passes through the medium path that uses a big lookup table (653 values of 64-bit coefficients) and one operation of 64-bit multiplication of mantissa.
Have you considered providing the specialized implementation for base 10?
For cases when you need to parse 64-bit float values that were previously serialized from the binary representation, it allows minimizing the need of subsequent slower paths down to 0.22% (for arbitrary binary representation).
from rust-lexical.
Thank you for finding this, let me first confirm the issues compared to other parsers, and I'll triage it accordingly.
from rust-lexical.
@dangrabcad Confirmed, it's a core error in the slow-path algorithm used by default, let me check if this exists in other versions.
Specifically, bhcomp
is using a b
value of 7.689539722041642e164
, and is using the internal routine small_atof
to generate the significant digits. This is an amazing catch, thank you for finding this.
The core bug seems to be in small_atof
in bhcomp.rs
, which is giving me a value lower than the compared digits for real_digits
, so the comparison suggests that we should round-down, rather than round-up, maybe due to an overflow or underflow during the correct scaling. Quite a nice find, thank you.
from rust-lexical.
@dangrabcad I've confirmed the source of the issue, the scaling in small_atof
is exactly 10 off for the real digits as opposed to the b+h
digits, which is likely due to calculating the exponent in this case off-by-1 exactly. Should be an easy patch.
from rust-lexical.
@dangrabcad The integration tests should fail (an unrelated bug I need to patch), but the following should fix the error. I've diagnosed the source of the issue, and it's a rather larger bug than I was hoping, and a very good catch. I'll add in more unittests to ensure this is never an issue again.
In short, any floating point-number with trailing 0s in the fraction part that forced the slow path (IE, more than 17 digits and near a halfway point) may have been rounded incorrectly. I'll be updating the README accordingly, crediting you for finding this, and adding in further tests to ensure this never happens again. I cannot thank you enough.
After patching the various dependency issues on older Rust versions (the issue with the integrations), I should push a commit later today.
Once again, thank you. Quite a catch.
from rust-lexical.
Fixed as of v0.4.3.
from rust-lexical.
@dangrabcad I already have some integrations for proptest, but clearly they aren't comprehensive enough. Please send over anything you have. This discovery sqaushed a class of bugs, and was beyond useful.
Also, @plokhotnyuk, I've been fairly swamped lately, but I would to. Also, I've been very glad my work has been useful to more than just the Rust ecosystem.
from rust-lexical.
Related Issues (20)
- [OTHER] Improve internal safety comments and architecture
- [BUG] Unsoundness: `try_parse_{4,8}digits` appear to advance iterators out of bounds
- [BUG] Unsoundness in `Bytes::read()` HOT 1
- [BUG] BytesIter should be an `unsafe trait` or private
- [BUG] Bounds for integer parsing are not correctly checked HOT 1
- [FEATURE] Update Dragonbox Algorithm HOT 2
- [BUG] `umul192_lower128` can error on due to overflow on addition
- [FEATURE] Add thousands digit separator for ```WriteIntegerOptions``` to use with ```to_string_with_options```
- parse_partial ignoring minus sign in presence of trailing characters HOT 3
- [BUG] Assertion failure on parsing hexadecimal floats HOT 1
- [BUG] Tests fail to compile because quickcheck is missing from Cargo.toml in the crates.io tarball. HOT 4
- [QUESTION] Use of min_significant_digits with max_significant digits
- [BUG] Integer overflow checking logic is incorrect
- [FEATURE] `write` APIs should take `buffer: &mut [MaybeUninit<u8>]`
- [QUESTION] support custom types (i256)
- [BUG] -0.0 printed as "0.0" HOT 1
- [BUG] Safety comments for MaybeUninit::assume_init calls are wrong, calls are UB
- [BUG] consecutive_digit_separator has no effect
- [BUG] Inconsistent error between int and float parsing
- [BUG] `no_integer_leading_zeros(true)` incorrectly parses input
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from rust-lexical.