Giter Site home page Giter Site logo

fastdoubleparser's People

Contributors

chenzhongpu avatar kosak avatar laino avatar lemire avatar marschall avatar pjfanning avatar vlsi avatar wrandelshofer avatar xtonik avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

fastdoubleparser's Issues

Incorrect maven command sequence

The test can be done with javac and java directly, but it does NOT work as expected with maven.

After mvn clean package, the command below raises an error "Error: Could not find or load main class ch.randelshofer.fastdoubleparserdemo.Main in module ch.randelshofer.fastdoubleparserdemo":

java -XX:CompileCommand=inline,java/lang/String.charAt -p fastdoubleparser/target:fastdoubleparserdemo/target -m ch.randelshofer.fastdoubleparserdemo/ch.randelshofer.fastdoubleparserdemo.Main --markdown

I checked the jars inside fastdoubleparser/target and fastdoubleparserdemo/target, and found that they contains nothing but a META-INF folder!

jar xvf fastdoubleparser-0.7.0.jar
  created: META-INF/
 inflated: META-INF/MANIFEST.MF

So, the maven command cannot produce correct jar, and I think it is caused by incorrect Maven project structures and POM configurations. BTW, I think the current multi-release jar here is a little overkill.

float parser

Hi - thanks for all the great work on the double parser. I've been experimenting with it for possible inclusion in jackson-core.

Parsing floats using the double parser is also much faster than using Float.parseFloat but unfortunately casting doubles to floats can often give you different result from plain Float.parseFloat.

Would it be possible to consider also supporting a dedicated float parser?

An example is 7.006492321624086e-46 which Float.parseFloat returns as 1.4E-45 but using FastDoubleParser:

        double dbl = FastDoubleParser.parseDouble("7.006492321624086e-46");
        System.out.println("double=" + dbl); //7.006492321624085E-46
        System.out.println("float=" + (float)dbl); //0.0

possible performance issue with very big doubles

JavaDoubleParser seems to be slower than Double.parseDouble for very large numbers (thousands of digits).
Malicious actors often create input files with large numbers to try to cause denial of service issues.

I have a jmh benchmark at https://github.com/pjfanning/jackson-number-parse-bench

./gradlew jmh

It's worth checking the build.gradle file as I have a param that controls which benchmark to run.

jmh {
    includes = ['org.example.jackson.bench.DoubleParserBench']
}

I'm wondering if it would be possible to disregard the least significant digits. If there are 1000 digits, only the first 30 or 40 digits should really impact the double value - even if you were conservative and limited it 100 or 200, this would limit the risk vector.

Publish a multi-release JAR

"We use your java8 code in jackson-core. If you publish a jar with your java8 branch code that would be great - we would change our build to use your published jars and that shades the class packages to include them in jackson-core jar.

One solution would be to append '-java8' to the artifact name (and '-java17' for the java17 jar). Or maven supports 'classifiers' which basically lead to a similar result."

Originally posted by @pjfanning in #22 (comment)

Document which code signing keys will be used for published artifacts.

Looks like artifacts are being signed with this key:
https://keyserver.ubuntu.com/pks/lookup?search=6ead752b3e2b38e8e2236d7ba9321edaa5cb3202&fingerprint=on&op=index

If that is the correct key can you add a section to the readme confirming that is the key that is expected to be used for code signing on the artifacts released from this repo? Thanks. :)

See examples of other libs that provide docs for the code signing key used are here:

Double.parseDouble("0e555") != FastDoubleParser.parseDouble("0e555")

Double.parseDouble and FastDoubleParser.parseDouble return different results for the string "0e555":

Double.parseDouble("0e555"): 0.0
FastDoubleParser.parseDouble("0e555"): Infinity

Edit: I believe that is caused by the special case at

return negative ? Double.NEGATIVE_INFINITY : Double.POSITIVE_INFINITY;
, which does not handle the even more special case of the mantissa being 0.

BigDecimal parser

Thanks for all the hard work on the double and float parsers. Would there be any chance that you could consider adding support for BigDecimal parsing? A lot of the low level parser could be reused.

BigInteger parser

    @wrandelshofer I'm using v0.5.2 and have found that `JavaBigIntegerParser,parseBigInteger(CharSequence str)` accepts hex values like "AAAA" but `new BigInteger(String)` throws a NumberFormatException with "AAAA".

Would it be possible to support being able to disable hex support?

Originally posted by @pjfanning in #24 (comment)

make it allocation free on happy path

Double d = FastDoubleMath.hexFloatLiteralToDouble(index, isNegative, digits, exponent, virtualIndexOfPoint, exp_number, isDigitsTruncated, skipCountInTruncatedDigits);

why allocate here?? by that time you know it's not a NaN for sure... so instead of returning null you can just return Double.NaN or whatever special constant.

lack of jmh tests is also troubling :(

1.0.0 release only supports very recent JVMs

Jackson still supports Java 8 but fastdoubleparser has at least some classes that have class file major version 66 - might be java 22

Jackson built fine with fastdoubleparser 0.9.0.

This could be a shortcoming of maven plugins - that don't know about Java 22. In fairness, Java 22 is only early access and many build tools really struggle to keep up.

Error:  Failed to execute goal org.apache.maven.plugins:maven-shade-plugin:3.5.1:shade (shade-jackson-core) on project jackson-core: Error creating shaded jar: Problem shading JAR /home/runner/.m2/repository/ch/randelshofer/fastdoubleparser/1.0.0/fastdoubleparser-1.0.0.jar entry META-INF/versions/22/ch/randelshofer/fastdoubleparser/FastDoubleSwar.class: java.lang.IllegalArgumentException: Unsupported class file major version 66 

Edit: This seems to be a shortcoming of maven-shade-plugin but I think I have managed to work around it by excluding the java 22 classes that are in META-INF/versions/22/ch/randelshofer/fastdoubleparser

Double.parseDouble(...) != FastDoubleParser.parseDouble(...)

I have found another input string for which the return values of Double.parseDouble and FastDoubleParser.parseDouble differ. This one is less important than #6 though as it implies only a very minor loss in precision:

Double.parseDouble("-2.2222222222223e-322"): -2.2E-322
FastDoubleParser.parseDouble("-2.2222222222223e-322"): 0.0

Both this issue and #6 have been found with the open-source JVM fuzzer Jazzer. If you are interested in these kinds of findings, I could add the fuzzer to the project as a PR.

Is this a mistake with hex float parsing?

In this section of FastFloatMath, it checks the significand against a 53-bit number (as if it were testing to see if it is an exactly representable double), but then casts to float, despite the comments repeatedly referring to the code as using doubles. I think the cast to float should probably be a cast to double (and d should be a double), but I'm not familiar with the code.

if (Long.compareUnsigned(significand, 0x1fffffffffffffL) <= 0) {
// convert the integer into a double. This is lossless since
// 0 <= i <= 2^53 - 1.
float d = (float) significand;
//
// The general idea is as follows.
// If 0 <= s < 2^53 then
// 1) Both s and p can be represented exactly as 64-bit floating-point
// values (binary64).
// 2) Because s and p can be represented exactly as floating-point values,
// then s * p will produce correctly rounded values.

Please bundle LICENSE/NOTICE files in the produced jar files

I'm upgrading jackson in Apache JMeter, and I found the new jackson version depends on fastdoubleparser.
It turns out fastdoubleparser does not ship with the license, so it is problematic for the consumers.

See apache/jmeter#5831, and the build failure: https://github.com/apache/jmeter/actions/runs/4823397202/jobs/8592678119?pr=5831#step:4:1857

I have created a lot of similar requests, and almost all of them got fixed eventually, see Dependency with "manual" license configuration in apache/jmeter#469

Current issues

  1. The current license is MIT: https://github.com/wrandelshofer/FastDoubleParser/blob/aeeab26365235cc2fbfb68fea2145a4b86a800fd/LICENSE
    However, please note that there's no canonical MIT license text. Every MIT license is different since the copyright is a part of the license text.
    In other words, the line Copyright (c) 2021 Werner Randelshofer, Switzerland is a part of the license, and the license text requires that The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software

It is hard for consumers to comply with the requirement above, especially if fastdoubleparser.jar does not include the license text.

  1. The pom file for fastdoubleparser refers to a different license. See https://repo1.maven.org/maven2/ch/randelshofer/fastdoubleparser/0.8.0/fastdoubleparser-0.8.0.pom
    The URL there is http://www.opensource.org/licenses/mit-license.php, which does not mention Werner Randelshofer.

  2. fastdoubleparser.jar misses reference to the license. There are cases when fastdoubleparser.jar appears without the corresponding pom.xml, so if you consider fastdoubleparser.jar alone, it is hard to tell what is the license for that artifact.

Consider relicensing with Apache-2.0

You might want to consider switching to Apache-2.0 license. It has several advantages for the consumers:

  1. The copyright and the license text are separate. In other words, there's a canonical Apache-2.0 license text, and you can put your copyright notice into NOTICE file. In general, it becomes easier to review, since every MIT license is different while every Apache-2.0 is the same.
  2. Apache-2.0 license mentions Grant of Patent License while MIT does not mention patents
  3. With Apache-2.0 you can have a canonical license URL right in pom.xml and MANIFEST.MF
  4. With MIT, you literally force everyone to double-check the license text since no one knows if you have other modifications than the custom copyright.

If you absolutely like MIT, you might go with MIT or Apache-2.0, however, I'm not sure if you want that complication (as it would be impossible to express in pom.xml)

Fix steps

  1. Include the license text into the jars under META-INF/LICENSE, META-INF/NOTICE, etc. It would enable consumers to get up-to-date licenses when they depend on fastdoubleparser.
  2. Fix pom.xml to point to the proper license text (e.g. a permalink to GitHub). The current link http://www.opensource.org/licenses/mit-license.php is invalid as it points to a wrong license text.
  3. Add Bundle-License: Apache-2.0 (or Bundle-License: MIT; link=...) manifest entry (where Apache-2.0 is SPDX identifier, see https://osgi.org/specification/osgi.core/7.0.0/framework.module.html#framework.module-bundle-license )

FastDoubleParser doesn't support all input formats as the default OpenJDK Float/Double parsers

The FastDoubleParser was recently introduced in Jackson through this issue FasterXML/jackson-core#577 is 3-4x times faster compared to the version that's implemented in OpenJDK. This is fantastic news, since many numerical processing workloads would benefit from this.

However the OpenJDK Double/Float parsers support variety of input formats that the FastDoubleParser will fail on, therefore it can cause unexpected regressions when used.

For example, the FastDoubleParser will fail with a NumberFormatException on these example patterns (there are more to be found in the OpenJDK Double/Float tests):

1.1e-23f
0x.003p12f
0x1.17742db862a4P-1d

I think apart from the first one in this list, the rest are all hexadecimal if I'm not mistaken.

issue with module-info classes in v0.9.0 release

There are multiple module-info classes in the v0.9.0 jar. In v0.8.0, there was just the versions/9/module-info.class.

In v0.9.0, there are module-info.class fils in all the versions dirs.

This is causing FasterXML/jackson-core#1027

Would it be possible to get some background on the v0.9.0 changes, so that I can work out what to do with the jackson-core issues?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.