Giter Site home page Giter Site logo

Comments (8)

matthew-dean avatar matthew-dean commented on August 17, 2024

Actually, I wonder if it'd be even faster to generate an enum for literals that consists of a single object that maps used literals to their numeric equivalents for ultra-fast comparison. As in:

var Literal = {
  w:  119,
  h: 104
};
...
{"name": "TokenName", "symbols": [ Literal.w, Literal.h ]},

Although each Rule could probably be optimized as well.

from nearley.

matthew-dean avatar matthew-dean commented on August 17, 2024

Something like this:

{"name": " string$18", "symbols": [{"literal":"-"}, {"literal":"-"}], "postprocess": function joiner(d) {
        return d.join('');
    }},
    {"name": "UnaryOperator", "symbols": [" string$18"]},
    {"name": "UnaryOperator", "symbols": [{"literal":"+"}]},
    {"name": "UnaryOperator", "symbols": [{"literal":"-"}]},
    {"name": "UnaryOperator", "symbols": [{"literal":"~"}]},
    {"name": "UnaryOperator", "symbols": [{"literal":"!"}]},

... could probably have at least duplication removed and be like:

    new Rule(Token.string_18, [{ "symbols": [Literal.DASH, Literal.DASH], "postprocess": function joiner(d) {
        return d.join('');
    }]),
    new Rule(Token.UnaryOperator, [
     { "symbols": [Token.string_18] },
     { "symbols": [Literal.PLUS] }, 
     { "symbols": [Literal.DASH] },
     { "symbols": [Literal.TILDE] },
     { "symbols": [Literal.EXCLAMATION_POINT] } ] )

from nearley.

kach avatar kach commented on August 17, 2024

Hey @matthew-dean

This isn't something I've thought too much about, since nobody has written a grammar large enough for this to be an issue yet. I think the JIT point you bring up is by far the most important since it might be a significant speed boost.

Initially, of course, back in the days of yore when there was no .ne language, I wrote the initial (subsequently bootstrapped) grammar in the JS object format by hand, and so it needed to be verbose and easy to debug.

.... I suppose you're going to make me write this, lol.

If you have the time, then that would be amazing! PRs are always welcome here. :-)

from nearley.

matthew-dean avatar matthew-dean commented on August 17, 2024

The only problem with me doing this is I haven't gotten my grammar to work yet 😢. Rewriting the JS output would make more sense when testing against a working grammar. I suppose though I could use the JavaScript one, but I think you've said in the past that it may have issues, no? Is there a support group where I could talk about my feelings re: nearley?

from nearley.

kach avatar kach commented on August 17, 2024

The only problem with me doing this is I haven't gotten my grammar to work yet

:-(

Anything I can help with?

Rewriting the JS output would make more sense when testing against a working grammar.

There exist a couple nice, large in-the-wild grammars you can play with. Nearley's own bootstrapped grammar is pretty stable. Shrdlite and milk-lang are two others.

I suppose though I could use the JavaScript one, but I think you've said in the past that it may have issues, no?

Yeah, someday over the river @JacobEdelman will good-ify it. :-)

Is there a support group where I could talk about my feelings re: nearley?

If you need, uh, counseling, you can pm me on IRC/Freenode (hardmath123). I can register #nearley if more people want to chat.

(Generic emotional support when Earley parsing gets you down is available at #marpa from the amazing @jeffreykegler, King of the Earley Parsers.)

from nearley.

matthew-dean avatar matthew-dean commented on August 17, 2024

Are you on Twitter? You could DM me on https://twitter.com/matthewdeaners.

from nearley.

kach avatar kach commented on August 17, 2024

No, sorry, I don't do social media. You can email me.

from nearley.

tjvr avatar tjvr commented on August 17, 2024

Nearley doesn't need to know the names of the tokens, does it?

If we want to output sensible error messages (Expected string), then the names of the tokens are useful.

Objects that have the same "shape" can be optimized by the JIT compiler, whereas { "literal": "w" } and { "literal": "h" } won't necessarily be detected as having the same shape, IIRC

Actually, V8 should share pseudoclasses between plain objects too!


My feelings about all this are that yes, you could shave a few bytes off the file size by using integers instead of names; but the difference will disappear after gzipping anyway, so why make your life difficult? :)

from nearley.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.