This is not meant to be a criticism but rather, I'm wondering why the generated JS is

Something like this: <div class="snippet-clipboard-content notranslate position-re

Hey <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

Are you on Twitter? You could DM me on <a href="https://twitter.com/matthewdeaners" re

Why is the generated JS so verbose? about nearley HOT 8 CLOSED

matthew-dean commented on August 17, 2024

Why is the generated JS so verbose?

from nearley.

Comments (8)

matthew-dean commented on August 17, 2024

Actually, I wonder if it'd be even faster to generate an enum for literals that consists of a single object that maps used literals to their numeric equivalents for ultra-fast comparison. As in:

var Literal = {
  w:  119,
  h: 104
};
...
{"name": "TokenName", "symbols": [ Literal.w, Literal.h ]},

Although each Rule could probably be optimized as well.

from nearley.

matthew-dean commented on August 17, 2024

Something like this:

{"name": " string$18", "symbols": [{"literal":"-"}, {"literal":"-"}], "postprocess": function joiner(d) {
        return d.join('');
    }},
    {"name": "UnaryOperator", "symbols": [" string$18"]},
    {"name": "UnaryOperator", "symbols": [{"literal":"+"}]},
    {"name": "UnaryOperator", "symbols": [{"literal":"-"}]},
    {"name": "UnaryOperator", "symbols": [{"literal":"~"}]},
    {"name": "UnaryOperator", "symbols": [{"literal":"!"}]},

... could probably have at least duplication removed and be like:

    new Rule(Token.string_18, [{ "symbols": [Literal.DASH, Literal.DASH], "postprocess": function joiner(d) {
        return d.join('');
    }]),
    new Rule(Token.UnaryOperator, [
     { "symbols": [Token.string_18] },
     { "symbols": [Literal.PLUS] }, 
     { "symbols": [Literal.DASH] },
     { "symbols": [Literal.TILDE] },
     { "symbols": [Literal.EXCLAMATION_POINT] } ] )

from nearley.

kach commented on August 17, 2024

Hey @matthew-dean

This isn't something I've thought too much about, since nobody has written a grammar large enough for this to be an issue yet. I think the JIT point you bring up is by far the most important since it might be a significant speed boost.

Initially, of course, back in the days of yore when there was no .ne language, I wrote the initial (subsequently bootstrapped) grammar in the JS object format by hand, and so it needed to be verbose and easy to debug.

.... I suppose you're going to make me write this, lol.

If you have the time, then that would be amazing! PRs are always welcome here. :-)

from nearley.

matthew-dean commented on August 17, 2024

The only problem with me doing this is I haven't gotten my grammar to work yet 😢. Rewriting the JS output would make more sense when testing against a working grammar. I suppose though I could use the JavaScript one, but I think you've said in the past that it may have issues, no? Is there a support group where I could talk about my feelings re: nearley?

from nearley.

kach commented on August 17, 2024

The only problem with me doing this is I haven't gotten my grammar to work yet

:-(

Anything I can help with?

Rewriting the JS output would make more sense when testing against a working grammar.

There exist a couple nice, large in-the-wild grammars you can play with. Nearley's own bootstrapped grammar is pretty stable. Shrdlite and milk-lang are two others.

I suppose though I could use the JavaScript one, but I think you've said in the past that it may have issues, no?

Yeah, someday over the river @JacobEdelman will good-ify it. :-)

Is there a support group where I could talk about my feelings re: nearley?

If you need, uh, counseling, you can pm me on IRC/Freenode (hardmath123). I can register #nearley if more people want to chat.

(Generic emotional support when Earley parsing gets you down is available at #marpa from the amazing @jeffreykegler, King of the Earley Parsers.)

from nearley.

matthew-dean commented on August 17, 2024

Are you on Twitter? You could DM me on https://twitter.com/matthewdeaners.

from nearley.

kach commented on August 17, 2024

No, sorry, I don't do social media. You can email me.

from nearley.

tjvr commented on August 17, 2024

Nearley doesn't need to know the names of the tokens, does it?

If we want to output sensible error messages (Expected string), then the names of the tokens are useful.

Objects that have the same "shape" can be optimized by the JIT compiler, whereas { "literal": "w" } and { "literal": "h" } won't necessarily be detected as having the same shape, IIRC

Actually, V8 should share pseudoclasses between plain objects too!

My feelings about all this are that yes, you could shave a few bytes off the file size by using integers instead of names; but the difference will disappear after gzipping anyway, so why make your life difficult? :)

from nearley.

Why is the generated JS so verbose? about nearley HOT 8 CLOSED

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent