Giter Site home page Giter Site logo

Comments (6)

renatahodovan avatar renatahodovan commented on May 13, 2024 1

Hi @kaby76

It's not a bell curve but it's an (1/x)^n curve (in this case (1/2)^n), which is exactly what we expect from quantifiers by definition/implementation. The generation of quantifiers happens according to the following pseudo code:

source_text = UnparserRule(name='source_text')
while random_decision():
    source_text += UnparserRule(name='description')

It means, that the probability of the generation of one description is 1/2, for two descriptions is (1/2)^2, for three is (1/2)^3, etc., i.e.; (1/2)^n, what your plot shows as well.

from grammarinator.

akosthekiss avatar akosthekiss commented on May 13, 2024 1

@kaby76 I was just about to leave a comment guiding you to models, if you wanted to tweak the "let's flip a coin" default approach. You can write your own decision model that has the same API as DefaultModel . Every random decision of the generated fuzzer (e.g., how to chose an alternative from A | B or how many times to iterate over *) actually happens here. And the default model can be replaced even from the command line using the -m or --model switch:

https://github.com/renatahodovan/grammarinator/blob/master/grammarinator/generate.py#L237-L238

As the documentation of models is incomplete (so to say), let me introduce quantify(self, node, idx, min, max). Whenever a quantifier is reached during test case generation, the model's quantify method is called in a for loop. Actually, quantify should be a generator and it should yield as many times as the loop is expected to iterate. It is expected that it yields between min and max times (inclusive). To help quantify make the decision, the current node is passed as an argument, for which children are being generated; e.g., node.name names the rule that is corresponding to the node in the grammar. Moreover, idx is also passed as an argument, which uniquely identifies the quantifier within the rule. (E.g., in S: A* B?;, * has index 0, ? has index 1.)

I know that the above is a bit brief, but I hope it helps.

BTW, there is also a subclass of DefaultModel, called DispatchingModel. It simplifies tweaking the random decisions in some selected rules by writing methods named like quantify_<RULE>. E.g., in your example:

class VerilogModel(grammarinator.runtime.DispatchingModel):
    def quantify_source_text(self, node, idx, min, max):
        yield
        yield
        yield

(And this would create test cases that always contained exactly three descriptions. The rest of the quantifiers would still use the flip-the-coin approach.)

from grammarinator.

CityOfLight77 avatar CityOfLight77 commented on May 13, 2024

I'm facing same issue with all grammars I tested they generated empty files, but I don't know it's intended or not.

from grammarinator.

renatahodovan avatar renatahodovan commented on May 13, 2024

Hi @kaby76 and @CityOfLight77

It's not a surprise if you look carefully into the grammar to generate test cases from. In case of VerilogGenerator, the start rule used in the example is source_text. It's definition from the grammar is:

// START SYMBOL
source_text
	: description* EOF
	;

It means, that source_text must be constructed from zero or more description (due to the Kleene star quantifier * after description), i.e., empty files should be recognized by a Verilog parser.
Grammarinator does exactly the same in the opposite direction: before every generation it rolls a dice to decide whether to generate zero or more description (i.e., generate empty file or not).

Although this random decision about zero or more quantifier expansion is quite useful deeper in the derivation tree to avoid infinite recursions, at the beginning, around the start_rule, it's worth to manually replace the * with + (Kleene plus, "one or more" quantifier) to avoid empty output files.

I hope this helps!

@CityOfLight77 If it doesn't solve your problem with empty files, please share the grammar and I'll look into it.

Cheers,
Reni

from grammarinator.

kaby76 avatar kaby76 commented on May 13, 2024

For grammarinator-generate.exe VerilogGenerator.VerilogGenerator --sys-path . -d 10 -n 100 -r source_text --serializer grammarinator.runtime.simple_space_serializer, I then used Trash to get the number of children for the source_text rule (for i in tests/*; do trparse -t gen $i 2>/dev/null | trxgrep ' /source_text/*' | trtext -c ; done > o) and made a histogram plot for the number of children in a source_text for 100 generated tests. It seems the "sampling" for the LL-derivations follows a bell curve. Why is that?

Untitled

from grammarinator.

kaby76 avatar kaby76 commented on May 13, 2024

Thanks. That explains quite a bit of what the generated code is doing. I can now follow through on what for _ in self._model.quantify(current, 0, min=0, max=inf) does.

from grammarinator.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.