Giter Site home page Giter Site logo

Comments (3)

Zac-HD avatar Zac-HD commented on August 17, 2024

That's really nice work, thanks for sharing it!

I think pysource-codegen is a great complement to hypothesmith, which works somewhat differently. (I'll focus on from_node() here, since from_grammar() is older and pretty janky)

  • hypothesmith uses the libcst concrete syntax tree to generate source code, by starting from a node type and recursively choosing valid subcomponents.
    • Hypothesis focusses on smaller test cases, with a semi-arbitrary limit of 8K of entropy because skipping larger inputs usually finds more bugs per minute (with more, diverse, smaller inputs). If you need 10kloc though, a new tool is probably the way to go.
    • Sometimes generating weird strings, comments, whitespace, etc. uncovers bugs too.
    • I ensure valid generation via a combination of declaring how to generate valid instances of each node type, and rejection sampling. This takes a while to implement but means there's no fixup stage, and keeps internal reduction working.
  • Hypothesis' shrinking works by replaying generation, but with different choices. Here's the paper on how that works (pdf).
    • As a neat side effect, this allows us to use targeted PBT, coverage-guided fuzzing, SMT solving with crosshair, and all the other nice stuff built in to or integrated with Hypothesis.

Overall though, hypothesmith is an unfinished proof-of-concept and I'm delighted to see other useful tools being created! I think it's unlikely that we could directly merge our tools, but if you'd like to contribute to hypothesmith or try using libcst I'd be happy to help out 🙂

from hypothesmith.

15r10nk avatar 15r10nk commented on August 17, 2024

Thank you, your answer was very helpful.

I think both tools have their strength.
I intentional choose the python ast over libcst, because my goal is to test executing.
executing operates on the python bytecode which is mostly independent of white spaces. Logical errors are more important in this case. Having two variables naming the same thing leads to more special cases than to name everything differently (scoping of function, nonlocal/global variables).

As I told earlier, I am a big fan of hypothesis. Thank you for your great work. Maybe I will find a way to integrate pysource-codegen into the hypothesis world. I will keep it in my mind while I refactor my code.

from hypothesmith.

Zac-HD avatar Zac-HD commented on August 17, 2024

Sounds great!

Hypothesis has some internal logic to generate things like duplicated names (based on mutating existing test cases), but I think at a lower rate than pysource-codegen 🙂

from hypothesmith.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.