pschanely / hypothesis-crosshair Goto Github PK

View Code? Open in Web Editor NEW

8.0 8.0 0.0 83 KB

Level-up your Hypothesis tests with CrossHair

License: MIT License

Python 100.00%

hypothesis-crosshair's Issues

Two problems on first try

Hi,

I was curious to try hypothesis-crosshair, this is just reporting some issues which popped up in my first attempt.

1. Import of deprecated package

  File "/home/jobh/mambaforge/envs/hypothesis/lib/python3.11/site-packages/crosshair/libimpl/relib.py", line 2, in <module>
    from sre_parse import ANY  # type: ignore
    ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jobh/mambaforge/envs/hypothesis/lib/python3.11/sre_parse.py", line 2, in <module>
    warnings.warn(f"module {__name__!r} is deprecated",
DeprecationWarning: module 'sre_parse' is deprecated
================================================================================= short test summary info ==================================================================================
FAILED crosshair/test_basic.py::test_needs_solver - DeprecationWarning: module 'sre_parse' is deprecated
==================================================================================== 1 failed in 0.07s =====================================================================================

I think this package should be imported as re._parser in python 3.11+. Next, with -Wignore:

2. Error on first run

  File "/home/jobh/src/hypothesis/crosshair/test_basic.py", line 4, in test_needs_solver
    @given(st.integers())
                   ^^^
  File "/home/jobh/src/hypothesis/hypothesis-python/src/hypothesis/core.py", line 1658, in wrapped_test
    raise the_error_hypothesis_found
  File "/home/jobh/mambaforge/envs/hypothesis/lib/python3.11/site-packages/crosshair/libimpl/builtinslib.py", line 1101, in __hash__
    return self.__index__().__hash__()
           ^^^^^^^^^^^^^^^^
  File "/home/jobh/mambaforge/envs/hypothesis/lib/python3.11/site-packages/crosshair/libimpl/builtinslib.py", line 1112, in __index__
    space = context_statespace()
            ^^^^^^^^^^^^^^^^^^^^
  File "/home/jobh/mambaforge/envs/hypothesis/lib/python3.11/site-packages/crosshair/statespace.py", line 247, in context_statespace
    raise CrosshairInternal
crosshair.util.CrosshairInternal
================================================================================= short test summary info ==================================================================================
FAILED crosshair/test_basic.py::test_needs_solver - crosshair.util.CrosshairInternal
==================================================================================== 1 failed in 0.13s =====================================================================================

Interestingly, this failure goes away if I run again (until the .hypothesis db directory is removed). So it seems that the example has been found, and has been recorded by hypothesis, before the error happens.

================================================================================= short test summary info ==================================================================================
FAILED crosshair/test_basic.py::test_needs_solver - assert 123456789 != 123456789
==================================================================================== 1 failed in 0.37s =====================================================================================

(yay!)

`Duplicate type "<class 'array.array'>" registered` from repeated imports?

I'm not really sure what's causing this, but it accounts for 351 out of 386 failures on the latest nocover run of HypothesisWorks/hypothesis#4034. I suspect most of the affected tests would then pass, but either way triage would be a lot easier.

Build logs here; representative traceback:

  File "python3.10/site-packages/hypothesis/core.py", line 1699, in wrapped_test
    raise the_error_hypothesis_found
  File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 883, in exec_module
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "python3.10/site-packages/hypothesis_crosshair_provider/crosshair_provider.py", line 8, in <module>
    import crosshair.core_and_libs  # Needed for patch registrations
  File "python3.10/site-packages/crosshair/core_and_libs.py", line 129, in <module>
    _make_registrations()
  File "python3.10/site-packages/crosshair/core_and_libs.py", line 74, in _make_registrations
    arraylib.make_registrations()
  File "python3.10/site-packages/crosshair/libimpl/arraylib.py", line 157, in make_registrations
    register_type(array, make_array)
  File "python3.10/site-packages/crosshair/core.py", line 553, in register_type
    raise CrosshairInternal(f'Duplicate type "{typ}" registered')
crosshair.util.CrosshairInternal: Duplicate type "<class 'array.array'>" registered

`TypeError` on `Fraction` + `math.log`

https://crosshair-web.org/?crosshair=0.1&python=3.8&gist=53cb15615819ffd4e29fe0c29b1cfe38

this may not be the smallest counterexample, but it was as small as I could easily make it. min(n, 1) is to avoid crosshair finding domain errors.

This is the root cause of hypothesis-crosshair erroring on this property:

@given(st.decimals(min_value=1.0, max_value=1.5))
def f(n):
    pass
f()

Can we ask Z3 to maximize the value of arguments to `hypothesis.target()`?

Talking to a friend about using hypothesis+crosshair as an easy-to-use Z3 interface, my main caveat was that we can find counterexamples but not use Z3's optimization power. But if we hacked on this a bit, maybe we could...

Obviously it'd only work for cases where hypothesis.target() is called with a symbolic object, but so long as we let you add a shim to report those cases I think this would work!

Infinite recursion with nested properties

It's possible this is a symptom of a separate issue. I haven't looked into it and am just logging it to ensure we don't forget.

This seems superficially similar to #11, but I don't know if it's related. I don't know if using timedelta is critical to reproducing or not (when I tried to manually shrink via st.timedeltas -> st.integers, I got #11).

@given(st.timedeltas())
def outer(val):
    @given(st.timedeltas(min_value=val, max_value=val))
    def inner(v):
        assert v == val
    inner()
outer()

Traceback (most recent call last):
  File "/Users/tybug/Desktop/Liam/coding/sandbox/hypothesis_/sandbox4.py", line 31, in <module>
    outer()
  File "/Users/tybug/Desktop/Liam/coding/sandbox/hypothesis_/sandbox4.py", line 26, in outer
    def outer(val):
                   ^
  File "/Users/tybug/Desktop/Liam/coding/hypothesis/hypothesis-python/src/hypothesis/core.py", line 1657, in wrapped_test
    raise the_error_hypothesis_found
  File "/Users/tybug/Desktop/Liam/coding/hypothesis/hypothesis-python/src/hypothesis/core.py", line 1624, in wrapped_test
    state.run_engine()
  File "/Users/tybug/Desktop/Liam/coding/hypothesis/hypothesis-python/src/hypothesis/core.py", line 1152, in run_engine
    runner.run()
  File "/Users/tybug/Desktop/Liam/coding/hypothesis/hypothesis-python/src/hypothesis/internal/conjecture/engine.py", line 757, in run
    self._run()
  File "/Users/tybug/Desktop/Liam/coding/hypothesis/hypothesis-python/src/hypothesis/internal/conjecture/engine.py", line 1219, in _run
    self.generate_new_examples()
  File "/Users/tybug/Desktop/Liam/coding/hypothesis/hypothesis-python/src/hypothesis/internal/conjecture/engine.py", line 975, in generate_new_examples
    self.test_function(data)
  File "/Users/tybug/Desktop/Liam/coding/hypothesis/hypothesis-python/src/hypothesis/internal/conjecture/engine.py", line 531, in test_function
    raise Flaky(
hypothesis.errors.Flaky: Inconsistent results from replaying a failing test case!
  last: INTERESTING from RecursionError at /opt/homebrew/lib/python3.12/site-packages/crosshair/tracers.py:163
  this: OVERRUN

Rare `PathTimeout` errors in `provider.realize(...)`

#20 did indeed fix the ~dozens of returning a symbolic type errors that I was getting before, but in a few rare cases it looks like that's uncovered a new failure mode: this CI job has three tests failing with

  File ".../hypothesis_crosshair_provider/crosshair_provider.py", line 293, in realize
    return self.export_value(value)
  File ".../hypothesis_crosshair_provider/crosshair_provider.py", line 280, in export_value
    return deep_realize(value)
  File ".../crosshair/core.py", line 258, in deep_realize
    return deepcopyext(value, CopyMode.REALIZE, {})
  File ".../crosshair/copyext.py", line 47, in deepcopyext
    obj = obj.__ch_realize__()  # type: ignore
  File ".../crosshair/libimpl/builtinslib.py", line 3819, in __ch_realize__
    return bytes(tracing_iter(self.inner))
  File ".../crosshair/tracers.py", line 463, in tracing_iter
    value = next(itr)
  File ".../crosshair/libimpl/builtinslib.py", line 2537, in __iter__
    if not space.smt_fork(idx < my_smt_len):
  File ".../crosshair/statespace.py", line 1043, in smt_fork
    return self.choose_possible(expr, probability_true)
  File ".../crosshair/statespace.py", line 854, in choose_possible
    self.check_timeout()
  File ".../crosshair/statespace.py", line 841, in check_timeout
    raise PathTimeout
crosshair.util.PathTimeout

I'm not really sure what we should do here. Maybe we should have a known exception to indicate that the current test case can't be realized so that Hypothesis can skip it as if for assume(False)? (raised/caught here) Increasing the timeout will only make the problem less common rather than fixing it, but at the cost of rather slow tests. Maybe that - or removing it entirely - is worth it though?

`NotDeterministic` with `st.text()` with no decisions made

I think this only happens under certain conditions since pschanely/CrossHair#263 uses st.text() and works fine. Maybe when no meaningful decisions are made on the symbolic?

This issue also occurs with st.characters(). Found while trying to run hypothesis' test suite under crosshair (HypothesisWorks/hypothesis#4022 (review)). I'm tentatively calling this a crosshair-side issue, but if you discover otherwise let me know 😄

@settings(backend="crosshair", deadline=None)
@given(st.text())
def f(s):
    pass
f()

My debug output:

1272686.164|               |pre_path_hook() No coverage biasing in effect. ( 0  code locations)
1272686.165|             |condition_parser() Using parsers:  []
1272686.166|            |per_test_case_context_manager() starting iteration 1
1272686.168|                    |_next_name() Drawing str_01
1272686.172|                 |find_model_value()  *** Begin Not Deterministic Debug *** 
1272686.173|                 |find_model_value() Model value node expected; found <class 'crosshair.statespace.DeatchedPathNode'> instead.
1272686.173|                 |find_model_value()   Traceback:  (<module> sandbox4.py:36) (f sandbox4.py:33) (wrapped_test core.py:1623) (run_engine core.py:1151) (run engine.py:736) (_run engine.py:1198) (generate_new_examples engine.py:881) (cached_test_function engine.py:1432) (test_function engine.py:434) (__stoppable_test_function engine.py:307) (_execute_once_for_engine core.py:1047) (execute_once core.py:961) (__exit__ contextlib.py:144) (per_test_case_context_manager crosshair_provider.py:110) (deep_realize core.py:251) (deepcopyext copyext.py:37) (__ch_realize__ builtinslib.py:3106) (__index__ builtinslib.py:1113) (find_model_value statespace.py:937) (test_stack util.py:221)
1272686.173|                 |find_model_value()  *** End Not Deterministic Debug *** 
1272686.174|            |per_test_case_context_manager() ended iteration (exception: NotDeterministic: ) (per_test_case_context_manager crosshair_provider.py:110) (deep_realize core.py:251) (deepcopyext copyext.py:37) (__ch_realize__ builtinslib.py:3106) (__index__ builtinslib.py:1113) (find_model_value statespace.py:939)
1272686.174|            |per_test_case_context_manager() no decisions made; ignoring this iteration
1272686.197|         |export_value() WARNING: export_value() requested before test case complered (<module> sandbox4.py:36) (f sandbox4.py:33) (wrapped_test core.py:1623) (run_engine core.py:1151) (run engine.py:736) (_run engine.py:1198) (generate_new_examples engine.py:881) (cached_test_function engine.py:1432) (test_function engine.py:458) (post_test_case_hook crosshair_provider.py:286) (export_value crosshair_provider.py:277) (test_stack util.py:221)
1272686.203|                       |__init__() CrosshairInternal 
1272686.204|                       |__init__()  Stack trace:
  File "/Users/tybug/Desktop/Liam/coding/sandbox/hypothesis_/sandbox4.py", line 36, in <module>
    f()
  File "/Users/tybug/Desktop/Liam/coding/sandbox/hypothesis_/sandbox4.py", line 33, in f
    @given(st.text())
  File "/Users/tybug/Desktop/Liam/coding/hypothesis/hypothesis-python/src/hypothesis/core.py", line 1623, in wrapped_test
    state.run_engine()
  File "/Users/tybug/Desktop/Liam/coding/hypothesis/hypothesis-python/src/hypothesis/core.py", line 1151, in run_engine
    runner.run()
  File "/Users/tybug/Desktop/Liam/coding/hypothesis/hypothesis-python/src/hypothesis/internal/conjecture/engine.py", line 736, in run
    self._run()
  File "/Users/tybug/Desktop/Liam/coding/hypothesis/hypothesis-python/src/hypothesis/internal/conjecture/engine.py", line 1198, in _run
    self.generate_new_examples()
  File "/Users/tybug/Desktop/Liam/coding/hypothesis/hypothesis-python/src/hypothesis/internal/conjecture/engine.py", line 881, in generate_new_examples
    zero_data = self.cached_test_function(bytes(BUFFER_SIZE))
  File "/Users/tybug/Desktop/Liam/coding/hypothesis/hypothesis-python/src/hypothesis/internal/conjecture/engine.py", line 1432, in cached_test_function
    self.test_function(data)
  File "/Users/tybug/Desktop/Liam/coding/hypothesis/hypothesis-python/src/hypothesis/internal/conjecture/engine.py", line 510, in test_function
    self.__stoppable_test_function(data)
  File "/Users/tybug/Desktop/Liam/coding/hypothesis/hypothesis-python/src/hypothesis/internal/conjecture/engine.py", line 307, in __stoppable_test_function
    self._test_function(data)
  File "/Users/tybug/Desktop/Liam/coding/hypothesis/hypothesis-python/src/hypothesis/core.py", line 1041, in _execute_once_for_engine
    result = self.execute_once(data)
  File "/Users/tybug/Desktop/Liam/coding/hypothesis/hypothesis-python/src/hypothesis/core.py", line 964, in execute_once
    result = self.test_runner(data, run)
  File "/Users/tybug/Desktop/Liam/coding/hypothesis/hypothesis-python/src/hypothesis/core.py", line 737, in default_executor
    return function(data)
  File "/Users/tybug/Desktop/Liam/coding/hypothesis/hypothesis-python/src/hypothesis/core.py", line 875, in run
    kw, argslices = context.prep_args_kwargs_from_strategies(
  File "/Users/tybug/Desktop/Liam/coding/hypothesis/hypothesis-python/src/hypothesis/control.py", line 157, in prep_args_kwargs_from_strategies
    obj = check(self.data.draw(s, observe_as=f"generate:{k}"))
  File "/Users/tybug/Desktop/Liam/coding/hypothesis/hypothesis-python/src/hypothesis/internal/conjecture/data.py", line 2423, in draw
    return strategy.do_draw(self)
  File "/Users/tybug/Desktop/Liam/coding/hypothesis/hypothesis-python/src/hypothesis/strategies/_internal/lazy.py", line 167, in do_draw
    return data.draw(self.wrapped_strategy)
  File "/Users/tybug/Desktop/Liam/coding/hypothesis/hypothesis-python/src/hypothesis/internal/conjecture/data.py", line 2417, in draw
    return strategy.do_draw(self)
  File "/Users/tybug/Desktop/Liam/coding/hypothesis/hypothesis-python/src/hypothesis/strategies/_internal/strings.py", line 117, in do_draw
    return data.draw_string(
  File "/Users/tybug/Desktop/Liam/coding/hypothesis/hypothesis-python/src/hypothesis/internal/conjecture/data.py", line 2182, in draw_string
    node = self._pop_ir_tree_node("string", kwargs, forced=forced)
  File "/Users/tybug/Desktop/Liam/coding/hypothesis/hypothesis-python/src/hypothesis/internal/conjecture/data.py", line 2330, in _pop_ir_tree_node
    if not ir_value_permitted(node.value, node.ir_type, kwargs):
  File "/Users/tybug/Desktop/Liam/coding/hypothesis/hypothesis-python/src/hypothesis/internal/conjecture/data.py", line 1089, in ir_value_permitted
    if len(value) < kwargs["min_size"]:
  File "/opt/homebrew/lib/python3.12/site-packages/crosshair/libimpl/builtinslib.py", line 1112, in __index__
    space = context_statespace()
  File "/opt/homebrew/lib/python3.12/site-packages/crosshair/statespace.py", line 247, in context_statespace
    raise CrosshairInternal
  File "/opt/homebrew/lib/python3.12/site-packages/crosshair/util.py", line 594, in __init__
    debug(" Stack trace:\n" + "".join(traceback.format_stack()))

Traceback (most recent call last):
  File "/Users/tybug/Desktop/Liam/coding/sandbox/hypothesis_/sandbox4.py", line 36, in <module>
    f()
  File "/Users/tybug/Desktop/Liam/coding/sandbox/hypothesis_/sandbox4.py", line 33, in f
    @given(st.text())
                   ^^^
  File "/Users/tybug/Desktop/Liam/coding/hypothesis/hypothesis-python/src/hypothesis/core.py", line 1656, in wrapped_test
    raise the_error_hypothesis_found
  File "/opt/homebrew/lib/python3.12/site-packages/crosshair/libimpl/builtinslib.py", line 1112, in __index__
    space = context_statespace()
            ^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.12/site-packages/crosshair/statespace.py", line 247, in context_statespace
    raise CrosshairInternal
crosshair.util.CrosshairInternal

Invalid combination of arguments to `draw_boolean(...)`

These assertions in Hypothesis check that if forced is True, then p > 0, and vice-versa at the other end - because it would be invalid to generate the forced value otherwise. However, the following example fails with Crosshair:

from hypothesis import settings
from hypothesis.stateful import RuleBasedStateMachine, rule, run_state_machine_as_test

@run_state_machine_as_test
@settings(backend="crosshair", deadline=None)
class IntListRules(RuleBasedStateMachine):

    @rule()
    def a(self): pass

    @rule()
    def b(self): pass

I'm reporting it here because of the draw_boolean() symptom, but I actually find it pretty plausible that it's an issue in Crosshair's reasoning about the internal FeatureStrategy (see here). As usual, found in HypothesisWorks/hypothesis#4034 🙂

Integrating with observability features so maintainers and users can tell what's happening

Hypothesis' observability features were originally designed for the benefit of users (HypothesisWorks/hypothesis#3797), but have also turned out to be useful for developers - e.g. this discussion led to bugfixes and performance improvements.

Recent work on HypothesisWorks/hypothesis#4034 has me thinking that observability information from Crosshair would actually be pretty valuable for us as maintainers, to answer questions like:

how many paths did we ignore due to #19?
what arguments or values were realized during the test?
(if it's all of them, are we adding much value? what metrics would answer that?)
how many paths have we explored? how many remain? how many are we unable to explore?
timing information would help us understand whether #21 is an anomaly or something we need to deal with regularly, and tune a solution accordingly.
(I'm sure there are many more that will become obvious later)

Practically speaking, what do we actually need to do here?

decide what to measure (schema here)
- status_reason, if something on the crosshair side was responsible for a Status.INVALID result, as if for assume(False). I considered whether we need custom statuses for e.g. #21 and concluded that the status_reason metadata was a better fit.
- not features; they're intended to be about the runtime behavior of the code under test.
  - while I'm thinking about it, event() is currently disabled under Crosshair to reduce premature realization, but we could support it nicely by e.g. deferring it to the end of the test case where we realize anyway.
- timing observations are meant to be disjoint, so that the sum is total runtime. We therefore probably want to put internal timings in the metadata instead.
- metadata: the catch-all put-anything-here section.
connect it up
- metadata is easy, just add an optional method to the PrimitiveProvider protocol and call it here
- we could handle status_reason similarly, with an extra clause here, but I'm not sure what the provider would actually want to say.
use the new information to understand what's going on, improve hypothesis-crosshair, iterate on observability, etc.

Would you find this useful? If so, setting up metadata passthrough would be pretty easy 🙂

Zero inputs tried for bounded floats

This doesn't find an error:

@settings(backend="crosshair", deadline=None, database=None)
@given(st.floats(1.0, 2.0)) # or other bounds
def f(x):
    assert False
f()

I see lots of IgnoreAttempt in debug logs. Probably crosshair is giving up generating inputs here when it shouldn't be?

`smallest_nonzero_magnitude` inclusive

Here's a quick one to either fix or dismiss as intended; I'm fairly sure the condition bounds for smallest_nonzero_magnitude should be inclusive <= as smallest_nonzero_magnitude is allowed. https://github.com/pschanely/hypothesis-crosshair/blob/main/hypothesis_crosshair_provider/crosshair_provider.py#L222-L227.

`CrosshairInternal` on nested properties

from hypothesis import given, settings
from hypothesis import strategies as st

settings.register_profile("crosshair", backend="crosshair", deadline=None)
settings.load_profile("crosshair")

@given(st.integers())
def outer(n1):
    @given(st.integers(min_value=n1, max_value=n1))
    def inner(n2):
        assert n1 == n2
    inner()
outer()

stacktrace:

  ... (truncated)

  File "/Users/tybug/Desktop/Liam/coding/sandbox/hypothesis_/sandbox4.py", line 36, in outer
    def outer(n1):
                   
  File "/Users/tybug/Desktop/Liam/coding/hypothesis/hypothesis-python/src/hypothesis/core.py", line 830, in test
    return self.test(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/tybug/Desktop/Liam/coding/sandbox/hypothesis_/sandbox4.py", line 37, in outer
    @given(st.integers(min_value=n1, max_value=n1))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/tybug/Desktop/Liam/coding/hypothesis/hypothesis-python/src/hypothesis/strategies/_internal/numbers.py", line 112, in integers
    @defines_strategy(force_reusable_values=True)
                   ^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/tybug/Desktop/Liam/coding/hypothesis/hypothesis-python/src/hypothesis/strategies/_internal/utils.py", line 79, in cached_strategy
    if cache_key in cache:
       ^^^^^^^^^^^^^^^^^^
  File "/Users/tybug/Desktop/Liam/coding/hypothesis/hypothesis-python/src/hypothesis/internal/cache.py", line 97, in __contains__
    return key in self.keys_to_indices
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.12/site-packages/crosshair/libimpl/builtinslib.py", line 908, in __eq__
    return numeric_binop(ops.eq, self, other)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.12/site-packages/crosshair/libimpl/builtinslib.py", line 462, in numeric_binop
    raise CrosshairInternal("Numeric operation on symbolic while not tracing")
crosshair.util.CrosshairInternal: Numeric operation on symbolic while not tracing

I think we're using the kwargs to st.integers as a cache key to reuse strategies, which includes n1, and somehow n1 is realized outside of a crosshair scope (or maybe just outside of its own outer crosshair scope since we may have jumped to a new one with inner?).

Idle musing: something we should consider more broadly is nested backend scopes. A user may request backend A on outer and a different backend B on inner. Ideally this would just work™️, but of course it may not be that simple to support.

Counterexample extracted from https://github.com/HypothesisWorks/hypothesis/blob/d8c17832141f587fec9ef89895ed01a4e9c1650d/hypothesis-python/tests/cover/test_datetimes.py#L52-L54.

Add generic TypeError suppression

Regular CrossHair checks TypeError messages for certain CrossHair type names and will ignores paths when it finds them. We probably need this logic in the hypothesis plugin as well.

Do something when argument realization fails

When life gets too hard for the solver, it'll raise an UnknownSatisfiability error, and then there's a good chance that our argument realization code produce the same exception, failing to find concrete values for the draws. We probably need to extend the solver timeout during realization and/or wipe the solver state and pretend it was a useless/trivial iteration.

This is the root cause of hypothesis test failures like returning a symbolic type from provider.realize(obj) (comment, CI)

Crosshair takes significant time to generate valid examples for some strategies

I'm seeing many related internal IgnoreAttempt raises here.

For instance:

@given(st.datetimes())
@settings(backend="crosshair", deadline=None)
def f(n):
    print("call")
f()

takes ~75 IgnoreAttempts over 5 seconds until it prints its first call.

Other offenders are st.dates() (not that bad) and st.emails().

This may not be surprising given crosshair is working with a low level IR, but I wonder if this could be alleviated in some way?

descriptor `items` for `dict` objects doesn't apply to a `ShellMutableMap` object

Here's a nice cursed one for you, probably low priority:

from hypothesis import strategies as st
from hypothesis import given, settings

settings.register_profile("crosshair", backend="crosshair", deadline=None)
settings.load_profile("crosshair")

@given(st.builds(dict).map(dict.items))
def f(x):
    print(x)
f()

  ...
  File "/Users/tybug/Desktop/Liam/coding/hypothesis/hypothesis-python/src/hypothesis/internal/conjecture/data.py", line 2447, in draw
    return strategy.do_draw(self)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/tybug/Desktop/Liam/coding/hypothesis/hypothesis-python/src/hypothesis/strategies/_internal/strategies.py", line 844, in do_draw
    result = self.pack(x)  # type: ignore
             ^^^^^^^^^^^^
TypeError: descriptor 'items' for 'dict' objects doesn't apply to a 'ShellMutableMap' object
while generating 'x' from builds(dict).map(dict.items)

Hypothesis uses this to register a strategy for typing.ItemsView.

`CrosshairInternal` on `st.dates` / `st.datetimes`

Somehow, kwargs["max_value"] (and only max_value) on an integer ir node becomes symbolic here and errors when we go to cache based off it. This only happens on dates/datetimes so I'm suspecting crosshair interception there causes some of our code to be colored symbolically, but hypothesis' date strategy code may also be at fault.

@given(st.dates())
def f(d):
    pass
f()

  ... (truncated)
  File "/Users/tybug/Desktop/Liam/coding/hypothesis/hypothesis-python/src/hypothesis/internal/conjecture/engine.py", line 475, in test_function
    self._cache(data)
  File "/Users/tybug/Desktop/Liam/coding/hypothesis/hypothesis-python/src/hypothesis/internal/conjecture/engine.py", line 381, in _cache
    self.__data_cache_ir[key] = result
    ~~~~~~~~~~~~~~~~~~~~^^^^^
  File "/Users/tybug/Desktop/Liam/coding/hypothesis/hypothesis-python/src/hypothesis/internal/cache.py", line 111, in __setitem__
    i = self.keys_to_indices[key]
        ~~~~~~~~~~~~~~~~~~~~^^^^^
  File "/opt/homebrew/lib/python3.12/site-packages/crosshair/libimpl/builtinslib.py", line 1143, in __hash__
    return self.__index__().__hash__()
           ^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.12/site-packages/crosshair/libimpl/builtinslib.py", line 1154, in __index__
    space = context_statespace()
            ^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.12/site-packages/crosshair/statespace.py", line 247, in context_statespace
    raise CrosshairInternal
crosshair.util.CrosshairInternal

patch

Here's a logging patch that narrows the problem, if it's helpful:

diff --git a/hypothesis-python/src/hypothesis/internal/conjecture/engine.py b/hypothesis-python/src/hypothesis/internal/conjecture/engine.py
index eb326f59a..708e48904 100644
--- a/hypothesis-python/src/hypothesis/internal/conjecture/engine.py
+++ b/hypothesis-python/src/hypothesis/internal/conjecture/engine.py
@@ -456,6 +456,7 @@ class ConjectureRunner:
                 if self.settings.backend != "hypothesis":
                     for node in data.examples.ir_tree_nodes:
                         value = data.provider.post_test_case_hook(node.value)
+                        print("kwargs should not be symbolic", type(node.kwargs["max_value"]))
                         expected_type = {
                             "string": str,
                             "float": float,

With this, I get

kwargs should not be symbolic <class 'int'>
kwargs should not be symbolic <class 'int'>
kwargs should not be symbolic <class 'int'>
kwargs should not be symbolic <class 'int'>
kwargs should not be symbolic <class 'int'>
kwargs should not be symbolic <class 'int'>
kwargs should not be symbolic <class 'int'>
kwargs should not be symbolic <class 'int'>
kwargs should not be symbolic <class 'int'>
kwargs should not be symbolic <class 'int'>
kwargs should not be symbolic <class 'int'>
kwargs should not be symbolic <class 'int'>
kwargs should not be symbolic <class 'int'>
kwargs should not be symbolic <class 'int'>
kwargs should not be symbolic <class 'int'>
kwargs should not be symbolic <class 'int'>
kwargs should not be symbolic <class 'int'>
kwargs should not be symbolic <class 'int'>
kwargs should not be symbolic <class 'crosshair.libimpl.builtinslib.SymbolicInt'>

followed by the above stacktrace error.

Test against hypothesis-jsonschema

Let's see what works and what doesn't!

Issue with multiply registered contracts

import time as tm, pytest
from functools import wraps
from hypothesis import given, settings, strategies as st

@settings(backend="crosshair")
@given(st.text("abcdefg"))
def test_hits_internal_assert(x):
    assert set(x).issubset(set("abcdefg"))

@pytest.fixture(scope="function")
def _patch_time(monkeypatch):
    def time():
        nonlocal current_time
        current_time += 0.1
        return current_time

    current_time = tm.time()
    monkeypatch.setattr(tm, "time", wraps(tm.time)(time))
    monkeypatch.setattr(tm, "monotonic", wraps(tm.monotonic)(time))
    # monkeypatch.setattr(tm, "perf_counter", wraps(tm.perf_counter)(time))

@settings(backend="crosshair")
@given(n=st.integers())
def test_crosshair_does_not_like_monkeypatching(_patch_time, n): ...
    # crosshair.register_contract.ContractRegistrationError: 
    # Pre- and postconditons and skip_body should not differ when registering multiple contracts for the same function

Originally posted by @Zac-HD in HypothesisWorks/hypothesis#3914 (comment)

respect `{min, max}_size` in `draw_string`

right now these parameters are ignored, which leads to crosshair generating unsound ir nodes:

@given(st.characters())
def f(c):
    assert len(c) == 1
f()

the above gives a flaky error due to this; the root cause of invalid lengths is maybe more clear with this example:

@given(st.characters())
def f(c):
    print(len(c)) # 0, 1, 2, ...
f()

Storing symbolic values outside of the test case lifetime

values = []
@settings(backend="crosshair", deadline=None, database=None)
@given(st.integers())
def f(x):
    values.append(x)

f()
# crosshair.util.CrosshairInternal: Numeric operation on symbolic while not tracing
print("values", values)

We do indeed call post_test_case_hook -> deep_realize here, but presumably that returns the realized value rather than realizing the value and any references in place. Is deep-realizing all references even possible to achieve?

Arguably this is an abuse of hypothesis, but we do use this technique in hypothesis tests, eg. cc @Zac-HD since this might involve a wider viability discussion.

pschanely / hypothesis-crosshair Goto Github PK

hypothesis-crosshair's Issues

patch

Recommend Projects

Recommend Topics

Recommend Org