lycantropos / hypothesis_sqlalchemy Goto Github PK

hypothesis strategies for generating SQLAlchemy objects

License: MIT License

Python 96.39% Shell 1.42% Dockerfile 1.07% PowerShell 1.12%

hypothesis testing quickcheck sqlalchemy

hypothesis_sqlalchemy's Introduction

hypothesis_sqlalchemy

In what follows python is an alias for python3.7 or pypy3.7 or any later version (python3.8, pypy3.8 and so on).

Installation

Install the latest pip & setuptools packages versions

python -m pip install --upgrade pip setuptools

User

Download and install the latest stable version from PyPI repository

python -m pip install --upgrade hypothesis_sqlalchemy

Developer

Download the latest version from GitHub repository

git clone https://github.com/lycantropos/hypothesis_sqlalchemy.git
cd hypothesis_sqlalchemy

Install dependencies

python -m pip install -r requirements.txt

Install

python setup.py install

Usage

With setup

>>> import warnings
>>> from hypothesis.errors import NonInteractiveExampleWarning
>>> # ignore hypothesis warnings caused by `example` method call
... warnings.filterwarnings('ignore', category=NonInteractiveExampleWarning)

let's take a look at what can be generated and how.

Tables

We can write a strategy that produces tables

>>> from hypothesis_sqlalchemy import scheme
>>> from sqlalchemy.engine.default import DefaultDialect
>>> dialect = DefaultDialect()
>>> tables = scheme.tables(dialect,
...                        min_size=3,
...                        max_size=10)
>>> table = tables.example()
>>> from sqlalchemy.schema import Table
>>> isinstance(table, Table)
True
>>> from sqlalchemy.schema import Column
>>> all(isinstance(column, Column) for column in table.columns)
True
>>> 3 <= len(table.columns) <= 10
True

Records

Suppose we have a table

>>> from sqlalchemy.schema import (Column,
...                                MetaData,
...                                Table)
>>> from sqlalchemy.sql.sqltypes import (Integer,
...                                      String)
>>> metadata = MetaData()
>>> user_table = Table('user', metadata,
...                    Column('user_id', Integer,
...                           primary_key=True),
...                    Column('user_name', String(16),
...                           nullable=False),
...                    Column('email_address', String(60)),
...                    Column('password', String(20),
...                           nullable=False))

and we can write strategy that

produces single records (as tuples)

>>> from hypothesis import strategies
>>> from hypothesis_sqlalchemy.sample import table_records
>>> records = table_records(user_table, 
...                         email_address=strategies.emails())
>>> record = records.example()
>>> isinstance(record, tuple)
True
>>> len(record) == len(user_table.columns)
True
>>> all(column.nullable and value is None
...     or isinstance(value, column.type.python_type) 
...     for value, column in zip(record, user_table.columns))
True

produces records lists (with configurable list size bounds)

>>> from hypothesis_sqlalchemy.sample import table_records_lists
>>> records_lists = table_records_lists(user_table,
...                                     min_size=2,
...                                     max_size=5, 
...                                     email_address=strategies.emails())
>>> records_list = records_lists.example()
>>> isinstance(records_list, list)
True
>>> 2 <= len(records_list) <= 5
True
>>> all(isinstance(record, tuple) for record in records_list)
True
>>> all(len(record) == len(user_table.columns) for record in records_list)
True

Development

Bumping version

Preparation

Install bump2version.

Pre-release

Choose which version number category to bump following semver specification.

Test bumping version

bump2version --dry-run --verbose $CATEGORY

where $CATEGORY is the target version number category name, possible values are patch/minor/major.

Bump version

bump2version --verbose $CATEGORY

This will set version to major.minor.patch-alpha.

Release

Test bumping version

bump2version --dry-run --verbose release

Bump version

bump2version --verbose release

This will set version to major.minor.patch.

Running tests

Install dependencies

python -m pip install -r requirements-tests.txt

Plain

pytest

Inside Docker container:

with CPython

docker-compose --file docker-compose.cpython.yml up

with PyPy

docker-compose --file docker-compose.pypy.yml up

Bash script:

with CPython
```
./run-tests.sh
```
or
```
./run-tests.sh cpython
```
with PyPy
```
./run-tests.sh pypy
```

PowerShell script:

with CPython
```
.\run-tests.ps1
```
or
```
.\run-tests.ps1 cpython
```
with PyPy
```
.\run-tests.ps1 pypy
```

hypothesis_sqlalchemy's People

Contributors

Stargazers

Watchers

Forkers

joshrosen mutusfa rubenhelsloot ptallada lachlancoding gazorby dmaljovec frndmg

hypothesis_sqlalchemy's Issues

UUID not supported

UUID is not a supported inferred strategy but I believe it should be.. maybe it as simple as adding a mapping of UUID to the UUID strategy in this dictionary:
https://github.com/lycantropos/hypothesis_sqlalchemy/blob/master/hypothesis_sqlalchemy/columnar/values.py#L51-L63

Here is how you reproduce the issue:

import sqlalchemy as sa
import sqlalchemy.dialects.postgresql as pg
from hypothesis_sqlalchemy import tabular

metadata = sa.MetaData()

user = sa.Table(
    "users",
    metadata,
    sa.Column("user_id", pg.UUID(), primary_key=True, unique=True),
    sa.Column("login", pg.VARCHAR(255), nullable=True),
)

tabular.records.factory(user) # throws NotImplementedError

The end of the stack trace looks like:

.../python3.6/site-packages/hypothesis_sqlalchemy/columnar/values.py in from_type(type_)
     66 @singledispatch
     67 def from_type(type_: TypeEngine) -> Strategy[Any]:
---> 68     return values_by_python_types[type_.python_type]
     69
     70

.../python3.6/site-packages/sqlalchemy/sql/type_api.py in python_type(self)
    394
    395         """
--> 396         raise NotImplementedError()
    397
    398     def with_variant(self, type_, dialect_name):

NotImplementedError:

[Question] How to customize strategies?

It is not clear to me how to customize strategies that they generate customized tables and records. Given your records usage example: How could I generate values for user_id in a customized range of integer values, let's say between 0 and 1000?

Add support for unique constraints

We have currently support for single unique columns and we should add support for unique constraints too (both their generating & handling in records creation).

Variant support

First off, thanks for the incredibly useful library. It's exactly what I needed, except for one pitfall: Variants.

I have a Column with a Variant, which I've been trying to manually register:

# Provide a proxy type that will allow us to use Integer for sqlite and BigInteger for other DB
# engines to avoid sqlite not allowing autoincrement BigIntegers.
AutoBigIntType = BigInteger().with_variant(sqlite.INTEGER(), "sqlite")

...

@from_type.register(AutoBigIntType)
def autobiginttype_values_factory(type_: AutoBigIntType) -> Strategy[int]:
    return strategies.integers(min_value=0)

>>> E   TypeError: Invalid annotation for 'type_'. Variant() is not a class.

This is somewhat related to #21 because supporting UUIDs is sometimes solved with Variants. However, I think there are various use cases for Variants outside of the example I provided and Postgres UUIDs, so it makes sense to come up with some kind of cohesive strategy for the general usage.

I have no idea if it's correct, but my naive intuition is that when you discover a Variant, you could resolve to MyTable.columns["variant_column_name"].type.impl.

Python 3.7 Crash on import

Using with Python 3.7 results in crash on import.

Sample code:

from hypothesis import strategies
from hypothesis_sqlalchemy import tables

Stack trace:

Traceback (most recent call last):
File "example_failure.py", line 6, in
from hypothesis_sqlalchemy import tables
File "/usr/local/lib/python3.7/site-packages/hypothesis_sqlalchemy/tables/init.py", line 2, in
from .tables import factory
File "/usr/local/lib/python3.7/site-packages/hypothesis_sqlalchemy/tables/tables.py", line 7, in
from hypothesis_sqlalchemy import columns
File "/usr/local/lib/python3.7/site-packages/hypothesis_sqlalchemy/columns.py", line 104, in
def non_all_unique_lists_factory(lists: Strategy = lists_factory()
File "/usr/local/lib/python3.7/site-packages/hypothesis_sqlalchemy/columns.py", line 87, in lists_factory
max_size=max_size)
TypeError: lists() got an unexpected keyword argument 'average_size'

uuid factory

I've noticed that with hypothesis v6.39.6 hypothesis_sqlalchemy will fail to produce examples of tables with UUID columns. It seems to fail because hypothesis adopted a func(*, keyword_argument=default) pattern for their function signatures and hypothesis_sqlalchemy attempts to call strategies.uuids(version), when the function signature is uuids(*, version: Optional[int] = None).

I understand that this is no the version of hypothesis listed in the requirements, however, this is the version required by Pydantic, so at the moment users are forced to choose between generating examples for SQLAlchemy or Pydantic, or abandoning UUID fields. It seems like a simple fix and would be backwards compatible.

Records lists_factory unique_by with multiple columns

Hi,

First of all, I wanted to thank you for all the work, this project has really helped me in setting up an easily testable codebase!
I've found an issue when you have a model with multiple unique columns. Specifically, the lists_factory returns records that are unique by the combination of multiple columns, but not as individual columns. I have, for example, the following model:

class Country(BaseEntity):
  """
  A country is an overarching entity that can hold information about all
  locations assigned to it.
  """
  __tablename__ = "country"
  id = db.Column(db.Integer, primary_key=True)
  title = db.Column(db.String, unique=True, nullable=False)
  vat = db.Column(db.Integer)
  currency = db.Column(db.String(3), default='EUR', nullable=False)

Passing this model to the lists_factory returns a lists strategy with unique_by function
lambda row: (row[0], row[1]), which means that the combination of id and title must be unique, but not the individual id and title. This results in errors when I try to generate multiple, as those fields may not be unique, especially when hypothesis is shrinking.

I would propose to change the unique_by-generating function to return a tuple of functions, one for each column. (lambda row: row[0], lambda row: row[1]) would only result in records that differ on both columns. I wouldn't mind opening a PR for this, if you agree that it would be nice to have this functionality.

Thanks in advance!

Ruben

Add support for check constraints

We should add support for check constraints (both their generating & handling in records creation).

Ability to change module's context

To make package more flexible we need to provide simple and comfortable way of changing modules context (like global variables and functions).

Allow application of extra functions to strategy in lists_factory

I often generate lists of models using the lists_factory in hypothesis_sqlalchemy.tabular.records, but sometimes I have the issue that I need to apply a filter function to each generated model. For example, I have a model that has always one of three foreign keys set, so I want to filter out all examples where this is not the case. However, if I apply .filter to the entire list of generated models, hypothesis will take a very long time to figure out what is wrong and what to avoid, so I'd rather just filter out those specific examples that violate the constraint, while maintaining replayability.

Would it be possible to allow lists_factory to accept a function that is then applied to each individual values_tuples pair it generates?

Custom type support?

Interesting package, thank you for the effort!

Have you considered adding support for custom types? For example, suppose I use a custom type like e.g. CurrencyType which basically maps to a SQLA String type (src) then it would be useful to either

use the column’s type to map that to a strategy; or
use the column’s Python type to map that to a strategy (if the Python type is a simple one); or
use a user-provided custom strategy.

With the current version I receive an error:

>>> hypothesis_sqlalchemy.sample.table_records(Test.__table__)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/.../.venv/lib/python3.10/site-packages/hypothesis_sqlalchemy/core/table_records.py", line 15, in instances
    return columns_records.instances(list(table.columns),
  File "/.../.venv/lib/python3.10/site-packages/hypothesis_sqlalchemy/core/columns_records.py", line 29, in instances
    return strategies.tuples(*map(column_scalars, columns))
  File "/.../.venv/lib/python3.10/site-packages/hypothesis_sqlalchemy/core/column.py", line 39, in scalars
    result = column_type.scalars(column.type)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/functools.py", line 889, in wrapper
    return dispatch(args[0].__class__)(*args, **kw)
  File "/.../.venv/lib/python3.10/site-packages/hypothesis_sqlalchemy/core/column_type.py", line 134, in scalars
    return _values_by_python_types[type_.python_type]
KeyError: <class 'sqlalchemy_utils.primitives.currency.Currency'>

Here it would be useful to either:

return a text(max_size=3) strategy (poor strategy based in the custom type’s String(3) wrapped type, but it’d work); or
return a sample_from(list(babel.core.get_global("all_currencies").keys())) strategy for the Python type Currency (although this wouldn’t be possible to derive from the Currency code; or
let the user provide the sample_from() strategy.

Add `Framework :: Hypothesis` PyPI trove classifier

The Python Packaging Authority has recently created a new trove classifier to help users find Hypothesis-related packages, as described and demonstrated in HypothesisWorks/hypothesis#1663.

It would be great if this package used it too!

Shrinking nullable columns

According to the hypothesis docs, st.one_of shrinks towards elements earlies in the list. For nullable columns, you use result |= st.none(), which results in st.one_of(<other strategy>, st.none()).

Would it be desirable to interchange the order so the strategy will shrink towards NULLS rather than filled values?

[Question] Dependent data for correct database insertion

Thanks for this library! It was exactly what I needed and thought I'd have to write it till I saw this. :)

One question though, how have you handled tables with foreign keys? More generally, how have you approached generating records that are dependent on other records. The hypothesis docs seem to suggest using composite or using flatmap as in this Django example. Having to roll these one-off functions is unsatisfactory since the FK information has enough information to create such composites. How do you approach this in your apps?

I'm thinking about porting this clojure library to hypothesis. It has a general way of thinking about declaring dependent data with graphs and then generating example data from the graph in a succinct way. I'm thinking something like this paired with hypothesis-sqlalchemy could provide a slick way of generating complex DB data. Thoughts?

Receive None for columns with default value.

In my model I have columns with default value. When the strategy generates value for this columns sometimes it generates None. Is there a way to receive the default value instead None?

I create solution that works, but I was wondering if there is another way.

videos = tables.records.factory(Video.__table__).map(
    lambda record: {column.name: _set_default_value(column, record[index]) for index, column in enumerate(Video.__table__.columns)}
)

def _set_default_value(field, field_value):
    if field.default is not None and field_value is None:
        return field.default.arg
    return field_value

Support `sqlalchemy` 2

Hi, this is a great project. sqlalchemy 2 has been released, and it would be great if this supported it.

I'd be happy to open a PR.

Add max size to email strategy

It should be possible to call strategies.emails(max_size=N) as database column for emails may have constraint much smaller than 255.

Currently I have worked it around with:

@defines_strategy_with_reusable_values
def emails(max_size: int = None,) -> SearchStrategy[str]:
    """A strategy for generating email addresses as unicode strings. The
    address format is specified in :rfc:`5322#section-3.4.1`. Values shrink
    towards shorter local-parts and host domains.

    This strategy is useful for generating "user data" for tests, as
    mishandling of email addresses is a common source of bugs.
    """
    from hypothesis.provisional import domains

    local_chars = string.ascii_letters + string.digits + "!#$%&'*+-/=^_`{|}~"
    local_part = text(local_chars, min_size=1, max_size=64)
    # TODO: include dot-atoms, quoted strings, escaped chars, etc in local part
    return builds("{}@{}".format, local_part, domains()).filter(
        lambda addr: len(addr) <= (max_size if max_size is not None else 254)
    )