Giter Site home page Giter Site logo

lycantropos / hypothesis_sqlalchemy Goto Github PK

View Code? Open in Web Editor NEW
28.0 2.0 8.0 207 KB

hypothesis strategies for generating SQLAlchemy objects

License: MIT License

Python 96.39% Shell 1.42% Dockerfile 1.07% PowerShell 1.12%
hypothesis testing quickcheck sqlalchemy

hypothesis_sqlalchemy's Introduction

hypothesis_sqlalchemy

In what follows python is an alias for python3.7 or pypy3.7 or any later version (python3.8, pypy3.8 and so on).

Installation

Install the latest pip & setuptools packages versions

python -m pip install --upgrade pip setuptools

User

Download and install the latest stable version from PyPI repository

python -m pip install --upgrade hypothesis_sqlalchemy

Developer

Download the latest version from GitHub repository

git clone https://github.com/lycantropos/hypothesis_sqlalchemy.git
cd hypothesis_sqlalchemy

Install dependencies

python -m pip install -r requirements.txt

Install

python setup.py install

Usage

With setup

>>> import warnings
>>> from hypothesis.errors import NonInteractiveExampleWarning
>>> # ignore hypothesis warnings caused by `example` method call
... warnings.filterwarnings('ignore', category=NonInteractiveExampleWarning)

let's take a look at what can be generated and how.

Tables

We can write a strategy that produces tables

>>> from hypothesis_sqlalchemy import scheme
>>> from sqlalchemy.engine.default import DefaultDialect
>>> dialect = DefaultDialect()
>>> tables = scheme.tables(dialect,
...                        min_size=3,
...                        max_size=10)
>>> table = tables.example()
>>> from sqlalchemy.schema import Table
>>> isinstance(table, Table)
True
>>> from sqlalchemy.schema import Column
>>> all(isinstance(column, Column) for column in table.columns)
True
>>> 3 <= len(table.columns) <= 10
True

Records

Suppose we have a table

>>> from sqlalchemy.schema import (Column,
...                                MetaData,
...                                Table)
>>> from sqlalchemy.sql.sqltypes import (Integer,
...                                      String)
>>> metadata = MetaData()
>>> user_table = Table('user', metadata,
...                    Column('user_id', Integer,
...                           primary_key=True),
...                    Column('user_name', String(16),
...                           nullable=False),
...                    Column('email_address', String(60)),
...                    Column('password', String(20),
...                           nullable=False))

and we can write strategy that

  • produces single records (as tuples)
    >>> from hypothesis import strategies
    >>> from hypothesis_sqlalchemy.sample import table_records
    >>> records = table_records(user_table, 
    ...                         email_address=strategies.emails())
    >>> record = records.example()
    >>> isinstance(record, tuple)
    True
    >>> len(record) == len(user_table.columns)
    True
    >>> all(column.nullable and value is None
    ...     or isinstance(value, column.type.python_type) 
    ...     for value, column in zip(record, user_table.columns))
    True
  • produces records lists (with configurable list size bounds)
    >>> from hypothesis_sqlalchemy.sample import table_records_lists
    >>> records_lists = table_records_lists(user_table,
    ...                                     min_size=2,
    ...                                     max_size=5, 
    ...                                     email_address=strategies.emails())
    >>> records_list = records_lists.example()
    >>> isinstance(records_list, list)
    True
    >>> 2 <= len(records_list) <= 5
    True
    >>> all(isinstance(record, tuple) for record in records_list)
    True
    >>> all(len(record) == len(user_table.columns) for record in records_list)
    True

Development

Bumping version

Preparation

Install bump2version.

Pre-release

Choose which version number category to bump following semver specification.

Test bumping version

bump2version --dry-run --verbose $CATEGORY

where $CATEGORY is the target version number category name, possible values are patch/minor/major.

Bump version

bump2version --verbose $CATEGORY

This will set version to major.minor.patch-alpha.

Release

Test bumping version

bump2version --dry-run --verbose release

Bump version

bump2version --verbose release

This will set version to major.minor.patch.

Running tests

Install dependencies

python -m pip install -r requirements-tests.txt

Plain

pytest

Inside Docker container:

  • with CPython
    docker-compose --file docker-compose.cpython.yml up
  • with PyPy
    docker-compose --file docker-compose.pypy.yml up

Bash script:

  • with CPython

    ./run-tests.sh

    or

    ./run-tests.sh cpython
  • with PyPy

    ./run-tests.sh pypy

PowerShell script:

  • with CPython
    .\run-tests.ps1
    or
    .\run-tests.ps1 cpython
  • with PyPy
    .\run-tests.ps1 pypy

hypothesis_sqlalchemy's People

Contributors

dycw avatar joshrosen avatar lycantropos avatar ptallada avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

hypothesis_sqlalchemy's Issues

UUID not supported

UUID is not a supported inferred strategy but I believe it should be.. maybe it as simple as adding a mapping of UUID to the UUID strategy in this dictionary:
https://github.com/lycantropos/hypothesis_sqlalchemy/blob/master/hypothesis_sqlalchemy/columnar/values.py#L51-L63

Here is how you reproduce the issue:

import sqlalchemy as sa
import sqlalchemy.dialects.postgresql as pg
from hypothesis_sqlalchemy import tabular

metadata = sa.MetaData()

user = sa.Table(
    "users",
    metadata,
    sa.Column("user_id", pg.UUID(), primary_key=True, unique=True),
    sa.Column("login", pg.VARCHAR(255), nullable=True),
)

tabular.records.factory(user) # throws NotImplementedError

The end of the stack trace looks like:

.../python3.6/site-packages/hypothesis_sqlalchemy/columnar/values.py in from_type(type_)
     66 @singledispatch
     67 def from_type(type_: TypeEngine) -> Strategy[Any]:
---> 68     return values_by_python_types[type_.python_type]
     69
     70

.../python3.6/site-packages/sqlalchemy/sql/type_api.py in python_type(self)
    394
    395         """
--> 396         raise NotImplementedError()
    397
    398     def with_variant(self, type_, dialect_name):

NotImplementedError:

Variant support

First off, thanks for the incredibly useful library. It's exactly what I needed, except for one pitfall: Variants.

I have a Column with a Variant, which I've been trying to manually register:

# Provide a proxy type that will allow us to use Integer for sqlite and BigInteger for other DB
# engines to avoid sqlite not allowing autoincrement BigIntegers.
AutoBigIntType = BigInteger().with_variant(sqlite.INTEGER(), "sqlite")

...

@from_type.register(AutoBigIntType)
def autobiginttype_values_factory(type_: AutoBigIntType) -> Strategy[int]:
    return strategies.integers(min_value=0)

>>> E   TypeError: Invalid annotation for 'type_'. Variant() is not a class.

This is somewhat related to #21 because supporting UUIDs is sometimes solved with Variants. However, I think there are various use cases for Variants outside of the example I provided and Postgres UUIDs, so it makes sense to come up with some kind of cohesive strategy for the general usage.

I have no idea if it's correct, but my naive intuition is that when you discover a Variant, you could resolve to MyTable.columns["variant_column_name"].type.impl.

Python 3.7 Crash on import

Using with Python 3.7 results in crash on import.

Sample code:

from hypothesis import strategies
from hypothesis_sqlalchemy import tables

Stack trace:

Traceback (most recent call last):
File "example_failure.py", line 6, in
from hypothesis_sqlalchemy import tables
File "/usr/local/lib/python3.7/site-packages/hypothesis_sqlalchemy/tables/init.py", line 2, in
from .tables import factory
File "/usr/local/lib/python3.7/site-packages/hypothesis_sqlalchemy/tables/tables.py", line 7, in
from hypothesis_sqlalchemy import columns
File "/usr/local/lib/python3.7/site-packages/hypothesis_sqlalchemy/columns.py", line 104, in
def non_all_unique_lists_factory(lists: Strategy = lists_factory()
File "/usr/local/lib/python3.7/site-packages/hypothesis_sqlalchemy/columns.py", line 87, in lists_factory
max_size=max_size)
TypeError: lists() got an unexpected keyword argument 'average_size'

uuid factory

I've noticed that with hypothesis v6.39.6 hypothesis_sqlalchemy will fail to produce examples of tables with UUID columns. It seems to fail because hypothesis adopted a func(*, keyword_argument=default) pattern for their function signatures and hypothesis_sqlalchemy attempts to call strategies.uuids(version), when the function signature is uuids(*, version: Optional[int] = None).

I understand that this is no the version of hypothesis listed in the requirements, however, this is the version required by Pydantic, so at the moment users are forced to choose between generating examples for SQLAlchemy or Pydantic, or abandoning UUID fields. It seems like a simple fix and would be backwards compatible.

Records lists_factory unique_by with multiple columns

Hi,

First of all, I wanted to thank you for all the work, this project has really helped me in setting up an easily testable codebase!
I've found an issue when you have a model with multiple unique columns. Specifically, the lists_factory returns records that are unique by the combination of multiple columns, but not as individual columns. I have, for example, the following model:

class Country(BaseEntity):
  """
  A country is an overarching entity that can hold information about all
  locations assigned to it.
  """
  __tablename__ = "country"
  id = db.Column(db.Integer, primary_key=True)
  title = db.Column(db.String, unique=True, nullable=False)
  vat = db.Column(db.Integer)
  currency = db.Column(db.String(3), default='EUR', nullable=False)

Passing this model to the lists_factory returns a lists strategy with unique_by function
lambda row: (row[0], row[1]), which means that the combination of id and title must be unique, but not the individual id and title. This results in errors when I try to generate multiple, as those fields may not be unique, especially when hypothesis is shrinking.

I would propose to change the unique_by-generating function to return a tuple of functions, one for each column. (lambda row: row[0], lambda row: row[1]) would only result in records that differ on both columns. I wouldn't mind opening a PR for this, if you agree that it would be nice to have this functionality.

Thanks in advance!

Ruben

Ability to change module's context

To make package more flexible we need to provide simple and comfortable way of changing modules context (like global variables and functions).

Allow application of extra functions to strategy in lists_factory

I often generate lists of models using the lists_factory in hypothesis_sqlalchemy.tabular.records, but sometimes I have the issue that I need to apply a filter function to each generated model. For example, I have a model that has always one of three foreign keys set, so I want to filter out all examples where this is not the case. However, if I apply .filter to the entire list of generated models, hypothesis will take a very long time to figure out what is wrong and what to avoid, so I'd rather just filter out those specific examples that violate the constraint, while maintaining replayability.

Would it be possible to allow lists_factory to accept a function that is then applied to each individual values_tuples pair it generates?

Custom type support?

Interesting package, thank you for the effort!

Have you considered adding support for custom types? For example, suppose I use a custom type like e.g. CurrencyType which basically maps to a SQLA String type (src) then it would be useful to either

  • use the column’s type to map that to a strategy; or
  • use the column’s Python type to map that to a strategy (if the Python type is a simple one); or
  • use a user-provided custom strategy.

With the current version I receive an error:

>>> hypothesis_sqlalchemy.sample.table_records(Test.__table__)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/.../.venv/lib/python3.10/site-packages/hypothesis_sqlalchemy/core/table_records.py", line 15, in instances
    return columns_records.instances(list(table.columns),
  File "/.../.venv/lib/python3.10/site-packages/hypothesis_sqlalchemy/core/columns_records.py", line 29, in instances
    return strategies.tuples(*map(column_scalars, columns))
  File "/.../.venv/lib/python3.10/site-packages/hypothesis_sqlalchemy/core/column.py", line 39, in scalars
    result = column_type.scalars(column.type)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/functools.py", line 889, in wrapper
    return dispatch(args[0].__class__)(*args, **kw)
  File "/.../.venv/lib/python3.10/site-packages/hypothesis_sqlalchemy/core/column_type.py", line 134, in scalars
    return _values_by_python_types[type_.python_type]
KeyError: <class 'sqlalchemy_utils.primitives.currency.Currency'>

Here it would be useful to either:

  • return a text(max_size=3) strategy (poor strategy based in the custom type’s String(3) wrapped type, but it’d work); or
  • return a sample_from(list(babel.core.get_global("all_currencies").keys())) strategy for the Python type Currency (although this wouldn’t be possible to derive from the Currency code; or
  • let the user provide the sample_from() strategy.

Shrinking nullable columns

According to the hypothesis docs, st.one_of shrinks towards elements earlies in the list. For nullable columns, you use result |= st.none(), which results in st.one_of(<other strategy>, st.none()).

Would it be desirable to interchange the order so the strategy will shrink towards NULLS rather than filled values?

[Question] Dependent data for correct database insertion

Thanks for this library! It was exactly what I needed and thought I'd have to write it till I saw this. :)

One question though, how have you handled tables with foreign keys? More generally, how have you approached generating records that are dependent on other records. The hypothesis docs seem to suggest using composite or using flatmap as in this Django example. Having to roll these one-off functions is unsatisfactory since the FK information has enough information to create such composites. How do you approach this in your apps?

I'm thinking about porting this clojure library to hypothesis. It has a general way of thinking about declaring dependent data with graphs and then generating example data from the graph in a succinct way. I'm thinking something like this paired with hypothesis-sqlalchemy could provide a slick way of generating complex DB data. Thoughts?

Receive None for columns with default value.

In my model I have columns with default value. When the strategy generates value for this columns sometimes it generates None. Is there a way to receive the default value instead None?

I create solution that works, but I was wondering if there is another way.

videos = tables.records.factory(Video.__table__).map(
    lambda record: {column.name: _set_default_value(column, record[index]) for index, column in enumerate(Video.__table__.columns)}
)

def _set_default_value(field, field_value):
    if field.default is not None and field_value is None:
        return field.default.arg
    return field_value

Support `sqlalchemy` 2

Hi, this is a great project. sqlalchemy 2 has been released, and it would be great if this supported it.

I'd be happy to open a PR.

Add max size to email strategy

It should be possible to call strategies.emails(max_size=N) as database column for emails may have constraint much smaller than 255.

Currently I have worked it around with:

@defines_strategy_with_reusable_values
def emails(max_size: int = None,) -> SearchStrategy[str]:
    """A strategy for generating email addresses as unicode strings. The
    address format is specified in :rfc:`5322#section-3.4.1`. Values shrink
    towards shorter local-parts and host domains.

    This strategy is useful for generating "user data" for tests, as
    mishandling of email addresses is a common source of bugs.
    """
    from hypothesis.provisional import domains

    local_chars = string.ascii_letters + string.digits + "!#$%&'*+-/=^_`{|}~"
    local_part = text(local_chars, min_size=1, max_size=64)
    # TODO: include dot-atoms, quoted strings, escaped chars, etc in local part
    return builds("{}@{}".format, local_part, domains()).filter(
        lambda addr: len(addr) <= (max_size if max_size is not None else 254)
    )

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.