guyskk / validr Goto Github PK
View Code? Open in Web Editor NEWA simple, fast, extensible python library for data validation.
License: Other
A simple, fast, extensible python library for data validation.
License: Other
I'm trying to make a choice validator like test_custom_validator
in test_custom_validator.py
,
but it will be better with "choices" parameter, so I made this:
def choice_validator(choices: list or tuple):
@validator_wrap(string=False)
def _validator(value):
if value not in choices:
raise Invalid('invalid choice')
return value
return _validator
and used it like:
SP = SchemaParser(validators={'choice': choice_validator})
SP.parse({'type?choice(["A", "B"])': 'blahblah'})
unfortunately this would raise exception:
validr._exception.SchemaError: invalid JSON value in '["A", "B"]'
and I find the parameters between "()"
will be split by ","
, so there is no way to pass array variable to a custom validator?
It will be great helpful if there is an example of custom validator with parameters.
&
with .
and remove @
grammarOverview:
# old
name?str&strip&default="world"&desc="Your name"
# new-yaml
name: str.strip.default="world".desc="Your name"
# new-python
name: T.str.strip.default("world").desc("your name")
# new refer
pet: T.ref("http://example.com/schema.json#Pet").optional.desc('description')
scalar:
validator.bool.key=value
list:
- validator.bool.key=value
- arg0
- arg1
dict:
$self: validator.bool.key=value
key0: value
key1: value
refer:
pet: ref("http://example.com/schema.json#Pet").optional.desc('description')
pet:
- ref.optional.desc('description')
- http://example.com/schema.json#Pet
from validr import T
Welcome = T.dict(
message='str.desc="Welcome message"'
message=T.str.desc("Welcome message")
).optional.desc('Welcome Object')
@route('/')
def welcome(
name: 'str.strip.default="world".desc="Your name"',
name: T.str.strip.default("world").desc("Your name"),
) -> T.list(Welcome).minlen(3):
return [{'message': 'hello ' + name}] * 3
Currently datetime.strptime method is not flexible, it only support vert strict format.
Use https://github.com/closeio/ciso8601 is better to parse iso8601 datetime format.
And add tzaware
option, if tzaware=True
then return datetime object with timezone info (UTC).
The change is compatible, no break changes.
Steps to reproduce:
$ python3.10 -m venv py310
$ py310/bin/pip install validr
Output:
Installing collected packages: validr
Running setup.py install for validr ... error
error: subprocess-exited-with-error
× Running setup.py install for validr did not run successfully.
│ exit code: 1
╰─> [1381 lines of output]
VALIDR_SETUP_MODE=c
running install
/home/krat/Projects/py310/lib/python3.10/site-packages/setuptools/command/install.py:34: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
warnings.warn(
running build
running build_py
creating build
creating build/lib.linux-x86_64-3.10
creating build/lib.linux-x86_64-3.10/validr
copying src/validr/exception.py -> build/lib.linux-x86_64-3.10/validr
copying src/validr/schema.py -> build/lib.linux-x86_64-3.10/validr
copying src/validr/validator.py -> build/lib.linux-x86_64-3.10/validr
copying src/validr/__init__.py -> build/lib.linux-x86_64-3.10/validr
copying src/validr/model.py -> build/lib.linux-x86_64-3.10/validr
copying src/validr/_validator_py.py -> build/lib.linux-x86_64-3.10/validr
copying src/validr/_exception_py.py -> build/lib.linux-x86_64-3.10/validr
creating build/lib.linux-x86_64-3.10/validr/_vendor
copying src/validr/_vendor/email_validator.py -> build/lib.linux-x86_64-3.10/validr/_vendor
copying src/validr/_vendor/__init__.py -> build/lib.linux-x86_64-3.10/validr/_vendor
copying src/validr/_vendor/durationpy.py -> build/lib.linux-x86_64-3.10/validr/_vendor
copying src/validr/_vendor/fqdn.py -> build/lib.linux-x86_64-3.10/validr/_vendor
running egg_info
writing src/validr.egg-info/PKG-INFO
writing dependency_links to src/validr.egg-info/dependency_links.txt
writing requirements to src/validr.egg-info/requires.txt
writing top-level names to src/validr.egg-info/top_level.txt
reading manifest file 'src/validr.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
adding license file 'LICENSE'
writing manifest file 'src/validr.egg-info/SOURCES.txt'
copying src/validr/_exception_c.c -> build/lib.linux-x86_64-3.10/validr
copying src/validr/_exception_c.pyx -> build/lib.linux-x86_64-3.10/validr
copying src/validr/_validator_c.c -> build/lib.linux-x86_64-3.10/validr
copying src/validr/_validator_c.pyx -> build/lib.linux-x86_64-3.10/validr
copying src/validr/model.pyi -> build/lib.linux-x86_64-3.10/validr
copying src/validr/schema.pyi -> build/lib.linux-x86_64-3.10/validr
running build_ext
building 'validr._exception_c' extension
creating build/temp.linux-x86_64-3.10
creating build/temp.linux-x86_64-3.10/src
creating build/temp.linux-x86_64-3.10/src/validr
x86_64-linux-gnu-gcc -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/krat/Projects/py310/include -I/usr/include/python3.10 -c src/validr/_exception_c.c -o build/temp.linux-x86_64-3.10/src/validr/_exception_c.o
src/validr/_exception_c.c: In function ‘__Pyx_call_return_trace_func’:
src/validr/_exception_c.c:1075:15: error: ‘PyThreadState’ {aka ‘struct _ts’} has no member named ‘use_tracing’; did you mean ‘tracing’?
1075 | tstate->use_tracing = 0;
| ^~~~~~~~~~~
| tracing
*** a lot of similar lines go here ***
/usr/include/python3.10/cpython/unicodeobject.h:446:26: note: declared here
446 | static inline Py_ssize_t _PyUnicode_get_wstr_length(PyObject *op) {
| ^~~~~~~~~~~~~~~~~~~~~~~~~~
error: command '/usr/bin/x86_64-linux-gnu-gcc' failed with exit code 1
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
error: legacy-install-failure
For Python 3.11 the output differs, but it's still a compilation error.
python3 --version
Python 3.6.9
pip3 --version
pip 20.0.2 from /home/yumi/.local/lib/python3.6/site-packages/pip (python 3.6)
pip3 install validr
Processing => validr-1.1.3-cp36-cp36m-linux_x86_64.whl
python3
>>> from validr import T
>>> T.enum([1,2,3])
....
File "/home/yumi/.local/lib/python3.6/site-packages/validr/schema.py", line 424, in _check_items
raise SchemaError('items must be bool, int, float or str')
validr._exception_c.SchemaError: items must be bool, int, float or str
>>> T.enum(1,2,3)
...
File "/home/matu/.local/lib/python3.6/site-packages/validr/schema.py", line 362, in __call__
raise SchemaError("can't call with more than one positional argument")
validr._exception_c.SchemaError: can't call with more than one positional argument
Hi @guyskk
I found your validr_uncython.py scipt quite useful. After modifying it for my needs, I placed it as a standalone project here: https://github.com/JohannesBuchner/uncythonize and uploaded it also to pypi. I am sure I will use it in the future.
Thank you!
Some changes:
import validr
instead of import validater
ValidatorString
, validr.validators
, build_re_validator
and builtin_validators
To handle invalid values more flexibly, will add two params for all validators:
invalid_to(value)
: replace invalid value with the specified valueinvalid_to_default
: replace invalid value with default value, the default must be setAnd add value
attribute to Invalid exception, the error message will include the invalid value (long text will be truncated).
Validr is currently unsupported for server applications that require verifying POST requests. Since Validr has implemented AsciiTable as opposed to either a dictionary or a list of errors, when creating RESTful API's the thrown exception is absolutely useless.
A better implementation would perhaps be the ability to define some variable within the base Model classes to give better control of the error:
@modelclass
class Model:
self.error_type = dict
class Person(Model):
name=T.str.maxlen(16).desc('at most 16 chars')
website=T.url.optional.desc('website is optional')
This way when an error is found, the server could easily return back what is going on.
try:
test = Person(name=True, website='')
except Exception as e:
return json({'error': 'Invalid key(s) input.', 'keys': e})
Where the error e
would be:
{'name': 'invalid string'}
And therefore the server is able to respond:
{
'error': 'Invalid key(s) input.',
'keys': {
'name': 'invalid string'
}
}
For now, unfortunately, I will have to go back to another json verifier. Great project nonetheless!
To support validate data with multiple schemas, similar to the anyOf
, oneOf
feature in jsonschema, I propose union schema.
Union schema solves two usage scenario.
schema_by_type_or_keys = T.list(T.union([
T.str,
T.list(T.str),
T.dict(key1=T.str),
T.dict(key2=T.str, key3=T.str),
]))
valid_values = [
"string",
["list", "of", "string"],
{"key1": "key1 value"},
{"key2": "key2 value", "key3": "key3 value"}
]
schema in json format:
[
"union",
"str",
["list", "str"],
{"key1": "str"},
{"key2": "str", "key3": "str"},
]
validate process:
def union_validator(compiler, items):
scalar_inner = None
list_inner = None
dict_inners = {}
for schema in items:
assert schema.validator != 'union', 'ambiguous schema'
assert not schema.optional and not schema.default, 'ambiguous schema'
if schema.validator == 'list':
assert list_inner is None, 'ambiguous schema'
list_inner = compiler.compile(schema)
elif schema.validator == 'dict':
key = required_fields_of(schema)
assert key not in dict_inners, 'ambiguous schema'
# TODO: make sure only one inner schema will be selected
dict_inners[key] =compiler.compile(schema)
else:
assert scalar_inner is None, 'ambiguous schema'
scalar_inner = compiler.compile(schema)
def validate(value):
if isinstance(value, list):
return list_inner(value)
elif isinstance(value, dict):
# TODO: optimize select inner schema
for keys, inner in dict_inners.items():
if keys.issubset(value.keys()):
return inner(value)
return
else:
return scalar_inner(value)
return validate
Example on select inner schema:
dict schema and keys:
schema1: k1,k2
schema2: k1,k2,k3
schema3: k1,k2,k4
schema4: k1,k2,k5,k6
value keys and matched schema:
k1,k2,k3,k4,k5,k6 -> schema4
k1,k2,k3,k4,k5 -> schema3
k1,k2,k3 -> schema2
k1,k2 -> schema1
Logic: match the longest subset schema.
schema_by_specified_field = T.list(T.union(
smtp=T.dict(
host=T.str,
port=T.int,
username=T.str,
password=T.str,
),
slack=T.dict(
endpoint=T.url,
token=T.str,
)
).by('type'))
valid_values = [
{
"type": "smtp",
"host": "localhost",
"port": 25,
"username": "guyskk",
"password": "123456",
},
{
"type": "slack",
"endpoint": "https://api.slack.com",
"token": "xxxxxx",
},
]
schema in json format:
{
"$self": "union.by('type')",
"smtp": {
"host": "str",
"port": "int",
"username": "str",
"password": "str",
},
"slack": {
"endpoint": "str",
"token": "str",
}
}
validate process:
def union_validator(compiler, items, by):
inners = {}
for k, schema in items.items():
assert not schema.optional and not schema.default, 'ambiguous schema'
inners[key] = compiler.compile(v)
def validate(value):
by_type = value.get(by)
return inners[by_type](value)
return validate
Check List:
I am trying out a simple schema and I realize it is not possible to use optional
while having additional attributes declared in a dict
schema. Am I doing it correctly? I cannot find any relevant example regarding this usage.
As such,
from validr import T, modelclass, asdict, ValidrError
@modelclass
class Model:
"""Base Model"""
class Person(Model):
url=T.url.desc("url")
myObj = T.dict.optional(
idx=T.int.optional.default(0)
)
try:
result = Person(url="http://test.com", myObj={
"idx": 1
})
print(asdict(result))
except ValidrError as err:
print(err.message)
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
File "", line 1, in
File "/tmp/pip-build-qqy3vem8/validr/setup.py", line 8, in
long_description = f.read()
File "/usr/lib/python3.5/encodings/ascii.py", line 26, in decode
return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe7 in position 576: ordinal not in range(128)
Looks like with some system encodings istallation is broken.
i suggest replacing line 7 at setup.py
with open(os.path.join(dirname(__file__), 'README.md')) as f:
with
with open(os.path.join(dirname(__file__), 'README.md'), 'r', 'utf-8') as f:
Similar to JSON SchemapatternProperties
and "additionalProperties": false/true
features.
It's important features to support docker-compose config schemas: https://github.com/docker/compose/tree/master/compose/config
T.dict.key(T.str.match("^[a-zA-Z0-9._-]+$")).value(T.int)
{
"abc.A-B-C_123": 123,
"ABC.abc-4_5_6": 456,
}
T.dict(key=T.int).extra('discard')
T.dict(key=T.int) # default: discard
{"key": 123, "xxx": 123} -> {"key": 123}
T.dict(key=T.int).extra('keep')
{"key": 123, "xxx": 123} -> {"key": 123, "xxx": 123}
T.dict(key=T.int).extra('error')
{"key": 123, "xxx": 123} -> raise Invalid("xxx fields")
I am working on english document, suggesting and pull request are welcome
Currently validator can output either string or object(non-string), no convenience way to control it's accept and output type. For example datetime validator always output string, but sometimes it's better to output datetime object.
To solve the problem I will introduce accept
and output
parameter to @validator()
.
accept
parameter:
email
, phone
, idcard
dict
, list
datetime
, date
, time
, url
, uuid
, ipv4
, ipv6
output
parameter:
str
, email
, phone
, idcard
dict
, list
, int
, float
object
parameter to control which to output, eg: datetime
, date
, time
, url
, uuid
, ipv4
, ipv6
. To reduce conflict, the object
parameter will rename to output_object
in validator's signature.Usage example:
@validator(accept=(str,object), output=(str,object)
def datetime_validator(output_object=False):
def validate(value):
# do validation
if output_object:
# return datetime object
else:
# return datetime string
return validate
datetime_str_schema = T.datetime
datetime_obj_schema = T.datetime.object
Special case for str
validator:
By default str
validator only accept str type because all python object implement __str__
method and simply convert object to str will cause unwanted behavior.
So str validator will has an accept_object
parameter to control whether it should convert object to str.
Backward compatibility:
The origin string
parameter will be deprecated but will not be removed, no break changes.
string=True
equal to accept=(str, object), output=str
string=False
equal to accept=(str, object), output=object
The feature will added in v1.1, maybe in a few months.
I have a schema like
plain_rule_schema_in = T.dict(
enabled=T.bool.optional.default,
ttl=T.int.optional,
settings=T.dict(
addr=T.netaddr
)
)
plain_rule_list_in = T.dict(
name=T.str,
enabled=T.bool,
description=T.str.optional.default(''),
category=T.enum(
','.join([category.name for category in RuleListCategoriesEnum])
),
rules=T.list(
plain_rule_schema_in,
).maxlen(100000)
)
of a relatively big list of rule entities inside of the list container. It processes fast enough (0.23 sec for 20000 entities), but that's before i add a unique checker to the list. Time instantly jumps to 15-16 s!
I can imagine that unique check is a little slow, yes, but not SUCH slow. There may be a catch here, of course, for example, we can try to drop list/dict objects to json, and many others to str to do substantially faster checks. You may think of something even faster, but for now, this great check is literally unusable
I use validr==1.0.4, netaddr check is performed via
@validator(string=True)
def netaddr_validator(compiler):
"""Custom validator
Params:
compiler: can be used for compile inner schema
items: optional, and can only be scalar type, passed by schema in `T.validator(items)` form
some_param: other params
Returns:
validate function
"""
def validate(value):
"""Validate function
Params:
value: data to be validate
Returns:
valid value or converted value
Raises:
Invalid: value invalid
"""
try:
value = str(netaddr.IPNetwork(value))
except netaddr.core.AddrFormatError as ex:
raise Invalid('{} is invalid addr'.format(value))
return value
return validate
I get this error while installing in an empty environment (virtual environment) with python 3.8.10:
pip install validr
Collecting validr
Using cached validr-1.2.1.tar.gz (291 kB)
Collecting idna>=2.5
Using cached idna-3.3-py3-none-any.whl (61 kB)
Collecting pyparsing>=2.1.0
Downloading pyparsing-3.0.8-py3-none-any.whl (98 kB)
|████████████████████████████████| 98 kB 154 kB/s
Collecting terminaltables>=3.1.0
Downloading terminaltables-3.1.10-py2.py3-none-any.whl (15 kB)
Building wheels for collected packages: validr
Building wheel for validr (setup.py) ... error
ERROR: Command errored out with exit status 1:
command: /home/shayan/dev/test-validr/env/bin/python3 -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-mtopyp8x/validr/setup.py'"'"'; __file__='"'"'/tmp/pip-install-mtopyp8x/validr/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' bdist_wheel -d /tmp/pip-wheel-kx_fpz1i
cwd: /tmp/pip-install-mtopyp8x/validr/
Complete output (7 lines):
VALIDR_SETUP_MODE=c
usage: setup.py [global_opts] cmd1 [cmd1_opts] [cmd2 [cmd2_opts] ...]
or: setup.py --help [cmd1 cmd2 ...]
or: setup.py --help-commands
or: setup.py cmd --help
error: invalid command 'bdist_wheel'
----------------------------------------
ERROR: Failed building wheel for validr
Running setup.py clean for validr
Failed to build validr
Installing collected packages: idna, pyparsing, terminaltables, validr
Running setup.py install for validr ... done
Successfully installed idna-3.3 pyparsing-3.0.8 terminaltables-3.1.10 validr-1.2.1
Also I can't install it in my env created using pipenv with python 3.10.3 and I get:
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2018' in position 41: ordinal not in range(128)
Is there a way to create custom validator with the ability to use nested other validators (as list and dict do?)
I know i'm using the validr in an unexpected way, but the case is: I have the db results which i want to pack into arbitrary data structure. One of the fields of the list of nested dicts (representing db rows) is bytea in postgre (python bytes), packed json dump -> str -> bytes. So, i need to unpack it, and ideally - check the result as a dict.
Is it possible?
优化引用语法
目前不支持引用有参数,无法处理被引用的数据是可选这个功能。
也不支持引用多个,不能像多重继承(mixins)一样,组合多个数据。
另外 "$self?&optional"
这里的 ?
不太好,应当去掉。
以下是改进后支持的语法:
"?validater(arg1,arg2...)&key=value&..."
"(arg1,arg2...)&key=value&..."
"&key=value&..."
"@refer&key=value&..."
"@refer@refer&key=value&..."
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.