Comments (17)
@george-zubrienko in v1 all these coercions should be a feature flag. It may still be the default behavior btw.
from dataclasses-json.
Bumping to a new version and then documenting seems to me the best thing to do in this case. In the release notes for 0.6.0, just specify that there's a fix for a regression introduced in 0.5.9, and then explain what is the intended behavior from this version onward.
from dataclasses-json.
Alright, then this is what we'll do. I think we'll have some commits to release by end of this week, so expect 0.6.0 around Sunday :)
from dataclasses-json.
Opened an issue for v1 - when we get there, would be nice if you have time to participate in the discussion :) Thanks a lot for bringing this up!
from dataclasses-json.
This is not even type coercion, this is defo a regression because 1234 is not a boolean value. I'm looking into this.
from dataclasses-json.
This is the culprit in core.py
lines 247-250
elif _issubclass_safe(field_type, (int, float, str, bool)):
res = (field_value
if isinstance(field_value, field_type)
else field_type(field_value))
I'll check where this was introduced. This PR: #375
from dataclasses-json.
Now to sum up:
- The
42: str
to42: int
is a valid auto-conversion in many frameworks and DCJ should not be an exception, thus this should not raise, so
d = {"this_must_be_int": 1234, "this_must_be_string": 1234}
MyClass.from_dict(d) # MyClass(this_must_be_int=1234, this_must_be_string="1234")
should be fine. Sames goes for floats etc. But for bool this goes side way because bool(any non-empty value)
in python will always return True
and that is definitely wrong. I will submit a PR fixing this for bool
, let me know your thoughts on this change, if you think this warrants a bump to 0.6 @lidatong @s-vitaliy @matt035343 @USSX-Hares
from dataclasses-json.
Not sure minor bump is needed, since I would categorize it as a patch
from dataclasses-json.
@USSX-Hares I agree that it should be a feature flag. @george-zubrienko please add to task list.
I changed my mind, I recommend bumping to 0.6.0.
from dataclasses-json.
Thanks everyone for the quick feedback! I still have some thoughts on this issue:
- The
42: str
to42: int
is a valid auto-conversion in many frameworks and DCJ should not be an exception, thus this should not raise, so [...] Sames goes for floats etc.
I do not use Python frameworks on a daily basis and I do not know any that would do that to strings and numerals. I think this is a bad idea, for three reasons:
-
As python is a strongly typed language, no Python developer should ever expect a type change at runtime.
-
Type annotations are annotations. I think they should stay that way. It feels wrong to have them suddenly begin to have any operational value at runtime.
-
Regardless of the above (those are just my two cents, you guys do your stuff and it's fine), it still is an unexpected change of behavior between DCJ versions. Whatever you guys decided to do with booleans, I strongly believe the other types have to be treated the same way on that basis alone.
Thanks again all!
from dataclasses-json.
As python is a strongly typed language, no Python developer should ever expect a type change at runtime.
Python also has dynamic typing and is an interpreted language, the very existence of Union
type contradicts this statement. Type coercion is a common thing where JSON/YAML are used in real life, esp when dealing with configurations, so this is in fact a common practice. If you use app configurations (appsettings.json
, HOCON
etc), this is daily life, unless you go strict mode for everything. In HOCON you can implicitly convert stuff like 2s
to timespan of 2 seconds and this is actually an awesome feature. If you want a strict typecheck on JSON read, this is possible to do, but I agree this should be a feature flag and v0 of DCJ doesn't offer that, but I am working on v1 PR that will enable that.
Type annotations are annotations. I think they should stay that way. It feels wrong to have them suddenly begin to have any operational value at runtime.
This statement is the opposite of the first one. If we treat everything as recommendation, then we should have coercions as default - remember JSON is a text format and doesn't carry schema information like AVRO or PARQUET. In fact, it is a quite simple thing, how target types should be treated, but you seem to be more inclined towards strict compliance, which is one way to look at this, but not the only way ;)
it still is an unexpected change of behavior between DCJ versions
I'd argue with this. As stated in the reasoning from the original author #375, since DCJ has marshmellow
interop, quote
This differs from marshmallow decode behavior which properly converts to the integer value.
we should behave the way marshmellow users expect, and lack of type coercion was a bug, that turned out to be a feature for some. Thus, I agree that for some users - definitely, for others - long awaited fix that doesn't warrant a minor bump.
However, even the boolean change I've opened a PR for can cause even more issues to be opened, where people either:
- are annoyed coercion happens first place
- are annoyed coercion happens, but stuff like
[]
or{}
are not converted into booleans. Btw, conversion of "empty" objects into booleans is language-implementation specific, so we risk a huge flame war if we try to enforce something specific here
Thus, I will keep this issue opened and ref it to #442 where we will make both crowds happy by moving this behaviour to a feature flag, so everybody can have their own little DCJ doing the right thing. Until then, help welcome/stay tuned :)
from dataclasses-json.
Python also has dynamic typing and is an interpreted language, the very existence of
Union
type contradicts this statement.
I don't want this to sidetrack into unrelated discussions and all, but I should point out that you are confusing dynamic/static with weak/strong typing. You should know that python is both dynamic AND strongly typed. Names that refer to objects are not bound to any type and may be reassigned freely. That's the dynamic part, and that's what Union
is for. Union is not a type for the object it annotates, i.e. it's not your object that is "Union", it's the annotation itself. And that tells the type checker that the name X may refer to an object of either of A or B (or more) type. At runtime X will be strictly typed, and it will be A or B, and Union will have no existence. That's all what this means. Type is not dynamic. References are.
That's the context of my intervention here. Like I said, you proceed as you see fit, but I do believe that treating python dataclasses as JSON/YAML-like objects (by default or not) is, though convenient, definitely not pythonic. 😅
from dataclasses-json.
I don't want this to sidetrack into unrelated discussions and all
That is exactly what is happening here. If you are unhappy with the response to the issue and you think some changes are in order, that is one thing and can be discussed, but so far I do not see any response to the reasons outlined in my response, including the marshmellow
interop alignment, which was the main reason for this change in the first place. We have to take into account the broader audience here :)
In case you create an issue and put a bug label on it, we must investigate and decide on the best course of action, which includes you as the issue author, taking responsibility and helping us in this journey - for example push for minor version bump - do you think this is needed, or not? The but I do believe that treating python dataclasses as JSON/YAML-like objects (by default or not) is, though convenient, definitely not pythonic
does not really help to decide the course of action and frankfully just adds clutter to the issue thread, we could have gone with a Discussion
instead then ;)
You should know that python is both dynamic AND strongly typed
I never said the opposite :) Important part to note that we are dealing with text format serde in a language where you effectively have very little control of the actual type you are getting after deserialization, the magic "at runtime strictly typed" X.a
, if you seen that part of DCJ source code, is actually decided by the type hint on a dataclass field, because there is no other way to get that information. This can lead to issues, like the ones we see with booleans. That being said, please focus on the original problem and what do you think of the proposed solution above. I'd like to read that part instead.
from dataclasses-json.
Those are fair points. My apologies. I do acknowledge there is a broader context here with DCJ, which I tend to forget. I mainly use from_dict()
, with which the stakes are simpler. I have nothing else to say about the solution, in the sense that in the end I believe it is the developers' call. Bumping to 0.6.0 is fine! 😃
from dataclasses-json.
Bumping to 0.6.0 is fine
Good, I tend to think we should have probably done this too, which @matt035343 agreed as well. But given that the damage has already been done, what do you think is the best course of action? Should we just bump the next release, will that be an acceptable solution?
from dataclasses-json.
Small update, moving this to next week as we might have another commit which can introduce some funky behaviours, so I'm pushing 0.6.0 to next release some time next week
from dataclasses-json.
@jfsmith-at-coveo v0.6.0 released with a note on type coercion and a few other things. I'm closing this issue as resolved and promise this will be further addressed in v1 API when it comes out.
from dataclasses-json.
Related Issues (20)
- [FEATURE] V1 - JsonSerializer should support `list`, `dict` and types from `typing` library that are supported in v0
- [FEATURE] V1 - Class Hierarchy SerDe
- [FEATURE] V1 - Automatic serializer registration from module
- [BUG] Incompatibility with from __future__ import annotations Affects Overriding of Global Config in Python 3.11 HOT 4
- Dataclasses containing variables that reference themselves in list/dict fail to re-create as original type HOT 13
- [BUG] Generated schema doesn't take into account custom decoder
- [Guess] Looks like we have CRLF sneaked to the source code tree HOT 4
- [BUG] global configuration of encoder/decoder for type T is not used for Optional[T] field HOT 5
- _decode_dataclass ignores InitVars because of use of fields HOT 5
- [BUG] "Supported types" documentation is not up to date HOT 1
- [FEATURE] V1 - Type coercions feature
- [BUG] PyPI release broken due to GH changes to env protection HOT 16
- [BUG] Type hinting HOT 1
- [FEATURE] Use existing marshmallow schema
- [BUG] from_json of class with attribute of list of types which inherit basic types results in wrong result HOT 3
- [BUG] Dataclass.from_json with Union type fields infers missing fields and fails in the wrong spot HOT 1
- [minimal-repro-testcase] from v0.5.12 on, tuple fields get truncated HOT 5
- [BUG] No way to default explicit nulls/Nones to a value
- [BUG] Numpy decode issue...
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dataclasses-json.