Subject of the issue
When a dataclass defines an InitVar
attribute, TypedJsonMixin
's __post_init__
method tries to validate the type of the InitVar
attribute, which doesn't exist as an actual attribute on the class. Additionally, from_dict
(and consequently from_json
as well) fail to initialize a new instance because of a missing positional argument (more explanation follows).
Steps to reproduce
import dataclasses
from typed_json_dataclass import TypedJsonMixin
@dataclasses.dataclass
class Test(TypedJsonMixin):
init: dataclasses.InitVar[str]
a: int = 0
b: str = ''
def __post_init__(self, init: str) -> None:
self.a = len(init)
self.b = init[0]
super().__post_init__() # Necessary to call mixin
Test('foo')
# AttributeError: 'Test' object has no attribute 'init'
This is due to __post_init__
retrieving fields from the __dataclass_fields__
attribute, which includes the initialization variables. However, when trying to access this field, lookup fails as these fields are actually not stored in the instance at all, and only passed to the __post_init__
method to finish initialization. The dataclasses.fields()
function would not return these initialization variables, so that would be a possible solution.
However, this still fails the instantiation of a dataclass from a dictionary or JSON string, since these initialization variables are required arguments to a dataclass' __init__
. See below.
import dataclasses
from typed_json_dataclass import TypedJsonMixin
@dataclasses.dataclass
class Test(TypedJsonMixin):
init: dataclasses.InitVar[str]
a: int = 0
b: str = ''
def __post_init__(self, init: str) -> None:
self.a = len(init)
self.b = init[0]
# super().__post_init__()
f = Test('foo')
g = Test.from_json(f.to_json())
# TypeError: __init__() missing 1 required positional argument: 'init'
Disabling the super call to __post_init__
skips the type check but uncovers another issue. Since the to_dict
method delegates to dataclasses.asdict
, these initialization variables are not saved in the dictionary. They cannot be saved in the dictionary, as they do not exist in the scope of the dataclass instance anymore. Therefore, I believe there is no solution to that issue that does not require either significant hacking of dataclass internals, dropping InitVar
from a dataclass declaration, or using a non-elegant workaround.
Two workarounds I have identified so far:
- Supply a default value to the initialization variable.
@dataclasses.dataclass
class Test(TypedJsonMixin):
init: dataclasses.InitVar[str] = None
a: int = 0
b: str = ''
def __post_init__(self, init: str) -> None:
if init is None:
# from_dict
return
self.a = len(init)
self.b = init[0]
This will turn it into a non-required argument and allows instantiation with __init__
to succeed. However, this requires us to check that this initialization variable is not the default value, which would be the case when initialized from from_dict
. Furthermore, instantiations without supplying the initialization variable, such as Test()
, are now legal (but don't make sense).
- Mimic the initialization variable as a private field that is not considered for comparisons and
__repr__
.
@dataclasses.dataclass
class Test(TypedJsonMixin):
_init: str = dataclasses.field(compare=False, repr=False)
a: int = 0
b: str = ''
def __post_init__(self) -> None:
self.a = len(self._init)
self.b = self._init[0]
super().__post_init__()
This makes it impossible to instantiate the dataclass without a value for _init
, but keeps this value around throughout the lifetime of the instance. The __post_init__
method would also be called again to process _init
after deserialization from JSON. It is possible to del
the _init
attribute after __post_init__
, but this again leads to issues when converting the dataclass to and from a dictionary.
Neither are really good solutions to the problem. I'd love to hear your thoughts on this.