larryhastings / co_annotations Goto Github PK
View Code? Open in Web Editor NEWLicense: Other
License: Other
version: da54ee
OS: Ubuntu 20.04.2 LTS (Focal Fossa)
Steps to reproduce.
$ ./configure --with-pydebug && make -j4
$ wget https://github.com/python/mypy/raw/2ac722ac7d844f4c5ae2f3c36f3470543cf01b11/mypy/checker.py
$ vim checker.py # Add `from __future__ import co_annotations`
$ ./python -m py_compile checker.py
python: Python/compile.c:5766: stackdepth_push: Assertion `b->b_startdepth < 0 || b->b_startdepth == depth' failed.
Aborted
In Python-dev thread, I call it "memory error" because I built python without pydebug.
This is copy of the mail, what I get when I used co_annotations branch without pydebug.
$ ../co_annotations/python -m compileall mypy
Listing 'mypy'...
Compiling 'mypy/checker.py'...
free(): corrupted unsorted chunks
Aborted
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1 0x00007ffff7c73859 in __GI_abort () at abort.c:79
#2 0x00007ffff7cde3ee in __libc_message
(action=action@entry=do_abort, fmt=fmt@entry=0x7ffff7e08285 "%s\n") at
../sysdeps/posix/libc_fatal.c:155
#3 0x00007ffff7ce647c in malloc_printerr
(str=str@entry=0x7ffff7e0a718 "free(): corrupted unsorted chunks") at
malloc.c:5347
#4 0x00007ffff7ce81c2 in _int_free (av=0x7ffff7e39b80 <main_arena>,
p=0x555555d1db30, have_lock=<optimized out>) at malloc.c:4356
#5 0x0000555555603906 in PyMem_RawFree (ptr=<optimized out>) at
Objects/obmalloc.c:1922
#6 _PyObject_Free (ctx=<optimized out>, p=<optimized out>) at
Objects/obmalloc.c:1922
#7 _PyObject_Free (ctx=<optimized out>, p=<optimized out>) at
Objects/obmalloc.c:1913
#8 0x000055555567caa9 in compiler_unit_free (u=0x555555ef0fd0) at
Python/compile.c:583
#9 0x000055555568aea5 in compiler_exit_scope (c=0x7fffffffc3d0) at
Python/compile.c:760
#10 compiler_function (c=0x7fffffffc3d0, s=<optimized out>,
is_async=0) at Python/compile.c:2529
#11 0x000055555568837d in compiler_visit_stmt (s=<optimized out>,
c=0x7fffffffc3d0) at Python/compile.c:3665
#12 compiler_body (c=c@entry=0x7fffffffc3d0, stmts=0x555556222450) at
Python/compile.c:1977
#13 0x0000555555688e51 in compiler_class (c=c@entry=0x7fffffffc3d0,
s=s@entry=0x555556222a60) at Python/compile.c:2623
#14 0x0000555555687ce3 in compiler_visit_stmt (s=<optimized out>,
c=0x7fffffffc3d0) at Python/compile.c:3667
#15 compiler_body (c=c@entry=0x7fffffffc3d0, stmts=0x5555563014c0) at
Python/compile.c:1977
#16 0x000055555568db00 in compiler_mod (filename=0x7ffff72e6770,
mod=0x5555563017b0, c=0x7fffffffc3d0) at Python/compile.c:2001
Another problem brought up by Joseph Perez on python-dev:
from dataclasses import dataclass
@dataclass
class User:
name: str
friends: list[User]
This will break because the implementation of the dataclass
decorator will access the class's __annotations__
, but at that point User
is not defined yet, so the user will get a NameError.
This is a similar problem to #1, and possible solutions are similar:
list["User"]
.Hey there -
This is a SQLAlchemy 2.0 ORM mapping:
from __future__ import annotations
from sqlalchemy import ForeignKey
from sqlalchemy.orm import DeclarativeBase
from sqlalchemy.orm import Mapped
from sqlalchemy.orm import mapped_column
from sqlalchemy.orm import relationship
class Base(DeclarativeBase):
pass
class Parent(Base):
__tablename__ = "parent_table"
id: Mapped[int] = mapped_column(primary_key=True)
child_id: Mapped[int] = mapped_column(ForeignKey("child_table.id"))
child: Mapped[Child] = relationship(back_populates="parents")
class Child(Base):
__tablename__ = "child_table"
id: Mapped[int] = mapped_column(primary_key=True)
parents: Mapped[list[Parent]] = relationship(back_populates="child")
Above, SQLAlchemy 2.0 uses runtime inspection of __annotations__
to see things like "the Parent class has a link to the Child class" as well as "the Child class has a link to the Parent class".
I just read pep-649 and this is going to make things more difficult for us; probably not impossible, but very difficult to work around. Currently, the runtime inspection of __annotations__
occurs within the creation of the Parent
and Child
classes themselves, that is, using __init_subclass__()
. Based on what I see in pep649, this would raise a TypeError (or something) because Parent class creation would consume __annotations__
and Child
is not defined yet.
The irony here is that non-typed SQLAlchemy, which has been around for a decade and looks like this:
# SQLAlchemy pre pep-484 integration
class Parent(Base):
__tablename__ = "parent_table"
id = Column(Integer, primary_key=True)
child_id = Column(ForeignKey("child_table.id"))
child = relationship("Child", back_populates="parents")
above actually does defer the evaluation (but not the consumption, since it's just a string) of the "Child" string name until all classes have been defined. This made it easy for us to move to using __annotations__
, since with ForwardRef("Child")
, we get that same string just as we always have.
pep649 means we can no longer get that name at all, a name that's in the source code of the file, at all, until its importable. I'm not familiar with anything else in Python that works this way, even with "from foo import bar" you can override __import__()
to see the strings coming in.
Has it been considered that if __annotations__
becomes an evaluable descriptor, that there would be some other way to simply get the raw names from a class without using eval()
? That would keep things open for tools like SQLAlchemy that have spent a lot of time working with current approaches.
63b415c papers over the issue, but we should figure out why it's too small in the first place and fix the root cause.
So far I haven't been able to find a repro case, not sure where the checker.py
mentioned in the commit message comes from; there's no checker.py
in the CPython repo.
As mentioned in a couple of other tickets, I was recently re-reading https://lukasz.langa.pl/61df599c-d9d8-4938-868b-36b67fdb4448/ (prompted by the release of Python 3.11) and considering the concerns it raises with PEP 649.
The circular reference and if TYPE_CHECKING:
questions are already discussed in #1, #2, and #17, so I won't repeat them here.
This ticket is instead about:
from __future__ import annotations
For the first, I think "Explicitly quote annotations that use type hinting syntax that only becomes valid at runtime in later Python versions" is sufficient to address the point. Even string annotations would need to rely on explicit quoting to backport type hinting syntax that only becomes valid at compile time in future versions, and code object annotations would be starting from the much richer Python 3.12 type hinting syntax baseline.
For the second, PEP 649 doesn't currently spell out the expected fate of from __future__ import annotations
. Łukasz's article assumes it would be deprecated, emit deprecation warnings, and eventually stop working outright, but I'm not sure that's the right thing for it to do (due to the cross-version compatibility concerns that Łukasz raises in his post).
Instead, the following seems like a cleaner way forward:
__future__
import, from __future__ import annotations
continues to function as it does in 3.9, without any deprecation warnings (just as regular annotations wouldn't have any deprecation warnings)from __future__ import annotations
(so the semantics of the latter import still change from what they were previously, just as they would for code with no __future__
import, but there's no hard error on the __future__
import itself)That way, just as concerns with migrating from eager annotations to lazy annotations can be resolved during the __future__
import period, so can concerns with migrating from string annotations to code object based annotations, with the goal being that the eventual semantic change doesn't cause any significant problems in either case.
This idea assumes that the practical issues with migrating from string annotations to code object based annotations are resolved, but those are presumably going to have to be resolved anyway in order for PEP 649 to be accepted by the SC.
I'm trying to get feature parity with existing (non-stringized) annotations. Consider this example:
# from __future__ import co_annotations
class C:
abc = 'xyz'
def method(self, a:abc=abc):
...
import inspect
print(inspect.signature(C.method))
That code works fine in Python 3.9. But if you try it in the co_annotations
branch and uncomment the first line it no longer works.
I'm pretty sure it's because __co_annotations__
is a function, and the function is being defined inside a class, and Python doesn't let functions defined inside of classes see the class namespace. That's why it works if co_annotations
isn't turned on; the __annotations__
dict is generated from within the class namespace, so the lookup works fine.
But, well. co_annotations functions are special, aren't they? And we want them to see fields defined in the class namespace. In fact we want them to be closures, keeping references to those values alive.
I'm posting this here in case anybody seeing this (Guido, Jelle) can suggest how to fix this. My best guess so far is that I need to add a fourth _Py_block_ty
to symtable.h
for co_annotations functions. This would be a nested block inside a ClassBlock
that behaved like a FunctionBlock
except it was allowed to see the definitions inside the ClassBlock
. My fear is that there might be baked-in assumptions inside classes that e.g. a ClassBlock
would never ever contain a... cell variable? Is that right? I'm still mostly working in the dark when it comes to how closures are implemented in Python.
Context: I'm aiming to test import performance of __future__.co_annotations
(relative to __future__.annotations
) on a large realistic annotation-heavy codebase, as I offered to do in https://mail.python.org/archives/list/[email protected]/message/VBG2LXU6OHROQ3NPF373L7W4W23B24DE/
I'm finding that the current build doesn't seem to be able to even import the added Lib/test/future_co_annotations.py
file without a crash; I'm wondering if I'm doing something wrong, or if this is a known issue?
The issue I'm seeing is that the compiler puts a tuple of (<co_annotations code object>, locals())
on the stack in a case where the annotations make use of local names (e.g. the Nested.f
case in future_co_annotations.py
), but there doesn't seem to be any handling of such a tuple in MAKE_FUNCTION
target in ceval; it just crashes on the assert(TOP()->ob_type == &PyCode_Type);
because the stack top is a tuple, not a code object.
If this is known issue that needs fixing, I may be able to work on a fix, just wanted to check first whether this is the right rabbit hole to go down.
(Adding the issue here rather than the CPython repo since the PEP 649 implementation hasn't landed yet)
Since Python 3.2, functools.update_wrapper
has eagerly copied o.__annotations__
from the wrapped object to the wrapper object.
This will do the wrong thing under PEP 649, since it will trigger early evaluation of the annotations on wrapped functions (and hence potentially fail if the annotations contain forward references).
I think the following changes will be sufficient to resolve the incompatibility:
__annotations__
from the default WRAPPER_ASSIGNMENTS
list__annotate__
on the wrapper object as described belowFor maximum compatibility, I think the wrapper annotations will need to be retrieved as follows:
def __annotate__(format):
if format == 1:
return wrapped.__annotations__ # Ensure previously cached annotations are respected
import inspect # Ugh, maybe the guts of `get_annotations` could live down in `functools`?
return inspect.get_annotations(wrapped, format)
wrapper.__annotate__ = __annotate__
I initially thought we could just add __annotate__
to WRAPPER_ASSIGNMENTS
, but I realised that doesn't play nice if a previous decorator has already eagerly evaluated the annotations, and perhaps modified them (in which case, wrapped.__annotate__
may even be None
if the other decorator has been updated to behave as PEP 649 recommends).
The comment in the implementation sketch relates to the fact that inspect
imports functools
, so having functools
depend on an inspect
API introduces a circular dependency. Perhaps the piece that update_wrapper
needs could be factored out to a lower level functools._get_annotations
helper API that inspect.get_annotations
also invokes? If we did that, the PEP 649 compatible update_wrapper
code would become:
def __annotate__(format):
return _get_annotations(wrapped, format)
wrapper.__annotate__ = __annotate__
Currently __prepare__ only requires that the returned item implements __setitem__ and __getitem__. It appears that co_annotations require that the object actually must be a subclass of dict. Minimum code to reproduce.
class TestDict:
def __init__(self):
self.data = {}
def __setitem__(self, key, value):
self.data[key] = value
def __getitem__(self, item):
return self.data[item]
class CustomType(type):
@classmethod
def __prepare__(metacls, name, bases):
return TestDict()
def __new__(metacls, name, bases, namespace, /, **kwargs):
namespace = namespace.data
cls = super().__new__(metacls, name, bases, namespace, **kwargs)
return cls
class A(metaclass=CustomType):
val:str
A.__annotations__
#errors here
__new__ does require that the input to its third argument be a subclass of dictionary but any metaclass that handles it like above would be avoiding that issue.
Not sure if this is just a bad assertion and its not actually an issue for the code or if it is an issue with the code.
The following operators should be explicitly disallowed in annotation functions when co_annotation
is active.
:=
(walrus),yield
and yield from
, andawait
.Their use in an annotation should result in a compile-time error.
There should be tests for each of these in the standard library.
This was brought up by Joseph Perez on the mailing list. The problem is that this is a fairly common idiom:
from __future__ import annotations
from typing import TYPE_CHECKING
if TYPE_CHECKING:
from expensive_module import SomeType
def f(x: SomeType) -> None: ...
f.___annotations__ # NameError under current PEP 649
My only idea for a solution is to make an undefined name produce some special object, like typing.ForwardRef
, in an annotation. But that may introduce more complexity, because annotations aren't just names.
I can think of three more operations we'd have to support with current standard library typing:
SomeType
may be generic so we'll have to support SomeType[int]
SomeType | int
list[SomeType]
, and get caught up by overzealous runtime typechecking. For example, typing.Union
would currently reject it.(I opened this issue because I feel like it's an easier way to have a focused discussion on a single problem. If you disagree, feel free to let me know.)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.