drivendataorg / erdantic Goto Github PK
View Code? Open in Web Editor NEWEntity relationship diagrams for Python data model classes like Pydantic
Home Page: https://erdantic.drivendata.org/
License: MIT License
Entity relationship diagrams for Python data model classes like Pydantic
Home Page: https://erdantic.drivendata.org/
License: MIT License
Workflow failed: tests #207
Logging this issue so that other people don't have to suffer as much as I did.
I copied the same dataclass example into a newdataclass.py
python file and ran erdantic newdataclass -o diagram.png
and got the following error.
After about 30 mins of debugging by running
from importlib import import_module
import_module("newdataclass")
I figured it was because PYTHONPATH
wasn't set to the current directory.
Traceback (most recent call last):
File "/Users/bhavaniravi/.virtualenvs/python-everyday/lib/python3.9/site-packages/erdantic/cli.py", line 131, in import_object_from_name
return import_module(full_obj_name)
File "/opt/homebrew/Cellar/[email protected]/3.9.6/Frameworks/Python.framework/Versions/3.9/lib/python3.9/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
File "<frozen importlib._bootstrap>", line 984, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'newdataclass'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/bhavaniravi/.virtualenvs/python-everyday/bin/erdantic", line 8, in <module>
sys.exit(app())
File "/Users/bhavaniravi/.virtualenvs/python-everyday/lib/python3.9/site-packages/erdantic/cli.py", line 108, in main
model_or_module_objs = [import_object_from_name(mm) for mm in models_or_modules]
File "/Users/bhavaniravi/.virtualenvs/python-everyday/lib/python3.9/site-packages/erdantic/cli.py", line 108, in <listcomp>
model_or_module_objs = [import_object_from_name(mm) for mm in models_or_modules]
File "/Users/bhavaniravi/.virtualenvs/python-everyday/lib/python3.9/site-packages/erdantic/cli.py", line 135, in import_object_from_name
module_name, obj_name = full_obj_name.rsplit(".", 1)
ValueError: not enough values to unpack (expected 2, got 1)
Workflow failed: tests #209
It would be cool to have a CLI since making a diagram is a pretty simple action.
erdantic mymodule.MyModel --out diagram.png
Is it possible to fine-tune the diagram generation by passing edge and node attributes down to graphviz, to approximate UML class diagrams ?
If a composition tree is enormous, it may be useful to split it up into multiple diagrams.
One way to do that is to specify terminal nodes indicated in a way to show that there is a cut there.
Would be a nice convenience feature if users could specify a module (or multiple modules) and have erdantic loop through its namespace and find any data model classes.
Currently, fields with lists of models will render edges for lists. However, constrained lists will not. Here is a basic example:
from pydantic import BaseModel, conlist, Field
from typing import List
import erdantic as erd
class Foo(BaseModel):
name: str
class Bar(BaseModel):
foo1: List[Foo]
foo2: conlist(Foo, unique_items=True)
foo3: List[Foo] = Field(..., unique_items=True)
diagram = erd.create(Bar)
print(diagram)
"EntityRelationshipDiagram(models=[PydanticModel(Bar), PydanticModel(Foo)], edges=[Edge(source=PydanticModel(Bar), source_field=<PydanticField: 'foo1', List[Foo]>, target=PydanticModel(Foo))])"
diagram.draw("foobar.png")
Dataclasses will show List[erdantic.examples.dataclasses.Adventurer]
instead of List[Adventurer]
.
Example:
import dataclasses
import typing
import erdantic as erd
@dataclasses.dataclass
class A:
bees: typing.List['B']
@dataclasses.dataclass
class B:
x: int
erd.draw(A, B, out="diagram.png")
This produces:
It seems that since both classes are known by erd
, then it should be possible to substitute the forward reference during draw
.
GitHub Actions workflow tests #295 failed.
Event: schedule
Branch: main
Commit: 5bb6c04cddfb93aae3ad674e4c39ef0f09938c23
Created by jayqi/failed-build-issue-action
GitHub Actions workflow tests #285 failed.
Event: schedule
Branch: main
Commit: c094337c79029d3d6cc530748b5fb80a46d58ab0
Created by jayqi/failed-build-issue-action
I suggest to support pydantic.Field.alias
field. Currently erdantic ignores it.
That's an example with simple Gift
class:
from pydantic import BaseModel, Field
class Gift(BaseModel):
for_: str = Field(alias="for")
erdantic produces:
Although, pydantic displays a field as for
:
>>> Gift.schema()
{'title': 'Gift', 'type': 'object', 'properties': {'for': {'title': 'For', 'type': 'string'}},
'required': ['for']}
Libraries like FastAPI use that schema in Swagger generations.
Workflow failed: tests #163
The workflow tests.yml is referencing action nwtgck/actions-netlify using references v1.1. However this reference is missing the commit 223b5b5981680adfe8ec4c9f471620cbbdbfadec which may contain fix to the some vulnerability.
The vulnerability fix that is missing by actions version could be related to:
(1) CVE fix
(2) upgrade of vulnerable dependency
(3) fix to secret leak and others.
Please consider to update the reference to the action.
Workflow failed: tests #184
As a user
I want to see the relation between a class and its inherited class and also separate the members where it should be
So that you will get a correct UML view of the classes.
example: https://github.com/drivendataorg/erdantic/blob/main/erdantic/examples/pydantic.py
Additional class:
class PlannedParty(Party):
"""A planned group of adventurers finding themselves doing and saying things altogether unexpected.
Attributes:
name (str): Name that party is known by
formed_datetime (datetime): Timestamp of when the party was formed
members (List[Adventurer]): Adventurers that belong to this party
active_quest (Optional[Quest]): Current quest that party is actively tackling
planned_quests (List[Quest]): A list of quest that party which can be tackled
"""
planned_quests: List[Quest] = []
(Although this is an erd package, it can have added value. I understand if this issue is closed as it might not fit the vision.)
Hey,
I would like to display also the description and also other attributes of a column in extra columns. Is this possible with ERDANTIC?
Field
Model
test case:
https://github.com/adsharma/fquery/blob/master/tests/mock_user.py
Right now, I get only one class in the diagram, namely: MockUser. Would like to see "friends" and "reviews" edges/objects in the output.
Workflow failed: tests #212
Microsoft Windows [Version 10.0.22000.1098]
(c) Microsoft Corporation. All rights reserved.
C:\Users\Max>pip install https://github.com/drivendataorg/erdantic.git#egg=erdantic
Collecting erdantic
Downloading https://github.com/drivendataorg/erdantic.git
| 214.4 kB 29.0 kB/s 0:00:07
ERROR: Cannot unpack file C:\Users\Max\AppData\Local\Temp\pip-unpack-fpbmoqaz\erdantic.git (downloaded from C:\Users\Max\AppData\Local\Temp\pip-install-6ohmio06\erdantic_d45a27d4d8f1499baaee813a654749bf, content-type: text/html; charset=utf-8); cannot detect archive format
ERROR: Cannot determine archive format of C:\Users\Max\AppData\Local\Temp\pip-install-6ohmio06\erdantic_d45a27d4d8f1499baaee813a654749bf
Currently erdantic
is the working name of this package, a play on "ERD" and "pydantic".
What this package does is: create entity relationship diagrams (ERDs or ER diagrams) for Python data modeling classes, including standard library dataclasses
and Pydantic models. It's written to be generalizable, and so it may support other frameworks in the future (for example: marshmallow, attrs). Pydantic is what inspired me to create it, but it may also undersell generalizability.
Some other names I've considered at some point:
@glipstein let me know if you have any ideas or opinions about some of these names.
GitHub Actions workflow tests #222 failed.
Event: push
Branch: main
Commit: f57fb4eefb8e1f80476342f7de8fd2b1f84b9d5b
Created by jayqi/failed-build-issue-action
Python 3.6 has been past EOL for several months now, so it probably makes sense to drop support.
Also should add Python 3.10 to test matrix.
Pandera allows to create schema's and validations for Pandas dataframes. Is it possible that the erdantic tool supports the Pandera schema's? These can be exported to YAML, so perhaps that's an accessible way to read the schema's?
Workflow failed: tests #206
I believe that the current way of defining the modality is not fully correct. If the cardinality is many, then the modality will become zero. But this means you will never get one-to-many. I propose that the modality gets determined to only check if a field is nullable and not also if the cardinality is many.
from typing import Any
from erdantic.typing import repr_type
repr_type(Any)
ends up with:
TypeError Traceback (most recent call last)
<ipython-input-14-a7b820b2367e> in <module>
----> 1 repr_type(Any)
~/repos/erdantic/erdantic/typing.py in repr_type(tp)
82 if tp is Ellipsis:
83 return "..."
---> 84 if issubclass(tp, Enum):
85 return repr_enum(tp)
86 return tp.__name__
TypeError: issubclass() arg 1 must be a class
Right now we don't fully check rendered content on diagrams, but we should.
We should do this in a parameterized way to reduce the amount of testing code to be maintained. We can use pytest's fixtures to load the example classes and/or created diagram objects. Addressed in #21.
#21 addressed creating the static outputs, and has a test that checks the against the static DOT files. Still need tests that check png and svg formats.
Workflow failed: tests #127
Sometimes data classes have docstrings, for the overall class or for individual attributes. It may be useful to be able to show those docstrings.
How the visual design will work is an open question. The tables are currently compact and easy to read—it may require some creativity to add docstrings in a way that don't detract too much from that.
When I stumbled on this project, what I was really looking for was a way to generate an ERD in PlantUML. I've implemented that for personal use, but I'm happy to contribute the work back if it would be considered for inclusion. The changes are quite small.
This was a remarkably easy code base to grok and dig into, so kudos on a useful project!
The workflow release.yml is referencing action nwtgck/actions-netlify using references v1.1. However this reference is missing the commit 223b5b5981680adfe8ec4c9f471620cbbdbfadec which may contain fix to the some vulnerability.
The vulnerability fix that is missing by actions version could be related to:
(1) CVE fix
(2) upgrade of vulnerable dependency
(3) fix to secret leak and others.
Please consider to update the reference to the action.
Hi! Pydantic V2 just dropped, including a number of breaking changes. Running erdantic
, I see the following:
$ erdantic
Traceback (most recent call last):
File ".../src/.venv/bin/erdantic", line 5, in <module>
from erdantic.cli import app
File ".../src/.venv/lib/python3.11/site-packages/erdantic/__init__.py", line 2, in <module>
import erdantic.pydantic # noqa: F401
^^^^^^^^^^^^^^^^^^^^^^^^
File ".../src/.venv/lib/python3.11/site-packages/erdantic/pydantic.py", line 11, in <module>
class PydanticField(Field[pydantic.fields.ModelField]):
^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: module 'pydantic.fields' has no attribute 'ModelField'
Right now kind of overkill to come up with optional dependencies when we're only supporting dataclasses and Pydantic. However, if we add more, it may make sense to turn them into optional dependencies.
It hopefully shouldn't have much impact on UX, since a user would have to have whatever library installed to define their models before they can use erdantic anyways.
Workflow failed: tests #187
Dataclasses resolves Optional[X]
into Union[X, None]
. It would look more compact and readable to display that as Optional
.
There are likely many version incompatibility problems because the typing
module has had a lot of churn between Python 3.6 through 3.9. The current code is only known to work with version 3.8. Will probably need some backports or conditional workarounds.
Some likely culprits:
typing._GenericAlias
, which mypy doesn't like. This might not exist at all before 3.8. In 3.9 it was formally moved to and exposed as types.GenericAlias
typing.get_args
and typing.get_origins
were added in Python 3.8list
, dict
) support []
as of Python 3.9. Need to make sure everything still works.First of all, thanks for creating erdantic
in the first place! I really like how it's possible to generate ERDs from pydantic models and ouput them via SVG.
I just really wish it would be possible to provide custom labels for the resulting graphs. Or at least to provide a possibility to remove the erdantic watermark Created by erdantic v0.5.0 <https://github.com/drivendataorg/erdantic>
.
From my personal view, this is a drawback as it distracts the viewer from the main content. Moreover, it feels like the resulting diagram was generated with some shareware or free license of a proprietary software which requires self credits. Actual, the total opposite is the case because you've chosen MIT license (thanks for that!). I strongly believe that your package does not require the watermark and would enjoy even wider usage without it.
Currently, I hesitate to include it into autodoc_pydantic
without any customization. @yves-renier has created a great PR to leverage erdantic
improving the auto-documentation of pydantic models.
I would be willing to provide a PR upstream adding this feature if you agree.
Workflow failed: tests #183
Workflow failed: tests #211
I was playing around with erdantic, but I could not get satisfactory results that would give me a proper ERD. I've built some helper scripts where I can create an ERD from a Patito model (pydantic for Polars dataframes).
The issue is that Erdantic assumes the cardinality based on the field type. So a string or int will become 1:1 or 0:1 relationship, and only a List(string) will get the many relationshpi. But this is in many cases wrong, a table can have multiple records where it can match with that foreign key even if the type is just string or int. It would be very useful in some cases to provide the cardinality myself.
I will open a PR for this, and give it a try to change the behaviour, by allowing additional fields in the Edge class.
Workflow failed: tests #208
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.