beanieodm / beanie Goto Github PK
View Code? Open in Web Editor NEWAsynchronous Python ODM for MongoDB
Home Page: http://beanie-odm.dev/
License: Apache License 2.0
Asynchronous Python ODM for MongoDB
Home Page: http://beanie-odm.dev/
License: Apache License 2.0
Should we add pytest to pre-commit? It would slow down commits but you would probably want to run it anyway.
Any plans to implement https://pymongo.readthedocs.io/en/3.12.0/api/pymongo/write_concern.html
Seems fairly striaght forward to do though unsure how you would want to handle the API for it. Happy to mock something up.
Hey Roman,
I tried to use max() aggregation but it seems like it gets an error when there is no documents found, because it tries to access 0th element in the documents found.
Is this intended? Or is it supposed to give default value like 0?
Cheers.
Hello Roman, i have some issue about run migrations, where do i run this "beanie new-migration -n migration_name -p relative/path/to/migrations/directory/" to perform a migration?
class Product(Document):
price: float
async def main():
...
async for result in Product.find():
...
When iterating on the results of a find operation, the code works but Pylint (2.11.1) complains with the following error:
main.py:9:24: E1133: Non-iterable value Product.find() is used in an iterating context (not-an-iterable)
When creating a new collection with an attribute used as custom index and the value of this attribute is an instance of FieldType, I expect that the index on mongo is created used the Field attribute, instead, it is created with the field name and this could broke the application logic.
import asyncio
from beanie import Document, init_beanie
from motor.motor_asyncio import AsyncIOMotorClient
from pydantic import Field
from beanie.odm.fields import Indexed
class Program(Document):
class Collection:
name = "test-program"
program_id: Indexed(str, unique=True) = Field(..., alias="programId")
async def test():
new_db_client = AsyncIOMotorClient("mongodb://localhost:27017")
new_db = new_db_client["testdb"]
await init_beanie(database=new_db, document_models=[Program])
program1 = Program(program_id='test-program-1')
await program1.insert() # OK
program2 = Program(program_id='test-program-2')
await program2.insert() # Error: E11000 duplicate key error collection: testdb.test-program index: program_id_1 dup key: { program_id: null }, full error: {'index': 0, 'code': 11000, 'keyPattern': {'program_id': 1}, 'keyValue': {'program_id': None}, 'errmsg': 'E11000 duplicate key error collection: testdb.test-program index: program_id_1 dup key: { program_id: null }'}
asyncio.run(test())
The expected result is that both document are stored on the collection.
Instead, removing the alias attribute from Field, it work well, because the document is store with the attribute program_id.
Add lib source to the Pyright checks and fix all the issues.
The current result annotation is BaseModel, which is not reflecting the real use-cases. It should be Generic for the projection model or for the original document class.
Current search criteria typing is Union[Mapping[str, Any], Any]
. Any
is too wide for this. It must be bool
or subtype of bool
First of all, thank you for creating this library. I just started using it with FastAPI and it seems to be a good match.
However, I noticed that when a Document is used as response_model in FastAPI, a revision_id
field is returned.
Like in the example below from the beanie_fastapi_demo
with the beanie version set to 1.7.0.
This code (I only included the relevant parts):
class Note(Document):
title: str
text: Optional[str]
tag_list: List[Tag] = []
@notes_router.post("/notes/", response_model=Note)
async def create_note(note: Note):
await note.create()
return note
When called with
curl -X 'POST' \
'http://127.0.0.1:10001/v1/notes/' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"title": "my test",
"text": "with some text",
"tag_list": []
}'
Will return:
{
"_id": "617ebbc46c9ce4b67d333f4f",
"revision_id": null,
"title": "my test",
"text": "with some text",
"tag_list": []
}
Looking through the code, and using git blame I think revision_id
was introduced in 1.6.0. The documentation on this feature is sparce, so I'm not really sure what to do with this field. In any case I would like to hide the field in the serialized output and schema, as I don't want it to be part of the api schema. So far, I have not found a way to do this, other than creating a specific (pydantic) model like so:
class NoteResp(BaseModel):
id: PydanticObjectId
title: str
text: Optional[str]
tag_list: List[Tag] = []
class Note(Document, NoteResp):
pass
@notes_router.post("/notes/", response_model=NoteResp)
async def create_note(note: Note):
await note.create()
return NoteResp(**note.dict())
However this realy feels like a workaround. Maybe there is a way to 'disable' the revision_id field I haven't found yet?
First and foremost thank you for creating such a useful tool! It's exactly what I've been looking for.
Now for context I was looking to use a motor method count_documents()
that was not exposed at the Document level. I noticed in the documentation that it's suggested to use Document.get_motor_collection().count_documents()
. This indeed worked but its not very clean imo. I could also explicitly redefine in my User class any methods I wanted use from motor that weren't defined in Document. But this too felt like extra code I shouldn't have to write. So instead I created this metaclass to dynamically call them.
What are your thoughts about this? Is there a better way? Are there any gotchas I should be concerned with? If you believe this change would actually be helpful to others would you like me to submit a PR?
from beanie import Document
from motor.motor_asyncio import AsyncIOMotorCollection # Note: I had to use AsyncIOMotorCollection because I ran into recursive issues due to other areas in the codebase where you call "getattr".
from pydantic.main import ModelMetaclass # Note: Pydantic's BaseModel has a custom metaclass; therefore to use a metaclass of our own we have to extend theirs.
class DocumentMetaClass(ModelMetaclass):
def __getattr__(cls, attr):
if callable(getattr(AsyncIOMotorCollection, attr, None)): # Note: I only want to execute this code if the attr that's missing exists in motor.
motor_collection = cls.get_motor_collection()
motor_func = classmethod(getattr(motor_collection, attr))
setattr(cls, attr, motor_func)
return getattr(cls, attr)
class User(Document, metaclass=DocumentMetaClass): # Note: This assignment of metaclass could be done at the Document level if it's helpful to others.
username: str
email: str
# skipping async code...etc
await User.count_documents({ 'username': user.username }, limit = 1)
MyPy (rightfully) complains when you override the ID fields type in a subclass. I'm not sure if this is fixable, but maybe we could change the default type so everything seems like a subclass?
Many of the doc strings have hard-coded links to the documentation. When the documentation changes these point to the wrong location.
Implement upsert
method.
MongoDB doc: https://docs.mongodb.com/manual/reference/method/db.collection.update/#syntax
Usage examples:
class Product(Document):
name: str
description: str
price: float
await Product.find_one(Product.name=="Test").upsert(Set({Product.price: 100})) # Will raise error about description field
await Product.find_one(
Product.name=="Test"
).upsert(
Set({Product.price: 100})
).on_insert(
{Product.description: "some description"}
)
Or come up with a better name
Don't know if it is my setup but migrations/test_free_fall.py
fails on my machine.
Running mongodb v4.4.4 on arch linux, python 3.9.5.
The state controller is needed to add the ability to insert only changed fields to the database.
Users should be able to choose if it is needed or not, as it will use more memory.
It should not duplicate data without reason. Only if a field was changed, it should create a dump.
I am working through the cocktail api tutorial on https://developer.mongodb.com/article/beanie-odm-fastapi-cocktails/. Basically, everything works fine, but I get a strange typing error from Pylance in VSCode. I am not certain, if I should be afraid of it or not.
For this function...
async def get_cocktail(cocktail_id: PydanticObjectId) -> Cocktail:
"""Helper function to look up a cocktail by id"""
cocktail = await Cocktail.get(cocktail_id)
if cocktail is None:
raise HTTPException(status_code=404, detail="Cocktail not found")
return cocktail
... I get the following warning.
From my point of view PyLance highlights a typing problem. Cocktail.get
returns a Document and not the subclass Cocktail.
So, as I am just 30 minutes into the whole Beanie adventure, I might be entirely wrong. Or PyLance has an issue here?
Hello,
How to create TimeSeries Collection ?
https://docs.mongodb.com/manual/core/timeseries-collections/#create-a-time-series-collection
The mongo query as exemple:
db.createCollection(
"weather24h",
{
timeseries: {
timeField: "timestamp",
metaField: "metadata",
granularity: "hours"
},
expireAfterSeconds: 86400
}
)
Hi,
I am new with python web development, so I am sorry if I missed something. From beanie version 4.0.0 onwards, I have been having issue with the change from update to update_dict method in modify_schema for PydanticObjectId:
(condaenv) user@ubuntu-Ubuntu:~/Documents/projects/project/backend/project$ uvicorn main:app --reload INFO: Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit) INFO: Started reloader process [58319] using statreload /home/user/anaconda3/envs/condaenv/lib/python3.9/site-packages/jose/backends/cryptography_backend.py:18: CryptographyDeprecationWarning: int_from_bytes is deprecated, use int.from_bytes instead from cryptography.utils import int_from_bytes, int_to_bytes INFO: Started server process [58321] INFO: Waiting for application startup. INFO: Application startup complete. INFO: 127.0.0.1:50488 - "GET / HTTP/1.1" 404 Not Found INFO: 127.0.0.1:50488 - "GET /favicon.ico HTTP/1.1" 404 Not Found INFO: 127.0.0.1:50492 - "GET /docs HTTP/1.1" 200 OK INFO: 127.0.0.1:50492 - "GET /openapi.json HTTP/1.1" 500 Internal Server Error ERROR: Exception in ASGI application Traceback (most recent call last): File "/home/user/anaconda3/envs/condaenv/lib/python3.9/site-packages/uvicorn/protocols/http/h11_impl.py", line 396, in run_asgi result = await app(self.scope, self.receive, self.send) File "/home/user/anaconda3/envs/condaenv/lib/python3.9/site-packages/uvicorn/middleware/proxy_headers.py", line 45, in __call__ return await self.app(scope, receive, send) File "/home/user/anaconda3/envs/condaenv/lib/python3.9/site-packages/fastapi/applications.py", line 199, in __call__ await super().__call__(scope, receive, send) File "/home/user/anaconda3/envs/condaenv/lib/python3.9/site-packages/starlette/applications.py", line 111, in __call__ await self.middleware_stack(scope, receive, send) File "/home/user/anaconda3/envs/condaenv/lib/python3.9/site-packages/starlette/middleware/errors.py", line 181, in __call__ raise exc from None File "/home/user/anaconda3/envs/condaenv/lib/python3.9/site-packages/starlette/middleware/errors.py", line 159, in __call__ await self.app(scope, receive, _send) File "/home/user/anaconda3/envs/condaenv/lib/python3.9/site-packages/starlette/exceptions.py", line 82, in __call__ raise exc from None File "/home/user/anaconda3/envs/condaenv/lib/python3.9/site-packages/starlette/exceptions.py", line 71, in __call__ await self.app(scope, receive, sender) File "/home/user/anaconda3/envs/condaenv/lib/python3.9/site-packages/starlette/routing.py", line 566, in __call__ await route.handle(scope, receive, send) File "/home/user/anaconda3/envs/condaenv/lib/python3.9/site-packages/starlette/routing.py", line 227, in handle await self.app(scope, receive, send) File "/home/user/anaconda3/envs/condaenv/lib/python3.9/site-packages/starlette/routing.py", line 41, in app response = await func(request) File "/home/user/anaconda3/envs/condaenv/lib/python3.9/site-packages/fastapi/applications.py", line 152, in openapi return JSONResponse(self.openapi()) File "/home/user/anaconda3/envs/condaenv/lib/python3.9/site-packages/fastapi/applications.py", line 130, in openapi self.openapi_schema = get_openapi( File "/home/user/anaconda3/envs/condaenv/lib/python3.9/site-packages/fastapi/openapi/utils.py", line 354, in get_openapi definitions = get_model_definitions( File "/home/user/anaconda3/envs/condaenv/lib/python3.9/site-packages/fastapi/utils.py", line 24, in get_model_definitions m_schema, m_definitions, m_nested_models = model_process_schema( File "pydantic/schema.py", line 548, in pydantic.schema.model_process_schema File "pydantic/schema.py", line 589, in pydantic.schema.model_type_schema File "pydantic/schema.py", line 236, in pydantic.schema.field_schema File "pydantic/schema.py", line 303, in pydantic.schema.get_field_schema_validations File "/home/user/anaconda3/envs/condaenv/lib/python3.9/site-packages/beanie/odm/fields.py", line 43, in __modify_schema__ field_schema.update_dict( AttributeError: 'dict' object has no attribute 'update_dict'
I looked through the code and saw that there is a function update_dict in general_utils.py, however this is a function and not a method. Is there some library that I am missing? I used pip to install the package previously, and poetry to install as well, but encountered the same error.
Thanks for your help.
Currently, there is no syntactic sugar for deleting a document and retrieving the deletion result, to know whether the doc existed or not.
result = MyDocument.find_one({...}).delete()
This does not work, as result
is always None
The only way to do this for now is to use the Motor API:
deletion: DeleteResult = await MyDocument.get_motor_collection().delete_one({...})
# deletion.deleted_count can be used to know whether a document was deleted or not
Something like this does't work:
class Task(Document):
id: str = Field(..., alias="_id")
updated: datetime = Field(..., default_factory=datetime.utcnow)
It would be cool to be able to set _id manually.
The Indexed
type annotation is raising a warning from Pylance (and other type checkers), because the the result of Indexed()
is not annotated as any particular type.
class Address(Document):
street_number: int
street_name: str
city: str
country: str
longitude: str
latitude: str
postal_code: Indexed(str) # We sort on postal codes, so index them
class Collection:
name = "addresses"
I think, looking at the code, its a Union[T, _Indexed]
where:
class _Indexed:
_indexed : Tuple[Any,Dict]
T = TypeVar("T")
def Indexed(typ: T, index_type=ASCENDING, **kwargs) -> Union[T, _Indexed]:
...
Hey! So first off: I'm digging Beanie a lot so far, and have been using it for a project. The entities that I'm using for this project, however, do not rely on the default _id
unique id field of MongoDB and instead rely on a custom UUID for the API interface. So I was wondering what you thought about extending the Document.get method to be able to use _id by default, but also be able to perform find_one under the hood for alternate fields.
I'm able to work around it easily - I just use Entity.find_one( {"uuid": uuid} ) instead of Entity.get(uuid), where it presumes I mean to reference _id. You may think that it's a pretty niche use-case, and I'd probably agree with you, but overall I was still curious what you thought about making that aspect of Beanie more flexible.
Thanks!
I couldn't find anything in the docs to do text search (https://docs.mongodb.com/manual/reference/operator/query/text/)
I have a document :
class Location(Document):
name: str
private: bool = False
class Meta:
table = "locations"
I want to do a search like "where name contains 'New York'"
Is this possible at the moment?
Active Record pattern allows using callback functions on specific database-related events.
It should support pre-
and post-
events on:
Usage:
Users must be able to mark any async or sync method of the document model with the event decorator and specify one or many event types as decorator parameters.
Example:
class Sample(Document):
number: int
@after_event([Insert, Replace])
def update_cache(self):
cache[self.id] = self.number
Details:
bar = await Product.get("608da169eb9e17281f0ab2ff") # not working,
bar = await Product.get(PydanticObjectId("608da169eb9e17281f0ab2ff")) # working.
There two official example projects:
Both should:
The current version of the tutorial looks not good:
Each point needs improvement.
Links:
So it works great. Seriously love this lib. Is there anyway to get the fields to be visible when typing in pycharm? Does it work in any other ides? Everything looks fine when I pop it under the debugger.
Is there a way to automate relationships between documents in beanie or do you have to do it manually? If manually, are we able to access the ObjectId parameter of the document or should we just ensure that we have an index on a unique field? Also, can we set the unique field of an index to assist with this? I didn't see anything in the documentation about support of the unique flag of indexes. This may actually be multiple issues and let me know and I can split it up if you'd prefer.
Nice project, looks promising. Is there (or will there) be a way to specify indexes? Something like what MongoEngine is doing would totally suffice if you want inspiration :)
https://docs.mongoengine.org/guide/defining-documents.html?highlight=index#indexes
Hello Roman !
I am using projections to limit the amount of data of a collection, but when I use it, the field _id does not projects.
Is this a normal behavior ?
from beanie import PydanticObjectId
class EpicShortView(BaseModel):
_id: PydanticObjectId
external_id: str = None
business: str = None
class Epic(Document):
class Collection:
name = 'epics'
external_id: str = None
business: str = None
epic_name: str = None
epic_owner: str = None
epic_idea: str = None
epic_description: str = None
Epic.find(Epic.business == business).project(EpicShortView).to_list()
does not return _id
Thank you
class OutTestModel(BaseModel):
id: Optional[PydanticObjectId]
name:str
class TestModel(Outtestmodel,Document):
id:Optional[PydanticObjectId]
secret_message:str
@Router.get("/failtest",response_model=OutTestModel)
async def wontwork():
mymodel = TestModel(name="jeff",secret_message="this is gonna break")
await mymodel.save()
return mymodel
The above code OutTestModel will not get the id value from TestModel.
I believe the cause is
https://github.com/roman-right/beanie/blob/1641dd81be64dd1dc11af667deb2e50feb2de2be/beanie/odm/documents.py#L922-L927
How hard would it be to optionally just use _id for to and from mongo parts? The only other work around is to not us alias for the response model which opens a whole other can of worms.
https://roman-right.github.io/beanie/document/ is loading but not loading properly and most of the links are 404ing, e.g.:
There are ORM functionality which I don't see in beanie which are mentioned below,
Is the above feature already present, if yes can you point me to the documentation.
Custom MongoDB-specific fields:
If the field type is Document subclass, then only id should be stored there and the whole subdocument must be stored to the separated collection.
Example:
class Window(Document):
width: int
height: int
class House(Document):
address : str
windows: List[Window]
favorite_window: Window
Problems:
House.find(House.favorite_window.width == 1)
house.set({House.favorite_window.width: 1})
Hi
beanie initialization takes long time often several minutes sometime
await init_beanie(database=client.db_name, document_models=[Product])
On debugging it was found that the delay was due to the function await asyncio.gather(*collection_inits) in the file beanie/odm/utils/general.py
The last line in the init_beanie source code
async def init_beanie(
database: AsyncIOMotorDatabase,
document_models: List[Union[Type["DocType"], str]],
allow_index_dropping: bool = True,
):
"""
Beanie initialization
:param database: AsyncIOMotorDatabase - motor database instance
:param document_models: List[Union[Type[DocType], str]] - model classes
or strings with dot separated paths
:param allow_index_dropping: bool - if index dropping is allowed.
Default True
:return: None
"""
collection_inits = []
for model in document_models:
if isinstance(model, str):
model = get_model(model)
collection_inits.append(
model.init_collection(
database, allow_index_dropping=allow_index_dropping
)
)
await asyncio.gather(*collection_inits)
please advise on steps to resolve the issue . At times init beanie is taking upto 3+ minuites . i am loading around 9 model files in beanie
Is this affected by other asynchronous processes being initialized ? i have kafka initialization being done
and other asyn processes being executed as part of initialization as well
Hello,
What a wonderfull work you've done! I'm starting using Beanie with fastapi and I couldn't find anything in the docs or in the code that allow me to use the db.collection.disctinct
from mongodb docs.
I'm using the following document:
class Customer(BaseModel):
name: str
email: Optional[str]
# some fields
class Invoice(Document)
customer: Customer
# some fields
Coming from pymongo, I use collection.distinct('customer')
to get the list of distinct customers from the collection 'invoices', is there a way to do this (or something equivalent using aggregate and project) with Beanie?
First of all, thanks for creating this project, it has been fantastic to work with. I have a feature request, which I am happy to implement, I would like the find, find_many, find_all to project our model so that we get back only the data we are interested in.
This helps reduce database load.
An example of where we might use this is when we have a database model which hold a whole bunch of data, but we are loading a summary view which only requires 2 or 3 of these fields.
example:
class AnalyticsLabellingTask(Document):
reported_by: str = Field(...)
device_id: str = Field(...)
priming_start_date: datetime = Field(...)
event_start_date: datetime = Field(...)
end_date: datetime = Field(...)
reviewers: List[str] = Field(...)
comments: List[Comments] = Field(default_factory=list)
reviewed_events: List[ReviewedEvents] = Field(default_factory=list)
water_data: List[TimeSeriesData] = Field(...)
archived: bool = Field(...)
class Collection:
name = "AnalyticsLabellingTasks"
class AnalyticsLabellingTasksSummary(Document):
device_id: str = Field(...)
event_start_date: datetime = Field(...)
end_date: datetime = Field(...)
reported_by: str = Field(...)
class Collection:
name = "AnalyticsLabellingTasks"
I am happy to implement this, I would do it by updating the find calls:
# get the field list of interest, this could be generated
# at the __init__ stage and be stored in memory to prevent
# having to build this many times
fields = test.__fields__
projection = {}
for name, field in fields.items():
if field.alias:
projection[field.alias] = 1
else:
projection[name] = 1
cursor = cls.get_motor_collection().find(filter_query, projection, **kwargs)
Let me know your thoughts and I can put together a PR next week
MyPy (rightfully) complains that Indexed(type)
should be Indexed[]
.
This leaves two questions:
I need to be able to sort documents by multiple fields in case insensitive mode. In mongo db I could do this with collations and strength at 1. I have not found a way to use collations with beanie.
When executing an expression of a nested object field of a Document class, the value of the aliases is not used to create the expression string for the mongo query.
E.g.
class HeaderObject(BaseModel):
header_id: str = Field(alias="headerId")
class Header(Document):
class Collection:
name = "test-header"
header: HeaderObject
print(Header.header.header_id == 1) # actual => {'header.header_id': 1}; expected => {'header.headerId': 1}
So the solution is that beanie during the class init, check the type of the field, if it is object, so go inside it to get the alias.
Consider the following test:
async def test_find_changed(preset_documents):
query1 = Sample.find_many(Sample.integer > 1)
query2 = query1.find_many(Sample.nested.optional == None)
assert await query1.to_list() != await query2.to_list()
This fails as the query2 = ...
line also changes the first query. Should this instead return a new object with the parameters set differently?
First of all, I have to say this library is concise and beautiful, and I'm happy I stumbled upon it!
I wondered if Document inheritance will be supported in the future, such as in mongoengine.
The problem arises when I have documents with mostly similar fields and some other varying fields, and in the corresponding algorithmic code in the backend, I got a representation of the same data with object inheritance.
When I subclass a Document class (and set the same collection name), no error is thrown, but when I try to use the find_all() method on a base Document class, all of the child documents are projected onto the base class representation. I'd like to have a find function that would return all of the documents in their corresponding children Document classes.
My current workaround is using a single class definition along with a Union[] field that unifies different classes that represent "extra fields" on the single document.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.