mongoengine / mongoengine Goto Github PK

View Code? Open in Web Editor NEW

4.2K 137.0 1.2K 8.94 MB

A Python Object-Document-Mapper for working with MongoDB

Home Page: http://mongoengine.org

License: MIT License

Python 100.00%

python mongodb mongodb-orm mongo pymongo orm odm hacktoberfest

mongoengine's Introduction

MongoEngine

Info: MongoEngine is an ORM-like layer on top of PyMongo.
Repository: https://github.com/MongoEngine/mongoengine
Author: Harry Marr (http://github.com/hmarr)
Maintainer: Stefan Wójcik (http://github.com/wojcikstefan)

About

MongoEngine is a Python Object-Document Mapper for working with MongoDB. Documentation is available at https://mongoengine-odm.readthedocs.io - there is currently a tutorial, a user guide, and an API reference.

Supported MongoDB Versions

MongoEngine is currently tested against MongoDB v3.6, v4.0, v4.4, v5.0, v6.0 and v7.0. Future versions should be supported as well, but aren't actively tested at the moment. Make sure to open an issue or submit a pull request if you experience any problems with a more recent MongoDB versions.

Installation

We recommend the use of virtualenv and of pip. You can then use python -m pip install -U mongoengine. You may also have setuptools and thus you can use easy_install -U mongoengine. Another option is pipenv. You can then use pipenv install mongoengine to both create the virtual environment and install the package. Otherwise, you can download the source from GitHub and run python setup.py install.

The support for Python2 was dropped with MongoEngine 0.20.0

Dependencies

All of the dependencies can easily be installed via python -m pip. At the very least, you'll need these two packages to use MongoEngine:

pymongo>=3.4

If you utilize a DateTimeField, you might also use a more flexible date parser:

dateutil>=2.1.0

If you need to use an ImageField or ImageGridFsProxy:

Pillow>=2.0.0

If you need to use signals:

blinker>=1.3

Examples

Some simple examples of what MongoEngine code looks like:

Tests

To run the test suite, ensure you are running a local instance of MongoDB on the standard port and have pytest installed. Then, run pytest tests/.

To run the test suite on every supported Python and PyMongo version, you can use tox. You'll need to make sure you have each supported Python version installed in your environment and then:

# Install tox
$ python -m pip install tox
# Run the test suites
$ tox

Community

Contributing

We welcome contributions! See the Contribution guidelines

mongoengine's People

Contributors

Stargazers

Watchers

Forkers

hmarr flosch iapain seanoc lyddonb benmur aparo soviut donspaulding jrossi vandersonmota joeritchey jamescasbon flavioamieiro schallis twinsant nossila francescarpi danielhasselrot theojulienne ixc alien-labs n1k0 nowells eyeseast danjac alex atbrox aleszoulek sibsibsib harikrishnan83 jaimebuelta scott2b armorris007 unacowa kaitlin pbs-education sshwsfc radev bvosburgh rafa-munoz markferrer nickvlku jstallings jrmi btubbs hensom justquick ctoscano jassinm pombredanne cheshire ahmedsoliman ankhbayar montaro pelletier ricardodani indirecthit glyphobet gregglind berinhard e98cuenc samuelclay heyman maratfm glogiotatidis zhangcheng wpjunior kulasama mechanism zakj ender1976 dcrosta hylje mulka hafeez3000 breezemind tremolo johnarnfield fuyasing linuxnow karimallah sly010 amythos liokm jpfarias mjtamlyn paulcunnane mike-allen tydus exfm darkdarkfruit hghazal guniorobot grubberr lesite chengxiangfei gnublade recursify aparrish

mongoengine's Issues

FloatField shouldn't be validated when not present

when using only to limit the fields of query result, if a FloatField is not in only's argument, trying to save the document will raise ValidationError, float value too small

Bad performance with django pagination in certain occasion

Hello, I'm having problems when using django generic view with pagination, and it's funny, but when you use pagination number for 12 objects, many requests are made to the database when the number of resulting elements is greater than 12 (more than one page), I got to investigate, note that made 17 queries to the database for these conditions. Thank you.

Update Docs regarding validation errors

FROM: https://groups.google.com/d/topic/mongoengine-users/mfQj7M0UXag/discussion

Ok, I got it. In case anyone lands in here.

I have a profile object with a homepage field which is a URLField type.

Put the following on top of your python script:

from mongoengine.base import ValidationError

Then you can do an actual check before you save and you will get an errors dictionary on the ValidationError object:

try:
profile.validate()
except ValidationError as e:
if e.errors.get('homepage'):
print "INVALID_URL"

The ValidationError.errors object looks like this (it will list all fields that have errors):

e.errors
{'homepage': ValidationError(Invalid URL: htp://loco.com ("homepage"),)}

Hope this helps someone out there...

Unable to add index on 'id'

I have the following model

class Blog(Document, TimeStamped):
    meta = {
        'indexes': [
            ['categories', 'id']
        ]
    }

    title = StringField(required=True)
    description = StringField(required=True)
    categories = ListField(ReferenceField(Category))

However it doesn't work with InvalidQueryError: Cannot resolve field "id"

Performance

Need to improve perf of ListFields / DictFields (complex fields in general)

Currently, we iterate and convert too much in ops like to_python and to_mongo so we should do less and only do so on converting. Alternatively, we could use a _raw field that stores the mongo cut of the data, so saving is cheap.

BaseDynamicField move to fields?

I'd like to use BaseDynamicField as a field in my document or embedded document, so I can specify a particular field that can be polymorphic in type.

Right now, I have to explicitly do

from mongoengine.base import BaseDynamicFIeld

Should this BaseDynamicField be moved to fields.py and exposed to users?

query speed incredibly slow in 0.6.3

I have a query like this

Entry.objects(types__in=types,public=True).order_by(order).exclude('resources','comments','creater').skip(offset).limit(limit)

this code runs fast in 0.5.2, but when I upgrade to 0.6.3, it becomes incredibly slow, I have about 400K entries, the query takes about 200 seconds to finish when offset=30, limit=30

seems mongoengine is unable to map this query to use index, thus the mongodb have to do a full scan.

a really strange thing happening with get_or_create

Hi there. I love your engine. Usually it treats me rather well. A very funny thing seems to have happened by a freak coincidence. I am parsing some XML files. I have a Word class to record their occurrences. It looks like this:

class Word(Document):
    stem = StringField()
    count = IntField(default=1)
    forms = ListField(StringField(), default=list)
    occurs = ListField(EmbeddedDocumentField(Occurrence), default=list)

(My version of Python is 2.7.1, running on OSX Lion; this happens from an IDLE launched in a ZSH iTerm2 console, and as executed in a script launched from same.)

I ran into the issue while using a SAX parser; it kept choking on the word greenvill (which happened to be the very first word! I thought it wasn't working at all, at first). It hadn't done this to me ever before. So I pulled the class into an IDLE session, just to see if other stuff worked, or if it worked in different contexts or something, and discovered something bizarre. A quick sesh below:

>>> Word.objects.get_or_create(stem='g')
(<Word: Word object>, True)
>>> Word.objects.get_or_create(stem='gr')
(<Word: Word object>, True)
>>> Word.objects.get_or_create(stem='gre')
(<Word: Word object>, True)
>>> Word.objects.get_or_create(stem='gree')
(<Word: Word object>, True)
>>> Word.objects.get_or_create(stem='green')
(<Word: Word object>, False)
>>> Word.objects.get_or_create(stem='greenv')
(<Word: Word object>, True)
>>> Word.objects.get_or_create(stem='greenvi')
(<Word: Word object>, True)
>>> Word.objects.get_or_create(stem='greenvil')
(<Word: Word object>, True)
>>> Word.objects.get_or_create(stem='greenvill')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Library/Python/2.7/site-packages/mongoengine/queryset.py", line 799, in get_or_create
    doc = self.get(*q_objs, **query)
  File "/Library/Python/2.7/site-packages/mongoengine/queryset.py", line 761, in get
    result1 = self.next()
  File "/Library/Python/2.7/site-packages/mongoengine/queryset.py", line 916, in next
    return self._document._from_son(self._cursor.next())
  File "/Library/Python/2.7/site-packages/mongoengine/base.py", line 968, in _from_son
    else field.to_python(value))
  File "/Library/Python/2.7/site-packages/mongoengine/base.py", line 291, in to_python
    value_dict = dict([(key, self.field.to_python(item)) for key, item in value.items()])
  File "/Library/Python/2.7/site-packages/mongoengine/fields.py", line 414, in to_python
    return self.document_type._from_son(value)
  File "/Library/Python/2.7/site-packages/mongoengine/base.py", line 950, in _from_son
    class_name = son.get(u'_cls', cls._class_name)
AttributeError: 'unicode' object has no attribute 'get'
>>> Word.objects.get_or_create(stem='l')
Traceback (most recent call last):
# the Traceback is identical, and so omitted

So, as you can see, for whatever reason 'greenvill' and 'l' (I haven't discovered any others, or a pattern) cause the Attribute error. Why could this be? I am so confused. And even weirder: I tried it with a different class and label name, to much better effect:

>>> Leaf.objects.get_or_create(name='greenvill')
(<Leaf: Leaf object>, True)

so I added an artificial stem attribute to the leaf class, which by the way now looks like this:

class Leaf(Document):                                                                                                                                                             
    stem = StringField()                                                                                                                                                          
    name = StringField()                                                                                                                                                          
    kind = StringField()                                                                                                                                                          
    type = StringField()                                                                                                                                                          
    count = IntField(default=1)                                                                                                                                                   
    occurs = ListField(EmbeddedDocumentField(Occurrence))

and tried again:

>>> Leaf.objects.get_or_create(stem='greenvill')
(<Leaf: Leaf object>, True)

still good! What is it about the Word class? I changed the attribute name that I wanted, so that the Word class now looks like this:

class Word(Document):                                                                                                                                                             
    stems = StringField()                                                                                                                                                         
    count = IntField(default=1)                                                                                                                                                   
    forms = ListField(StringField(), default=list)                                                                                                                                
    occurs = ListField(EmbeddedDocumentField(Occurrence), default=list)

and it works!

>>> Word.objects.get_or_create(stems='greenvill')
(<Word: Word object>, True)

but then when I switch back to stem instead of stems and try it again, still the same error with the same trace as above.

So, you know, this isn't really a big deal. But I am mystified, and I wonder if you might have some idea about what is causing this problem. In the meantime I'll just rename my attribute.

Implement choices parameter for GenericEmbeddedDocumentField()

I would like to request that this be implemented:

values = ListField(GenericEmbeddedDocumentField(choices=('InputOption','QuizAnswerOption')))

in a Document class definition. Assume we have defined classes InputOption(EmbeddedDocument) and QuizAnswerOption(EmbeddedDocument).

Thanks,
Jake

Bi-Directional relationships + reverse_delete_rule

I've discovered a potential bug when implementing Bi-Directional relationships + reverse_delete_rule.

Here is the code that work:

from mongoengine import *
class Foo(Document):
bar = ReferenceField('Bar')

class Bar(Document):
foo = ReferenceField(Foo)

Here is the code that NOT work:

from mongoengine import *
class Foo(Document):
bar = ReferenceField('Bar', reverse_delete_rule=NULLIFY)

class Bar(Document):
foo = ReferenceField(Foo, reverse_delete_rule=NULLIFY)

Undefined document fields should not override instance methods

Mongoengine should not set attributes and potentially override instance methods when initializing a model with fields that are not defined in the document structure. For example, the following code should either trigger an error when creating the document, or create the document silently while keeping the instance method and print hello as expected:

from mongoengine import Document

class Doc(Document):
    def method(self):
        return 'hello'

doc = Doc(method='something')
print doc.method()

Similarly, when a document of the structure { "method": "something" } exists in the database, loading that document (e.g. using get()) should not override the instance method.

Add exception handling and rollback to get_or_create

Continue to educate people that upserts exist and are a better solution.
Handle index issues : The race condition doesnt exist where unique=True - exactly same as django (indexes that enforce uniqueness will stop multiple items being added) we should enforce a safe write and catch an index error.
Code a rollback if created - add an extra query to match all items and delete all but the first ordered by obj id (creation time) - those would be items created in a race and then return the results of a get with created = False.

refs hmarr#478

cannot import name QuerySet

Well everything is working fine except this. I tried to follow the doc example: http://mongoengine-odm.readthedocs.org/en/latest/guide/querying.html#custom-querysets

i have in my code:
from mongoengine import *

Any idea?

get_or_create does not handle required fields

Given this architecture:

class A(Document):
    foo = StringField(__required=True__)

query = {foo : 'bar', defaults: {foo : 'bar'}}
A.objects.get_or_create(query)

The get_or_create query throws an error because of the document validation, although the only required field is specified in the query:

mongoengine.base.ValidationError: Errors encountered validating document:
foo: Field is required ("foo")

I think the expected behavior would be:

When no document is found, it creates the document with the object attribute specified in the query (even the ones that are not specified in the default).
This should avoid validating errors when the query has the right parameters.

In the end it would be nice to be able to do:

A.objects.get_or_create(foo='bar')

The objects queryset is used for all instances

class Foo(db.Document):
    bar = db.StringField(default='bar')
    active = db.BooleanField(default=False)

    @db.queryset_manager
    def objects(doc_cls, queryset):
        return queryset(active=True)

    @db.queryset_manager
    def with_inactive(doc_cls, queryset):
        return queryset(active=False)

foo1 = Foo(active=True); foo1.save()
foo2 = Foo(active=False); foo2.save()

inactive_foo = Foo.with_inactive.first(); inactive_foo.delete()

This silently fails to delete the inactive foo despite the fact that a custom queryset manager was used. Shouldn't queryset methods be based off the queryset used for the query?

Calling repr alterates the cursor

In [40]: ms = Message.objects()

In [42]: ms.count()
Out[42]: 8679

In [43]: ms
Out[43]: [<Message: ...>, <Message: ...>, <Message: ...>, <Message: ...>, <Message: ...>, <Message: ...>, <Message: ...>, <Message: ...>, <Message: ...>, <Message: ...>, <Message: ...>, <Message: ...>, <Message: ...>, <Message: ...>, <Message: ...>, <Message: ...>, <Message: ...>, <Message: ...>, <Message: ...>, <Message: ...>, '...(remaining elements truncated)...']

In [45]: ms.count()
Out[45]: 21

_delta clean up

Currently, .save not only has a large performance impact it causes unexpected behaviour. Refactor _delta / dirty data tracking its costly and has some fragile edge cases.

Custom queryset manager with arguments?

Read : https://groups.google.com/forum/?fromgroups#!forum/mongoengine-users

So should we use this project to add issues or the hmarr/mongoengine?

db_field and property AttributeError

If you have a field on a Document that uses a db_field value that is the same as a property on the Document itself, then an AttributeError is raised when trying to fetch that instance from Mongo.

I've created an example here:
https://gist.github.com/2944918

I've reproduced the bug in 0.6.10/python 2.7 but it worked fine in 0.5.2/python 2.7

Constructing documents with embedded documents should be a single operation

Given the following very simple case::

>>> class ED(mongoengine.EmbeddedDocument):
...     name = mongoengine.StringField()
... 
>>> class D(mongoengine.Document):
...     doc_name = mongoengine.StringField()
...     doc = mongoengine.EmbeddedDocumentField(ED)
...

The only way to create a document D with an ED doc, is to separately create the document ED. It should be as easy as:

>>> d = D(**{'doc_name': 'hello', 'doc': {'name': 'world'}})
>>> d.save()

However this results in a validation error.

It's really nice to be able take advantage of the validation the mongoengine provides, but if I need to deconstruct and reconstruct a dictionary just to save it, it's far less desirable. I don't think it should be that hard to implement, as mongoengine should understand that if it gets a dictionary instead of and embeded document, it should go ahead and create the embededdocument automatically.

Extract ObjectIds without loading ReferenceField

I have a reference field to a large object. Using lazy loading is causing performance issues as is select_related. I really need to do an in_bulk type operation and then map/join the data myself.

objectids = slist.scalar("sessionexception__distinctlog")

sessionexception is an EmbeddedDocument while distinctlog is a referencefield.

If I do a str() on the objectids array elements I get "DistinctLog" which is the ReferenceField's document type.

The problem is if I then do something like objectids[0].id it loads the entire document.

I can't for the life of me figure out a way to extrace the Object id from the ReferenceField. Someting like

objectids = slist.scalar("sessionexception__distinctlog__id")

gives me an error that I cannot do a join in mongodb. I'm trying to tell mongoengine that I just want the id to come out of the referencefield object...any help would be appreciated. I might just have to do with a raw objectid field type or in addition.

GeoPointField Index bug

I've found a bug in the recursive handling of geo_indexs when the name of the geo_index is the same as the name used to refer to the containing object by the parent.

Try running the two example scripts.

The does_not_work script creates a 2d index on the location field for the Parent object. Which is obviously wrong, and gives a mongoengine.queryset.OperationError: Could not save document (location object expected, location array not in correct format) error on save().

Constructing queryset managers that use variables is broken

I added this as a comment to commit 8879d55, but figured I should add an issue for it as well.

This commit breaks implementing queryset manager functions that use variables.

@queryset_manager
def works(doc_cls, queryset):
    return queryset.filter(happy=True)

@queryset_manager
def broken(doc_cls, queryset):
    filter_dict = dict(happy=True)
    return queryset.filter(**filter_dict)

This is broken because co_varnames includes all the variables defined in the function. Did you mean to count arguments instead with co_argcount instead?

Even still, using a partial here breaks things when you try to use the QuerySet:

>>> works(title="foo")
[]

>>> broken(title="foo")
TypeError: broken() got an unexpected keyword argument 'title'

and

>>> works.filter(title="foo")
[]

>>> broken.filter(title="foo")
AttributeError: 'functools.partial' object has no attribute 'filter'

Although I guess the intention here is that you'd call broken(somearg) to get the actual QuerySet and then use that. I think just changing len(co_varnames) to co_argcount will fix things?

FileField requires file object instead of file-like object

GridFS requires the argument to put() to be a file-like object that implements read(), yet mongoengine demands the value passed to put() to be an actual instance of a file object.

Django's InMemoryUploadedFile and TemporaryUploadedFile classes, both of which implement read() therefore cannot be passed directly to a FileField, but must be converted to an actual string or file object. In my case, it's not terribly big deal, but it seems a bit counterintuitive. Is this discrepancy between GridFS and MongoEngine by design?

BinaryField as primary_key fails on delete()

mongoengine version 0.6.16

import uuid
from mongoengine import *

connect('test')


class Doc(Document):
    id = BinaryField(primary_key=True)

Doc.drop_collection()

d = Doc(id=uuid.uuid4().bytes).save()
d.delete()

Throws on delete():

Traceback (most recent call last):
  File "/tmp/testmongo.py", line 13, in <module>
    d.delete()
  File "/tmp/virtualenv/lib/python2.6/site-packages/mongoengine/document.py", line 294, in delete
    self.__class__.objects(pk=self.pk).delete(safe=safe)
  File "/tmp/virtualenv/lib/python2.6/site-packages/mongoengine/queryset.py", line 1338, in delete
    self._collection.remove(self._query, safe=safe)
  File "/tmp/virtualenv/lib/python2.6/site-packages/pymongo/collection.py", line 481, in remove
    safe, kwargs, self.__uuid_subtype), safe)
  File "/tmp/virtualenv/lib/python2.6/site-packages/pymongo/message.py", line 162, in delete
    encoded = bson.BSON.encode(spec, False, uuid_subtype)
  File "/tmp/virtualenv/lib/python2.6/site-packages/bson/__init__.py", line 533, in encode
    return cls(_dict_to_bson(document, check_keys, uuid_subtype)) 
bson.errors.InvalidStringData: strings in documents must be valid UTF-8

Issues with Custom Managers and Testing

Hello you studly MongoEngine devs!!

Currently I am working on a project using Django and MongoEngine, and I am trying to define a custom manager for some of my models.

Here is the code for my model class:

import django
from django.db import models
from managers import *
from mongoengine import *
from common.models import BaseDocument
from datetime import datetime

class BaseActivity(BaseDocument):
    created_time     = DateTimeField()
    updated_time     = DateTimeField(default = datetime.now())
    performed_by     = ListField()   #user type and id
    performed_by_url = StringField() #posisbly change to URLField() depending on how we want to do these
    performed_on     = ListField()   #user type and id
    performed_on_url = StringField()
    location         = StringField()

    #objects = ActivityManager

    meta = {'allow_inheritance' : True, 'queryset_class' : ActivityManager}

#BaseActivity.add_to_class('objects', ActivityManager(BaseActivity, BaseActivity.objects._collection))

I left those comments in because those are other measures that I have tried with this class, none of which worked, so I stuck to the documented way to perform this action.

Here is the manager code that I am using:

from django.db import models
from mongoengine import *
from mongoengine.queryset import QuerySetManager, QuerySet

class ActivityManager(QuerySetManager):

    def __init__(self, *args):
        super(QuerySetManager, self).__init__()

    def get_recent_by_date(self, date):
        """
        gets recent activity from the start date given, to the current time
        """
        query_data = self.objects.filter(created_time__gte = date )
        return query_data

    def get_recent_by_user(self, user):
        """
        gets activity from the given user
        """
        query_data = self.objects.filter(performed_by= user)
        return query_data

    def get_activity(self, start_date, end_date):
        """
        gets all activity from the start date to the end date
        """
        query_data = self.objects.filter(created_time__lt = end_date).filter(created_time__gte = start_date)
        return query_data

This all seems pretty straight forward, nothing here looks suspicious to me.

Here is the test code:

import sys
from django.test import TestCase
from django.test.client import Client
from ..managers import *
from ..models import BaseActivity
from datetime import datetime

class ManagerTests(TestCase):

    def setUp(self):
        self.base_activity = BaseActivity(
            created_time     = datetime.today(),
            updated_time     = datetime.now(),
            performed_by     = ('Artist', 1),
            performed_by_url = 'Kyle',
            performed_on     = ('Venue', 1),
            performed_on_url = 'HoB',
            location         = "Boston",
        )
        self.base_activity2 = BaseActivity(
            created_time     = datetime(2011,7,1),
            updated_time     = datetime.now(),
            performed_by     = ('Artist', 1),
            performed_by_url = 'Kyle',
            performed_on     = ('Venue', 1),
            performed_on_url = 'HoB',
            location         = "Boston",
        )

        self.base_activity.save()
        self.base_activity2.save()

    def tearDown(self):
        BaseActivity.objects.delete()

    def test_get_recent_by_date(self):
        #print self.base_activity.objects.get_recent(datetime(2012, 7, 31))
        self.assertEqual(len(self.base_activity.objects.get_recent_by_date(datetime(2012, 7, 30))), 1)

    def test_get_recent_by_user(self):
        self.assertEqual(len(self.base_activity.objects.get_recent_by_user(('Artist', 1))), 2)

    def test_get_activity(self):
        self.assertEqual(len(self.base_activity.objects.get_activity(datetime(2012,7,29),datetime.today())), 1)

Here is the error that I get :

======================================================================
ERROR: test_get_recent_by_user (activity.tests.managers.ManagerTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/napa/kyle/source/trunk/napa/activity/tests/managers.py", line 30, in setUp
    self.base_activity.save()
  File "/home/napa/kyle/source/trunk/napa/common/models.py", line 29, in save
    return super(BaseDocument, self).save(*args, **kwargs)
  File "/home/napa/kyle/source/trunk/env/local/lib/python2.7/site-packages/mongoengine/document.py", line 194, in save
    collection = self.__class__.objects._collection
AttributeError: 'ActivityManager' object has no attribute '_collection'

Any insight with this issue would be greatly appreciated. As it seems, I have done everything correct for using a custom QuerySetManager, but inheritance doesn't seem to be right.

There is very little (if any) documentation on this type of thing, and from the other stuff I found in my searches (i.e. that monkeypatch) still doesn't fix this issue.

This seems like a big issue since it could take down an entire site, but if this is an error of mine please point it out!!

Thanks so much for any assistance you can give!
Kyle

It is possible to search for numerical fields using strings

I found something that I think is a little inconsistent in the API,
and I'm wondering if I could report it as a bug to fix. It's not
technically a "bug" I suppose, but I think it is an inconsistency.

Suppose I have a simple model:

class A(Document):
number = fields.IntField()
a_list = fields.ListField()

If I want to find all A's that have number set to 5, I can do that
these two ways and they will both work:

A.objects.filter(number=5)
A.objects.filter(number='5')

That's fine and good (and makes mangopie easier to work on). Now,
suppose I'd like to filter by the length of the a_list field:

A.objects.filter(a_list__size=5) # This works
A.objects.filter(a_list__size='5') # This will silently fail, always returning 0 objects

We had a discussion about this on the mailing list:

http://groups.google.com/group/mongoengine-users/browse_thread/thread/e03fb7611026bda0

It seems to be the opinion that we should not allow searching for Ints using strings that could be interpreted as Ints. I think either way is fine, as long as we are consistent.

I would like to advocate that if we choose to disallow searching for Ints using strings that we raise exceptions when it is attempted instead of just returning nothing. Thank you!

Converting mongoengine objects to JSON

It's entirely possible that I'm missing something, but it seems like the de facto way of converting mongoengine objects (my primary use-case is Document) to JSON is the following code:

https://groups.google.com/forum/#!activity/mongoengine-users/N7p0Mi065swJ/mongoengine-users/ya9XbrAVwi4/N7p0Mi065swJ

Am I missing a nicer way, and if not, would a pull request with an in-mongoengine-method be appreciated?

GridFS cleanups

FileFields dont clean up when deleted and having the same default collection name makes drop_collection difficult.

Refs: hmarr#495 hmarr#496

Implement cleanup for:

import tempfile
import unittest
from mongoengine import *

import pymongo, gridfs

class GridFS(unittest.TestCase):


    def test_file_delete_cleanup(self):
        """Ensure that the gridfs file is deleted when a document
        with a GridFSProxied Field is deleted"""


        class RecursiveObject(EmbeddedDocument):
            obj = EmbeddedDocumentField('self')
            file = FileField()

        class TestFile(Document):
            recursive_obj = EmbeddedDocumentField(RecursiveObject)

        TestFile.drop_collection()

        def _create_testfile():
            testfile = TestFile(recursive_obj=RecursiveObject(obj=RecursiveObject()))
            testfile.recursive_obj.file.put('Hello, World!')
            testfile.recursive_obj.obj.file.put('MongoEngine')
            testfile.save()
            return testfile

        def _assert(testfile):
            testfile_grid_id = testfile.recursive_obj.file.grid_id
            testfile_fs = testfile.recursive_obj.file.fs

            testfile_grid_id_2 = testfile.recursive_obj.obj.file.grid_id
            testfile_fs_2 = testfile.recursive_obj.obj.file.fs

            self.assertFalse(testfile_fs.exists(testfile_grid_id))
            self.assertFalse(testfile_fs_2.exists(testfile_grid_id_2))

        # Test document.delete()
        testfile = _create_testfile()
        testfile.delete()
        _assert(testfile)

        # Test Queryset delete()
        testfile = _create_testfile()
        TestFile.objects().delete()
        _assert(testfile)

        # Test drop_collection
        testfile = _create_testfile()
        TestFile.drop_collection()
        _assert(testfile)

No reads go to the secondary server after enabled the replica set.

Here is the code that I use to connect to the db.(I call it in the
settings.py)
connect('TC', host='127.0.0.1:27017,127.0.0.1:27018,127.0.0.1:27019', replicaSet='myset', read_preference=ReadPreference.SECONDARY)

In one of my views, I call below:
TCUser.objects().slave_okay(True).only('first_name', 'last_name').get(id=uid)

I use mongostat to monitor the queries. But I haven't seen any query goes to my secondary server yet. It's been always 0 -- the query column of my secondary servers in mongostat. If anyone has successfully directed reads to a secondary server in the past, it's very likely that I am doing something wrong either in the mongostats or setting up the replica set. I do see writes go to my secondary servers though according to mongostat.

FYI:
I am running Python 2.7.3, Mongoengine 0.6.5, Django 1.3 on my MacBook. I am running a 3 nodes replica set. Not using arbiter, one primary, two secondary.

Searchs for MapField

fails :/

import datetime
from mongoengine import *

connect('test')

class Boletim(Document):
    name = StringField()
    visited = MapField(DateTimeField())

Boletim.drop_collection()
b = Boletim(nome="wilson", visited={'friends': datetime.datetime.now()})
b.save()

a = Boletim.objects(nome="wilson", visited__friends__exists=True).first()

assert a == b

Implement queryset on abstract Documents

With the following architecture: class A is abstract, B and C classes are concrete and both inherits from A.

class A(Document):
    meta = {'abstract': True, 'collection' : 'A'}
    foo = StringField()

class B(A):
    pass

class C(A):
    pass

I would like to perform some queries on the abstract class, like
A.objects.count(), or A.objects(foo='bar'), that would respectively count the documents from collection A, and retrieve A and B objects with attribute 'foo' equals to 'bar'.

Currently, this does not seem possible, as 'objects' is not defined for abstract class A. I was wondering if this was made on purpose, because I would have thought that 'objects' attribute was more a question of whether the 'collection' field is set or not.
One way I see to sort this out, is for users to implement a home-made 'objects' method in class A, that would simply call the 'objects' method of all known non abstract subclasses.
Does it make sense for mongoengine to support this "natively"?

Proposal: auto_now and auto_now_add options for DateTimeField

There are two very useful options for DateTimeField in Django: auto_now and auto_now_add.

From django docs:
"""
auto_now
Automatically set the field to now every time the object is saved. Useful for "last-modified" timestamps. Note that the current date is always used; it's not just a default value that you can override.

auto_now_add
Automatically set the field to now when the object is first created. Useful for creation of timestamps. Note that the current date is always used; it's not just a default value that you can override.
"""

I think it is very simple task to implement auto_now_add behavior at least. I understand that implementing auto_now is a bit harder. By the way the fact that MongoEngine still has no support for these options lets me think that this is done by design. Which is not right by my personal opinion.

Incorrect source link on the mongoengine website

http://mongoengine.org/#source

Now: git://github.com/mongoengine/mongoengine.git

Should be: git clone git://github.com/MongoEngine/mongoengine.git

Deprecate single arg for queryset_manager

`scalar` should return only `ObjectId`s for relations, not fully dereference

If you put a reference field in scalar, it will fully dereference the related document.
Just returning the ObjectId sounds sane and in line with what you want to accomplish with scalar (do as little work as possible).

Testcase:


from mongoengine import *                                                                                                  
import mongoengine                                                                                                         

import bson                                                                                                                


class Person( Document ):                                                                                                  
    name = StringField()                                                                                                   


class Organization( Document ):                                                                                            
    name = StringField()                                                                                                   
    owner = ReferenceField( 'Person' )                                                                                     


class ScalarTestCase( unittest.TestCase ):                                                                                 

    def setUp( self ):                                                                                                     
        mongoengine.register_connection( mongoengine.DEFAULT_CONNECTION_NAME, 'mongoengine_test' )                         

    def tearDown( self ):                                                                                                  
        pass                                                                                                               

    def test_scalar( self ):                                                                                               
        person = Person( name="owner" )                                                                                    
        person.save()                                                                                                      
        organization = Organization( name="company", owner=person )                                                        
        organization.save()                                                                                                

        scalar = Organization.objects.scalar( 'id', 'owner' )[0]                                                           
        assert isinstance(scalar[0], bson.ObjectId)                                                                        

        # `owner` is a fully dereferenced Person object whereas it should just be an ObjectId or scalar misses its purpose
        assert isinstance(scalar[1], bson.ObjectId)

Cascade saves dont cascade through listfields / dicts..

Very expensive - but improving perf may give a way to achieve this..

QuerySet is edit in place not clones ala Django

Current behavior:

In [1]: from documents import Doc

In [2]: qs = Doc.objects

In [3]: qs
Out[3]: [<Doc: Doc object>, <Doc: Doc object>, <Doc: Doc object>, <Doc: Doc object>, <Doc: Doc object>, <Doc: Doc object>, <Doc: Doc object>, <Doc: Doc object>, <Doc: Doc object>, <Doc: Doc object>, <Doc: Doc object>, <Doc: Doc object>, <Doc: Doc object>, <Doc: Doc object>, <Doc: Doc object>, <Doc: Doc object>, <Doc: Doc object>, <Doc: Doc object>, <Doc: Doc object>, <Doc: Doc object>, '...(remaining elements truncated)...'] 

In [4]: doc = qs.get(id='mydoc')

In [5]: doc
Out[5]: <Doc: Doc object>

In [6]: qs
Out[6]: [<Doc: Doc object>]

Expected behavior:

In the end, the queryset should still contain all the documents, not just the document that was obtained via get().

provide a clean method for Document

In django:
https://docs.djangoproject.com/en/dev/ref/models/instances/?from=olddocs#django.db.models.Model.clean

https://github.com/django/django/blob/stable/1.4.x/django/db/models/base.py#L617
https://github.com/django/django/blob/stable/1.4.x/django/db/models/base.py#L809

example:

class Person(Document):
    name = StringField()
    status = StringField()
    partner = ReferenceField()

    def clean(self):
        if self.status == 'married' and not self.partner:
            raise ValidationError("Partner is missing", field_name="partner")
        elif self.status == 'single':
            self.partner = None

ValidationError should give an object ID and class name

ValidationErrors don't give you any idea what object is failing validation. This makes it extremely hard to debug when these errors invariably pop up in log files. It would be very helpful if the document type and the mongo object ID were included in the exception error message.

Py3 - str, unicode tests

Update tests to check unicode / byte strings - for python 3 support.

Bug in recursive save documents with FileField

Hello.

Here's a bug:

TEST_IMAGE_PATH = os.path.join(os.path.dirname(__file__), 'mongoengine.png')

class DocumentTest(unittest.TestCase):

    def test_save_max_recursion_not_hit_with_file_field(self):

        class Foo(Document):
            name = StringField()
            file = FileField()
            bar = ReferenceField('self')

        Foo.drop_collection()

        a = Foo(name='hello')
        a.save()

        a.bar = a
        a.file = open(TEST_IMAGE_PATH, 'rb')
        a.save()

        # Confirm can save and it resets the changed fields without hitting
        # max recursion error
        b = Foo.objects.with_id(a.id)
        b.name='world'
        b.save()

        self.assertEquals(b.file, b.bar.file, b.bar.bar.file)

When it saving, we have
b == b.bar == b.bar.bar,
and
b.file.grid_id == b.bar.file.grid_id == b.bar.bar.file.grid_id
but
b.file != b.bar.file != b.bar.bar.file
because there are always different GridFSProxy objects gets;

P.S. This bug "works" only with real file stored in GridFS

Pull request with test:
#25

Make documentation of map_reduce function 'output' options more clear.

I just began using MongoEngine a few days ago. I have used MongoDB in the past. I was building some map/reduce methods, when I realized that I was unsure how to specify when to "merge" the results of a reduce function, as opposed to using the "replace", "reduce", and "inline" output options. It wasn't immediately clear to me that the arguments to the output parameter in the map_reduce function call could be a dictionary of the form {"OPTION": "COLLECTION"}; see the paragraph titled "Options Outputs" for a description: http://www.mongodb.org/display/DOCS/MapReduce .

Obviously, this is almost exactly similar in syntax as pymongo's map syntax, but I wasn't able to figure that out with certainty until I actually looked in the MongoEngine package and examined the map_reduce function.

I recommend updating the description of the output parameter in the map_reduce function documentation at http://mongoengine-odm.readthedocs.org/en/latest/apireference.html to specify that the output argument can be a dictionary, as well as has different methods of outputting.

Thanks!

UUIDField should be implemented as mongodb's native uuid type and not a string

pymongo accepts uuid.UUID objects and transforms them to BSON uuid objects. these objects are saved in binary format and are more efficient than the current implementation.

logic of including _cls and _types in BaseDocument.to_mongo() in mongoengine/base.py needs correction.

In the comments, it is stated

"# Only add _cls and _types if allow_inheritance is not False"

But the '_cls' and '_type' are added to the dictionary when:
self._meta.get('allow_inheritance', True) == False)

The False on the right side of the condition should be changed to True.

python3k support

I will not ask you to rewrite everything for py3k. I just need some hint, to do it myself ))
I've refactor sources with 2to3 and after some manual fixes stuck into imports loop.

Here is it:

document.py

from .base import (DocumentMetaclass, TopLevelDocumentMetaclass, BaseDocument,
              BaseDict, BaseList)

base.py (inside metaclass)

from mongoengine import Document, EmbeddedDocument, DictField

I think it is not so hard to fix odd places and and 2to3 into setup.py . What do you think about that?

ListField(GeoPointField()) problem with namedtuple

mongoengine 0.6.13

from collections import namedtuple
from mongoengine import connect, Document, GeoPointField, ListField

connect('test', port=27019)

class Doc(Document):
    points = ListField(GeoPointField())


Point = namedtuple('Point', 'lon lat')
p = Point(lon=1, lat=2)

Doc(points=[tuple(p)]).save()
Doc(points=[p]).save()

The first save() works, but the second one throws an exception:

Traceback (most recent call last):
  File "./testm.py", line 14, in <module>
    Doc(points=[p]).save()
  File "/tmp/venv/lib/python2.6/site-packages/mongoengine/document.py", line 184, in save
    self.validate()
  File "/tmp/venv/lib/python2.6/site-packages/mongoengine/base.py", line 884, in validate
    for name, field in self._fields.items()]
  File "/tmp/venv/lib/python2.6/site-packages/mongoengine/base.py", line 292, in __get__
    value, max_depth=1, instance=instance, name=self.name
  File "/tmp/venv/lib/python2.6/site-packages/mongoengine/dereference.py", line 44, in __call__
    self.reference_map = self._find_references(items)
  File "/tmp/venv/lib/python2.6/site-packages/mongoengine/dereference.py", line 69, in _find_references
    for field_name, field in item._fields.iteritems():
AttributeError: 'tuple' object has no attribute 'iteritems'

Also tried with plain point = GeoPointField() and that worked either way.

Support addToSet and each

http://www.mongodb.org/display/DOCS/Updating#Updating-%24addToSetand%24each

Need back reference from embedded document to parent

class Child(EmbeddedDocument):
    def save()
        pass #impossible

    def change_some_parents_attr_val()
        pass #impossible


class Parent(Document):
    child = EmbeddedDocumentField()

I think it's not so hard to create a back reference to a parent (root) document during class initialization

Can't delete an EmbeddedDocument from a ListField

my documents

from mongoengine import *
from project.users.documents import User


class Collaborator(EmbeddedDocument):
    user = ReferenceField(User, unique=True)
    is_admin = BooleanField(default=False)

    def __unicode__(self):
        return u'%s - is_admin: %s' % (self.user, self.is_admin)


class Site(Document):
    name = StringField(max_length=75, unique=True, required=True)
    collaborators = ListField(EmbeddedDocumentField(Collaborator))

    def __unicode__(self):
        return u'%s' % self.name

    meta = {
        'indexes': ['name'],
     }

When I want to remove a specific collaborator from the site in the following way

Site.objects(id=site.id).update_one(pull__collaborators__user=user)

That returns 1, but when I check the collaborator document is still there.