Giter Site home page Giter Site logo

davidlatwe / montydb Goto Github PK

View Code? Open in Web Editor NEW
574.0 12.0 29.0 1.33 MB

Monty, Mongo tinified. MongoDB implemented in Python !

License: BSD 3-Clause "New" or "Revised" License

Python 99.78% Makefile 0.22%
mongodb tinydb python pymongo mocking lmdb

montydb's Introduction

drawing

Python package Version PyPi downloads

Monty, Mongo tinified. MongoDB implemented in Python!

Inspired by TinyDB and it's extension TinyMongo

What is it?

A pure Python-implemented database that looks and works like MongoDB.

>>> from montydb import MontyClient

>>> col = MontyClient(":memory:").db.test
>>> col.insert_many( [{"stock": "A", "qty": 6}, {"stock": "A", "qty": 2}] )
>>> cur = col.find( {"stock": "A", "qty": {"$gt": 4}} )
>>> next(cur)
{'_id': ObjectId('5ad34e537e8dd45d9c61a456'), 'stock': 'A', 'qty': 6}

Most of the CRUD operators have been implemented. You can visit issue #14 to see the full list.

This project is tested against:

  • MongoDB: 3.6, 4.0, 4.2 (4.4 on the way๐Ÿ’ฆ)
  • Python: 3.7, 3.8, 3.9, 3.10, 3.11

Install

pip install montydb
  • optional, to use real bson in operation (pymongo will be installed) For minimum requirements, montydb ships with it's own fork of ObjectId in montydb.types, so you may ignore this option if ObjectId is all you need from bson

    pip install montydb[bson]
  • optional, to use lightning memory-mapped db as storage engine

    pip install montydb[lmdb]

Storage

๐Ÿฆ„ Available storage engines:

  • in-memory
  • flat-file
  • sqlite
  • lmdb (lightning memory-mapped db)

Depending on which one you use, you may have to configure the storage engine before you start.

โš ๏ธ

The configuration process only required on repository creation or modification. And, one repository (the parent level of databases) can only assign one storage engine.

To configure a storage, see flat-file storage for example:

from montydb import set_storage, MontyClient


set_storage(
    # general settings
    
    repository="/db/repo",  # dir path for database to live on disk, default is {cwd}
    storage="flatfile",     # storage name, default "flatfile"
    mongo_version="4.0",    # try matching behavior with this mongodb version
    use_bson=False,         # default None, and will import pymongo's bson if None or True

    # any other kwargs are storage engine settings.
    
    cache_modified=10,       # the only setting that flat-file have
)

# ready to go

Once that done, there should be a file named monty.storage.cfg saved in your db repository path. It would be /db/repo for the above examples.

Configuration

Now let's moving on to each storage engine's config settings.

๐ŸŒŸ In-Memory

memory storage does not need nor have any configuration, nothing saved to disk.

from montydb import MontyClient


client = MontyClient(":memory:")

# ready to go

๐Ÿ”ฐ Flat-File

flatfile is the default on-disk storage engine.

from montydb import set_storage, MontyClient


set_storage("/db/repo", cache_modified=5)  # optional step
client = MontyClient("/db/repo")  # use current working dir if no path given

# ready to go

FlatFile config:

[flatfile]
cache_modified: 0  # how many document CRUD cached before flush to disk.

๐Ÿ’Ž SQLite

sqlite is NOT the default on-disk storage, need configuration first before getting client.

Pre-existing sqlite storage file which saved by montydb<=1.3.0 is not read/writeable after montydb==2.0.0.

from montydb import set_storage, MontyClient


set_storage("/db/repo", storage="sqlite")  # required, to set sqlite as engine
client = MontyClient("/db/repo")

# ready to go

SQLite config:

[sqlite]
journal_mode = WAL
check_same_thread =   # Leave it empty as False, or any value will be True

Or,

repo = "path_to/repo"
set_storage(
    repository=repo,
    storage="sqlite",
    use_bson=True,
    # sqlite pragma
    journal_mode="WAL",
    # sqlite connection option
    check_same_thread=False,
)
client = MontyClient(repo)
...

SQLite write concern:

client = MontyClient("/db/repo",
                     synchronous=1,
                     automatic_index=False,
                     busy_timeout=5000)

๐Ÿš€ LMDB (Lightning Memory-Mapped Database)

lightning is NOT the default on-disk storage, need configuration first before get client.

Newly implemented.

from montydb import set_storage, MontyClient


set_storage("/db/repo", storage="lightning")  # required, to set lightning as engine
client = MontyClient("/db/repo")

# ready to go

LMDB config:

[lightning]
map_size: 10485760  # Maximum size database may grow to.

URI

Optionally, You could prefix the repository path with montydb URI scheme.

client = MontyClient("montydb:///db/repo")

Utilities

Pymongo bson may required.

  • montyimport

    Imports content from an Extended JSON file into a MontyCollection instance. The JSON file could be generated from montyexport or mongoexport.

    from montydb import open_repo, utils
    
    
    with open_repo("foo/bar"):
        utils.montyimport("db", "col", "/path/dump.json")
  • montyexport

    Produces a JSON export of data stored in a MontyCollection instance. The JSON file could be loaded by montyimport or mongoimport.

    from montydb import open_repo, utils
    
    
    with open_repo("foo/bar"):
        utils.montyexport("db", "col", "/data/dump.json")
  • montyrestore

    Loads a binary database dump into a MontyCollection instance. The BSON file could be generated from montydump or mongodump.

    from montydb import open_repo, utils
    
    
    with open_repo("foo/bar"):
        utils.montyrestore("db", "col", "/path/dump.bson")
  • montydump

    Creates a binary export from a MontyCollection instance. The BSON file could be loaded by montyrestore or mongorestore.

    from montydb import open_repo, utils
    
    
    with open_repo("foo/bar"):
        utils.montydump("db", "col", "/data/dump.bson")
  • MongoQueryRecorder

    Record MongoDB query results in a period of time. Requires to access database profiler.

    This works via filtering the database profile data and reproduce the queries of find and distinct commands.

    from pymongo import MongoClient
    from montydb.utils import MongoQueryRecorder
    
    client = MongoClient()
    recorder = MongoQueryRecorder(client["mydb"])
    recorder.start()
    
    # Make some queries or run the App...
    recorder.stop()
    recorder.extract()
    {<collection_1>: [<doc_1>, <doc_2>, ...], ...}
  • MontyList

    Experimental, a subclass of list, combined the common CRUD methods from Mongo's Collection and Cursor.

    from montydb.utils import MontyList
    
    mtl = MontyList([1, 2, {"a": 1}, {"a": 5}, {"a": 8}])
    mtl.find({"a": {"$gt": 3}})
    MontyList([{'a': 5}, {'a': 8}])

Development

montydb uses Poetry to make it easy manage dependencies and set up the development environment.

Initial setup

After cloning the repository, you need to run the following commands to set up the development environment:

make install

This will create a virtual environment and download the required dependencies.

updating dependencies

To keep dependencies updated after git operations such as local updates or merging changes into local dev branch

make update

Makefile

A makefile is used to simplify common operations such as updating, testing, and deploying etc.

make or make help

install                        install all dependencies locally
update                         update project dependencies locally (run after git update)
ci                             Run all checks (codespell, lint, bandit, test)
test                           Run tests
lint                           Run linting with flake8
codespell                      Find typos with codespell
bandit                         Run static security analysis with bandit
build                          Build project using poetry
clean                          Clean project

Run mongo docker image

Most of our tests compare montydb CRUD results against real mongodb instance, therefore we must have a running mongodb before testing.

For example, if we want to test against mongo 4.4:

docker run --name monty-4.4 -p 30044:27017 -d mongo:4.4

Tests

poetry run pytest --storage {storage engin name} --mongodb {mongo instance url} [--use-bson]

Example:

poetry run pytest --storage memory --mongodb localhost:30044 --use-bson

Why did I make this?

Mainly for personal skill practicing and fun.

I work in the VFX industry and some of my production needs (mostly edge-case) requires to run in a limited environment (e.g. outsourced render farms), which may have problem to run or connect a MongoDB instance. And I found this project really helps.


This project is supported by JetBrains

drawing ย ย  drawing

montydb's People

Contributors

arieltorti avatar bobuk avatar cclauss avatar davidlatwe avatar dependabot[bot] avatar jqueguiner avatar madeinoz67 avatar michaelcurrin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

montydb's Issues

Cannot upsert document with "_id" specified

Problem

Specifying document "_id" field in update will raise WriteError: Performing an update on the path '_id' would modify the immutable field '_id' even the "_id" didn't change.

>>> col.update_one({"_id": "my-id"}, {"$set": {"_id": "my-id"}}, upsert=True)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "...\montydb\collection.py", line 285, in update_one
    updator = Updator(update, array_filters)
  File "...\montydb\engine\update.py", line 80, in __init__
    self.operations = OrderedDict(sorted(self.parser(spec).items()))
  File "...\montydb\engine\update.py", line 150, in parser
    raise WriteError(msg, code=66)
montydb.errors.WriteError: Performing an update on the path '_id' would modify the immutable field '_id'

Pymongo only raise this error when update document use different _id from filter document.

  • col.update_one({"_id": "some-id"}, {"$set": {"_id": "some-id", "foo": "barbar"}}, upsert=True)

    Valid, filter and update both specified same _id.

  • col.update_one({"_id": "some-id"}, {"$set": {"_id": "other-id", "foo": "barbar"}}, upsert=True)

    Invalid, _id not the same.

  • col.update_one({"foo": "bar"}, {"$set": {"_id": "some-id", "foo": "barbar"}}, upsert=True)

    Maybe invalid if filter result document exists and _id is not the same.

Positional operator issue

Doesn't work with positional operators. Made the same update_one with pymongo successfully. Don't sure if monty got this feature yet not, but as I can see it causing error.

update_one({'users': {'$elemMatch': {'_id': id_}}}, {'$set': {'invoices.$.name': name}})

montydb.erorrs.WriteError: The positional operator did not find the match needed from the query.

Improve maintainability

  • To be clear about what's implemented and what's not
  • Improve README
  • Add wiki
  • Add doc strings
  • Add type hints (after #28)
  • Add development guide (after #28)

Updating a document in an array leads to an error

Updating a document in an array leads to an error:
https://docs.mongodb.com/manual/reference/operator/update/positional/#update-documents-in-an-array

  collection.update_one(
      filter={
          "order": order_number,
          "products.product_id": product_id,
      },
      update={
          "$set": {
              "products.$.quantity": quantity
          }
      }
  )

leads to an exception:

            else:
                # Replace "$" into matched array element index
                matched = fieldwalker.get_matched()
>               position = matched.split(".")[0]
E               AttributeError: 'NoneType' object has no attribute 'split'
\field_walker.py:586: AttributeError

Although README says bson is optional but still being required

Problem

>>> from montydb import MontyClient
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "...\montydb\__init__.py", line 19, in <module>
    from . import utils
  File "...\montydb\utils\__init__.py", line 2, in <module>
    from .io import (
  File "...\montydb\utils\io.py", line 7, in <module>
    from bson import decode_all, BSON
ModuleNotFoundError: No module named 'bson'

These bson import statements should be in somewhere else :

from bson import decode_all, BSON
from bson.codec_options import CodecOptions
from bson.py3compat import string_type
from bson.json_util import (
loads as _loads,
dumps as _dumps,
RELAXED_JSON_OPTIONS as _default_json_opts,
)

Support regexes with `$not`

Seems that this is not implemented:

{"$not": {"$regex": "(?i)Hello World"}}

output:

{"time":"2023-07-19T20:12:37Z","message":"  File \"/usr/local/lib/python3.10/site-packages/montydb/engine/queries.py\", line 381, in _parse_not"}
{"time":"2023-07-19T20:12:37Z","message":"    raise OperationFailure(\"$not cannot have a regex\")"}
{"time":"2023-07-19T20:12:37Z","message":"montydb.errors.OperationFailure: $not cannot have a regex"}
{"time":"2023-07-19T20:12:37Z","message":"ERROR:root:$not cannot have a regex"}

Support for Mongoengine

Montydb with the sqlite backend provides multi-process operation, at least in my initial trials with just 2 processes writing to the database simultaneously. This is clearly an advantage over mongita, which also provides a file/mem clone of pyMongodb; however, doesn't provide multi-process support. But mongitadb does support Mongoengine which was achieved recently.

Has anyone been able to use Montydb with Mongoengine?

bson.errors.InvalidBSON: objsize too large after update_one

Getting this error when making any operation after editing database. Using lmdb. Guessed it was after wrong update_one, but not sure about. Anyway, adding original code of editing db:

    record = {'free': 12313232, 'path': '/media/mnt/'}
    col = getattr(db, 'storages')

    record = {**models['storage'], **record}
    if col.count_documents({'path': record['path']}) > 0:
        col.update_one({'path': record['path']}, {'$set': {**record}})
    else:
        col.insert_one(record)

Error text:

Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/user/.local/lib/python3.8/site-packages/montydb/cursor.py", line 365, in next if len(self._data) or self._refresh(): File "/home/user/.local/lib/python3.8/site-packages/montydb/cursor.py", line 354, in _refresh self.__query() File "/home/user/.local/lib/python3.8/site-packages/montydb/cursor.py", line 311, in __query for doc in documents: File "/home/user/.local/lib/python3.8/site-packages/montydb/storage/lightning.py", line 253, in <genexpr> docs = (self._decode_doc(doc) for doc in self._conn.iter_docs()) File "/home/user/.local/lib/python3.8/site-packages/montydb/storage/__init__.py", line 227, in _decode_doc return bson.document_decode( File "/home/user/.local/lib/python3.8/site-packages/montydb/types/_bson.py", line 64, in document_decode return cls.BSON(doc).decode(codec_options) File "/home/user/.local/lib/python3.8/site-packages/bson/__init__.py", line 1258, in decode return decode(self, codec_options) File "/home/user/.local/lib/python3.8/site-packages/bson/__init__.py", line 970, in decode return _bson_to_dict(data, codec_options) bson.errors.InvalidBSON: objsize too large

That's how db looks like in plain:

$ cat db/storages.mdb @ @ ~60c9a9cf368c720edc2668a6{"path": "/mnt/hdd4", "total": 9999, "used": 99, "free": 9900, "status": "ready", "_id": {"$oid": "60c9a9cf368c720edc2668a6"}}~60c9a9cf368c720edc2668a5{"path": "/boot/efi", "total": 9999, "used": 99, "free": 9900, "status": "ready", "_id": {"$oid": "60c9a9cf368c720edc2668a5"}}y60c9a9cf368c720edc2668a4{"path": "/run", "total": 9999, "used": 99, "free": 9900, "status": "ready", "_id": {"$oid": "60c9a9cf368c720edc2668a4"}}v60c9a9cf368c720edc2668a3{"path": "/", "total": 9999, "used": 99, "free": 9900, "status": "ready", "_id": {"$oid": "60c9a9cf368c720edc2668a3"}} f*~s60c9a9cf368c720edc2668a3{"path": "/", "total": 9999, "used": 99, "free": 0, "status": "ready", "_id": {"$oid": "60c9a9cf368c720edc2668a3"}}2~60c9a9cf368c720edc2668a6{"path": "/mnt/hdd4", "total": 9999, "used": 99, "free": 9900, "status": "ready", "_id": {"$oid": "60c9a9cf368c720edc2668a6"}}~60c9a9cf368c720edc2668a5{"path": "/boot/efi", "total": 9999, "used": 99, "free": 9900, "status": "ready", "_id": {"$oid": "60c9a9cf368c720edc2668a5"}}y60c9a9cf368c720edc2668a4{"path": "/run", "total": 9999, "used": 99, "free": 9900, "status": "ready", "_id": {"$oid": "60c9a9cf368c720edc2668a4"} f*~sr60c9a9cf368c720edc2668a3{"path": "/", "total": 9999, "used": 99, "free": 0, "status": "busy", "_id": {"$oid": "60c9a9cf368c720edc2668a3"}}~60c9a9cf368c720edc2668a6{"path": "/mnt/hdd4", "total": 9999, "used": 99, "free": 9900, "status": "ready", "_id": {"$oid": "60c9a9cf368c720edc2668a6"}}~60c9a9cf368c720edc2668a5{"path": "/boot/efi", "total": 9999, "used": 99, "free": 9900, "status": "ready", "_id": {"$oid": "60c9a9cf368c720edc2668a5"}}y60c9a9cf368c720edc2668a4{"path": "/run", "total": 9999, "used": 99, "free": 9900, "status": "ready", "_id": {"$oid": "60c9a9cf368c720edc2668a4"}}

Or this:

$ cat db/storages.mdb @ @ 0 documents 0 document 0 ฬฯ†ฤง^sTb_idฬฯ†ฤง^sTpath/media/user/ssd1totalusedfreestatusreadyr60c9a9cf368c720edc2668a3{"path": "/", "total": 9999, "used": 99, "free": 0, "status": "busy", "_id": {"$oid": "60c9a9cf368c720edc2668a3"}}~60c9a9cf368c720edc2668a6{"path": "/mnt/hdd4", "total": 9999, "used": 99, "free": 9900, "status": "ready", "_id": {"$oid": "60c9a9cf368c720edc2668a6"}}~60c9a9cf368c720edc2668a5{"path": "/boot/efi", "total": 9999, "used": 99, "free": 9900, "status": "ready", "_id": {"$oid": "60c9a9cf368c720edc2668a5"}}y60c9a9cf368c720edc2668a4{"path": "/run", "total": 9999, "used": 99, "free": 9900, "status": "ready", "_id": {"$oid": "60c9a9cf368c720edc2668a4"}} 0 documents ฬฯ†ฤง^sTb_idฬฯ†ฤง^sTpath/media/user/ssd1totalusedfreestatusreadyr60c9a9cf368c720edc2668a3{"path": "/", "total": 9999, "used": 99, "free": 0, "status": "busy", "_id": {"$oid": "60c9a9cf368c720edc2668a3"}}~60c9a9cf368c720edc2668a6{"path": "/mnt/hdd4", "total": 9999, "used": 99, "free": 9900, "status": "ready", "_id": {"$oid": "60c9a9cf368c720edc2668a6"}}~60c9a9cf368c720edc2668a5{"path": "/boot/efi", "total": 9999, "used": 99, "free": 9900, "status": "ready", "_id": {"$oid": "60c9a9cf368c720edc2668a5"}}y60c9a9cf368c720edc2668a4{"path": "/run", "total": 9999, "used": 99, "free": 9900, "status": "ready", "_id": {"$oid": "60c9a9cf368c720edc2668a4"} 0 0 ฬฯ†ฤง^sTb_idฬฯ†ฤง^sTpath/media/user/ssd1totalusedfreestatusreadyr60c9a9cf368c720edc2668a3{"path": "/", "total": 9999, "used": 99, "free": 0, "status": "busy", "_id": {"$oid": "60c9a9cf368c720edc2668a3"}}~60c9a9cf368c720edc2668a6{"path": "/mnt/hdd4", "total": 9999, "used": 99, "free": 9900, "status": "ready", "_id": {"$oid": "60c9a9cf368c720edc2668a6"}}~60c9a9cf368c720edc2668a5{"path": "/boot/efi", "total": 9999, "used": 99, "free": 9900, "status": "ready", "_id": {"$oid": "60c9a9cf368c720edc2668a5"}}y60c9a9cf368c720edc2668a4{"path": "/run", "total": 9999, "used": 99, "free": 9900, "status": "ready", "_id": {"$oid": "60c9a9cf368c720edc2668a4"}}

Does it stores changes after updating?

Enable testing with multiple versions of MongoDB in one shot

Expose monty supported mongo versions as pytest command line args to bind ports, e.g.
Allow to set multiple mongodb uri for testing. And each uri can be different versions of mongodb instance.

pytest --mongodb=localhost:27017 --mongodb=localhost:27018

Implement MongoDB aggregate

NotImplementedError: 'MontyCollection.aggregate' is NOT implemented !
It would be awesome to have aggregate, this project is very cool non the less!

Count documents

How to properly just to count documents?

x = col.find()
x.count()  # is deprecated and for Python3.5 causing error "The JSON object must be str, not "bytes"
x = col.count() #  Python3.5 causing error "The JSON object must be str, not "bytes"
x = cound_documents() # Says missing argument filter and then item. Why it does required?

Multiple webserver workers has different db state for FlatFile

First of all, great lib for small prods and dev without installing and handling mongod or old-fashioned sqlite3!

Having a problem using supervisor with multiple workers, so basically running several instances of the same python script that connects to database, writing and reading it.
The problem is that every worker has different version of database. My config:

from montydb import MontyClient
client = MontyClient("data")
client.cache_modified = 1
db = client.db

With cache_modified = 0 also the same problem.
I think that montydb stores database is memory and consider FlatFile as a cache, so turning cache_modified to 1 would help, but not. Maybe the problem has another logic?

Confusing behavior of set_storage vs. MontyClient with memory

To create a memory-mapped DB per the docs I can do

from montydb import set_storage, MontyClient
MontyClient(":memory:")

which results in nothing being written to disk. In the above, ":memory:" is a special repository name.

However if I first use set_storage with ":memory:" as the repository name

set_storage(":memory:")
MontyClient(":memory:")

Then a directory called :memory: is written to the current directory with a monty.storage.cfg file in it, which actually has nothing to do with the created MontyClient.

Similarly, if I pass "memory" as the repository name to MontyClient, a directory called memory is created and the actual underlying storage type is flatfile

>>> mc=MontyClient("memory")
>>> mc
MontyClient(repository='memory', document_class=builtins.dict, storage_engine=MontyStorage(engine: 'FlatFileStorage'))

Finally, if I specify storage="memory" (note the absence of :), with or without a repository,

set_storage(":memory:", storage="memory")
MontyClient(":memory:")

or

set_storage(storage="memory")
MontyClient(":memory:")

then all works as expected and nothing is written to disk.

Strictly speaking none of these are "bugs", but personally I think this behavior is quite confusing and has a risk of unintended behavior. I suggest a few ways to address:

  1. Recognize ":memory:" as a special repo name in set_storage and prevent it from being created as a directory
  2. Warn the user if they pass "memory" as a repository name in MontyClient.__init__(). It is conceivable that someone might want their database to live in a folder called "memory", but I think it's also feasible that a user might pass that argument thinking they were getting a memory database
  3. Expose all the set_storage kwargs in MontyClient.__init__() so that one could say, for example MontyClient(storage='sqlite', use_bson=True) rather than having to understand the nuances of set_storage and invoke it separately.

I'm willing to work on implementing the above, but wanted to hear your feedback first to make sure I'm not misunderstanding some of the intended behavior.

update_one and update_many creating extra records for flatfile

Performing updates with the flat file is giving me duplicate documents. I'm running Python 3.8.9 and have tried it with both the pip install montydb install and pip install montydb[bson] install. This problem is not occurring when in sqlite mode.

Subsequent program runs after inserting a record are causing a duplicate document to be added with the same _id.

It looks like the OrderedDict cache update at
https://github.com/davidlatwe/montydb/blob/master/montydb/storage/flatfile.py#L79 is where the extra document is being added. Debugging the process shows that Python is adding a duplicate document because the keys in the ordered dict are actually different. One is an ObjectId object and the other is the binary serialized representation of that id. Here is a screenshot of the debugging output: debug output

Here is the source code to reproduce. Note that you will have to run it twice because this only occurs on subsequent runs.

from montydb import MontyClient, set_storage

set_storage("./db/repo",  cache_modified=0)
client =  MontyClient("./db/repo")
coll = client.petsDB.pets

if  len([x for x in coll.find({"pet":  "cat"})])  ==  0:
    coll.insert_one({"pet":"cat",  "domestic?":True, "climate":  ["polar",  "equatorial",  "mountain"]})
    
coll.update_one({"pet":  "cat"},  {"$push":  {"climate":  "continental"}})
# This should only ever print 1 on subsequent runs.
print(len([x for x in coll.find({"pet":  "cat"})]))

Find by ObjectId

Could not find by ObjectId. Code:

from bson.objectid import ObjectId

x = collection.find()
i = list(x)[0]['_id']

y = collection.find({'_id': ObjectId(i)})
print(list(y))

I see that database in mdb format has "_id": {"$oid": "id"} and I tried to find id in string with: {"_id.$oid": "string id"} but anyway, it would not work. Any suggestions? Thanks!

Refactor setup.py

  • Add extra opt pip install montydb[bson] for install pymongo, bson as dependency.

  • Change to auto use bson if found.

Inactive project?

Hello guys,
I find your project very interesting but there has not been any development for quite a while. Is this project not under development anymore?
Kind regards

ConfigurationError: montydb has been config to use BSON and cannot be changed in current session.

I am trying to use two instances of MontyClient simultaneously - one in memory and another on disk. Is this possible? Right now I am recieving the following error, which I am having trouble understanding.

ConfigurationError: montydb has been config to use BSON and cannot be changed in current session.

In the README I see

The configuration process only required on repository creation or modification. And, one repository (the parent level of databases) can only assign one storage engine.

By "repository creation" do you mean "on instantiation of `MontyClient"? Is there any way to set the storage on a per-instance basis so that I can move data between different repositories, and memory?

Thanks for any assistance!

`bytes` type unsupported

Issue

Cannot store bytes as values.

Env

Windows 10
Python 3.8.1
MontyDB 2.3.6

Actual error

>>> from montydb import MontyClient
>>> col = MontyClient(":memory:").db.test
>>> col.insert_one({'data': b'some bytes'})
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\user\.virtualenvs\name\lib\site-packages\montydb\collection.py", line 139, in insert_one
    result = self._storage.write_one(self, document)
  File "C:\Users\user\.virtualenvs\name\lib\site-packages\montydb\storage\__init__.py", line 45, in delegate
    return getattr(delegator, attr)(*args, **kwargs)
  File "C:\Users\user\.virtualenvs\name\lib\site-packages\montydb\storage\memory.py", line 120, in write_one
    self._col[b_id] = self._encode_doc(doc, check_keys)
  File "C:\Users\user\.virtualenvs\name\lib\site-packages\montydb\storage\__init__.py", line 183, in _encode_doc
    return bson.document_encode(
  File "C:\Users\user\.virtualenvs\name\lib\site-packages\montydb\types\_bson.py", line 236, in document_encode
    for s in _encoder.iterencode(doc):
  File "C:\Program Files\Python38\lib\json\encoder.py", line 431, in _iterencode
    yield from _iterencode_dict(o, _current_indent_level)
  File "C:\Program Files\Python38\lib\json\encoder.py", line 405, in _iterencode_dict
    yield from chunks
  File "C:\Program Files\Python38\lib\json\encoder.py", line 438, in _iterencode
    o = _default(o)
  File "C:\Users\user\.virtualenvs\name\lib\site-packages\montydb\types\_bson.py", line 222, in default
    return NoBSON.JSONEncoder.default(self, obj)
  File "C:\Program Files\Python38\lib\json\encoder.py", line 179, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type bytes is not JSON serializable

Same with PyMongo

>>> from pymongo import MongoClient
>>> col = MongoClient('127.0.0.1').tests.test1
>>> col.insert_one({'data': b'some bytes'})
<pymongo.results.InsertOneResult object at 0x000002BBB52BC7C0>
>>> next(col.find())
{'_id': ObjectId('60bdaa528ff3727b58f514f7'), 'data': b'some bytes'}

FieldWalker2

Improvement

Refactoring montydb.engine.core.FieldWalker using node tree implementation.

The code is much cleaner, especially the logic of Null Querying, and the value getting performance compare to current version is about 40% faster!

WIP Gist

FieldWalker2

Next

Implement node tree setting for projection and update.

$elemMatch in $elemMatch find nothing

Hi @davidlatwe,
Just discovered that monty(montydb-2.3.10) can't handle query like:

x = col.find({"mapping": {'$elemMatch': {'$elemMatch': {'$in': [https://accounts.google.com/o/oauth2/aut']}}}})

for a array in array elements:

"mapping": [ ["https://accounts.google.com/o/oauth2/auth", "client_id", "redirect_uri", "scope", "response_type"] ]

Just doesn't find anything, unlike pymongo does

Implement update operators

Field Update Operators
  • $inc
  • $min
  • $max
  • $mul
  • $rename
  • $set
  • $setOnInsert
  • $unset
  • $currentDate
Array Update Operators
  • $
  • $[]
  • $[]
  • $addToSet
  • $pop
  • $pull
  • $push
  • $pullAll
Modifiers
  • $each
  • $position
  • $slice
  • $sort

Indexes support

I hope to see create_index and drop_index in MontyDB (ideally, create_indexes/drop_indexes) and index support for queries.

Python 3 type hints

Python type hints would be an interesting direction once Python 2 support is dropped...

$ mypy --ignore-missing-imports .

montydb/montydb/types/_bson.py:279: error: Need type annotation for 'custom_json_hooks' (hint: "custom_json_hooks: Dict[<type>, <type>] = ...")
montydb/montydb/storage/memory.py:14: error: Need type annotation for '_repo'
montydb/montydb/storage/memory.py:15: error: Need type annotation for '_config'
montydb/montydb/storage/memory.py:94: error: Cannot assign to a method
montydb/montydb/storage/memory.py:94: error: Incompatible types in assignment (expression has type "Type[MemoryDatabase]", variable has type "Callable[[AbstractStorage], Any]")
montydb/montydb/storage/memory.py:146: error: Cannot assign to a method
montydb/montydb/storage/memory.py:146: error: Incompatible types in assignment (expression has type "Type[MemoryCollection]", variable has type "Callable[[AbstractDatabase], Any]")
montydb/montydb/storage/memory.py:167: error: Cannot assign to a method
montydb/montydb/storage/memory.py:167: error: Incompatible types in assignment (expression has type "Type[MemoryCursor]", variable has type "Callable[[AbstractCollection], Any]")
montydb/montydb/configure.py:24: error: Need type annotation for '_session' (hint: "_session: Dict[<type>, <type>] = ...")
montydb/montydb/storage/sqlite.py:313: error: Cannot assign to a method
montydb/montydb/storage/sqlite.py:313: error: Incompatible types in assignment (expression has type "Type[SQLiteDatabase]", variable has type "Callable[[AbstractStorage], Any]")
montydb/montydb/storage/sqlite.py:334: error: Argument 1 to "_ensure_table" has incompatible type "Callable[[SQLiteCollection, Any, Any], Any]"; expected "SQLiteCollection"
montydb/montydb/storage/sqlite.py:351: error: Argument 1 to "_ensure_table" has incompatible type "Callable[[SQLiteCollection, Any, Any, Any], Any]"; expected "SQLiteCollection"
montydb/montydb/storage/sqlite.py:414: error: Cannot assign to a method
montydb/montydb/storage/sqlite.py:414: error: Incompatible types in assignment (expression has type "Type[SQLiteCollection]", variable has type "Callable[[AbstractDatabase], Any]")
montydb/montydb/storage/sqlite.py:435: error: Cannot assign to a method
montydb/montydb/storage/sqlite.py:435: error: Incompatible types in assignment (expression has type "Type[SQLiteCursor]", variable has type "Callable[[AbstractCollection], Any]")
montydb/montydb/storage/lightning.py:171: error: Cannot assign to a method
montydb/montydb/storage/lightning.py:171: error: Incompatible types in assignment (expression has type "Type[LMDBDatabase]", variable has type "Callable[[AbstractStorage], Any]")
montydb/montydb/storage/lightning.py:191: error: Argument 1 to "_ensure_table" has incompatible type "Callable[[LMDBCollection, Any, Any, Any], Any]"; expected "LMDBCollection"
montydb/montydb/storage/lightning.py:201: error: Argument 1 to "_ensure_table" has incompatible type "Callable[[LMDBCollection, Any, Any, Any, Any], Any]"; expected "LMDBCollection"
montydb/montydb/storage/lightning.py:241: error: Cannot assign to a method
montydb/montydb/storage/lightning.py:241: error: Incompatible types in assignment (expression has type "Type[LMDBCollection]", variable has type "Callable[[AbstractDatabase], Any]")
montydb/montydb/storage/lightning.py:261: error: Cannot assign to a method
montydb/montydb/storage/lightning.py:261: error: Incompatible types in assignment (expression has type "Type[LMDBCursor]", variable has type "Callable[[AbstractCollection], Any]")
montydb/montydb/storage/flatfile.py:193: error: Cannot assign to a method
montydb/montydb/storage/flatfile.py:193: error: Incompatible types in assignment (expression has type "Type[FlatFileDatabase]", variable has type "Callable[[AbstractStorage], Any]")
montydb/montydb/storage/flatfile.py:220: error: Argument 1 to "_ensure_table" has incompatible type "Callable[[FlatFileCollection, Any, Any], Any]"; expected "FlatFileCollection"
montydb/montydb/storage/flatfile.py:233: error: Argument 1 to "_ensure_table" has incompatible type "Callable[[FlatFileCollection, Any, Any, Any], Any]"; expected "FlatFileCollection"
montydb/montydb/storage/flatfile.py:275: error: Cannot assign to a method
montydb/montydb/storage/flatfile.py:275: error: Incompatible types in assignment (expression has type "Type[FlatFileCollection]", variable has type "Callable[[AbstractDatabase], Any]")
montydb/montydb/storage/flatfile.py:298: error: Cannot assign to a method
montydb/montydb/storage/flatfile.py:298: error: Incompatible types in assignment (expression has type "Type[FlatFileCursor]", variable has type "Callable[[AbstractCollection], Any]")
Found 34 errors in 6 files (checked 100 source files)

MontyClient instanciation fails

Hey @davidlatwe, thanks for the great library!

There may be something wrong with MontyClient() called without any argument, as it fails when a .monty.storage file already exists.

Take such a file:

import montydb

montydb.MontyClient()

Run it twice and you should get this:

$ python issue.py
$ # no error yet
$ python issue.py
Traceback (most recent call last):
  File "issue.py", line 3, in <module>
    montydb.MontyClient()
  File ".venv/lib/python3.9/site-packages/montydb/client.py", line 50, in __init__
    self.__options = ClientOptions(options, wconcern)
  File ".venv/lib/python3.9/site-packages/montydb/base.py", line 205, in __init__
    self.__codec_options = bson.parse_codec_options(options)
TypeError: 'NoneType' object is not callable

If this is "normal", maybe an explicit error would make sense here. :)

For the context, Mongo-Thingy is an ODM that supports Monty as a backend. One of our users tried to use Monty and got stuck with this issue: Refty/mongo-thingy#48

Implement query operators

  • comparison
  • $eq
  • $gt
  • $gte
  • $lt
  • $lte
  • $ne
  • $in
  • $nin
  • logical
  • $and
  • $or
  • $nor
  • $not
  • array
  • $all
  • $elemMatch
  • $size
  • element
  • $type
  • $exists
  • evaluation
  • $mod
  • $regex

Permission denied with empty collections on LMDB

Hi,
I've got a weird lmdb.ReadonlyError: mdb_dbi_open: Permission denied when using a cursor over an empty collection.

How to reproduce:

set_storage('some_file', storage='lightning')
client = MontyClient('some_file')

db = client['my_db']
db.create_collection('my_col')

db['my_col'].find().next()

Even a count raises the same error so I'm still trying to figure out a workaround.

How to use `montydb` with an existing SQLite database?

I have an existing SQLite database on disk (dispatcher_db.sqlite) and was hoping that I could use montydb to make it possible for me to interact with it in Python as if it were a Mongo database.

It wasn't clear from the README how one might (or might not!) be able to do this. Would you mind providing some insight? Thank you!!

asyncio support

Thanks for this package, just starting to use in my OS project and would like to see async supported.

I know ultimately its accessing a file locally, however this has been done in a similar fashion with sqlite using the aiosqlite driver.

Dropping Python 3.4, 3.5 and adding 3.7 to CI

Dropping Python 3.4, 3.5 tests

In some test cases, for example:

  • test/test_engine/test_find.py

    • test_find_2
    • test_find_3
  • tests/test_engine/test_update/test_update.py

    • test_update_positional_filtered_near_conflict
    • test_update_positional_filtered_has_conflict_1
  • tests/test_engine/test_update/test_update_pull.py

    • test_update_pull_6
    • test_update_pull_7

They often failed randomly due to the dict key order in run-time. I think, unless changing those test case documents into OrderedDict, or can not ensure the key order input into monty and mongo were the same ( which may cause different output ).

Since this is not the issue of montydb's functionality, dropping them for good.

Involving Python 3.7

Well, it's 2019.

Everything works well but super slow

Just got a chance to dump 26.5K documents from mongodb to montydb and saved with flatfile storage, then kick it into production environment.

And the query is super slow, haha.
Should be related to #12.

Implement basic CRUD methods

  • db.collection.insert_one()
  • db.collection.insert_many()
  • db.collection.find()
  • db.collection.update_one()
  • db.collection.update_many()
  • db.collection.replace_one()
  • db.collection.delete_one()
  • db.collection.delete_many()

What's working & what's not

basic CRUD methods

  • insert_one
  • insert_many
  • find
  • update_one
  • update_many
  • replace_one
  • delete_one
  • delete_many
  • bulkWrite

Operators

Query Ops

comparison
  • $eq
  • $gt
  • $gte
  • $lt
  • $lte
  • $ne
  • $in
  • $nin
logical
  • $and
  • $or
  • $nor
  • $not
array
  • $all
  • $elemMatch
  • $size
element
  • $type
  • $exists
evaluation
  • $mod
  • $regex

Projection Ops

  • $
  • $elemMatch
  • $slice

Update Ops

Field Update Operators
  • $inc
  • $min
  • $max
  • $mul
  • $rename
  • $set
  • $setOnInsert
  • $unset
  • $currentDate
Array Update Operators
  • $
  • $[]
  • $[]
  • $addToSet
  • $pop
  • $pull
  • $push
  • $pullAll
Modifiers
  • $each
  • $position
  • $slice
  • $sort

Aggregation Ops

I haven't needed this yet, but it's in plan.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.