Giter Site home page Giter Site logo

dyntastic's Introduction

dyntastic

CI codecov pypi license

A DynamoDB library on top of Pydantic and boto3.

Installation

pip3 install dyntastic

If the Pydantic binaries are too large for you (they can exceed 90MB), use the following:

pip3 uninstall pydantic  # if pydantic is already installed
pip3 install dyntastic --no-binary pydantic

Usage

The core functionality of this library is provided by the Dyntastic class.

Dyntastic is a subclass of Pydantic's BaseModel, so can be used in all the same places a Pydantic model can be used (FastAPI, etc).

import uuid
from datetime import datetime
from typing import Optional

from dyntastic import Dyntastic
from pydantic import Field

class Product(Dyntastic):
    __table_name__ = "products"
    __hash_key__ = "product_id"

    product_id: str = Field(default_factory=lambda: str(uuid.uuid4()))
    name: str
    description: Optional[str] = None
    price: float
    tax: Optional[float] = None


class Event(Dyntastic):
    __table_name__ = "events"
    __hash_key__ = "event_id"
    __range_key__ = "timestamp"

    event_id: str
    timestamp: datetime
    data: dict

# All your favorite pydantic functionality still works:

p = Product(name="bread", price=3.49)
# Product(product_id='d2e91c30-e701-422f-b71b-465b02749f18', name='bread', description=None, price=3.49, tax=None)

p.model_dump()
# {'product_id': 'd2e91c30-e701-422f-b71b-465b02749f18', 'name': 'bread', 'description': None, 'price': 3.49, 'tax': None}

p.model_dump_json()
# '{"product_id": "d2e91c30-e701-422f-b71b-465b02749f18", "name": "bread", "description": null, "price": 3.49, "tax": null}'

Inserting into DynamoDB

Using the Product example from above, simply:

product = Product(name="bread", description="Sourdough Bread", price=3.99)
product.product_id
# d2e91c30-e701-422f-b71b-465b02749f18

# Nothing is written to DynamoDB until .save() is called:
product.save()

Getting Items from DynamoDB

Product.get("d2e91c30-e701-422f-b71b-465b02749f18")
# Product(product_id='d2e91c30-e701-422f-b71b-465b02749f18', name='bread', description="Sourdough Bread", price=3.99, tax=None)

The range key must be provided if one is defined:

Event.get("d2e91c30-e701-422f-b71b-465b02749f18", "2022-02-12T18:27:55.837Z")

Consistent reads are supported:

Event.get(..., consistent_read=True)

A DoesNotExist error is raised by get if a key is not found:

Product.get("nonexistent")
# Traceback (most recent call last):
#   ...
# dyntastic.exceptions.DoesNotExist

Use safe_get instead to return None if the key is not found:

Product.safe_get("nonexistent")
# None

Querying Items in DynamoDB

# A is shorthand for the Attr class (i.e. attribute)
from dyntastic import A

# auto paging iterable
for event in Event.query("some_event_id"):
    print(event)


Event.query("some_event_id", per_page=10)
Event.query("some_event_id")
Event.query("some_event_id", range_key_condition=A.timestamp < datetime(2022, 2, 13))
Event.query("some_event_id", filter_condition=A.some_field == "foo")

# query an index
Event.query(A.my_other_field == 12345, index="my_other_field-index")

# note: Must provide a condition expression rather than just the value
Event.query(123545, index="my_other_field-index")  # errors!

# query an index with an optional filter expression
filter_expression = None
if filter_value:
    filter_expression = A('filter_field').eq(filter_value)
Event.query(
    A.my_other_field == 12345,
    index="my_other_field-index",
    filter_expression=filter_expression
)

# consistent read
Event.query("some_event_id", consistent_read=True)

# specifies the order for index traversal, the default is ascending order
# returns the results in the order in which they are stored by sort key value
Event.query("some_event_id", range_key_condition=A.version.begins_with("2023"), scan_index_forward=False)

DynamoDB Indexes using a KEYS_ONLY or INCLUDE projection are supported:

for event in Event.query("2023-09-22", index="date-keys-only-index"):
    event.id
    # "..."
    event.timestamp
    # datetime(...)

    event.data
    # ValueError: Dyntastic instance was loaded from a KEYS_ONLY or INCLUDE index.
    #             Call refresh() to load the full item, or pass load_full_item=True
    #             to query() or scan()

# automatically fetch the full items
for event in Event.query("2023-09-22", index="date-keys-only-index", load_full_item=True):
    event.data
    # {...}

If you need to manually handle pagination, use query_page:

page = Event.query_page(...)
page.items
# [...]
page.has_more
# True
page.last_evaluated_key
# {"event_id": "some_event_id", "timestamp": "..."}

Event.query_page(..., last_evaluated_key=page.last_evaluated_key)

Scanning Items in DynamoDB

Scanning is done identically to querying, except there are no hash key or range key conditions.

# auto paging iterable
for event in Event.scan():
    pass

Event.scan((A.my_field < 5) & (A.some_other_field.is_in(["a", "b", "c"])))
Event.scan(..., consistent_read=True)

Updating Items in DynamoDB

Examples:

my_item.update(A.my_field.set("new_value"))
my_item.update(A.my_field.set(A.another_field))
my_item.update(A.my_int.set(A.another_int - 10))
my_item.update(A.my_int.set(A.my_int + 1))
my_item.update(A.my_list.append("new_element"))
my_item.update(A.some_attribute.set_default("value_if_not_already_present"))

my_item.update(A.my_field.remove())
my_item.update(A.my_list.remove(2))  # remove by index

my_item.update(A.my_string_set.add("new_element"))
my_item.update(A.my_string_set.add({"new_1", "new_2"}))
my_item.update(A.my_string_set.delete("element_to_remove"))
my_item.update(A.my_string_set.delete({"remove_1", "remove_2"}))

# Multiple updates can be performed at once
my_item.update(
    A.my_field.set("new_value"),
    A.my_int.set(A.my_int + 1),
    ...
)

The data is automatically refreshed after the update request. To disable this behavior, pass refresh=False:

my_item.update(..., refresh=False)

Supports conditions:

my_item.update(..., condition=A.my_field == "something")

By default, if the condition is not met, the update call will be a noop. To instead error in this situation, pass require_condition=True:

my_item.update(..., require_condition=True)

Batch Reads

Multiple items can be read from a table at the same time using the batch_get function.

Note that DynamoDB limits the number of items that can be read at one time to 100 items or 16MB, whichever comes first.

Note that if any of the provided keys are missing from dynamo, they will simply be excluded in the result set.

MyModel.batch_get(["hash_key_1", "hash_key_2", "hash_key_3"])
# => [MyModel(...), MyModel(...)]

For models with a range key defined:

MyModel.batch_get([("hash_key_1", "range_key_1"), ("hash_key_2", "range_key_2")])
# => [MyModel(...), MyModel(...)]

Batch Writes

Save and delete operations may also be performed in batches.

Note that DynamoDB limits the number of items that can be written in a single batch to 25 items or 16MB, whichever comes first. Dyntastic will automatically batch in chunks of 25, or less if desired.

with MyModel.batch_writer():
    MyModel(id="0").delete()
    MyModel(id="1").save()
    MyModel(id="2").save()

# all operations are performed once the `with` context is exited

To configure a smaller batch size, for example when each item is relatively large:

with MyModel.batch_writer(batch_size=2):
    MyModel(id="1").save()
    MyModel(id="2").save()
    # the previous two models are written immediately, since the batch size was reached
    MyModel(id="3).save()

# The final operation is performed here now that the `with` context has exited

Transactions

Dyntastic supports DynamoDB transactions. Transactions are performed using the transaction context manager and can be used to perform operations across one or multiple tables that reside in the same region.

from dyntastic import transaction

with transaction():
    item1 = SomeTable(...)
    item2 = AnotherTable.get(...)
    item1.save()
    item2.update(A.something.set("..."))

Note that DynamoDB limits the number of items that can be written in a single transaction to 100 items or 4MB, whichever comes first. Dyntastic can automatically flush the transaction in chunks of 100 (or fewer if desired) by passing auto_commit=True.

For example, to commit every 50 items:

with transaction(auto_commit=True, commit_every=50):
    item1 = SomeTable(...)
    item2 = AnotherTable.get(...)
    item1.save()
    item2.update(A.something.set("..."))

Create a DynamoDB Table

This functionality is currently meant only for use in unit tests as it does not support configuring throughput.

To create a table with no secondary indexes:

MyModel.create_table()

# Do not wait until the table creation is complete (subsequent operations
# may error if they are performed before the table creation is finished)
MyModel.create_table(wait=False)

To define global secondary indexes (creating local secondary indexes is not currently supported):

# All of the following are equivalent
index1 = "my_field"
index1 = Index("my_field")
index1 = Index("my_field", index_name="my_field-index")

# Range keys are also supported
index2 = Index("my_field", "my_second_field")
index2 = Index("my_field", "my_second_field", index_name="my_field_my_second_field-index")

MyModel.create_table(index1, index2)

Dynamic table names

In some circumstances you may want the table name to be defined dynamically. This can be done by setting the __table_name__ attribute to a Callable that returns the table name from the source of your choice. In the example below, we are using an environment variable.

import os
from dyntastic import Dyntastic

os.environ["MY_TABLE_NAME"] = "my_table"

class Product(Dyntastic):
    __table_name__ = lambda: os.getenv("MY_TABLE_NAME")
    __hash_key__ = "product_id"

    product_id: str

Custom dynamodb endpoint or region for local development

To explicitly define an AWS region or DynamoDB endpoint url (for using a local dynamodb docker instance, for example), set __table_region__ or __table_host__. These attributes can be a string or a Callable that returns a string.

from dyntastic import Dyntastic

class Product(Dyntastic):
    __table_name__ = "products"
    __table_region__ = "us-east-1"
    __table_host__ = "http://localhost:8000"
    __hash_key__ = "product_id"

    product_id: str

You can also set the environment variables DYNTASTIC_HOST and/or DYNTASTIC_REGION to control the behavior of the underlying boto3 client and resource objects.

Note: if both the environment variables and the class attributes are set, the class attributes will take precedence.

import os
from dyntastic import Dyntastic

os.environ["DYNTASTIC_HOST"] = "http://localhost:8000"
os.environ["DYNTASTIC_REGION"] = "us-east-1"

class Product(Dyntastic):
    __table_name__ = "products"
    __hash_key__ = "product_id"

    product_id: str

Contributing / Developer Setup

Make sure just is installed on your system

To setup the dev environment and install dependencies:

# create and activate a new venv
python3 -m venv .venv
. .venv/bin/activate

# install all dev dependencies
just install-dev

# to automatically run pre-commit before all commits
pre-commit install

After making changes, lint all code + run tests:

just pre-commit

# or individually:
just isort
just black
just flake8
just mypy
just test

# run a specific test/tests
just test tests/test_save.py tests/test_get.py
just test tests/some_save.py::test_save_aliased_item

dyntastic's People

Contributors

bvsn avatar cooncesean avatar kvanopdorp avatar nayaverdier avatar peterb154 avatar photonbit avatar strutt avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

dyntastic's Issues

Clarifications on item updates

Hi! Awesome library, been trying it out and have a couple of questions.

I made a simple class:

import uuid
from datetime import datetime

class Counter(Dyntastic):
    __table_name__ = "counters"
    __hash_key__ = "id"

    id: str = Field(default_factory=lambda: str(uuid.uuid4()))
    value: int = 0
    created: datetime = Field(default_factory=datetime.now)
    updated: datetime = Field(default_factory=datetime.now)

And I made a Lambda handler that does this (the intent is to increment the value by 1):

def run(event: EventBridgeEvent, context: LambdaContext):
    counter = Counter.safe_get(event.detail["id"])

    if not counter:
        counter = Counter(id=event.detail["id"])
        counter.save()

    counter.update(
        A.value.plus(1),
        A.updated.set(datetime.now()),
    )

Although I see the counter being initialized in Dynamo, the updates don't seem to take effect. For reference this does work:

def run(event: EventBridgeEvent, context: LambdaContext):
    counter = Counter.safe_get(event.detail["id"])

    if not counter:
        counter = Counter(id=event.detail["id"])
        counter.save()

    counter.value += 1
    counter.updated = datetime.now()
    counter.save()

So I'm just wondering if maybe I'm misunderstanding something, or if I've stumbled on a bug. I'm also not clear on the difference between these two ways to update, I guess counter.update() is more efficient?

Any help you can provide here would be awesome, thank you for taking the time to read this over ๐Ÿ™

Batching

Is there any plan to wrap the batch read or write methods available in boto3?

Perhaps something like:

Event.batch_get(
    "key1",
    "key2",
    "key3"
)

serialization does not honor json_encoders from Pydantic Model

given the following code:

class DummyJobClass:
     pass
       
class JobType(Enum):
    DUMMY_JOB_TYPE = DummyJobClass
    
class Job(Dyntastic):
    __table_name__ = "jobs"
    __hash_key__ = "job_id"
    __table_host__ = "http://localhost:8000"
    __table_region__ = "us-east-1"

    job_id: int
    type: JobType
    
    class Config:
        json_encoders = {
           JobType: lambda obj: obj.name
        }

model = Job(job_id=1, type=JobType.DUMMY_JOB_TYPE)
model.save()

Error:

File "pydantic/json.py", line 90, in pydantic.json.pydantic_encoder
TypeError: Object of type 'type' is not JSON serializable

This happens because the serialize function uses the pydantic_encoder which works for dataclasses because does not implement .json() method.

https://github.com/nayaverdier/dyntastic/blob/4b90aaa5d8dec7013de4c9692da44a7aa2766b0d/dyntastic/attr.py#L17C1-L33C70

For Sets and Decimals, we can implement a json_encoder in Dyntastic, but if someone overrides Config class does not work ๐Ÿ˜ข.

Another option would be implement something like this:
https://github.com/pydantic/pydantic/blob/d9c2af3a701ca982945a590de1a1da98b3fb4003/pydantic/main.py#L242-L245

in this part of serialize:

    else:
        # handle types like datetime
        return json.loads(json.dumps(data, default=pydantic_encoder))

Transaction methods?

First I wanted to say thank you for creating this library. I've been looking for a way to represent DynamoDB "models" using something like Pydantic but didn't want a full ORM like Pynamo. We're having a discussion about it on aws-powertools/powertools-lambda-python#2053.

Any thoughts on supporting TransactWriteItems and TransactGetItems? It would be cool to use a context manager with writing a transaction to automatically flush it once it reaches 100 items.

Feature request: __table_host__ or endpoint_url

Hi there ๐Ÿ‘‹ Is there any way to set a custom endpoint_url for dynamodb? For local development, I'm using dynamodb docker image, so looking for a way how to set my http://localhost:8000 for boto3 resource/client ๐Ÿค”

Any suggestions on how to set it up? Thanks in advance.

Localstack support

Hi team,

I'm currently very interested in using this for my projects.

One thing that I'd like to get out of the box is Localstack support for the models so that the host automatically gets rewritten to the Localstack when Localstack is configured.

I'm adding this issue here as I think it would be a nice QOL upgrade, and perhaps I will get to it if I find time.

Attr query by class variables.

Hi team,

So I've had a look at the mechanism for querying and I feel as if it's very close to complete. My only question would be to perhaps rework the way that the query works to allow for the feeding in of class and its hinted variables. Then the work is offloaded to the Attr library internally.

This would allow for auto-completion when querying an object. I haven't played with the library but from the looks of it, you won't get static analysis on those variables from the Attr class.

This I think would be the final implementation change that would make the library extra-ordinary and bring it into wide usage.

I will re-iterate that this library is fantastic and I'm extremely grateful for the work already done on it. I'm curious if I can find a way to implement the above so I'll try take a look at it.

Allowing updates without first fetching the record

Following on from #21 (comment).

I don't really have a clear path to this, but I'd expect something like:

Model.update("id", updates..., refresh=False (or true!))

Maybe it needs a new name beyond update, not sure. I definitely think this would be a great addition though.

[FEATURE] Support KEYS_ONLY secondary index

given a Example Table with secondary index that ha projection_type = 'KEYS_ONLY':

class Example(Dyntastic):
    __table_name__ = "example"
    __hash_key__ = "id"
    __table_host__ = "http://localhost:8000"
    __table_region__ = "us-east-1"
    
    some_required_attribute: str

when you get the results from dynamodb only return the __hash_key__ because of projection_type, this raises an error when calling the query() method because some_required_attribute is missing.

query_results = cls.query(
    A.is == id,
    index="example-index",
)

You can make optional all the fields to make this work, but I think that is a bad idea when using schema validation, lol.

Feature Idea:
When querying secondary indexes with KEYS_ONLYS projection would be great to:
(1) disable the validation of the pydantic schema upon getting the data and validate only keys and return the object with the partion keys only.
(2) [optional argument for query] query method return the full object using GET operations after getting (1)

Async support?

While considering leveraging dyntastic for a Litestar based API, I was wondering whether you'd considered enabling async capabilities? I see dyntastic is built on top of boto3, and so perhaps the obvious way to support async would be via aioboto3. Do you have any plans or thoughts in this space?

support for env variable based metadata

Hey all, just stumbled on this project and I LOVE it so far.

I use a the docker based amazon/dynamodb-local:latest image for local testing and deploy tables into AWS with cdk, which makes the tables names variable based on the deployment.

When testing locally, I have to change the model to set __table_host__ and __table_region__ to the local ddb instance, and remember to remove that before commit/deploy.

And for the __table_name__, I currently have to hard code the table name in cdk so that it matches the model definition. I'd rather use the dynamically generated table names and then set that as an ENV Variable on my lambdas/containers.

I think it would be great if these props be set natively using environment variables, rather than hard coded in the models. That way they would be easy to over-ride for local testing/unit tests and could be dynamically set when deployed to AWS.

If you are in support of this idea, I'll create a PR to implement.

Dyntastic import error

I just installed version 0.13.0 with pydantic v2 support. When importing the Dyntastic class, I get the below error:

I believe it's a matter of importing FieldInfo vs setting FieldInfo to pydantic.fields.Fiedlnfo. I opened a pull request with slight mods. Feel free to pull or mod as makes more sense to you.

from dyntastic import Dyntastic
Traceback (most recent call last):
File "", line 1, in
File "/Users/kirk/python-test/.venv/lib/python3.9/site-packages/dyntastic/init.py", line 1, in
from .attr import A, Attr
File "/Users/kirk/python-test/.venv/lib/python3.9/site-packages/dyntastic/attr.py", line 10, in
from . import pydantic_compat
File "/Users/kirk/python-test/.venv/lib/python3.9/site-packages/dyntastic/pydantic_compat.py", line 90, in
FieldInfo = pydantic.fields.FieldInfo
File "/Users/kirk/python-test/.venv/lib/python3.9/site-packages/pydantic/init.py", line 363, in getattr
return _getattr_migration(attr_name)
File "/Users/kirk/python-test/.venv/lib/python3.9/site-packages/pydantic/_migration.py", line 306, in wrapper
raise AttributeError(f'module {module!r} has no attribute {name!r}')
AttributeError: module 'pydantic' has no attribute 'fields'

Index documentation / source

I am trying from pynamodb to dyntastic and I am finding trouble to understand how to use indexes or which ones are supported. I've found some misleading clues:

https://github.com/nayaverdier/dyntastic/blob/main/dyntastic/main.py#L66

# TODO: support INCLUDE projection?
        self.projection = "KEYS_ONLY" if keys_only else "ALL"

https://github.com/nayaverdier/dyntastic/blob/main/dyntastic/main.py#L98

        validated, fields_set, errors = validate_model(model, item)
        if errors:
            # assume KEYS_ONLY or INCLUDE index

https://github.com/nayaverdier/dyntastic/blob/main/README.md?plain=1#L164

DynamoDB Indexes using a `KEYS_ONLY` or `INCLUDE` projection are supported:

So what are the projections supported?

Also, how could we use the dyntastic.Index class so I can add the index as a field to another model?

Consider removing the dependency pinning

Hi! We just started trying to integrate your library to our ecosystem but this is causing some conflicts with other libraries (because of the pinning of importlib-metadata at the moment, but the other two pinnings will eventually cause trouble as well). Would you consider removing the pinning?

Thank you!

`.query()` over large data sets with a `filter_condition` does not return expected results

Issue Description

When running Dyntastic.query() over a large data set, with a filter_condition, the resulting generator contains zero items (ie: raises StopIteration). The exact same query, with the exact same filter_condition, executed on a smaller data set, returns the expected items from Dynamo.

Video

Slightly difficult to explain over text, so I created a Loom to demonstrate the issue.

Steps To Reproduce

  1. Create a new dynatastic model w/ at least one str attribute.
  2. Create >1000 records of that model associated to the same p_key
  3. Query that model using a filter_expression on the attribute.

Clarification on create_table limitations and intent

I saw that the README states that create_table is only meant for testing, but I'm curious if this is due to it not being ready "for production" or because it's never intended to be ready for production?

The obvious feature to add is BILLING_MODE, and I was going to look into doing this, but it occurred to me that it might be intentionally left out. Same for LSI. As both of these are support by the DynamoDB client, I figure it's something like API limitations and/or maintenance limitations?

In my current workflow (using the Serverless framework) I can either define my tables in code, or define them in CloudFormation. Doing so in CloudFormation is quite tedious, but doing it in code is a bit awkward as it's not supported by dyntastic.

Any clarification you can provide here is appreciated!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.