Giter Site home page Giter Site logo

Comments (61)

seperman avatar seperman commented on August 15, 2024 7

Thanks for the great work by @victorhahncastell this shouldn't be difficult to implement. We will try to implement it in V3 release!

from deepdiff.

seperman avatar seperman commented on August 15, 2024 3

Hello @divamgupta @ri0t , @laike9m and everybody.
The Delta object is ready for beta testing finally!
You can see the changes in this PR: #188
And please feel free to comment on that PR.

In order to test it, please pull down the dev branch.

2 example usages:

from deepdiff import Delta, DeepDiff

t1 = [[1, 2, 3, 4], [4, 2, 2, 1]]
t2 = [[4, 1, 1, 1], [1, 3, 2, 4]]

diff = DeepDiff(t1, t2, ignore_order=True, report_repetition=True)
delta = Delta(diff)
>>> t1 + delta
[[1, 2, 3, 4], [4, 1, 1, 1]]

or you could dump the delta:

In [1]: from deepdiff import DeepDiff, Delta                                                   

In [2]: t1 = [1, 2]                                                                            
In [3]: t2 = [1, 2, 3, 5]                                                                      
In [4]: diff = DeepDiff(t1, t2)                                                                
In [5]: diff                                                                                   
Out[5]: {'iterable_item_added': {'root[2]': 3, 'root[3]': 5}}

In [6]: dump = diff.to_detla_dump()                                                            
In [7]: dump                                                                                   
Out[7]: b'DeepDiff Delta Payload v0-0-1\n\x80\x04\x956\x00\x00\x00\x00\x00\x00\x00}\x94\x8c\x13iterable_item_added\x94}\x94(\x8c\x07root[2]\x94K\x03\x8c\x07root[3]\x94K\x05us.'

In [8]: delta = Delta(dump)                                                                    
In [9]: delta + t1 == t2 
Out[9]: True

There are many more examples in the tests.
Please let me know the results of your tests when you can!
Thanks

from deepdiff.

seperman avatar seperman commented on August 15, 2024 2

Heads up I'm working on a major release that adds the Delta object to DeepDiff. The goal is for any DeepDiff object to be able to be converted to a Delta object. Then the Delta object can be applied to any object with a similar structure. I will cut a beta release soon. If anybody is interested in helping with beta testing, please let me know.

from deepdiff.

 avatar commented on August 15, 2024 1

Event Sourcing/CQRS is gaining momentum and this feature ^^ would be a killer feature.
In Event Sourcing we need to apply millions of diff per second to (sometimes nested) objects and there is currently no good way to do it.
To see the use case for our community just read pages 9 to 13 of this thesis
We basically store diff each time an entity/aggregate is modified.
I would see Deepdiff being good for rehydrating entities/aggregates.
Event Sourcing

from deepdiff.

phiweger avatar phiweger commented on August 15, 2024 1

deepdiff == awesome. would be very useful to store (diff, new.json) as a form of version control that can recover old.json, did you make progress on the implementation @seperman?

from deepdiff.

seperman avatar seperman commented on August 15, 2024 1

What I have in mind if to use the to_json output which is basically the text view result as the delta object. The tricky part is dealing with custom objects. Perhaps we can expect the custom object to be present in the locals() when using the delta.

As far as the interface, I had plans to use Python's magic methods. So one could simply use obj + delta or obj - delta interfaces: https://github.com/seperman/deepdiff/blob/master/deepdiff/diff.py#L129
Were you interested in helping with this feature by any chance? @laike9m

from deepdiff.

seperman avatar seperman commented on August 15, 2024 1

Cool! The TreeView is great but contains the entire objects that are being diffed. The delta object on the other hand is compact and only includes what the difference is. It is closer to the Text View than the Tree View. I'm planning to include different detail levels for the deltas corresponding to different "verbose_level" parameters that were used to make the diff object. That way there can be both directed or symmetric deltas depending on the user's needs.

from deepdiff.

laike9m avatar laike9m commented on August 15, 2024 1

This is great news, thank you for the fantastic work! I'll give it a try.

from deepdiff.

laike9m avatar laike9m commented on August 15, 2024 1

@seperman Thanks for the quick fix! Confirmed.

from deepdiff.

seperman avatar seperman commented on August 15, 2024

@divamgupta Right now DeepDiff doesn't provide this feature but if enough people need this feature, we can implement it. I don't personally have a use case for it. Where would you use it?

from deepdiff.

divamgupta avatar divamgupta commented on August 15, 2024

It can be used to log objects which change over time. Rather than logging the complete object we can log the changes of that object. To view the object at any time instance we would have to apply the diff.

from deepdiff.

anatoly-kussul avatar anatoly-kussul commented on August 15, 2024

Also, if you have to transfer some changes of an object between two processes, this may be useful.

from deepdiff.

seperman avatar seperman commented on August 15, 2024

Cool. I will look into it when I have a chance.

from deepdiff.

giulioungaretti avatar giulioungaretti commented on August 15, 2024

I think it's a great idea too!
@seperman another use case is with son config files for examples, where you just want to save the diff, instead of the entire updated configuration and the apply it !

It goes hand in and with #43 I think!

from deepdiff.

 avatar commented on August 15, 2024

would be great to have a mode optimized to take less disk space as we store millions of diffs

from deepdiff.

 avatar commented on August 15, 2024

by optimized I mean simply use for example 'vc' instead of 'values_changed' as index...
no need for something ultra-optimized...

from deepdiff.

victorhahncastell avatar victorhahncastell commented on August 15, 2024

@divamgupta @anatoly-kussul @giulioungaretti @pouledodue This sounds quite interesting. Could you elaborate a bit on what exactly you would use this feature for? What's the reason for only storing the diff in the first place -- to conserve disk space? A bit of a problem might be that DeepDiff is currently not at all optimized to create small diffs. To the contrary -- since introducing TreeView with v3 we store a diff by keeping the complete object trees of both versions.

We might need to implement a feature to compact results -- which implies throwing away information which in turn means a lot of use case won't work on the compacted versions. But as an optional feature this does sound interesting.

from deepdiff.

ri0t avatar ri0t commented on August 15, 2024

I'm building a migration system for jsonschema-defined data. I can apply diffs already. Not sure, if it handles special cases exactly right (probably not... ...yet)... Can show that soon, will publish the code @ https://github.com/Hackerfleet/hfos/tree/master/hfos/migration.py

from deepdiff.

tiwariayush avatar tiwariayush commented on August 15, 2024

@ri0t : Seems interesting how you are able to apply diffs for dictionary_items_added dictionary_items_removed. Why not apply a similar approach for iterable_items_added and iterable_items_removed and then use it for dictionaries ?

from deepdiff.

Seersucker avatar Seersucker commented on August 15, 2024

Hello folks, git newbie here. I attempted to merge my branch called "Seersucker:compare_objs" that does compare Python objects and JSON but the merge request failed because I do not have unit tests created. Can someone please help me develop these tests? I have my code working just not sure how to proceed. I'm familiar with pytest only for a test framework.

from deepdiff.

tpaulino7 avatar tpaulino7 commented on August 15, 2024

Any updates on this? It would be a very useful feature. Btw, great work!

from deepdiff.

laike9m avatar laike9m commented on August 15, 2024

Thanks for the information. I might not have time to implement it now, but I can definitely take a closer look.

from deepdiff.

laike9m avatar laike9m commented on August 15, 2024

So excited to hear things moving 😄 . I'm willing to help testing. Please let me know what to do, or what part you need help with.

BTW, a previous comment mentioned that

To the contrary -- since introducing TreeView with v3 we store a diff by keeping the complete object trees of both versions.

I wonder if it's still the case. I care about it cause like described in the other issue, I want to use the diff object to hopefully save some memory for storing different versions of large objects. I also see other people had similar use cases.

from deepdiff.

ri0t avatar ri0t commented on August 15, 2024

Oooweee! I am very willing to assist :-)

from deepdiff.

laike9m avatar laike9m commented on August 15, 2024

Cool! The TreeView is great but contains the entire objects that are being diffed. The delta object on the other hand is compact and only includes what the difference is. It is closer to the Text View than the Tree View. I'm planning to include different detail levels for the deltas corresponding to different "verbose_level" parameters that were used to make the diff object. That way there can be both directed or symmetric deltas depending on the user's needs.

Nice

from deepdiff.

seperman avatar seperman commented on August 15, 2024

Also tagging @victorhahncastell in case.

from deepdiff.

ri0t avatar ri0t commented on August 15, 2024

From a first glance, everything seems to work as intended with my objects.

Now my next todo item is how to get this to interop with the javascript deep diff library i found: https://github.com/flitbit/diff
Probably just needs a translation of the deltas.

from deepdiff.

laike9m avatar laike9m commented on August 15, 2024

Some issues about dependencies. It seems 5.0 relies on numpy, but it's not in requirements.txt thus not collected by setup.py and not installed.

4.x only depends on ordered-set. I haven't looked at code in detail, but numpy is a pretty heavey dependency, I wonder if it's always needed.

from deepdiff.

seperman avatar seperman commented on August 15, 2024

@laike9m The numpy being a dependency is a mistake. Numpy is not technically required. Working on that.

from deepdiff.

seperman avatar seperman commented on August 15, 2024

@laike9m If you pull the dev branch again, Numpy shouldn't be required anymore.

from deepdiff.

laike9m avatar laike9m commented on August 15, 2024

Another question: do we have some optimizations for applying a series (>1) of Deltas?

from deepdiff.

seperman avatar seperman commented on August 15, 2024

@laike9m
There is no specific optimization. It applies the deltas one after the other. They need to be applied in the correct order though.

t1 = [1, 2]
t2 = [1, 2, 3, 5]
t3 = [{1}, 3, 5]
dump1 = DeepDiff(t1, t2).to_delta_dump()
dump2 = DeepDiff(t2, t3).to_delta_dump()

delta1 = Delta(dump1)
delta2 = Delta(dump2)

>>> t1 + delta1 + delta2 == t3
True

from deepdiff.

laike9m avatar laike9m commented on August 15, 2024

Got it. Just asking.

from deepdiff.

seperman avatar seperman commented on August 15, 2024

Also I pushed more changes to the dev branch last night so please pull again.

from deepdiff.

laike9m avatar laike9m commented on August 15, 2024

Just an idea, can we have a way to easily create an empty Delta(means "identical")?

Also, to know whether two objects are equal, we have to use DeepDiff == {}. Does it makes sense to add a method like DeepDiff.is_empty()?

from deepdiff.

laike9m avatar laike9m commented on August 15, 2024

When using delta, I was kind of expecting an API like DeepDiff.to_delta. It feels a bit weird to have to initiate a Delta from DeepDiff.to_detla_dump/dict.

from deepdiff.

seperman avatar seperman commented on August 15, 2024

@laike9m So there is right now to_delta_dict() which one can then store the delta in any wau they want to. However when loading the Delta from that dictionary, it needs to use the Delta object. Originally I was gonna just keep it in DeepDiff. So it was gonna be DeepDiff.to_delta() and DeepDiff.from_delta() and then you would apply the DeepDiff object to anything. However the DeepDiff object has a lot more going on than Delta. So I made a new class for Delta that is very light weight. That makes it faster to load the delta and apply it. I could revert it back so it is abstracted in DeepDiff and the user doesn't need to even know about the delta object.

from deepdiff.

laike9m avatar laike9m commented on August 15, 2024

@laike9m So there is right now to_delta_dict() which one can then store the delta in any wau they want to. However when loading the Delta from that dictionary, it needs to use the Delta object. Originally I was gonna just keep it in DeepDiff. So it was gonna be DeepDiff.to_delta() and DeepDiff.from_delta() and then you would apply the DeepDiff object to anything. However the DeepDiff object has a lot more going on than Delta. So I made a new class for Delta that is very light weight. That makes it faster to load the delta and apply it. I could revert it back so it is abstracted in DeepDiff and the user doesn't need to even know about the delta object.

How about making DeepDiff.to_delta() returns Delta(DeepDiff.to_detla_dump()), just wraps the Delta creation to make it more intuitive, other things keep intact. That's what meant.

from deepdiff.

seperman avatar seperman commented on August 15, 2024

@laike9m
I'm adding the identical Delta. Basically it will be Delta({})
Instead of DeepDiff == {} you can do not DeepDiff.

from deepdiff.

laike9m avatar laike9m commented on August 15, 2024

@laike9m
I'm adding the identical Delta. Basically it will be Delta({})
Instead of DeepDiff == {} you can do not DeepDiff.

SG. Thanks.

from deepdiff.

seperman avatar seperman commented on August 15, 2024

@laike9m
Do you prefer to get the delta dictionary and serialize it yourself or you prefer the dump format which is some header + pickle of that dictionary?

from deepdiff.

seperman avatar seperman commented on August 15, 2024

basically Delta.to_detla_dump gives you a pickle object + headers. DeepDiff.to_delta_dict is the same object but a python dictionary before it is serialized.
I'm adding to the docs but to make sure you are in the loop, the pickle loader that I'm using is safe and only loads a whitelisted list of items. So there is no security concern unlike the normal pickle.

from deepdiff.

laike9m avatar laike9m commented on August 15, 2024

@laike9m
Do you prefer to get the delta dictionary and serialize it yourself or you prefer the dump format which is some header + pickle of that dictionary?

In my use case everything is in memory, and I want to keep Delta objects directly if possible. Is the descision of creating Delta objects on demand a consideration for memory usage, or? I can imagine the pickled data consumes less memory, but dict version should roughly be the same?

from deepdiff.

laike9m avatar laike9m commented on August 15, 2024

Anyway, if you don't plan to change it, current API is fine as well.

from deepdiff.

seperman avatar seperman commented on August 15, 2024

If you plan to keep a lot of the delta objects in memory, the pickled bytes take the least space. And you can load them back to Delta whenever needed. Regarding the API, thanks for giving the feedback. I am trying to make it more Pythonic.

from deepdiff.

laike9m avatar laike9m commented on August 15, 2024

Yeah, I'll store dumped delta anyway so it's fine.

from deepdiff.

laike9m avatar laike9m commented on August 15, 2024

I've added Delta to my program. For most test cases it works fine, except one. I created a self-contained script to reproduce (Python 3.8.0 + DeepDiff 98d7d2d):

from copy import deepcopy

from deepdiff import DeepDiff


def test_diff():
    class A:
        pass

    a1 = A()
    a2 = A()

    a1_old = deepcopy(a1)
    a1.x = a2
    print(DeepDiff(a1_old, a1))  # works
    print(DeepDiff(a1_old, a1).to_delta_dump())  # error


test_diff()

Gives:

python test/test_diff.py                                                                         
{'attribute_added': [root.x]}
Traceback (most recent call last):
  File "test/test_diff.py", line 37, in <module>
    test_diff()
  File "test/test_diff.py", line 34, in test_diff
    print(DeepDiff(a1_old, a1).to_delta_dump())
  File "/Users/laike9m/.pyenv/versions/3.8.0/envs/[email protected]/src/deepdiff/deepdiff/diff.py", line 1065, in to_delta_dump
    return pickle_dump(self.to_delta_dict())
  File "/Users/laike9m/.pyenv/versions/3.8.0/envs/[email protected]/src/deepdiff/deepdiff/serialization.py", line 92, in pickle_dump
    return header + b'\n' + pickle.dumps(obj, protocol=4, fix_imports=False)
AttributeError: Can't pickle local object 'test_diff.<locals>.A'

For now, I'm using Delta(DeepDiff) as a temporary solution.

from deepdiff.

laike9m avatar laike9m commented on August 15, 2024

As an experiment, I replaced stdlib's pickle with dill, and it actually works...

Step 1. Modify serialization.py like this

def pickle_dump(obj, header=BASIC_HEADER):
    if isinstance(header, str):
        header = header.encode('utf-8')
    # We expect at least python 3.5 so protocol 4 is good.
    import dill
    return header + b'\n' + dill.dumps(obj, fix_imports=False)

Step 2. Modify my test program

from copy import deepcopy

from deepdiff import DeepDiff, Delta


def test_diff():
    class A:
        pass

    a1 = A()
    a2 = A()

    a1_old = deepcopy(a1)
    a1.x = a2  # STORE_ATTR
    print(DeepDiff(a1_old, a1))  # works
    delta = Delta(
        DeepDiff(a1_old, a1).to_delta_dump(),
        safe_to_import={"dill._dill._create_type", "dill._dill._load_type"},
    )

    print((a1_old + delta).x)


test_diff()

Run

python test/test_diff.py                                                                         
{'attribute_added': [root.x]}
<__main__.A object at 0x1046035e0>

from deepdiff.

laike9m avatar laike9m commented on August 15, 2024

Interesting enough, I find that if using pytest, dill doesn't work either. I'm not sure why, but it seems pytest does some magic behind the scenes which conflicts with the pickling process.

Same program, tested with pytest --assert=plain -k "test/test_diff.py", gives

______________________________________ ERROR collecting test/test_diff.py ______________________________________
test/test_diff.py:24: in <module>
    test_diff()
test/test_diff.py:17: in test_diff
    DeepDiff(a1_old, a1).to_delta_dump(),
../../../.pyenv/versions/3.8.0/envs/[email protected]/src/deepdiff/deepdiff/diff.py:1065: in to_delta_dump
    return pickle_dump(self.to_delta_dict())
../../../.pyenv/versions/3.8.0/envs/[email protected]/src/deepdiff/deepdiff/serialization.py:93: in pickle_dump
    return header + b'\n' + dill.dumps(obj, fix_imports=False)
../../../.pyenv/versions/[email protected]/lib/python3.8/site-packages/dill/_dill.py:265: in dumps
    dump(obj, file, protocol, byref, fmode, recurse, **kwds)#, strictio)
../../../.pyenv/versions/[email protected]/lib/python3.8/site-packages/dill/_dill.py:259: in dump
    Pickler(file, protocol, **_kwds).dump(obj)
../../../.pyenv/versions/[email protected]/lib/python3.8/site-packages/dill/_dill.py:445: in dump
    StockPickler.dump(self, obj)
../../../.pyenv/versions/3.8.0/lib/python3.8/pickle.py:485: in dump
    self.save(obj)
../../../.pyenv/versions/3.8.0/lib/python3.8/pickle.py:558: in save
    f(self, obj)  # Call unbound method with explicit self
../../../.pyenv/versions/[email protected]/lib/python3.8/site-packages/dill/_dill.py:912: in save_module_dict
    StockPickler.save_dict(pickler, obj)
../../../.pyenv/versions/3.8.0/lib/python3.8/pickle.py:969: in save_dict
    self._batch_setitems(obj.items())
../../../.pyenv/versions/3.8.0/lib/python3.8/pickle.py:1000: in _batch_setitems
    save(v)
../../../.pyenv/versions/3.8.0/lib/python3.8/pickle.py:558: in save
    f(self, obj)  # Call unbound method with explicit self
../../../.pyenv/versions/[email protected]/lib/python3.8/site-packages/dill/_dill.py:912: in save_module_dict
    StockPickler.save_dict(pickler, obj)
../../../.pyenv/versions/3.8.0/lib/python3.8/pickle.py:969: in save_dict
    self._batch_setitems(obj.items())
../../../.pyenv/versions/3.8.0/lib/python3.8/pickle.py:1000: in _batch_setitems
    save(v)
../../../.pyenv/versions/3.8.0/lib/python3.8/pickle.py:601: in save
    self.save_reduce(obj=obj, *rv)
../../../.pyenv/versions/3.8.0/lib/python3.8/pickle.py:685: in save_reduce
    save(cls)
../../../.pyenv/versions/3.8.0/lib/python3.8/pickle.py:558: in save
    f(self, obj)  # Call unbound method with explicit self
../../../.pyenv/versions/[email protected]/lib/python3.8/site-packages/dill/_dill.py:1356: in save_type
    StockPickler.save_global(pickler, obj)
../../../.pyenv/versions/3.8.0/lib/python3.8/pickle.py:1068: in save_global
    raise PicklingError(
E   _pickle.PicklingError: Can't pickle <class 'test_diff.test_diff.<locals>.A'>: it's not found as test_diff.test_diff.<locals>.A

from deepdiff.

laike9m avatar laike9m commented on August 15, 2024

Another bug found (not related to the previous): when initializing Delta from DeepDiff object, if the DeepDiff object evaluates to False, initialization will fail.

Clearly it's due to this line:
https://github.com/seperman/deepdiff/blob/dev/deepdiff/delta.py#L93

    def __init__(self, diff=None, delta_path=None, mutate=False, verify_symmetry=False, raise_errors=False, log_errors=True, safe_to_import=None):
        if diff:
            if isinstance(diff, DeepDiff):
                self.diff = diff.to_delta_dict()
            elif isinstance(diff, Mapping):
                self.diff = diff
            elif isinstance(diff, strings):
                self.diff = pickle_load(diff, safe_to_import=safe_to_import)
        elif delta_path:
            with open(delta_path, 'rb'):
                content = delta_path.read()
            self.diff = pickle_load(content, safe_to_import=safe_to_import)
        else:
>           raise ValueError('Either diff or delta_path need to be specified.')
E           ValueError: Either diff or delta_path need to be specified.

An empty diff is not taken care of properly.

Let me know if you prefer to create a separate issue.

from deepdiff.

seperman avatar seperman commented on August 15, 2024

Cool I merged your PR..
Just curious, do you use dill? I have not had any usage for it. It seems slower than cPickle too
Also I'm doing some refactoring of the ignore_order=True algorithm to use dynamic programming. There are way too many re-calculations happening right now.

from deepdiff.

laike9m avatar laike9m commented on August 15, 2024

I don't have much experience with it, just know it exists. According to this answer, dill is indeed slower.

In the delta use case, I think being able to pickle more types of objects is more important than performance, and this file shows dill can pickle more stuff than cPickle. With that said, there are some ways to mitigate the performance issue:

  1. Choose cPickle by default, if failed, try dill, and add another field to record the method used.
  2. As the answer said, dill itself has different configurations, like byref=True, which makes it faster. Do we need it? I don't know.

    Other settings trade off picklibility for speed in selected objects.

This is not too bad. What worries me more is that, the fact that pickling can fail(even with dill like in the pytest case) in unexpected ways will force people to write the following code snippet everytime they want to use to_delta_dump():

diff = DeepDiff(a1_old, a1)
try:
    dump = diff.to_delta_dump()
except PicklingError:
    dump = diff  # or dump = diff.to_delta_dict()

# later in code
delta = Delta(dump)

This is, IMHO, cumbersome, error-prone, and hard to use. We can discuss the solution, but I think the direction could be to encapsulate this process by a to_delta API, which accepts a mode parameter. It works roughly like this:

class DeltaCreationMode(Enum):
    fastest_speed = 1
    least_space = 2
    balance = 3

class DeepDiff:
    ...
    def to_delta(self, mode: DeltaCreationMode):
        return Delta(self, mode)

class Delta:
    def __init__(self, diff: DeepDiff, mode: DeltaCreationMode):
        self.mode = mode
        if mode is DeltaCreationMode.fastest_speed:
            self.diff = diff.to_delta_dict()
        if mode in {DeltaCreationMode.balance, DeltaCreationMode.least_space):
			try:
			    self.dump = diff.to_delta_dump()
			except PicklingError:
                self.diff = diff.to_delta_dict()

    def __add__(self):
	    # When using self.diff, __add__ is just an example.
	    if self.diff is None:
	        assert self.dump
	        self.diff = pickle_load(self.dump)
	   
	    # ... use self.diff
	    
	    if self.mode is DeltaCreationMode.least_space and self.dump:
	        del self.diff

    def to_delta_dump(self):
        ...

    def to_delta_dict(self):
        ...

In this design, to_delta_dump and to_delta_dict is defined in the Delta class, which feels more natural for me (I could be wrong).

There are many things I didn't consider, and it's not necessarily the way to go, but that's my two cents.

from deepdiff.

laike9m avatar laike9m commented on August 15, 2024

Any thoughts? @seperman

from deepdiff.

seperman avatar seperman commented on August 15, 2024

@laike9m You brought up very good points. I have never had a need to use dill but I agree with you that the interface should easily allow one to switch pickle with any other serializer.
I followed your suggestion of moving the delta dump functions to Delta class itself.

The mode parameter is an interesting idea. I have to think a little more about it. What do you think about following the overall pattern in Python itself for the serializers? The common functions I see are dump(), dumps() and to_dict().

If we follow that convention, the DeepDiff.to_delta_dict function is now private. instead one can do Delta(diff_object).to_dict() to get the delta in dictionary format and Delta(diff_object).dump() and dumps() to get the serialized output. The default serializer behind the scene is pickle but the user can pass a serializer and a deserializer through parameters. Since we are doing some custom stuff for defining safe to import modules, it was not as straightforward of replacing pickle with dill or with json. Please check the new changes and let me know what you think. I'm updating the docs now for other modules so this interface can still change based on the feedback!

Btw, I just pushed some new changes that should make the ignore_order=True calculations exponentially faster than the previous commit on the dev branch.

from deepdiff.

laike9m avatar laike9m commented on August 15, 2024

I like the current implementation and the idea to specify different serializers/deserializers. Originally I was thinking about storing the serialized object inside Delta, but now I realized it doesn't make much sense. Thought I still wish I could use delta = diff.to_delta() to create a Delta object.

from deepdiff.

laike9m avatar laike9m commented on August 15, 2024

Meanwhile, I have a use case that I'd like to share and know what you think. Code is not opensourced yet so I use a screenshot.

image

from deepdiff.

seperman avatar seperman commented on August 15, 2024

Interesting. The DeepDiff object is way heavier than Delta so I wouldn't deepcopy the diff object. Instead the DeepDiff._to_delta_dict should deepcopy its output which ends up the delta object not having any objects in common with diff. In fact I just pushed a commit that does that: 3a91682

That doesn't really make the Delta object immutable though. But at least mutating the diff object won't affect the delta object. We have a reset() method on delta to reset it back to its original state every time it is added to something. The reset method is run automatically every time the delta is added to something.

from deepdiff.

laike9m avatar laike9m commented on August 15, 2024

Cool, this will solve my problem and I can remove deepcopy from my code. Thanks. For me, it is not the diff object gets modified, but the object it references, like this:

a1 = A()
a2 = A()
a1.x = a2
a1.x.y = 2  # a1.x is modified

"""
delta=<Delta: {
    'attribute_added': {
         'root.x': <test_attribute.test_attribute.<locals>.A object at 0x109a52910>
    }
}>
"""

from deepdiff.

laike9m avatar laike9m commented on August 15, 2024

Seems all features are stabled. May I ask when do you plan to release 5.0?

from deepdiff.

seperman avatar seperman commented on August 15, 2024

That's a great question. What is remaining is now getting to 100% test coverage. It is at 99%. And believe it or not that last 1% coverage always takes a little longer than it sounds like.
Also updating the docs.
Last but not least, there is a huge performance issue when it comes to Numpy since I'm not doing Numpy calculations efficiently and I'm working on it right now.
So I would say probably releasing by the end of this week.

from deepdiff.

laike9m avatar laike9m commented on August 15, 2024

No worries, performance is indeed important and worth spending time on.

from deepdiff.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.