drewmccormack / ensembles Goto Github PK

View Code? Open in Web Editor NEW

1.6K 1.6K 132.0 3.48 MB

A synchronization framework for Core Data.

License: MIT License

Objective-C 78.01% Ruby 0.18% C 21.51% Roff 0.30%

ensembles's People

Contributors

Stargazers

Watchers

Forkers

cparnot erndev lingling2012 steve21124 jonusx mustafa fdstevex themikeswan lengqingfeng ibeacons xkmo chronicstim steveswing carltongibson thbonk codegefluester yerayarencibia evands kunalsood zhangkai123 amitkumar3968 woohyuknrg favormm mastohhh realmacsoftware ibmsoft sergeykovalenko jingxuancao notjosh 2bj marzapower robpearlman nothirst rparada victorwon wjcwukong jverkoey aug0libs samesimilar rossetantoine anil291987 natecook1000 izackp workboxsoftware mwaterfall dhennessy ducky007 spokane-dude oks kbaxter rustemt mingsai craigdaly iiiyu syedfa engmsaleh algoalexis heke006 binhho pagaciky knutigro aryansbtloe effectiveprogramming-forks lrrrs sguo12 rraallvv gayledds enlarsen ycaihua nilstack bloom smithimage miradel51 suraphanl runt18 reversegiraffe mm2012mm terhechte 583175259 extracts jarbitechture michaelvu812 tommyle elanovasconcelos simone67 griddle heqikang haijiaoliuhao91 sircomet jicksy abdullah13314 xjki 8agame crashmaker reversescale bergyun daleninmann keremerkan gorbannastya alirafiq40

ensembles's Issues

Don't merge when all changes are from the current device

In the event integrator, it is not necessary to treat changes for an object if they are all from the local persistent store.

Implement rebasing

This is how it should work:

When a device first joins, it generates a baseline event with all of the objects in its database as inserts. This event will have a global count of 0.

When two baseline events are discovered, they need to be combined. If one is discovered to be a subset of the other, because all events that it includes are also in the other event, then it can be removed.

If neither baseline is a subset of the other, the union must be taken and put into a new baseline, with the original baselines deleted. We simply go through the inserts in each baseline, keeping one for each object, using global count to choose which one to keep, as usual.

Each baseline belongs to a baseline-history. As long as a baseline is just based on a previous baseline, plus any recent changes, it is in the same baseline history. However, if two disjoint baselines are merged with a union, the new baseline belongs to a new baseline-history.

The reason this is significant is that each event store will have to store the baseline-history id that the Core Data store is based on. If the history id of the current baseline differs from that stored in the event store, the Core Data store needs a rebuild from the new baseline. This involves applying the inserts from the baseline, and all newer change sets, to the existing Core Data store. A baseline can only ever include inserts, and a merge can only ever be a superset, so we should not need to figure out what objects in the current Core Data store need to be deleted or anything like that.

The baseline in a single baseline-history needs to be updated regularly, compacting the new events.
To find the included events, we use this blog post: http://blog.helftone.com/clear-in-the-icloud/

In short, we decide on a global count that will be the cutoff. We determine the supremum of local revisions, and that becomes the set of events included, as well as the revision numbers assigned to the baseline. The global count used for the baseline is the minimum of the global counts on each device’s most recent event.

System tests with random delays

Introduce random delays in file copy operations and have merge operations occur simultaneously, to simulate unexpected changes

Add monitoring of saves to prevent save during leeching

If a save to the monitored store should occur during leeching, data may be lost. Saves should be monitored, and the leech process reinitiated or terminated with an error if a save to the store is observed.

Add a willPrepareForSave notif

Add persistent store identifier to cloud files, and regularly check for existence.

A file should be added for each persistent store identifier to the cloud when leeching. It could just contain the creation date, but it is not so important what is in it.

A check should be made on launch and before a merge, as to whether the file exists for the local store. If not, data corruption has occurred, and a forced deleech should be the result.

The leeched device should be added to the cloud files at the end of the leeching process, when everything is already imported. That way, if a crash occurs before that, it will be flagged as a problem next time a sync is done, and the store automatically deleeched.

To support legacy, need to keep a flag in the store metadata to indicate whether the device check should be carried out.

Unique global identifiers on a per-entity basis.

The CDEEventBuilder class currently uniques global identifiers across all entities. It would be better to unique on a per-entity basis.

Eg. It is conceivable that a Tag class might have a global identifier the same as its text (e.g. Car). Another entity may have a similar approach, and a conflict on the global identifier is likely to occur.

Add support for ordered relationships

At the moment, CDEObjectChange handles to-many relationship changes by storing sets of added and removed global ids. This would need to be generalized to include an ordering parameter.

The removed set would not need indexes, but the added would. It may be difficult to guarantee proper ordering when there are conflicts.

Add image attachments to Idiomatic

Add an extra to-many entity to Idiomatic that can be used to store an image for each note.

When importing a persistent store, monitor saves and fail or reimport as required

To ensure that save changes are not lost, it is important to monitor save notifications during import of a persistent store, and reimport if one saves results to the monitored store.

Add relationship update conflict tests

In the full sync tests, add conflicting updates to relationships, and ensure they all end the same. Try all 3 types of relationship

Skip object changes from local device when possible

Object changes from the local device that are first in a set of changes can be ignored.

Allow switching between sync services in idiomatic

You should be able to tap a setup button and choose from a list of cloud file systems.

Use committed values in generating deltas, rather than the current value

Because other code may have already modified values post save before Ensembles gets to access the objects.

Pause merges if an unknown model version is encountered

Record a model hash in each event. When merging, check that all new events have hashes in the model versions. If there is one or more not known, the merge should fail with appropriate error code.

The ensemble should keep recording save events, but merges would require upgrading to get the new model version.

Write tests for the CDEIntegration willSave, didSave, didFailSave blocks

Test that any fixes are properly captured in the store mod event.

Store token to detect crashes during event builds

There are several places that store mod events. Some can take a while (eg import), and take up memory.

To allow intermediate saves before the event is fully built, store an id for the currently building event in the event store metadata. If a crash occurs before the event can be finished, this can be detected at next launch, and the event deleted, or selected as appropriate.

Events can be mandatory and non-mandatory. A mandatory event must complete, or the store will not be in a valid state. If a mandatory event is incomplete on startup, deleech and report error.

If a non-mandatory event is incomplete, just delete the event on launch.

Add support for saving/restoring incomplete event ids in CDEEventStore
Include a mandatory flag

Cases where incomplete events should be registered

Use this for saves (mandatory)
Merges (not mandatory)
Migrating events into store from files

In CDEPersistentStoreEnsemble, check on init for incomplete events

For a mandatory event, deleech and inform of error
For a non-mandatory event, delete the event (if found), and remove from incomplete list

Remove all uses of NSEnumerator

NSEnumerator was showing unexpected/buggy behaviour when used with arrays of fetched objects. The fetched arrays are probably not true NSArray objects, and perhaps do not work well with NSEnumerator.

A scan of the framework should be made to remove uses of NSEnumerator.

Add a delegate method to indicate a persistent store is about to be imported

This should be in the CDEPersistentStoreEnsemble class.
Method could be persistentStoreEnsemble:shouldImportPersistentStoreAtURL:. Returning NO results in no import. Returning YES causes a basic migration to take place.
Manual migration could occur here if desired when returning NO.

Add a WebDAV backend

This could be useful, though it might be better to write our own code rather than get into technical debt, if the code is not too complex.

WebDAV https://github.com/mattrajca/DAVKit

What should happen if changes made in the willSave delegate method cause the child context to fail to save?

At the moment, the whole merge fails. Should we just continue regardless?
Think we should fail, but fail with a clear error code that indicates that it couldn’t save the child context modified by the developer.

Initial tag for new notes in idiomatic

If a tag is selected when creating a new note in idiomatic, add that tag automatically to the new note.

Dropbox support in Idiomatic

Needs new UI to choose a cloud syncing service, and login.

Add a test for the CDEEventIntegrator for inserting an object after it is deleted

It is possible for someone to reinsert an object with the same global identifier as one that has previously been deleted. Make a test to ensure that if this happens, the object does end up reinserted.

The system for determining the active events in a merge may not be correct

At the moment, we take the events added since the last merge, and then add to that set the events that are concurrent with those new events.

It is conceivable that this may be inadequate. It may be necessary to recursively keep adding concurrent events until the set no longer changes.

Alternatively, we could move to the approach used by clear, where we include all events that have a global count greater or equal to the smallest global count in the new set of events.

Add to-many relationship update test

In the full sync tests, add a to-many relationship, with an update

Support model configurations

Add a property on the ensemble to set a model configuration. Make sure this is used when accessing the model through the framework.

When downloading all PDFs in a collection, cells not onscreen initially never show progress

Cells not onscreen when the button is pressed do not show download progress. It would be necessary to add the appropriate observers when creating new cells.

Add decorator class for CDECloudFileSystem to do zip compression

We can add decorator classes to enhance cloud file systems. One option is an encrypting class. Another is a zipping/compressing class. These could even be used together.

Idiomatic Sample App for Mac

Not really an issue, but a question.

I've coded a sample Mac app to test Ensemble, adding a target "IdiomaticMac" in current's sample project.
since it's a big change and i don't know if you may have planned already a Mac version of the test app, i thought i'd ask first before sending a pull request. It's currently working quite fine and sharing Notes with iOS version of the app just fine.

Since is just for test, i put most of the code in the app delegate instead of creating different controllers.

you can have a look at it in this branch of my fork of Ensembles

https://github.com/erndev/ensembles/tree/IdiomaticMac

Let me know if you find it interesting and i can send the pull request. Also, any changes/improvements/suggestions are appreciated.

cheers

Determine to-many deltas in didSave notification

Currently, we determine deltas for to-many notifs in the willSave notif. But objects can be changed after this in the validation and merging phases.
So instead, just store object ids for the to-many relationships in willSave, and use them to determine deltas in didSave.

Check the initialisation of all NSManagedObjectContexts to ensure they are thread-safe

The Core Data documentation states that all access to a queue-concurrent MOC should be from performBlock:, including setting parent contexts and other initialisation.

Check that this is the case over the whole framework.

Ensure existing objects found during an insert are not already deleted

Just before the CDEEventIntegrator inserts a new object, it first checks if one exists with that global identifier, and will not regenerate the object if it does. This check should also confirm that the existing object is not already deleted. If it is, a new object should be created.

Add measure of progress to CDEAsynchronousTaskQueue

I'm playing with CDEAsynchronousTaskQueue. It's great but it would be nice if it exposed some measure of progress, e.g. [tasks count] and, maybe, -currentTaskIndex.

The count seems simple enough. Setting the current index could be done after pulling the task from the enumerator in -startNextTask.

If this seems at all desirable I'm happy to put together a pull request.

Add WiFi/Multipeer cloud file system

Should have a local cache on each device. Would locate another device using bonjour, and should go through a pairing procedure. Each device would then store a 4 digit number that is used to handshake.

The sync procedure would simply involve each device sending the files that it currently has. The other device would then send back any new files it has. This sync would not necessarily have to happen at the same time as a ensembles merge. Usually, it would be wise to do the file sync first, and after that trigger a merge.

Store tokens to guarantee data consistency during merge operations

If a crash occurs at inopportune times during a merge, it is possible that the persistent store does not represent the ensemble events.

To detect this, store a unique token at the ‘point of no return’, when the results of the merge are being committed. Remove this token after a successful merge.

The token could be stored in the metadata of the CDEEventStore. It could track all building events as a set of tokens, and offer a method to add, remove, and retrieve the tokens.

If the token is discovered to exist at launch, it indicates a crash occurred during committing, and a forced deleech should occur.

Exception leeching NSOrderedSet property

(Turns out this is because ordered properties aren't supported yet)

Test pre- and post-merge delegate methods

Write tests to ensure pre- and post-merge delegate methods are called, and that changes made in the delegate method are captured in the merge event.

Add a encrypting CDECloudFileSystem

The idea would be to invoke the ‘decorator’ design pattern to make a class that conforms to CDECloudFileSystem, but also wraps an existing CDECloudFileSystem, and encrypts data as it is uploaded, and decrypts as it downloads. In this way, the rest of the framework would not need to know about the encryption, but the data in the cloud would be encrypted.

The init… method of the class would take another cloud file system, as well as any keys etc it needs to encrypt/decrypt.

Add unit tests for corner cases in the event integrator

Include updating an object never inserted

Test double deletion of an object

Test delete when object was never inserted

Test inserting after deletion. Object should survive

Investigate if the context willSave notif has final object values

If it is possible for properties to change after willSave, we will need to adapt to ensure the true changes are committed, not the preliminary ones.
One thing to consider is if the objects in the willSave notif have already had the merge policy applied. If not, they represent the actual value saved, and we should only trust final values in didSave, and just store committed values in the willSave notif.

Add unit tests for new cloud registration code

Allow merge operations to be cancelled

This will require enhancements to all of the classes involved in the merge.

A cancelled merge should produce a error with a ‘cancelled’ code.

Add random data system tests

Generate a random set of data for a standard model that includes all important attribute types and relationships.

Make random changes, and test that two coupled stores end up in the same state.

CDESaveMonitor retains objects after context is released

The CDESaveMonitor class currently stores updated values from the contextWillSave notification in a dictionary. If the context does not completely save, and then is released, the objects will be retained unnecessarily.

To prevent this, replace the dictionary with a NSMapTable, and make sure the key (context) is weak. (At the moment it is an NSValue.)

Add support for Dropbox cloud file system.

I have this in another repository already, and will integrate it.

DBRestClient+OSX.h is missing

You can't just checkout and build right now, because this file is missing.

DBRestClient+OSX.h

Add tracing info-level logging

It would be useful to have some high-level logging for tracing overall operations. This should not be excessive, but should indicate when the framework starts and finishes tasks.

Provide support for guided migrations to keep memory usage low

Migrating a whole store is expensive. It takes a long time, and it uses a lot of memory if the whole store has been loaded into RAM.

To get around this, we will allow guidance from the app developer. They can add keys to the user info in the Core Data model. These will be used to

Define a fixed order of migration for entities.
Provide batching guidance on how often a save should occur.

The user info keys will basically be a migration priority on entities, and a batching size, with appropriate defaults when the keys are not included.

When no batch size is given, no saves will occur. The app developer will need to make sure to order entities, and set batch saving options, to ensure that valid saves an occur.

Another similar option is to take the advice of Apple when it comes to migration. Apple recommends large migrations are broken into several smaller ones, or a few entities each. We could do the same by making entity groups (i.e. assign a tag to each entity for the migration group). Each group would be migrated together. But this may not be as granular as the batch option discussed above.

From Apple Core Data Docs:
Multiple Passes—Dealing With Large Datasets

The basic approach shown above is to have the migration manager take two models, and then iterate over the steps (mappings) provided in a mapping model to move the data from one side to the next. Because Core Data performs a "three stage" migration—where it creates all of the data first, and then relates the data in a second stage—it must maintain “association tables" (which tell it which object in the destination store is the migrated version of which object in the source store, and vice-versa). Further, because it doesn't have a means to flush the contexts it is working with, it means you'll accumulate many objects in the migration manager as the migration progresses.

In order to address this, the mapping model is given as a parameter of the migrateStoreFromURL:type:options:withMappingModel:toDestinationURL:destinationType:destinationOptions:error: call itself. What this means is that if you can segregate parts of your graph (as far as mappings are concerned) and create them in separate mapping models, you could do the following:

Get the source and destination data models
Create a migration manager with them
Find all of your mapping models, and put them into an array (in some defined order, if necessary)
Loop through the array, and call migrateStoreFromURL:type:options:withMappingModel:toDestinationURL:destinationType:destinationOptions:error: with each of the mappings
This allows you to migrate "chunks" of data at a time, while not pulling in all of the data at once.

From a "tracking/showing progress” point of view, that basically just creates another layer to work from, so you'd be able to determine percentage complete based on number of mapping models to iterate through (and then further on the number of entity mappings in a model you've already gone through).

Add CDECloudFileSystem delegate call for when new files are detected

Eg. cloudFileSystem:didDetectNewFilesAtPaths:
This method could be used directly by the user code to decide to merge, or it could be used internally by the ensemble to generate a second merge hint delegate call for the user code. The second approach would be interesting, because the ensemble has more information, and can decide whether the existing files are enough to allow a merge to proceed.

Note that iCloud has means of determining when new files arrive, as well as dropbox (http://www.dropbox.com/developers/blog/63).