Giter Site home page Giter Site logo

ensembles's People

Contributors

cflorion avatar dgwilson avatar dhennessy avatar drewmccormack avatar evands avatar fdstevex avatar gorbannastya avatar johnnye avatar kellyroach avatar keremerkan avatar kunalsood avatar laynemoseley avatar readmecritic avatar terhechte avatar uberjason avatar xjki avatar xkmo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ensembles's Issues

Implement rebasing

This is how it should work:

When a device first joins, it generates a baseline event with all of the objects in its database as inserts. This event will have a global count of 0.

When two baseline events are discovered, they need to be combined. If one is discovered to be a subset of the other, because all events that it includes are also in the other event, then it can be removed.

If neither baseline is a subset of the other, the union must be taken and put into a new baseline, with the original baselines deleted. We simply go through the inserts in each baseline, keeping one for each object, using global count to choose which one to keep, as usual.

Each baseline belongs to a baseline-history. As long as a baseline is just based on a previous baseline, plus any recent changes, it is in the same baseline history. However, if two disjoint baselines are merged with a union, the new baseline belongs to a new baseline-history.

The reason this is significant is that each event store will have to store the baseline-history id that the Core Data store is based on. If the history id of the current baseline differs from that stored in the event store, the Core Data store needs a rebuild from the new baseline. This involves applying the inserts from the baseline, and all newer change sets, to the existing Core Data store. A baseline can only ever include inserts, and a merge can only ever be a superset, so we should not need to figure out what objects in the current Core Data store need to be deleted or anything like that.

The baseline in a single baseline-history needs to be updated regularly, compacting the new events.
To find the included events, we use this blog post: http://blog.helftone.com/clear-in-the-icloud/

In short, we decide on a global count that will be the cutoff. We determine the supremum of local revisions, and that becomes the set of events included, as well as the revision numbers assigned to the baseline. The global count used for the baseline is the minimum of the global counts on each device’s most recent event.

System tests with random delays

Introduce random delays in file copy operations and have merge operations occur simultaneously, to simulate unexpected changes

Add persistent store identifier to cloud files, and regularly check for existence.

A file should be added for each persistent store identifier to the cloud when leeching. It could just contain the creation date, but it is not so important what is in it.

A check should be made on launch and before a merge, as to whether the file exists for the local store. If not, data corruption has occurred, and a forced deleech should be the result.

The leeched device should be added to the cloud files at the end of the leeching process, when everything is already imported. That way, if a crash occurs before that, it will be flagged as a problem next time a sync is done, and the store automatically deleeched.

To support legacy, need to keep a flag in the store metadata to indicate whether the device check should be carried out.

Unique global identifiers on a per-entity basis.

The CDEEventBuilder class currently uniques global identifiers across all entities. It would be better to unique on a per-entity basis.

Eg. It is conceivable that a Tag class might have a global identifier the same as its text (e.g. Car). Another entity may have a similar approach, and a conflict on the global identifier is likely to occur.

Add support for ordered relationships

At the moment, CDEObjectChange handles to-many relationship changes by storing sets of added and removed global ids. This would need to be generalized to include an ordering parameter.

The removed set would not need indexes, but the added would. It may be difficult to guarantee proper ordering when there are conflicts.

Pause merges if an unknown model version is encountered

Record a model hash in each event. When merging, check that all new events have hashes in the model versions. If there is one or more not known, the merge should fail with appropriate error code.

The ensemble should keep recording save events, but merges would require upgrading to get the new model version.

Store token to detect crashes during event builds

There are several places that store mod events. Some can take a while (eg import), and take up memory.

To allow intermediate saves before the event is fully built, store an id for the currently building event in the event store metadata. If a crash occurs before the event can be finished, this can be detected at next launch, and the event deleted, or selected as appropriate.

Events can be mandatory and non-mandatory. A mandatory event must complete, or the store will not be in a valid state. If a mandatory event is incomplete on startup, deleech and report error.

If a non-mandatory event is incomplete, just delete the event on launch.

  • Add support for saving/restoring incomplete event ids in CDEEventStore
  • Include a mandatory flag

Cases where incomplete events should be registered

  • Use this for saves (mandatory)
  • Merges (not mandatory)
  • Migrating events into store from files

In CDEPersistentStoreEnsemble, check on init for incomplete events

  • For a mandatory event, deleech and inform of error
  • For a non-mandatory event, delete the event (if found), and remove from incomplete list

Remove all uses of NSEnumerator

NSEnumerator was showing unexpected/buggy behaviour when used with arrays of fetched objects. The fetched arrays are probably not true NSArray objects, and perhaps do not work well with NSEnumerator.

A scan of the framework should be made to remove uses of NSEnumerator.

The system for determining the active events in a merge may not be correct

At the moment, we take the events added since the last merge, and then add to that set the events that are concurrent with those new events.

It is conceivable that this may be inadequate. It may be necessary to recursively keep adding concurrent events until the set no longer changes.

Alternatively, we could move to the approach used by clear, where we include all events that have a global count greater or equal to the smallest global count in the new set of events.

Support model configurations

Add a property on the ensemble to set a model configuration. Make sure this is used when accessing the model through the framework.

Idiomatic Sample App for Mac

Not really an issue, but a question.

I've coded a sample Mac app to test Ensemble, adding a target "IdiomaticMac" in current's sample project.
since it's a big change and i don't know if you may have planned already a Mac version of the test app, i thought i'd ask first before sending a pull request. It's currently working quite fine and sharing Notes with iOS version of the app just fine.

Since is just for test, i put most of the code in the app delegate instead of creating different controllers.

you can have a look at it in this branch of my fork of Ensembles

https://github.com/erndev/ensembles/tree/IdiomaticMac

Let me know if you find it interesting and i can send the pull request. Also, any changes/improvements/suggestions are appreciated.

cheers

Determine to-many deltas in didSave notification

Currently, we determine deltas for to-many notifs in the willSave notif. But objects can be changed after this in the validation and merging phases.
So instead, just store object ids for the to-many relationships in willSave, and use them to determine deltas in didSave.

Ensure existing objects found during an insert are not already deleted

Just before the CDEEventIntegrator inserts a new object, it first checks if one exists with that global identifier, and will not regenerate the object if it does. This check should also confirm that the existing object is not already deleted. If it is, a new object should be created.

Add measure of progress to CDEAsynchronousTaskQueue

I'm playing with CDEAsynchronousTaskQueue. It's great but it would be nice if it exposed some measure of progress, e.g. [tasks count] and, maybe, -currentTaskIndex.

The count seems simple enough. Setting the current index could be done after pulling the task from the enumerator in -startNextTask.

If this seems at all desirable I'm happy to put together a pull request.

Add WiFi/Multipeer cloud file system

Should have a local cache on each device. Would locate another device using bonjour, and should go through a pairing procedure. Each device would then store a 4 digit number that is used to handshake.

The sync procedure would simply involve each device sending the files that it currently has. The other device would then send back any new files it has. This sync would not necessarily have to happen at the same time as a ensembles merge. Usually, it would be wise to do the file sync first, and after that trigger a merge.

Store tokens to guarantee data consistency during merge operations

If a crash occurs at inopportune times during a merge, it is possible that the persistent store does not represent the ensemble events.

To detect this, store a unique token at the ‘point of no return’, when the results of the merge are being committed. Remove this token after a successful merge.

The token could be stored in the metadata of the CDEEventStore. It could track all building events as a set of tokens, and offer a method to add, remove, and retrieve the tokens.

If the token is discovered to exist at launch, it indicates a crash occurred during committing, and a forced deleech should occur.

Add a encrypting CDECloudFileSystem

The idea would be to invoke the ‘decorator’ design pattern to make a class that conforms to CDECloudFileSystem, but also wraps an existing CDECloudFileSystem, and encrypts data as it is uploaded, and decrypts as it downloads. In this way, the rest of the framework would not need to know about the encryption, but the data in the cloud would be encrypted.

The init… method of the class would take another cloud file system, as well as any keys etc it needs to encrypt/decrypt.

Investigate if the context willSave notif has final object values

If it is possible for properties to change after willSave, we will need to adapt to ensure the true changes are committed, not the preliminary ones.
One thing to consider is if the objects in the willSave notif have already had the merge policy applied. If not, they represent the actual value saved, and we should only trust final values in didSave, and just store committed values in the willSave notif.

Add random data system tests

Generate a random set of data for a standard model that includes all important attribute types and relationships.

Make random changes, and test that two coupled stores end up in the same state.

CDESaveMonitor retains objects after context is released

The CDESaveMonitor class currently stores updated values from the contextWillSave notification in a dictionary. If the context does not completely save, and then is released, the objects will be retained unnecessarily.

To prevent this, replace the dictionary with a NSMapTable, and make sure the key (context) is weak. (At the moment it is an NSValue.)

Add tracing info-level logging

It would be useful to have some high-level logging for tracing overall operations. This should not be excessive, but should indicate when the framework starts and finishes tasks.

Provide support for guided migrations to keep memory usage low

Migrating a whole store is expensive. It takes a long time, and it uses a lot of memory if the whole store has been loaded into RAM.

To get around this, we will allow guidance from the app developer. They can add keys to the user info in the Core Data model. These will be used to

  1. Define a fixed order of migration for entities.
  2. Provide batching guidance on how often a save should occur.

The user info keys will basically be a migration priority on entities, and a batching size, with appropriate defaults when the keys are not included.

When no batch size is given, no saves will occur. The app developer will need to make sure to order entities, and set batch saving options, to ensure that valid saves an occur.

Another similar option is to take the advice of Apple when it comes to migration. Apple recommends large migrations are broken into several smaller ones, or a few entities each. We could do the same by making entity groups (i.e. assign a tag to each entity for the migration group). Each group would be migrated together. But this may not be as granular as the batch option discussed above.

From Apple Core Data Docs:
Multiple Passes—Dealing With Large Datasets

The basic approach shown above is to have the migration manager take two models, and then iterate over the steps (mappings) provided in a mapping model to move the data from one side to the next. Because Core Data performs a "three stage" migration—where it creates all of the data first, and then relates the data in a second stage—it must maintain “association tables" (which tell it which object in the destination store is the migrated version of which object in the source store, and vice-versa). Further, because it doesn't have a means to flush the contexts it is working with, it means you'll accumulate many objects in the migration manager as the migration progresses.

In order to address this, the mapping model is given as a parameter of the migrateStoreFromURL:type:options:withMappingModel:toDestinationURL:destinationType:destinationOptions:error: call itself. What this means is that if you can segregate parts of your graph (as far as mappings are concerned) and create them in separate mapping models, you could do the following:

Get the source and destination data models
Create a migration manager with them
Find all of your mapping models, and put them into an array (in some defined order, if necessary)
Loop through the array, and call migrateStoreFromURL:type:options:withMappingModel:toDestinationURL:destinationType:destinationOptions:error: with each of the mappings
This allows you to migrate "chunks" of data at a time, while not pulling in all of the data at once.

From a "tracking/showing progress” point of view, that basically just creates another layer to work from, so you'd be able to determine percentage complete based on number of mapping models to iterate through (and then further on the number of entity mappings in a model you've already gone through).

Add CDECloudFileSystem delegate call for when new files are detected

Eg. cloudFileSystem:didDetectNewFilesAtPaths:
This method could be used directly by the user code to decide to merge, or it could be used internally by the ensemble to generate a second merge hint delegate call for the user code. The second approach would be interesting, because the ensemble has more information, and can decide whether the existing files are enough to allow a merge to proceed.

Note that iCloud has means of determining when new files arrive, as well as dropbox (http://www.dropbox.com/developers/blog/63).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.