orbitdb-archive / ipfs-log Goto Github PK

View Code? Open in Web Editor NEW

396.0 23.0 57.0 14.33 MB

Append-only log CRDT on IPFS

Home Page: https://orbitdb.github.io/ipfs-log/

License: MIT License

JavaScript 97.16% Makefile 0.18% HTML 2.67%

ipfs p2p orbit data-structure log-db orbit-db ipfs-log crdt

ipfs-log's People

Contributors

Stargazers

Watchers

Forkers

richardlitt dignifiedquire edsilv mateon1 geoah daviddias packetlost domsteil vith shamb0t chrisdostert dmbreaker guijun bedri chlunetwork mistakia crazybuster katedalton kaibakker listenaddress zachferland alanshaw tyleryasaka satazor haadcode jbelke iwe7 adam-palazzo glouvigny phillmac joincolony grantlouisherman yoramboccia haardikk21 n0izn0iz bluelovers vasa-develop holajiawei windseason tabcat dendisuhubdy salespaulo mxi-box csdummi rock-lee-520 littlebenlittle equitytrustbankxfether0xlba0ethplorer jeremyorme crotel unllamas justingirard pinkdiamond1 cbeck88 kkpan11 bitcoinoutput gipphe

ipfs-log's Issues

Separate hashing part to its own module

The hashing part, object.get and object.put, should be refactored into its own module.

Move to IPFS org for discoverability and maintenance?

OrbitDB is progressing towards dynamic access, aka dynamic permissions. As part of this effort, multiple access controller types need to be supported. Currently OrbitDB uses an IPFS-based access controller, and we would like to support OrbitDB-based and smart-contract-based access controller.

So far so good. To support a smart contract ACL, we have a scheme like this: right now, every entry has a key, which is used to sign the contents of the entry. When a peer receives this update, they use the key to verify that the entry was properly signed.

A smart contract ACL would also like to include some piece of info signed by a key which is in the smart contract. For instance, the smart contract has a user's wallet public key. The user signs the OrbitDB (ODB) public key with that wallet, resulting in a chainSignature.

Just as we verify the entry's signature with the key, we also want to verify the chainSignature with the chainKey.

That was the recap, here is the problem. This chainSignature and chainKey have to be stored somewhere so that peers can use them. Where should we store them?

My first proposal was to add them as properties directly to the entry. But this breaks the formal structure of the entry. Now our protocol must deal with different entry shapes. Sad.
Another proposal is to store them in an IPFS object, and put the hash under the key property.
And the proposal @shamb0t came up with in today's inaugural community call is to store them in an object structure under the key property. Optimally, the object would sort of be self-describing, in typical IPFS style. Multikey? Multisig?

Intersection function not used anymore

The intersection utility function is not used anymore afaict. We should remove it from the project.

Traversal Improvements

From: ipfs-inactive/dynamic-data-and-capabilities#50

Provide a way to efficiently and incrementally traverse the log. If I request the last 5 entries, there should be as less traversal and I/O as possible to accomplish that. Also, it should be able to have some kind of resume of the traversal, for pagination use-cases.
Ensure that the traversal of the tree happens concurrently, as much as possible. We can obviously improve later on.

Use orbit-db-identity-strore from npm

Reminder to update package.json and use orbit-db-identity-provider npm module here https://github.com/orbitdb/ipfs-log/blob/master/package.json#L16

Remove Bluebird as dependency

bluebird module is currently used in Log._fetchRecursive() to run promises in series (instead of parallel). However, bluebird uses up to 174kb of ~300kb of the ipfs-log dist build and I think that's a bit too much for using function.

To replace Promise.map() provided by bluebird, we can implement promise series either with .forEach or .reduce, and there are possibly other solutions.

Doing this would be a huge footprint gain for the dist build (300kb --> ~100kb?) and I would like to see this happen asap.

For reference:
https://pouchdb.com/2015/05/18/we-have-a-problem-with-promises.html
https://remysharp.com/2015/12/18/promise-waterfall
https://stackoverflow.com/questions/24586110/resolve-promises-one-after-another-i-e-in-sequence

Deterministic sorting of joined log entries

Document options.maxHistory in README

const options = { maxHistory: 1024 };
const log = new Log(new IPFS(), 'A', 'db name', options);

Send actual objects to Pubsub instead of the hashes

Instead of sending the hash of the new log entries to Pubsub, send the actual object. This should improve (IO) performance drastically.

Add browser benchmarks

We should run the benchmarks in the browser. They're currently Node.js only.

Memory Footprint

From: ipfs-inactive/dynamic-data-and-capabilities#50

Analyze the memory footprint of the log instance and make it smaller or configurable. There are scenarios where it's ok to have more I/O as opposed to increased memory usage. One example is precisely viewing the history of changes, which is a feature used sporadically.

Run tests with go-ipfs

We used to run testt with go-ipfs, too, and dropped the support some time ago. Now would be a good time to bring back the support.

Related to #154.

It would be great to have CI on this module

Reduce test code boilerplate

We currently have a bit of boilerplate to setup the test suite in each test file (requires, variables, starting IPFS, setting up paths/directories etc). We should try to refactor the tests' boilerplate to make it easier to add new tests and maintain existing ones.

We should take a similar approach to OrbitDB as that has worked well and also easily allows us to run the tests against go-ipfs again.

tests on master are failing

Tried with Node.js 6 LTS and NPM 3

preparations for js-ipfs 0.33

👋 I upgraded ipfs to 0.33.0-rc.4, datastore-level to ^0.9.0, ipfs-repo to ^0.24.0 and ran the tests and they all passed \o/

ipfs/js-ipfs#1635

doesn't work with Node.js 4 LTS

IPLD Support

From: ipfs-inactive/dynamic-data-and-capabilities#50

Support IPLD now that js-ipfs has it via the dag.put and dag.get functions. This allows us to use the IPLD query language and use explorer.ipld.io to debug.

Add possibility to add Entry instances to the log via .add()

It should be possible to do:

const entry = new Entry("hello", "Qm...Foo");
log.add(entry);

Reduce build size

With the new identity PR landing, the build size will go up significantly. Mostly due to the keystore, which is being exported by the Log module and depends on a lot of crypto modules. We should go through that and probably remove the keystore from ipfs-log to make the build smaller again

Update docs for Identity

Testing with js-ipfs next (0.28)

Ran tests and got:

Seems that all is good :)

Add benchmarks runner and incorporate performance testing with CI/PRs

Building on @haadcode's suggestion in #136, I would be down to help setup benchmarks for all of the methods and incorporate performancing testing with CircleCI to be able to better evaluate PRs in the future.

Browser tests

ipfs-log is missing browsers tests. We should add browsers tests to be part of the npm test run.

Add CI badge to the readme

To make it obvious to find the CI for this repo, we should add a CircleCI badge to the README.md.

Add progress events for history fetching/sync

When the log is being fetched from history or synced, it should emit 'progress' event with the so-far-fetched amount of items and total to-be-fetched amount as arguments.

Make the log data structure immutable

Since day 1 I've been wanting to try the log as an immutable data structure instead of being an object that mutates its state. I believe this would make it easier to reason about the log as a data structure.

In practice, it would mean instead of:

// log1: [1]
// log2: [2]
log1.join(log2).then((log) => /* log === log1, join mutates the instance */)
// log1: [1,2]
// log2: [2]
// log: [1,2]

We would write:

// log1: [1]
// log2: [2]
log1.join(log2).then((log) => /* log !== log1, log1 hasn't changed */)
// or
Log.join(log1, log2).then((log) => /* log !== log1, log1 hasn't changed */)
// log1: [1]
// log2: [2]
// log: [1,2]

And soon:

const log = await Log.join(log1, log2)

Use either Lazy.js or lodash, not both

ipfs-log depends on couple of lodash functions and Lazy.js (whole library) atm. It would good for the final build size to use either one.

Either use Lazy.js or lodash everywhere, bot not both. All dependencies for both libraries are in src/log.js.

As far as I understand, they both have similar performance characteristics, but this should be benchmarked.

Support signed messages

preparations for js-ipfs 0.32

I upgraded ipfs in the devDependencies to 0.32.0-rc.2 and ran the tests - they passed \o/

ipfs/js-ipfs#1497

Why not use Links inside the IPFS object

Looking at format ipfs-log is storing objects in IPFS, it looks like it is not storing links to the next object inside Links but inside Data itself. Is there a reason for that? This makes it so that then IPFS does not know about the linked structure of the chain/log.

preparations for js-ipfs 0.31

I upgraded the ipfs dependency in preparation for js-ipfs 0.31 and ran the tests, they passed \o/:

ipfs/js-ipfs#1458

Add a CLI tool

It would be great to have a CLI tool to manage logs.

The basic commands would be:

$ ipfs-log create
QmFoo1

$ ipfs-log append QmFoo1 "hello world"
QmFoo2

$ ipfs-log values QmFoo2
[{ payload: "hello world", ... }]

$ ipfs-log create --id 'logB'
QmFoo3

$ ipfs-log append QmFoo3 "hi"
QmFoo4

$ ipfs-log join QmFoo2 QmFoo4
QmFoo5

$ ipfs-log values QmFoo5 --size 2
[{ payload: "hello world", ... }, { payload: "hi", ... }]

I would use yargs to wrap the commands.

This one is up to grabs and if anyone wants to work on this, feel free to assign it to yourself or claim it here by saying so in a comment.

Constructor arguments (one vs many)

Just wanted to start a conversation about potentially moving away from having a large number of arguments when you have a mix of both optional and required arguments. Looking at it from a readability and functional perspective, I see the following benefits of moving toward a single argument object and avoiding instance creation that looks like this:

https://github.com/orbitdb/ipfs-log/blob/21bd045b4b4a27c59844ec091844ce9bdc5f88ed/benchmarks/benchmark-join-signed.js#L70

For me, named parameters are easier to remember than the argument order when there are as many as 7 arguments. It can also lead to greater readability as its not explicit what 'A' is in the above example
Potentially cleaner and less error prone use of optional parameters — similar to the first argument, you have to keep track of the argument order to properly place variables and null.
Easier to add and remove arguments.

preparations for js-ipfs 0.29.0

I've run the tests of this module in preparation for ipfs/js-ipfs#1320.

All of the tests pass, however, note that the PubSub API did change, consider updating carefully. See change log in the release issue ipfs/js-ipfs#1320

Future plans?

What is the ultimate goal?

Ensure that we have timeouts in place when reading/writing nodes in IPFS

From: ipfs-inactive/dynamic-data-and-capabilities#50

Ensure that we have timeouts in place when reading/writing nodes in IPFS

js-ipfs v0.24.0 pre-release testing

Hi @haadcode! We are about to release the next version of js-ipfs. I've just ran the tests with the new release and it seems it will require just one single migration: using the new simplified init set up.

This was already a change on v0.23.0, so I guess ipfs-daemon never got up to date.

The error:

/Users/koruza/code/ipfs-log/node_modules/ipfs-daemon/src/ipfs-node-daemon.js:83
                this._daemon.load((err) => {
                             ^

TypeError: this._daemon.load is not a function
    at _daemon.config.set (/Users/koruza/code/ipfs-log/node_modules/ipfs-daemon/src/ipfs-node-daemon.js:83:30)
    at waterfall (/Users/koruza/code/js-ipfs/node_modules/datastore-fs/src/index.js:172:17)
    at /Users/koruza/code/js-ipfs/node_modules/async/internal/parallel.js:39:9
    at /Users/koruza/code/js-ipfs/node_modules/async/internal/once.js:12:16
    at replenish (/Users/koruza/code/js-ipfs/node_modules/async/internal/eachOfLimit.js:59:25)
    at iterateeCallback (/Users/koruza/code/js-ipfs/node_modules/async/internal/eachOfLimit.js:49:17)
    at /Users/koruza/code/js-ipfs/node_modules/async/internal/onlyOnce.js:12:16
    at /Users/koruza/code/js-ipfs/node_modules/async/internal/parallel.js:36:13
    at /Users/koruza/code/js-ipfs/node_modules/write-file-atomic/index.js:60:11
    at LOOP (/Users/koruza/code/js-ipfs/node_modules/slide/lib/chain.js:7:26)
    at /Users/koruza/code/js-ipfs/node_modules/slide/lib/chain.js:18:7
    at FSReqWrap.oncomplete (fs.js:123:15)
npm ERR! Test failed.  See above for more details.

To learn how to spawn a node today, you can check the README here: https://github.com/ipfs/js-ipfs#create-a-ipfs-node-instance

Explicit Merge Nodes

From: ipfs-inactive/dynamic-data-and-capabilities#50

Have a way to explicitly create a merge node when we are merging other replica's nodes. One use-case for that is when a certain amount of heads is reached, we want to have it reduced to just one (the merge node), so that it occupies less space inside a CRDT

Custom Entry verification function

Hello, like last time I need this for my current use case:

Everybody can write to my log through orbit-db, and I see that in the code there is an async validation function for each Entry.

How about exposing a way to add custom validation to each Entry? I need a way to make sure that my nodes don't end up replicating entries with malformed information: even though anyone can write to my db I have a few rules about what kind of data is valid. This could also be used, if anyone wants to do so, to implement for example a proof-of-work required to write to the DB.

Thoughts?

Remove .name property from log.js

Split log tests into several files

The tests for the log have become huge and are hard to change and use in development.

We should split the tests in log.spec.js into several test files. This could be done for example file per each 'describe' in the original file.

Reduce memory footprint of a log instance & improve load time for orbit-db stores

Currently, when a log is loaded, all the data is kept in memory on log._entryIndex. Also, constructing a log requires passing in all the entries, which in the case of OrbitDB Stores that means reading all the data from disk (or just the N latest entries).

This is not very ideal as the load operation is O(n) and the memory footprint is an issue in mobile environments.

What are your thoughts on making the following changes:

allow constructor to require Entries and/or Heads with a nextsIndex (aka a entryHashIndex)

The following methods would also change:

log.values would read from disk when entryIndex does not exist
log.get(hash) would read from disk when entryIndex does not exit
log.has(hash) would use nextsIndex when entryIndex does not exist
log.toJSON / log.toSnapshot would include nextsIndex

The following methods would be added:

log.audit (or similar) to go through and rebuild/validate the nextsIndex

This would allow using just a cache of the head and a nextsIndex to create a functional instance of Log. Also, using the nextsIndex one could navigate and load any portion of the log.

Happy to take this on and explore better ways to accomplish this but would like some feedback to make sure I'm not missing anything obvious.

error in entry.js

static findChildren(entry, values) {
var stack = []
var parent = values.find((e) => Entry.isParent(entry, e))
var prev = entry
while (parent) {
stack.push(parent)
prev = parent
parent = values.find((e) => Entry.isParent(prev, e))
}
stack = stack.sort((a, b) => a.clock.time > a.clock.time)
return stack
}

maybe "a.clock.time > a.clock.time" should be "a.clock.time > b.clock.time"?

Add a Contribute section

Payload Encryption Support

It appears that with the identity provider now it would be easier to support/implement payload encryption by adding optional encrypt/decrypt to to the identity provider as well. Any more thoughts on this? and/or is this part of a larger access control conversation?

Would be interested in implementing, we currently implement a orbitdb store that wraps 'orbit-db-kvstore' and encrypts/decrypts values, but would be nice to encrypt entire payload at log layer.

JSON.parse on `.data` will throw, `data` is a `buffer`

.data inside a IPFS object is a Buffer, so doing a straight JSON.parse will fail (or should)

https://github.com/haadcode/ipfs-log/blob/master/src/log.js#L120-L128

Since this JSON.parse is wrapped in a promise and there is no catch, the error goes in silence.

Another issue is that since logData might go undefined, logData.items will throw and error too.

orbitdb-archive / ipfs-log Goto Github PK

ipfs-log's People

Contributors

Stargazers

Watchers

Forkers

ipfs-log's Issues

Recommend Projects

Recommend Topics

Recommend Org