orbitdb-archive / ipfs-log Goto Github PK
View Code? Open in Web Editor NEWAppend-only log CRDT on IPFS
Home Page: https://orbitdb.github.io/ipfs-log/
License: MIT License
Append-only log CRDT on IPFS
Home Page: https://orbitdb.github.io/ipfs-log/
License: MIT License
I upgraded the following dependencies in preparation for js-ipfs 0.30 and ran the tests, they passed \o/:
"datastore-level": "~0.8.0",
"ipfs": "github:ipfs/js-ipfs",
"ipfs-repo": "~0.22.1",
The hashing part, object.get
and object.put
, should be refactored into its own module.
First, a quick recap.
OrbitDB is progressing towards dynamic access, aka dynamic permissions. As part of this effort, multiple access controller types need to be supported. Currently OrbitDB uses an IPFS-based access controller, and we would like to support OrbitDB-based and smart-contract-based access controller.
So far so good. To support a smart contract ACL, we have a scheme like this: right now, every entry has a key, which is used to sign the contents of the entry. When a peer receives this update, they use the key to verify that the entry was properly signed.
A smart contract ACL would also like to include some piece of info signed by a key which is in the smart contract. For instance, the smart contract has a user's wallet public key. The user signs the OrbitDB (ODB) public key with that wallet, resulting in a chainSignature
.
Just as we verify the entry's signature with the key, we also want to verify the chainSignature
with the chainKey
.
That was the recap, here is the problem. This chainSignature
and chainKey
have to be stored somewhere so that peers can use them. Where should we store them?
key
property.key
property. Optimally, the object would sort of be self-describing, in typical IPFS style. Multikey? Multisig?The intersection utility function is not used anymore afaict. We should remove it from the project.
From: ipfs-inactive/dynamic-data-and-capabilities#50
Reminder to update package.json and use orbit-db-identity-provider npm module here https://github.com/orbitdb/ipfs-log/blob/master/package.json#L16
bluebird
module is currently used in Log._fetchRecursive() to run promises in series (instead of parallel). However, bluebird uses up to 174kb of ~300kb of the ipfs-log dist build and I think that's a bit too much for using function.
To replace Promise.map() provided by bluebird, we can implement promise series either with .forEach or .reduce, and there are possibly other solutions.
Doing this would be a huge footprint gain for the dist build (300kb --> ~100kb?) and I would like to see this happen asap.
For reference:
https://pouchdb.com/2015/05/18/we-have-a-problem-with-promises.html
https://remysharp.com/2015/12/18/promise-waterfall
https://stackoverflow.com/questions/24586110/resolve-promises-one-after-another-i-e-in-sequence
const options = { maxHistory: 1024 };
const log = new Log(new IPFS(), 'A', 'db name', options);
Instead of sending the hash of the new log entries to Pubsub, send the actual object. This should improve (IO) performance drastically.
We should run the benchmarks in the browser. They're currently Node.js only.
From: ipfs-inactive/dynamic-data-and-capabilities#50
Analyze the memory footprint of the log instance and make it smaller or configurable. There are scenarios where it's ok to have more I/O as opposed to increased memory usage. One example is precisely viewing the history of changes, which is a feature used sporadically.
See also #136
We used to run testt with go-ipfs, too, and dropped the support some time ago. Now would be a good time to bring back the support.
Related to #154.
We currently have a bit of boilerplate to setup the test suite in each test file (requires, variables, starting IPFS, setting up paths/directories etc). We should try to refactor the tests' boilerplate to make it easier to add new tests and maintain existing ones.
We should take a similar approach to OrbitDB as that has worked well and also easily allows us to run the tests against go-ipfs again.
Tried with Node.js 6 LTS and NPM 3
๐ I upgraded ipfs
to 0.33.0-rc.4
, datastore-level
to ^0.9.0
, ipfs-repo
to ^0.24.0
and ran the tests and they all passed \o/
From: ipfs-inactive/dynamic-data-and-capabilities#50
Support IPLD now that js-ipfs has it via the dag.put and dag.get functions. This allows us to use the IPLD query language and use explorer.ipld.io to debug.
It should be possible to do:
const entry = new Entry("hello", "Qm...Foo");
log.add(entry);
With the new identity PR landing, the build size will go up significantly. Mostly due to the keystore, which is being exported by the Log module and depends on a lot of crypto modules. We should go through that and probably remove the keystore from ipfs-log to make the build smaller again
ipfs-log is missing browsers tests. We should add browsers tests to be part of the npm test
run.
To make it obvious to find the CI for this repo, we should add a CircleCI badge to the README.md.
When the log is being fetched from history or synced, it should emit 'progress' event with the so-far-fetched amount of items and total to-be-fetched amount as arguments.
Since day 1 I've been wanting to try the log as an immutable data structure instead of being an object that mutates its state. I believe this would make it easier to reason about the log as a data structure.
In practice, it would mean instead of:
// log1: [1]
// log2: [2]
log1.join(log2).then((log) => /* log === log1, join mutates the instance */)
// log1: [1,2]
// log2: [2]
// log: [1,2]
We would write:
// log1: [1]
// log2: [2]
log1.join(log2).then((log) => /* log !== log1, log1 hasn't changed */)
// or
Log.join(log1, log2).then((log) => /* log !== log1, log1 hasn't changed */)
// log1: [1]
// log2: [2]
// log: [1,2]
And soon:
const log = await Log.join(log1, log2)
ipfs-log depends on couple of lodash functions and Lazy.js (whole library) atm. It would good for the final build size to use either one.
Either use Lazy.js or lodash everywhere, bot not both. All dependencies for both libraries are in src/log.js
.
As far as I understand, they both have similar performance characteristics, but this should be benchmarked.
I upgraded ipfs
in the devDependencies
to 0.32.0-rc.2
and ran the tests - they passed \o/
Looking at format ipfs-log is storing objects in IPFS, it looks like it is not storing links to the next object inside Links
but inside Data
itself. Is there a reason for that? This makes it so that then IPFS does not know about the linked structure of the chain/log.
I upgraded the ipfs dependency in preparation for js-ipfs 0.31 and ran the tests, they passed \o/:
It would be great to have a CLI tool to manage logs.
The basic commands would be:
$ ipfs-log create
QmFoo1
$ ipfs-log append QmFoo1 "hello world"
QmFoo2
$ ipfs-log values QmFoo2
[{ payload: "hello world", ... }]
$ ipfs-log create --id 'logB'
QmFoo3
$ ipfs-log append QmFoo3 "hi"
QmFoo4
$ ipfs-log join QmFoo2 QmFoo4
QmFoo5
$ ipfs-log values QmFoo5 --size 2
[{ payload: "hello world", ... }, { payload: "hi", ... }]
I would use yargs to wrap the commands.
This one is up to grabs and if anyone wants to work on this, feel free to assign it to yourself or claim it here by saying so in a comment.
Just wanted to start a conversation about potentially moving away from having a large number of arguments when you have a mix of both optional and required arguments. Looking at it from a readability and functional perspective, I see the following benefits of moving toward a single argument object and avoiding instance creation that looks like this:
null
.I've run the tests of this module in preparation for ipfs/js-ipfs#1320.
All of the tests pass, however, note that the PubSub API did change, consider updating carefully. See change log in the release issue ipfs/js-ipfs#1320
What is the ultimate goal?
From: ipfs-inactive/dynamic-data-and-capabilities#50
Ensure that we have timeouts in place when reading/writing nodes in IPFS
Hi @haadcode! We are about to release the next version of js-ipfs. I've just ran the tests with the new release and it seems it will require just one single migration: using the new simplified init set up.
This was already a change on v0.23.0, so I guess ipfs-daemon never got up to date.
The error:
/Users/koruza/code/ipfs-log/node_modules/ipfs-daemon/src/ipfs-node-daemon.js:83
this._daemon.load((err) => {
^
TypeError: this._daemon.load is not a function
at _daemon.config.set (/Users/koruza/code/ipfs-log/node_modules/ipfs-daemon/src/ipfs-node-daemon.js:83:30)
at waterfall (/Users/koruza/code/js-ipfs/node_modules/datastore-fs/src/index.js:172:17)
at /Users/koruza/code/js-ipfs/node_modules/async/internal/parallel.js:39:9
at /Users/koruza/code/js-ipfs/node_modules/async/internal/once.js:12:16
at replenish (/Users/koruza/code/js-ipfs/node_modules/async/internal/eachOfLimit.js:59:25)
at iterateeCallback (/Users/koruza/code/js-ipfs/node_modules/async/internal/eachOfLimit.js:49:17)
at /Users/koruza/code/js-ipfs/node_modules/async/internal/onlyOnce.js:12:16
at /Users/koruza/code/js-ipfs/node_modules/async/internal/parallel.js:36:13
at /Users/koruza/code/js-ipfs/node_modules/write-file-atomic/index.js:60:11
at LOOP (/Users/koruza/code/js-ipfs/node_modules/slide/lib/chain.js:7:26)
at /Users/koruza/code/js-ipfs/node_modules/slide/lib/chain.js:18:7
at FSReqWrap.oncomplete (fs.js:123:15)
npm ERR! Test failed. See above for more details.
To learn how to spawn a node today, you can check the README here: https://github.com/ipfs/js-ipfs#create-a-ipfs-node-instance
From: ipfs-inactive/dynamic-data-and-capabilities#50
Have a way to explicitly create a merge node when we are merging other replica's nodes. One use-case for that is when a certain amount of heads is reached, we want to have it reduced to just one (the merge node), so that it occupies less space inside a CRDT
Hello, like last time I need this for my current use case:
Everybody can write to my log through orbit-db, and I see that in the code there is an async validation function for each Entry.
How about exposing a way to add custom validation to each Entry? I need a way to make sure that my nodes don't end up replicating entries with malformed information: even though anyone can write to my db I have a few rules about what kind of data is valid. This could also be used, if anyone wants to do so, to implement for example a proof-of-work required to write to the DB.
Thoughts?
Hi there,
First of all thanks a bunch for the great work you're doing.
I wanted to suggest some additions to the documentation:
-add the inputs of "new Log()" , which are (ipfs, id, entries, heads, clock, key, keys = []) if I understand well
-enphasize the use of
var multihash= await log.toMultihash(ipfs, log)
var logs1fromIPFS= await Log.fromMultihash(ipfs, multihash)
as a way to communicate log to another peer
Cheers
The tests for the log have become huge and are hard to change and use in development.
We should split the tests in log.spec.js into several test files. This could be done for example file per each 'describe' in the original file.
Currently, when a log is loaded, all the data is kept in memory on log._entryIndex
. Also, constructing a log requires passing in all the entries, which in the case of OrbitDB Stores that means reading all the data from disk (or just the N latest entries).
This is not very ideal as the load operation is O(n) and the memory footprint is an issue in mobile environments.
What are your thoughts on making the following changes:
Entries
and/or Heads
with a nextsIndex
(aka a entryHashIndex)The following methods would also change:
log.values
would read from disk when entryIndex
does not existlog.get(hash)
would read from disk when entryIndex
does not exitlog.has(hash)
would use nextsIndex
when entryIndex
does not existlog.toJSON
/ log.toSnapshot
would include nextsIndex
The following methods would be added:
log.audit
(or similar) to go through and rebuild/validate the nextsIndex
This would allow using just a cache of the head
and a nextsIndex
to create a functional instance of Log
. Also, using the nextsIndex
one could navigate and load any portion of the log.
Happy to take this on and explore better ways to accomplish this but would like some feedback to make sure I'm not missing anything obvious.
static findChildren(entry, values) {
var stack = []
var parent = values.find((e) => Entry.isParent(entry, e))
var prev = entry
while (parent) {
stack.push(parent)
prev = parent
parent = values.find((e) => Entry.isParent(prev, e))
}
stack = stack.sort((a, b) => a.clock.time > a.clock.time)
return stack
}
maybe "a.clock.time > a.clock.time" should be "a.clock.time > b.clock.time"?
It appears that with the identity provider now it would be easier to support/implement payload encryption by adding optional encrypt/decrypt to to the identity provider as well. Any more thoughts on this? and/or is this part of a larger access control conversation?
Would be interested in implementing, we currently implement a orbitdb store that wraps 'orbit-db-kvstore' and encrypts/decrypts values, but would be nice to encrypt entire payload at log layer.
.data inside a IPFS object is a Buffer, so doing a straight JSON.parse will fail (or should)
https://github.com/haadcode/ipfs-log/blob/master/src/log.js#L120-L128
Since this JSON.parse is wrapped in a promise and there is no catch, the error goes in silence.
Another issue is that since logData might go undefined, logData.items will throw and error too.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.