layr-team / layr Goto Github PK

A decentralized (p2p) file storage system built atop Kademlia DHT that enforces data integrity, privacy, and availability through sharding, proofs of retrievability, redundancy, and encryption, with smart-contract powered incentive scheme

JavaScript 100.00%

tcp file-storage p2p nodejs streams kademlia dht decentralized-storage decentralized distributed-systems distributed-storage proof-of-retreival blockchain sharding stellar-network stellar-lumens blockchain-network peer-to-peer p2p-node smart-contracts

layr's Introduction

Layr

Getting Started

Seed node and BatNode Generation

Layr is an alpha version software that aims to implement a transaction-based p2p distributed file storage system.

Each Layr peer has two nodes: a Kademlia node that is responsible for managing contact information between nodes as well as addressing the locations of files in the network, and a BatNode which is responsible for handling file data transfer, retrieval, and auditing.

In its current state, an NAT traversal strategy in which a Layr peer's Kademlia node brokers connections between two Layr nodes' BatNodes using TCP hole-punching is something we are currently working on. Our case study (coming soon) will detail our approach to NAT traversal.

We define a Layr node as a BatNode-KademliaNode pair running on a device.

A Layr network needs at least one seednode running so that other nodes can join the network. So, before anything else, you should construct a seednode. A seednode is not a Layr node because it does not include a BatNode: it is an individual Kademlia node.

To get a Layr node up and running on a server, ssh into the server and fork this repo. You should then cd into the repo and run yarn install. After running yarn install, cd into the root directory of the project and run yarn link. This will allow you to use the CLI.

To set up a seed node specifically, update the constants.js file to match your server's host information. Then, cd into the seednode directory and run node seed.js. Further nodes that wish to join your network will need this updated version of constants.js in order to join your network.

For Layr nodes that will be participating as data hosts and/or data owners, ssh into a new server, cd into the root directory of the project,and run yarn install and then yarn link. After you do that, run node start.js. In a second terminal window, ssh into the same server and cd into the repo's root directory and run batchain -h for a list of commands you can use.

Data Owners

Chances are that you will upload files to the network. The question is: If you want to upload the file from one machine and retrieve it from another, what do you do?

To retrieve a file, you need the file name, the ids of the shard copies on the network, and the secret key used to encrypt (and decrypt) the file's contents.

Therefore, you can retrieve your file from any device as long as:

The Layr node on that device has the manifest file corresponding to the file you wish to retrieve.
The Layr node's .env file contains the private key you used to encrypt the file's data

In other words, what defines you as the owner of the data is possession of the manifest file that was generated when you uploaded the file to the network as well as the private key you used to encrypt that file's data.

If you simply run node start.js without manually creating a .env file and without including a PRIVATE_KEY in that file, then a private key will be generated for you automatically.

Stellar

Layr uses the stellar network to allow peer nodes to pay for space on other peer nodes' devices. In its current state, Layr is a proof-of-concept project and therefore uses Stellar's test-net. The Stellar test-net provides test-currency for transactions (10,000 lumens per account).

When a node is launched with node start.js, a secret .env file is created for you. This file will contain your private key for decrypting and encrypting file data that you upload to the network, as well as your Stellar account information. If you already have a stellar account, you should create the .env file manually and include your stellar public id like so: STELLAR_ACCOUNT_ID=xxx as well as your stellar secret key: STELLAR_SECRET=xxx

Both are required for transactions to work properly.

Demos

Note:

For npm:

Run npm install -g before running any batchain option or command, make sure to
Need to run npm install -g when making bin changes
If "chalk" is not working for you, run npm install chalk --save to make the command line more colorful

For yarn:

Run yarn link to create a symbolic link between project directory and executable command
Open another terminal window, run batchain and you should see:

 Usage: batchain [options] [command]


  Commands:

    sample      see the sample nodes running
    help [cmd]  display help for [cmd]

  Options:

    -h, --help  output usage information
    -l, --list  view your list of uploaded files in BatChain network

Local CLI demo 2 - upload and audit a file

First step is to make some temporary changes to allow the code to run locally

Uncomment the seed node information and comment out the remote seed node info. The file should end up looking like this:

// For network testing:
// exports.SEED_NODE = ['a678ed17938527be1383388004dbf84246505dbd', { hostname: '167.99.2.1', port: 80 }];
// exports.CLI_SERVER = {host: 'localhost', port: 1800};
// exports.BATNODE_SERVER_PORT = 1900;
// exports.KADNODE_PORT = 80;

// For local testing
exports.SEED_NODE = ['a678ed17938527be1383388004dbf84246505dbd', { hostname: 'localhost', port: 1338 }]
exports.BASELINE_REDUNDANCY = 3;

Next, change this line of code in the while loop

getClosestBatNodeToShard(shardId, callback){
  this.kadenceNode.iterativeFindNode(shardId, (err, res) => {
    let i = 0
    let targetKadNode = res[0]; // res is an array of these tuples: [id, {hostname, port}]
    while (targetKadNode[1].hostname === this.kadenceNode.contact.hostname &&
          (targetKadNode[1].port === this.kadenceNode.contact.port) {

to this.

    // while (targetKadNode[1].hostname === this.kadenceNode.contact.hostname &&
    while (targetKadNode[1].port === this.kadenceNode.contact.port) {

Now we can proceed with the demo.

cd into /audit directory
If you haven't already, run yarn link to create a symbolic link between project directory and executable command. This only needs to be done once.
Open 3 additional terminal windows or tabs that are also in the /audit directory
In the first terminal, cd into server directory. Run rm/db first and then run node node.js
In the second terminal, cd into server2 directory. Run rm/db first and then run node node.js
In the third terminal, cd into client directory. Run rm/db first and then run node node.js. This boots up the CLI server which will listen for CLI commands. Wait for a message to log out saying the CLI is ready before issuing any commands.
In the fourth terminal, cd into client as well. Here we can issue batchain CLI commands.
There should be a example file in the personal directory, so run batchain -u ./personal/example.txt. Wait a few seconds for the 24 or so shard files to be written to server and server2 /host directories.
Kill the process manually (Control-C) and run batchain -a ./manifest/$MANIFESTNAME.batchain. Replace $MANIFESTNAME with the manifest file name generated on client/manifest directory.

layr's People

Contributors

Stargazers

Watchers

layr's Issues

Daemonize CLI interface

This is more of a nice to have, but it since our primary interface is a CLI it would make the UX for issuing commands much nicer. Here are some links I looked up around the subject.

Daemonizing a process (not running it forever)
https://www.npmjs.com/package/daemon - This is what Kadence uses https://github.com/kadence/kadence/blob/master/bin/kadence.js#L20
https://stackoverflow.com/a/12214993/3950092 - node-daemonize2
https://github.com/niegowski/node-daemonize2
https://stackoverflow.com/questions/10428684/how-to-implement-console-commands-while-server-is-running-in-node-js - using process.stdin or prompt library

Running a process forever (related, but probably not something we want to do)
https://www.digitalocean.com/community/tutorials/how-to-set-up-a-node-js-application-for-production-on-ubuntu-16-04#install-pm2
https://stackoverflow.com/a/4988180/3950092 - simplest way to send process to background
https://stackoverflow.com/questions/4018154/how-do-i-run-a-node-js-app-as-a-background-service
https://github.com/Storj/storjshare-daemon

https://github.com/kadence/kadence/blob/master/bin/kadence.js#L126-L135 - Kadence code around stopping a process. PM2 also has docs around graceful stops

Set the payment amount based on file size instead of fix amount

Currently, we use a fix amount 10 for each transaction when when the user downloads/uploads shards each time, no matter how big or small the piece of file is. For example, owner pays the same amount of lumens for a 5MB piece of data, and also pays the same for a 1KB piece of data :

https://github.com/layr-team/batnode_proto/blob/20f947dea5a25ccbf43e36114979721b61044968/batnode.js#L51

While it works in our alpha phase, in real world situation we will need to calculate the amount for each shard/file based on its size to ensure fair usage.

Upload File Should Test Whether A Target Kademlia Node's Batnode is Live

Currently, upload file pings a viable host to make sure it is alive before sending a payment or data to that node. However, the ping does not check the status of the target Kademlia Node's BatNode server. It is possible (though unlikely) that one can be live while the other is not.

Export sharing constants for unchanged values to a module

Example: Since our default port and host values will be the same and never change for the 2nd BatNode server communicating with command line. It will be convenient to to define such constants once in a module. Extracting them in a module allows us to look up these unchanged values across the project. Defining them in one place will also prevents us from having typo errors.

Use separate method for preparing audit data

Using something like the method below will clean up auditFile

prepareAuditData(shards, shaIds) {
    return shaIds.reduce((acc, shaId) => {
      acc[shaId] = {};

      shards[shaId].forEach((shardId) => {
        acc[shaId][shardId] = false;
      });

      return acc;
    }, {});
  }

Able to upload larger file with JSON Stream but unable to download correctly

If we add JSON stream library, we can upload large file without setting very small size for each shard.

Branch I have been working on with : https://github.com/WilfredTA/batnode_proto/tree/jsonstream

Previously, we couldn't download large file to the client as we will experience data loss when trying to write large data into client's shards folder with current method:

 issueRetrieveShardRequest(shardId, hostBatNode, options, finishCallback){
..
  client.on('data', (data) => {
      fs.writeFileSync(`./shards/${saveShardAs}`, data, 'utf8')
  ....

While the servers can read the content quickly but client can not write to the folder as the same speed as servers in downloading process.

For example, when the servers are ready to read the 2nd shard and send the content back to client, the client is still trying to finish writing content for the 1st shard. What happens is the client will actually stop writing the previous shard and try to write the next shard.

If we compare the downloaded shard size from client's shards folder with the uploaded shard size from hosted server's folder, we can notice the downloaded shard size from client is smaller than the the same shard in hosted server's folder. We know that during the downloading process, the client server prematurely finishes writing a complete single shard in disk before accepting new shard request.

We currently fix it with writeStream and setTimeout method:

let writeStream = fs.createWriteStream(fileDestination);
    const completeFileSize = manifest.fileSize;
   // set the divided amount slightly below 16kb ~ 16384 (the default high watermark for read/write streams)
    const waitTime = Math.floor(completeFileSize/16000);  

    // use once listener here instead of "on" in order to pipe it once
    client.once('data', (data) => {
      writeStream.write(data);
      client.pipe(writeStream);

      if (distinctIdx < distinctShards.length - 1){
        finishCallback()
      } else {
        setTimeout( function() {fileUtils.assembleShards(fileName, distinctShards)}, waitTime);
      }
    })

In the future, it maybe better to use async/await instead of calculating estimated waiting time here

storage duration/agreement between hosts and users

To further incentivize more hosts to share their storage in the network, having a storage duration agreement before uploading will be more fair to the hosts.

For example, our system can suggest a default storage duration will be 3 months for each file. Before the storage duration expiring, system will notify users and then users need to decide whether they would like to extend or not. If users want to extend, we need to verify if users' wallet and subtract the payment on the first day of the extension.

Auditing should report the shard copy ID that failed and patching should remove it from the manifest

The only case in which an audit fails but the shard copy ID is still accessible is when the host node was offline at the time of the audit. In all other cases, it is in the data owner's best interest to completely forget about that shard copy ID forever.

Therefore, auditing should report the copy ID that failed and patching should remove that copy ID from the manifest.

In the future, we can always add three states to an audit rather than two states (true or false). The third state will handle the case in which the audited node is simply offline. That's a relatively easy thing to do once the above change is made, since the node alive test is a simple ping with an event handler: if the ping fails, set the result of the shard copy ID to this third state.

Automatic clean up the shards after uploading/downloading

Since we use the same folder "shards" to temporarily store the file pieces when client uploads and downloads files, it maybe more conveniently if our system can automatically clean up for clients once the uploading/downloading process finishes.

Optimize file streaming

Currently, space complexity for transferring data from one node to another scales linearly with the amount of data in the shard being transferred: peak memory usage (space complexity) = O(bytesInShard)

We can push peak memory usage down to constant space complexity by using something like node.js pipe function.

Instead of fs.readFile we can pipe the data in a readStream to a tcp stream.

The JSONStream library we are using solves the problem of larger JSON objects getting parsed when being split into two when being transferred over streams. It seems like it does this by holding the JSON in memory and delaying the trigger of the data event on the JSONStream until it has received a full JSON object. That means that JSONStream's peak memory usage also scales linearly with the size of the JSON object being sent to it. I need to verify this suspicion with their source code, though.

The problem with piping smaller JSON objects that each contain a portion of the total shard data is that the shard data needs to be written in the order it was received, which is hard to manage when the data is written via event handlers.

Essentially what we need to do is write multiple chunks that are not received in order without storing all chunks in memory.

Stellar Smart Contract to Ensure Payment and File Storage Between Untrusted Parties

Our current shard transfer algorithm goes like this:

A node with an ID close to the shard Id is found
That node is pinged to make sure it's still alive
If it's alive, initiate shard payment and transfer
If it's not alive, remove node from contacts and re-search

The "shard payment and transfer" subroutine goes like this:

Given a target node's address, ask for its Stellar account id
Send a payment to that Stellar account id
If the payment is successful, send the shard to the target node for storage

The problem with this is that the data owner cannot trust the target node to host their file. Nor can the data owner trust the inherent volatility of network connections. It is possible that a host node, upon receiving a payment, disconnects from the network. Finally, it is possible that the host node simply deletes the file right after receiving it, keeping the payment but freeing up storage.

We therefore need to "batch" file storage and payment for file storage such that the failure of one entails the failure of the other. To further prevent deletion of file storage immediately after receiving the file, the two nodes must agree on a duration for which the host will store the data owner's shard.

To ensure that this agreement is honored, the host node must be able to prove that it still has the file at the end of the agreed-upon duration. It must therefore pass a data availability and integrity audit immediately prior to receiving payment.

To ensure that the host node is actually paid by the data owner, an escrow account is set up with the funds to pay the host node at the end of the agreed-upon duration.

Edge cases:

What if the host node is generally online and available, but happens to be offline at the time of the final audit that verifies that it is storing the data it agreed to store?
What if the host node is offline for the entire duration, but then gets online immediately preceding the audit in order to get paid? They haven't really satisfied their end of the bargain in this case.

We can redefine the agreement in order to account for these edge cases. We can say that the host node agrees to host the data owner's file and also be available a given percentage of the time. Every time the host node is audited between the time of initial data storage and agreed-upon duration to host the file, the result of that audit is stored. At the end of the agreed-upon duration, the ratio of passed audits/total audits is calculated, and if that ratio is >= the agreed upon ratio of availability, then the host node is paid, otherwise, they are not paid.

Edge cases of the new agreement:

The host node cannot trust the data owner (who is also the auditor) to keep an honest record of the results of the audit. Depending on where these records are stored, it may be possible for the data owner to manipulate these records so that the data owner doesn't have to pay the host at the end of the agreement.

Use import statements to selectively include functionality

The idea is you export a larger object, and then only import the functions you need. For example, import some-function from fileUtils instead of importing the whole fileUtils function that returns the entire object of functions like we currently have. Modern libraries have this setup so can do things like import { prop-types} from React, etc.

Retrieving a file from a secondary machine

Great project, really learned a lot from your youtube video (https://www.youtube.com/watch?v=oCS05QSQ-1k). What isn't clear to me is the following: The use of a centralized cloud service is that when I upload on machine A, I can come online on machine B, independent of whether A is online or not, and get what A uploaded previously. However, what I don't get with Layr is that it works with a manifest file. As you show in your demo, you need to hand the CLI a path to a manifest. If I upload something on machine A, how am I getting the necessary manifest on machine B to access the uploaded file?

Need to verify if the file extension matches ".batchain" before performing downloading&auditing

Currently our CLI just only checks if specific file existed in home dir. We also need to check the extension for manifest file. Otherwise, if we pass a valid file but not with the correct manifest extension, the client side server will crash and exit.

Updated code in jsonstream branch but will need to add to master later

Users can set the amount of storage they want to offer to the network

Users can set the amount of storage their Batnode offers
Batnode tracks max storage (which is set by the user)
On Batnode initialization, current storage is set as a property to Batnode object
When a user tries to store a file on a host candidate, the candidate's available storage is calculated and compared: shard size <= max storage - available storage
Optimization challenge: checking available storage without reading each file and adding up the data it uses
Optimization challenge: if two nodes contact a host node at the same time, asking if it has enough storage for their shard, the host node will say yes to both of them because it has stored neither, but by saying yes to both, it agrees to store more data than it has the capacity to store. There needs to be an in-memory data structure of available storage that can be updated immediately when a node agrees to store a shard even if it hasn't already stored that shard.

An edge case to this, though, is that this data structure may be rendered inaccurate if the shard it agreed to store never made it over the wire!

DRY up auditing code

auditShard should be able to make use of getHostNode to dry up some of the code. Didn't work as planned initially, but we should later.#14 (comment)

Have hosted folders created automatically if they don't already exist

If a user tries to upload a file via the command line or manually and there is no existed hosted directory on the server node(s), then no files will be written to that node.

Additionally, the error isn't handled and it causes the server without the hosted folder to crash. Creating the folder if it doesn't exist means we won't have to handle any errors, though.

Add guard cases before generating .env for users

Currently there is an edge case that we need to take care for generating new .env file:

User already HAS stellar account but no PRIVATE KEY.
User created an empty env
With our current code in master, since the user self-created the .env file and added stellar account already, system will skip generating PRIVATE KEY also, therefore we should handle cases differently.

I made the changes to a new branch test-env to fix this bug.

Random Challenges and Their Results are Stored and Sent to Host

This makes the PoR answer unpredictable and variable. Challenges cannot be reused.

Benefits of this method:

Higher degree of confidence in audit accuracy: if the host passes, we can be more confident that they have the file.

Cons of this method:

Introduces an O(number of audits) space complexity for the data owner
Places an upper bound on the number of audits a data owner can do (they will have to download and re-upload the file if they run out of audits, which is more costly)
It is computationally costly for the data host, who has to process the entirety of the file data for each audit

Use the Async library for asynchronous actions

A lot of our asynchronous actions are achieved with callbacks that go down multiple levels. This can be re-written in a cleaner way with node.js' Async library.

async/await pattern in audit

We were hoping to use the async/await pattern in audit to solve a problem where we need to asynchronously return data from the auditFile method in order to pass the correct data object into the data event handler for the CLI to access. However, after a bit of research I’m a little skeptical of getting async/await type things to work quickly* the way we were thinking about. The problem is that auditFile executes a series of synchronous actions, then, those synchronous actions trigger events that have handlers. These events (mostly the ‘data’ event) and their event handlers then execute asynchronously. You’d have to do something differently to have all methods called in the audit operation execute asynchronously and be tied to their event handler.

The basic issue is that async works with functions that returns promises. Therefore I think what you would have to do to make this strategy work is make all audit related methods promise based, but I’m not sure I’d be able to pull that off soon. The way you’d start off is to have the entire auditShardData method body inside the return a promise (i.e return Promise.new(resolve, reject …) where it resolves in the data event handler. Article 2. shows a decent example of this. The problem then is we have 3/4 other methods in between there and auditFile method we need to return data from, which would also have to be promisified.

Options to explore are:

promisfying audit related method
promise based net.Socket wrapper libraries
custom events using event emmitter

Some related links:

https://github.com/mkloubert/node-simple-socket - socket library that’s promise based. not popular at all, though
https://techbrij.com/node-js-tcp-server-client-promisify - example of promisfying a client send method
https://www.ibm.com/developerworks/community/blogs/binhn/entry/creating_a_tls_tunnel_with_node_js_and_promise?lang=en - just another example of net.connect w/ promises
https://stackoverflow.com/questions/40352682/promisify-event-handlers-and-tiemout-in-nodejs - simple example of event based promise

Refactor processUpload function fileUtils

We pass the callback parameter in processUpload down to about four additional methods. It would be nice if we could make this part of the fileUtils a bit easier to read.

Remove main-thread-blocking I/O Operations

Some of our I/O operations are achieved with synchronous (thread-blocking) methods. This is a known anti-pattern in node (see here).

The reason we use synchronous actions in some places is to prevent another action from executing until the synchronous action has completed. This matches one of the use cases of the Async library, so this issue may be resolved when we refactor to Async-based asynchronous code rather than callback-based asynchronous code.

Remove readFile method from BatNode

Reading a file is Node file system (fs module) responsibility, and we aren't adding any functionality by adding the code to batnode.js.

Similar methods that be used on other modules:

connect
writeFile

Use constants for CLI messages

Messages like 'You can audit file to make sure file integrity' are currently in two files, so importing constants seems like an easy way to edit them in two places at once. JS communities leaders in React/Redux also seem to like using constants in general, so we should probably consider using them in additional situations like for numbers.

Improve on flat file storage

Data hosts currently store files in a one dimensional folder. As the number of files a data host stores increases, lookup will take longer and longer since lookup time grows linearly to the number of items in the hosted folder.

Improve upload process with async just-in-time process

Right now our upload process works, but it will be inefficient for large data.

Below is our current process:

Encrypt the file
Once encryption finishes, divide it in to k number of shards
Once all the shards finishes, distribute it to the network

This will slow down the upload process with large file since sharding takes longer time for big file.

A more efficient way will be using "async" way.
Step 1 and 2 will remain unchanged
3. Once the first shard finishes, we can distribute it first to the network when client is still writing the next shard
4. There will be very small chance that uploading the first shard will be faster than client writing the 2nd shard, but if it really happens, we can use if else statement to check if the 2nd shard completes yet before implementing distribution

Error "Timeout: failed to complete audit" If one node deleted a shard

Retrieve File Should Test if a Target Node is Online Before Sending it a Request

Currently, retrieve file only works with the happy path: it assumes that the host node of the first shard copy it locates is online.

We should implement an alive-test using a simple ping, and only send a request from that node if the ping is successful.

Further, the check that retrieve file uses to make sure a bat node is online is also not truly testing if a bat node is online: it is only testing that the batnode server was created at some point in the past.

Bigger shard has issues with file patching

current file patching method doesn't work well with shard which size > 66kb.

Fix solution in this branch: https://github.com/layr-team/Layr/tree/patch-bigger-shards