storjold / downstream-node Goto Github PK
View Code? Open in Web Editor NEWVerification node for the Storj network.
Home Page: https://live.driveshare.org
License: MIT License
Verification node for the Storj network.
Home Page: https://live.driveshare.org
License: MIT License
Below is a rough outline of how to approach scaling the file test size from 100-byte files to maximum disk capacity. We make use of our existing algorithm and libraries. This is allow us to stress test our network capacity(in terms of drive space), providing meaningful data for later.
For simplicity and scale we will use deterministic hash challenges, as described in the whitepaper. We won't use merkle trees though. This will take a very long time, and some compute power but we will generate a pool of hash challenges for each file on the Master List. This list must remain private.
Verification nodes must store these pools of challenges in a DB, and use excess CPU capacity to refresh the challenges file by file over time. Farmers will generate and store as much of the master list as they can. Farmers will all have the same files, but the verification node will issue a random challenge from the pool for each farmer stores, so that each farmers will provide a unique response.
Farmers simply need to be able to use the updated heartbeat to respond to challenges.
Will we allow unencrypted content onto the Storj network, especially if it is below the smallest chunking size? I think we should consider doing a statistical randomness test of chunks.
You really aren't supposed to put large files in a Github repo(its pretty slow too). Perhaps instead we can have the user wget
it from a web server, or have it automatically downloaded in the install/setup process.
This is the API spec for a node. It is required to produce the following information for each farmer:
The list of all farmer ids can be retrieved with
GET /api/downstream/status/list
which returns all the farmers ids
{
"farmers":
[
"fa1e4944e48ed7bd3739",
"997e717ba92078118cce",
"0f297828e2a687943fc4",
"81b6a0d841a3184028e6",
"49eb47ea315d53399f69",
"b2ca01ff2113559b231d",
"68ff46d440255ac29a3c",
"479935fca9ce02f62788",
"33d63de99f0aad6279ba",
"8088d59b6adf8faf9974",
"ddbeb08b93b1d06e9939",
"e7a5558e62d315a54058",
"1de67bc29901db705ea1",
"e94902cda505de115027",
"c9c1f91a6362af0babad",
"e87f8117d0d10a8d6479",
"5f652023fb6b8034fb5c",
"b05bd26f9f28035a3006",
"e78d8f0edd1a50bce83b",
"dc4ce32c7c8a7d0a3cb9"
]
}
Optionally, one may sort in ascending order by id
, address
, uptime
, heartbeats
, iphash
, contracts
, size
, or online
by using
GET /api/downstream/status/list/by/<sortby>
or in descending order
GET /api/downstream/status/list/by/d/<sortby>
It is also possible to limit the number of responses
GET /api/downstream/status/list/by/<sortby>/<limit>
and specify a page number
GET /api/downstream/status/list/by/<sortby>/<limit>/<page>
So some examples
GET /api/downstream/status/list/by/d/uptime/25
will return the 25 farmers with the highest uptime percentage
GET /api/downstream/status/list/by/d/contracts/15/2
will return the third page (rows 30-44) of the farmers with the most contracts.
And then the individual farmer information can be retrieved with:
GET /api/downstream/status/show/<token_hash>
{
"id": "45bd945fa10e3f059834",
"address": "18d6KhnTg9dM9jtb1MWXdbibu3Pwt1QHQt",
"location": {"name": "West Jerusalem", "country": "Israel", "lon": "35.21961", "zipcode": "", "state": "Jerusalem District", "lat": "31.78199"},
"uptime": 0.96015,
"heartbeats": 241,
"iphash": "d55529c83953e218cc58",
"contracts": 2,
"size": 200,
"online": true
}
Planning on making the id the first 20 characters of the hex representation of the token hash.
Will probably cache the farmer list on the server side to improve performance.
Currently planning on using geodis or GeoIP for geographic resolution.
This modification will add the following methods to the node library:
create_token(sjcx_address)
"""Creates a token for the given address. For now, addresses will not be enforced, and anyone
can acquire a token.
:param sjcx_address: address to use for token creation. for now, just allow any address.
:returns: the token
"""
delete_token(token)
"""Deletes the given token.
:param token: token to delete
"""
get_chunk_contract(token)
"""In the final version, this function should analyze currently available file chunks and
disburse contracts for files that need higher redundancy counts.
In this prototype, this function should generate a random file with a seed. The seed
can then be passed to a prototype farmer who can generate the file for themselves.
The contract will include the next heartbeat challenge, and the current heartbeat state
for the encoded file.
:param token: the token to associate this contract with
:returns: the chunk
"""
verify_proof(token,file_hash,proof)
"""This queries the DB to retrieve the heartbeat, state and challenge for the contract id, and
then checks the given proof. Returns true if the proof is valid.
:param token: the token for the farmer that this proof corresponds to
:param file_hash: the file hash for this proof
:param proof: a heartbeat proof object that has been returned by the farmer
:returns: boolean true if the proof is valid, false otherwise
"""
To write these functions the database models have to be modified to include contracts and tokens tables.
Additionally the following prototype routes should be exposed for the public API:
Get a new token for a given address. For now, don't check address, just return a token.
GET /api/downstream/new/<sjcx_address>
Response:
{
"token": "ceb722d954ef9d1af3eed2bbe0aeb954",
"heartbeat": "...heartbeat object string representation..."
}
Get a new chunk contract for a token. Only allow one contract per token for now. Returns the first challenge and expiration, the file hash, a seed for generation of the prototype file, and the file heartbeat tag.
GET /api/downstream/chunk/<token>
Response:
{
"challenge": "...challenge object string representation...",
"expiration": "2014-10-03 17:29:01",
"file_hash": "012fb25d2f14bb31bcbad5b8d99703114ed970601b21142c93b50421e8ddb0d7",
"seed": "70aacdc6a2f7ef0e7c1effde27299eda",
"tag": "...tag object string representation..."
}
Gets the currently due challenge for this token and file hash.
GET /api/downstream/challenge/<token>/<file_hash>
Response:
{
"challenge": "...challenge object string representation...",
"expiration": "2014-10-03 17:29:01",
}
Posts an answer for the current challenge on token and file hash.
POST /api/downstream/answer/<token>/<file_hash>
Parameters:
{
"proof": "...proof object string representation..."
}
Response:
{
"status": "ok"
}
Downstream farmer will also need to be modified to interface with this new prototype node.
In terms of allow people to farm these are the things I was thinking about in terms of limiting access.
Even with this limiting factor I think we might have some scale issues. We may or may not have DDOS attacks, but I more see someone trying to scale up a bunch of virtual nodes to farmer. With the limiting factors using total currency supply we can have a maximum of around 5k farmers. I estimate around 1k farmers at the peak.
So I think some testing needs to be done by spinning up some virtual farmers locally, before we start adding limiting factors.
Regardless I think since the dashboard is a separate interface, we can spin up multiple verify nodes as long has we have some kind of pattern and the dashboard will automatically detect and add their stats. My suggestion is that once the codebase is solid we craft a Digital Ocean image, so I can launch more nodes with a couple clicks.
On Downstream-Farmer:
> downstream --verify-ownership tests/thirty-two_meg.testfile 'http://localhost:5000'
Fetching challenges...
Received 1000 challenge(s).
Verifying ownership...
Verifying local file tests/thirty-two_meg.testfile.
Error: tests/thirty-two_meg.testfile is not a valid file
On Downstream-Node:
DEBUG in routes [/home/super3/Code/downstream-node/downstream_node/routes.py:51]:
No entry in database for file thirty-two_meg.testfile; generating challenes
then
DEBUG in routes [/home/super3/Code/downstream-node/downstream_node/routes.py:45]:
Fetching challenges for thirty-two_meg.testfile
Get downstream-node:
$ git clone https://github.com/Storj/downstream-node.git
$ cd downstream-node
$ pip install -r requirements.txt .
requirements.txt not found
cp downstream_node/config.py.template downstream_node/config.py
wrong
Result:
python runapp.py --initdb
Traceback (most recent call last):
File "runapp.py", line 11, in
import base58
ImportError: No module named 'base58'
Can you update the README?
Having a GVN become a bootstrap node may speed up the peer-finding process for seeding uploaders.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.