open-science-org / idea-hub Goto Github PK

View Code? Open in Web Editor NEW

8.0 8.0 6.0 12.46 MB

Decentralized repo to publish, review, store, track, and fund scientific publications

Python 1.94% Jupyter Notebook 80.78% HTML 4.07% JavaScript 10.76% CSS 2.44%

idea-hub's People

Contributors

Stargazers

Watchers

Forkers

himalayajung gil-air-may anish-stha bipinkh sireto 00mjk

idea-hub's Issues

Metadata Storage and DHT look up

Tasks Performed on June 5, 2020

Reviewed Chord Algorithm based DHT lookup
Tested indexing and searching with Apache Lucene

Setup contract

Smart Contracts

Contract is drawn after an idea has been validated by the validators
The contract shall keep a record of the idea using its identifier and validated info hash of the torrent.
Tokens are generated and distributed among authors, validators and OSO foundation

Reference issues

Set up a IPFS node file pin mechanism

Pin any files that are included in the messages received in that channel.
@abinashk working on this

Make POC web app our homepage and include current website as link in it.

Update/remove outdated dependencies

There are some outdated dependencies that should be removed. The details are available to the maintainers of the project.

Briefing
We need to associate the file CID with the ethereum address in order to provide ownership for an uploaded idea. We already have access to the ethereum address within the app (via react-web3 lib). The file CID is returned from the function uploadIdeaToIPFS() inside the IdeaForm.jsx component.

Objective
Integrate orbitdb into our current POC app in order to store the following <k,v> pair in a decentralized way:

<[ethereum_address], [file_CID]>

Let's break it down

Install OrbitDb
Discover how many different nodes do we need to be running to support OrbitDb (is the browser node enough)?
Figure out any specific OrbitDb configuration for the nodes involved in the process.
Integrate the saveToOrbit(eth_address, file_hash) callback into publishFileHash().
Figure out which OrbitDb API allows you to retrieve files.
Implement retrieveHashFromOrbit(eth_address). The address is taken from the state, and the function should console log the corresponding hash (after retrieval).
Create an button on the UI that has onClick behavior linked to retrieveHashFromOrbit
See the corresponding hash in the browser console after clicking the newly created button.

Libtorrent bootstrapping.

Set up lib torrent and look into Java binding of lib torrent library.

UI/UX development with electron

Idea hub client side setup with electron for desktop end client. React as UX tool and ant desgin as UI tool.
The client side will include:

Idea CRUD
Idea metadata CRUD
Search idea based on metadata

Distributed Metadata store

Assuming the documents are stored in IPFS, each document will have an IPFS hash. Unless this unique hash is known it is not possible to retrieve the document otherwise. This leads to the necessity for a metadata store which can facilitate a distributed search using humanly-readable queries.

Metadata structure

A basic metadata could be represented with tuple consisting of document (IPFS) hash, attribute name and value.

(document_hash, attribute_name, attribute_value)

Metadata file

Assuming the metadata for a document vastly remains the same, all the metadata can be grouped inside a metadata file and again use IPFS for storage. The metadata could simply be a json file so that parsing the file at a later stage would be easier.

Document to Metadata mapping

With separate document and metadata files, it is now necessary to map them so that we know which metadata belongs with which document. A choice to keep this map is to use DHT which allows update on the mapping incase the metadata file (& hash) changes.

document_hash -> metadata_hash

For this, IPFS-DHT or any other DHT could be used.

Searcher Nodes

A searcher node is responsible for handling the search queries and providing appropriate results or next set of nodes who could provide the results. This is basically a DHT node which fetches the metadata from DHT records and indexes them to support search. For indexing and searching, we could use Apache Lucene.

Incentive to search

It is important that there should be some incentive mechanism to keep the searcher nodes motivated to store & index more metadata files and respond to search queries. An example could be to have reputation system based on the successful serving of queries. This basically means more results a node serves, higher will be its reputation. In order to successfully serve more queries, a node will need to store & index more documents which in turn maintains the availability of the documents.

Discussion on TCRs and initial version of Validation Layer 1 (decentralized preprints)

Purpose
long-term: to encourage discussion on adoption of TCRs
short-term: to create first version of validation layer 1 of Idea Hub (i.e. spam filters)

Decentralized publishing (Idea-Hub) can be modeled as a cascade of token curated registries (TCRs). See OIP-7 open-science-org/OIPs#9. Design discussions should go to OIP-7

The TCRs will be similar but will be unique to serve different purposes. Which means, it would be good to have a general code for TCR and create different instances of it for specific use cases. We can consider using existing code. For example, https://github.com/skmgoldin/tcr

long-term: What is our game plan?

short-term: Whatever the game plan, how can we make the first version of the validation layer 1 of Idea Hub?

References
See Section 2 of https://github.com/open-science-org/wiki/blob/master/Proof_of_Idea.pdf

Design REST APIs Definition

API design specification and implementation by the node.

Endpoints will include :

Idea CRUD (by any user)
Validator CRUD (by OSO network)
Submit or update metadata, and update torrent files.
Select a processor for the Idea (by author)
Supporting endpoints for Validators
- Sign/Validate the Idea
- Assign validation responsibility to other Validators
Search the contents based on metadata
- Local search
- Search with other clients/nodes

Endpoints relating to citations and token distribution is not considered in this Issue.

IdeaHub stack

IdeaHub stack

Storage - Libtorrent
Network - DHT/TCP
Smart Contract - Token generation and distribution
REST APIs
Client - Desktop application built with Web/HTML technologies

1. Storage
Each idea is stored as a torrent. All the files (eg. research papers, data, multimedia files) are inside the torrent. Apart from the files representing the idea, a metadata.json file is created & updated by the OSO client which includes the metadata of the idea. Metadata can include fields like title, authors, abstract, keywords, etc. When the files or metadata changes the infohash of the torrent also changes. Therefore, a mechanism is needed to link an idea to the correct up to torrent infohash. This is done on the network layer using Distributed Hash Table (DHT).

2. Network - DHT/TCP
All full nodes in the network make up the DHT network. A major function of DHT is to keep up to date reference of the idea identifier to the latest torrent infohash so that all the clients (other than the author) can get the up to date information about the idea. A signature check is done before updating the DHT at each node to maintain the integrity and ownership of the idea.

3. Smart Contracts - Token generation & Distribution
When an idea is validated by the validators (introduced later), an immutable record of the idea is stored in the blockchain. A record of the idea identifier and the validated infohash of the torrent is written. After that, OSO tokens are generated and distributed to the following stakeholders.

Authors
Validators
OSO foundation

4. REST APIs
Each full node keeps track of all the ideas submitted to the network reading the idea identifier and the infohash from the blockchain. By downloading the torrent, the node indexes the metadata and makes the content available for search via REST APIs. The APIs are consumed by the end clients.

5. End client
The end client is what is shipped to the users. The users can create/update/delete their ideas and search for other ideas in the OSO network. Since every idea is represented as a torrent file, the client will also be a torrent client downloading and seeding files. This will make the idea more accessible to the rest of the community. Since all the ideas in the platform are validated by the validators, we can assume (safely) that being a torrent client will not have negative implications. We're embracing the technology and not its bad use cases.

Validators
Validators are the nodes that validate the ideas before they can be published to the blockchain and rewards can be distributed. It is a way to maintain the quality of the ideas in the network. Validators check that idea identifier is unique and the content of the torrent is original. While validators may not be able to do deep validation, basic educated human validation should prevent several spam ideas. More sophisticated checks can be done later as we move past the PoC.

Have to publish twice for pubsub to work

Description
The publish file hash (using pubsub) logic has to be called twice for the recipient node to receive the message. Basically, this means we have to click the "submit" button twice.

Preparation steps

Clone the branch poc-front-overhaul and make it run on localhost (yarn run start inside idea-hub\projects\idea-hub-client)
Install IPFS on local machine and run a go-ipfs node.
Enable go-ipfs websockets. Refer to this helpful guide
Connect the browser node to the node running on your machine. Go the IdeaForm.jsx components inside the component folder and change the const addr (inside the publishFileHash function) the websocket address you have configured on your machine node.
Run the go-ipfs node on the terminal with ipfs daemon --enable-pubsub-experiment on a separate terminal window type ipfs pubsub sub testing123
Setup is ready at this point. You should be able to reproduce the having to click twice behavior.

Objective

Find out why the recipient node only gets the message after a second publish (using the js-ipfs pubsub API).
Briefly explain the root cause by posting a comment.
Propose a solution and a pull request to the branch poc-front-overhaul

Idea-hub Website design proposal

Home page

Idea Detail Page

Idea Submit Form

My posted ideas view

Pubsub Won't Work Between Browser Node and Machine node (Daemon)

DEADLINE
27/03/2019 ⏳

Version: js-ipfs: 0.34.4 and go-ipfs: 0.4.19
Platform: js-ipfs on Chrome v. 72.0.3626.121 and go-ipfs on 64-bit Windows 10

Objective

Enable Pubsub feature between IPFS node running on browser and IPFS running on local machine.

Overview of the Problem

We are trying to connect 2 nodes and get the Pubsub functionality working between them.

Node A is created programatically and runs in the browser. This node will Publish the message.
Node B runs as daemon in local machine, started via CLI. This node will Subscribe to a topic.

But, despite having tried several different configurations, we did not get the Pubsub feature working in this scenario.

Why is this necessary?

This behavior is required since it is part of our current strategy to pin the submitted ideas to IPFS, and make them available to the public in a decentralized way. Put simply, if we manage to achieve Pubsub between Node A and Node B, that means we can move forward with the POC app that we've been developing.

NODE A (browser) Current Config

First you need to require wrtc. This is necessary to setup a p2p connection between the 2 nodes:

const wrtc = require("wrtc"); // or require('electron-webrtc')
const WStar = require("libp2p-webrtc-star");
const wstar = new WStar({ wrtc });

Then you can instantiate the node by:

this.ipfsNode = new IPFS({
      EXPERIMENTAL: { pubsub: true },
      relay: { enabled: true, hop: { enabled: true } },
      config: {
        Addresses: {
          Swarm: [
            "/ip4/127.0.0.1/tcp/4001/ipfs/QmPMaDyK2ee95BpjnMyWYiVi46EcZFrY8AUmSzieNSLbEa",
            "/dns4/wrtc-star.discovery.libp2p.io/tcp/443/wss/p2p-webrtc-star"
          ]
        },
        libp2p: {
          modules: {
            transport: [wstar],
            peerDiscovery: [wstar.discovery]
          }
        }
      }
    });

NODE B (daemon) Current Config

Once you install IFPS on your machine and run ipfs start from the terminal, a folder named .ipfs will be created in your $HOME directory. Inside that folder, you will find a config file. The current state of my file is:

{
  "API": {
    "HTTPHeaders": {}
  },
  "Addresses": {
    "API": "/ip4/127.0.0.1/tcp/5001",
    "Announce": [],
    "Gateway": "/ip4/127.0.0.1/tcp/8080",
    "NoAnnounce": [],
    "Swarm": [
      "/ip4/0.0.0.0/tcp/4001",
      "/ip6/::/tcp/4001"
    ]
  },
  "Bootstrap": [
    "/dnsaddr/bootstrap.libp2p.io/ipfs/QmNnooDu7bfjPFoTZYxMNLWUQJyrVwtbZg5gBMjTezGAJN",
    "/dnsaddr/bootstrap.libp2p.io/ipfs/QmQCU2EcMqAqQPR2i9bChDtGNJchTbq5TbXJJ16u19uLTa",
    "/dnsaddr/bootstrap.libp2p.io/ipfs/QmbLHAnMoJPWSCR5Zhtx6BHJX9KiKNN6tpvbUcqanj75Nb",
    "/dnsaddr/bootstrap.libp2p.io/ipfs/QmcZf59bWwK5XFi76CZX8cbJ4BhTzzA3gU1ZjYZcYW3dwt",
    "/ip4/104.131.131.82/tcp/4001/ipfs/QmaCpDMGvV2BGHeYERUEnRQAwe3N8SzbUtfsmvsqQLuvuJ",
    "/ip4/104.236.179.241/tcp/4001/ipfs/QmSoLPppuBtQSGwKDZT2M73ULpjvfd3aZ6ha4oFGL1KrGM",
    "/ip4/128.199.219.111/tcp/4001/ipfs/QmSoLSafTMBsPKadTEgaXctDQVcqN88CNLHXMkTNwMKPnu",
    "/ip4/104.236.76.40/tcp/4001/ipfs/QmSoLV4Bbm51jM9C4gDYZQ9Cy3U6aXMJDAbzgu2fzaDs64",
    "/ip4/178.62.158.247/tcp/4001/ipfs/QmSoLer265NRgSp2LA3dPaeykiS1J6DifTC88f5uVQKNAd",
    "/ip6/2604:a880:1:20::203:d001/tcp/4001/ipfs/QmSoLPppuBtQSGwKDZT2M73ULpjvfd3aZ6ha4oFGL1KrGM",
    "/ip6/2400:6180:0:d0::151:6001/tcp/4001/ipfs/QmSoLSafTMBsPKadTEgaXctDQVcqN88CNLHXMkTNwMKPnu",
    "/ip6/2604:a880:800:10::4a:5001/tcp/4001/ipfs/QmSoLV4Bbm51jM9C4gDYZQ9Cy3U6aXMJDAbzgu2fzaDs64",
    "/ip6/2a03:b0c0:0:1010::23:1001/tcp/4001/ipfs/QmSoLer265NRgSp2LA3dPaeykiS1J6DifTC88f5uVQKNAd"
  ],
  "Datastore": {
    "BloomFilterSize": 0,
    "GCPeriod": "1h",
    "HashOnRead": false,
    "Spec": {
      "mounts": [
        {
          "child": {
            "path": "blocks",
            "shardFunc": "/repo/flatfs/shard/v1/next-to-last/2",
            "sync": true,
            "type": "flatfs"
          },
          "mountpoint": "/blocks",
          "prefix": "flatfs.datastore",
          "type": "measure"
        },
        {
          "child": {
            "compression": "none",
            "path": "datastore",
            "type": "levelds"
          },
          "mountpoint": "/",
          "prefix": "leveldb.datastore",
          "type": "measure"
        }
      ],
      "type": "mount"
    },
    "StorageGCWatermark": 90,
    "StorageMax": "10GB"
  },
  "Discovery": {
    "MDNS": {
      "Enabled": true,
      "Interval": 10
    }
  },
  "Experimental": {
    "FilestoreEnabled": false,
    "Libp2pStreamMounting": false,
    "P2pHttpProxy": false,
    "QUIC": false,
    "ShardingEnabled": false,
    "UrlstoreEnabled": false
  },
  "Gateway": {
    "APICommands": [],
    "HTTPHeaders": {
      "Access-Control-Allow-Headers": [
        "X-Requested-With",
        "Range",
        "User-Agent"
      ],
      "Access-Control-Allow-Methods": [
        "GET"
      ],
      "Access-Control-Allow-Origin": [
        "*"
      ]
    },
    "NoFetch": false,
    "PathPrefixes": [],
    "RootRedirect": "",
    "Writable": false
  },
  "Identity": {
    "PeerID": "QmPMaDyK2ee95BpjnMyWYiVi46EcZFrY8AUmSzieNSLbEa",
    "PrivKey": "CAASqAkwggSkAgEAAoIBAQD1er6xc6EPjYtGjMV4W/uEs4uM17Zsm85S0nl/IgkcgJBxPPK82EGiIy3Z2BXqi9k018cmlR35WWLoyqPmqgRvQ4nBuWPTCmak/FoqBfJ7AxRovsb1NGPBhycHKSI+d0KaynpLZ+GLVgGU13cbu+viLEfIWa53GndVkpvYPFzDVBkvZRT6nkPsVemqfHiOPBgtjYDtij48fJU9Bl1cTfJ3EcuTpaEA+QhlqPHUsWUh4cmE/Ln5LSxR47zEIT2eYa9LIPPx328ZiSq6Lh1AGO4fe7j5B8L3ZPQYnLz+cBMftS2tlM4V3J6fqyqIl1sRKdKPHMUj8epB0EcKjyRaHWkjAgMBAAECggEAGgN8688mFUDZrotCbePJfqGMO0uswEuujKZTS76umn+hTu63hn2gTu9Nb5VvlSBmzyvCpfsNZxwq2CKJRetkduoAUjA0POwQPpGjeGqS7KhB5Gu7J8b6f0q0PxUD1PzMaRzl4tHKW/qsRjqjG6RJdfldTgT68RIz7TSRIVQcPHKa6E3t/QVzgQJcK6xQehGtOJ7PlSo8cHQRaxcNesUQyG4N0x4UfF+MqVme52HXZOn6OhzDdhjboHm0zzfhLil1r2AbgdGMWEzRp68OVtSAjjk5CN6EPuOmsck+22c1lV7wFfLctbv7sXuDDYtG1+24sx69AyQldgBXnsKf3lsX8QKBgQD1s+qliofUZIdN9bPaxEfKSfi607jQ3qJCnKlOvXtzO+v01yKDZ+0Qa5EeFOoK05V2xaw9mFstnefUPfid8B84+YOtCHcGyZmI8DJRmjHkxQHVHJB8lI82gz++k7jn4fUh7Ajb40S+RC/rw0XPcNR8HrzFcG+Tye70+IjswUAnWwKBgQD/xG6qdEYKcMYfoXNZ6d7NtlCJWs+P19gdprkdcfE3/0qd4+uOvptbW6dkPo2i9870HydKeTXmB94Tw81EcrSW8mjwSn9wibD4SInIjlE4EQUtsMPno2Hm0Jnmw9fUCx7o0OH+C364rX1dCBk+itKRfweQdYDEdMqEfheEyO632QKBgQDehqsecH+iWcW9UqkomhoW2LXfpv88lFZKlA421R+odv21yt5kOsyW0YUlxHVPht9YKaFcS89QWjHrpJC1ohL1C+442XDLgex+/GPmSgukENUfCPbHDdlC2s3xsWKHCLt1lItVctkApUrtcPaZ8KtRGpmHC9TR+dJkpW+FVWTf/wKBgQCf6SHD4uSzvGSy/A+R5N4PwfBCoItrhOkzSL0ugsHtX+k4JHtviQ67JOfYjh+iB8vV5/B56KThSIP52Y7qP8lXIwKnUfyx0PTblwbGZOy04DdbpMwndIhOdpfypvm3MqjFqWvSmT9Gmfnqg5i8+LDElSaWlFDJA7hm9CsiMzrFqQKBgFU4PH7KiraRaTtCNvXai4Lv7XXgnjhnUtE/enzvylCngmdF7sjxq1IQV8pNdKWqAx2yputUfOZGQgi/w2CC52svyEjrWv/NghACSCO5mMbMZc/cHPP54vt9lR2NqbnPnYZPcWym5kj3kCtCqpvDKnbDx4VhFhdxw8/yHM7N1r5+"
  },
  "Ipns": {
    "RecordLifetime": "",
    "RepublishPeriod": "",
    "ResolveCacheSize": 128
  },
  "Mounts": {
    "FuseAllowOther": false,
    "IPFS": "/ipfs",
    "IPNS": "/ipns"
  },
  "Pubsub": {
    "DisableSigning": false,
    "Router": "",
    "StrictSignatureVerification": false
  },
  "Reprovider": {
    "Interval": "12h",
    "Strategy": "all"
  },
  "Routing": {
    "Type": "dht"
  },
  "Swarm": {
    "AddrFilters": null,
    "ConnMgr": {
      "GracePeriod": "20s",
      "HighWater": 900,
      "LowWater": 600,
      "Type": "basic"
    },
    "DisableBandwidthMetrics": false,
    "DisableNatPortMap": false,
    "DisableRelay": false,
    "EnableAutoNATService": true,
    "EnableAutoRelay": true,
    "EnableRelayHop": false
  }
}

The Problem

I start the daemon on the terminal using:
ipfs daemon --enable-pubsub-experiment and subscribe to the topic using ipfs pubsub sub testing123

I navigate to localhost:3000 and the console logs:

Swarm listening on /p2p-circuit/ip4/127.0.0.1/tcp/4001/ipfs/QmPMaDyK2ee95BpjnMyWYiVi46EcZFrY8AUmSzieNSLbEa/ipfs/QmcCnVBpYLvcRLbutwRMkzeJwHfjbFaq8y9JQ2sLcBAX1L

Cool. So looks like the p2p circuit is online! So next we upload a file IPFS and save the CID to the state. (this.state.added_file_hash)

Then on the browser, we try to communicate with the daemon using

publishFileHash() {

    const topic = "testing123";
    const msg = Buffer.from(this.state.added_file_hash);
    console.log("msg is");
    console.log(this.state.added_file_hash);

    this.ipfsNode.pubsub.publish(topic, msg, err => {
      if (err) {
        return console.error(`failed to publish to ${topic}`, err);
      }
      console.log(`published to ${topic}`);
    });
  }

After this, browser console logs: published to testing123 but the daemon does not get the message. 😢

Why P2P Circuit? Did you try a direct connection

So maybe Pubsub does not work with P2P circuits. However, I could not find a way to get a direct connection between the browser node and the daemon node.

If you use:

publishFileHash() {
    const addr =
      "/ip4/127.0.0.1/tcp/4001/ipfs/QmPMaDyK2ee95BpjnMyWYiVi46EcZFrY8AUmSzieNSLbEa";
    this.ipfsNode.swarm.connect(addr, function(err) {
      if (err) {
        throw err;
      }
      const topic = "testing123";
      const msg = Buffer.from(this.state.added_file_hash);
      console.log("msg is");
      console.log(this.state.added_file_hash);

      this.ipfsNode.pubsub.publish(topic, msg, err => {
        if (err) {
          return console.error(`failed to publish to ${topic}`, err);
        }
        console.log(`published to ${topic}`);
      });
    });
  }

You get an browser console logs an error that says:

I want to contribute but I'm not sure where to start...

If you would like to know more about why solving this task is critical, feel free to reach out to me or @abinashk 😃 Right now this is our priority. If you're up for the challenge, I can also help setting up Node A and Node B so you can aid us to reach a conclusion 🥇

[Smart Contract] - Design and implement smart contract

State data structure
Add docs for description on how the smart contract design can be achieved i.e token management and reward sharing
Code implementation for smart contract using Solidity in Truffle framework
Tests and deployment in development env using Ganache