Giter Site home page Giter Site logo

nopfs's Introduction

NOpfs!

NOPfs helps IPFS to say No!

ipfs-lite

NOPfs is an implementation of IPIP-383 which add supports for content blocking to the go-ipfs stack and particularly to Kubo.

Content-blocking in Kubo

  1. Grab a plugin release from the releases section matching your Kubo version and install the plugin file in ~/.ipfs/plugins.
  2. Write a custom denylist file (see syntax below) or simply download one of the supported denylists from Denyli.st and place them in ~/.config/ipfs/denylists/ (ensure .deny extension).
  3. Start Kubo (ipfs daemon). The plugin should be loaded automatically and existing denylists tracked for updates from that point (no restarts required).

Denylist syntax

Denylist files must have the .deny extension. The content consists of an optional header and a body made of blocking rules as follows:

version: 1
name: IPFSorp blocking list
description: A collection of bad things we have found in the universe
author: [email protected]
hints:
  gateway_status: 410
  double_hash_fn: sha256
  double_hash_enc: hex
---
# Blocking by CID - blocks wrapped multihash.
# Does not block subpaths.
/ipfs/bafybeihvvulpp4evxj7x7armbqcyg6uezzuig6jp3lktpbovlqfkuqeuoq

# Block all subpaths
/ipfs/QmdWFA9FL52hx3j9EJZPQP1ZUH8Ygi5tLCX2cRDs6knSf8/*

# Block some subpaths (equivalent rules)
/ipfs/Qmah2YDTfrox4watLCr3YgKyBwvjq8FJZEFdWY6WtJ3Xt2/test*
/ipfs/QmTuvSQbEDR3sarFAN9kAeXBpiBCyYYNxdxciazBba11eC/test/*

# Block some subpaths with exceptions
/ipfs/QmUboz9UsQBDeS6Tug1U8jgoFkgYxyYood9NDyVURAY9pK/blocked*
+/ipfs/QmUboz9UsQBDeS6Tug1U8jgoFkgYxyYood9NDyVURAY9pK/blockednot
+/ipfs/QmUboz9UsQBDeS6Tug1U8jgoFkgYxyYood9NDyVURAY9pK/blocked/not
+/ipfs/QmUboz9UsQBDeS6Tug1U8jgoFkgYxyYood9NDyVURAY9pK/blocked/exceptions*

# Block IPNS domain name
/ipns/domain.example

# Block IPNS domain name and path
/ipns/domain2.example/path

# Block IPNS key - blocks wrapped multihash.
/ipns/k51qzi5uqu5dhmzyv3zac033i7rl9hkgczxyl81lwoukda2htteop7d3x0y1mf

# Legacy CID double-hash block
# sha256(bafybeiefwqslmf6zyyrxodaxx4vwqircuxpza5ri45ws3y5a62ypxti42e/)
# blocks only this CID
//d9d295bde21f422d471a90f2a37ec53049fdf3e5fa3ee2e8f20e10003da429e7

# Legacy Path double-hash block
# Blocks bafybeiefwqslmf6zyyrxodaxx4vwqircuxpza5ri45ws3y5a62ypxti42e/path
# but not any other paths.
//3f8b9febd851873b3774b937cce126910699ceac56e72e64b866f8e258d09572

# Double hash CID block
# base58btc-sha256-multihash(QmVTF1yEejXd9iMgoRTFDxBv7HAz9kuZcQNBzHrceuK9HR)
# Blocks bafybeidjwik6im54nrpfg7osdvmx7zojl5oaxqel5cmsz46iuelwf5acja
# and QmVTF1yEejXd9iMgoRTFDxBv7HAz9kuZcQNBzHrceuK9HR etc. by multihash
//QmX9dhRcQcKUw3Ws8485T5a9dtjrSCQaUAHnG4iK9i4ceM

# Double hash Path block using blake3 hashing
# base58btc-blake3-multihash(gW7Nhu4HrfDtphEivm3Z9NNE7gpdh5Tga8g6JNZc1S8E47/path)
# Blocks /ipfs/bafyb4ieqht3b2rssdmc7sjv2cy2gfdilxkfh7623nvndziyqnawkmo266a/path
# /ipfs/bafyb4ieqht3b2rssdmc7sjv2cy2gfdilxkfh7623nvndziyqnawkmo266a/path
# /ipfs/f01701e20903cf61d46521b05f926ba1634628d0bba8a7ffb5b6d5a3ca310682ca63b5ef0/path etc...
# But not /path2
//QmbK7LDv5NNBvYQzNfm2eED17SNLt1yNMapcUhSuNLgkqz

You can create double-hashes by hand with the following command:

printf "QmecDgNqCRirkc3Cjz9eoRBNwXGckJ9WvTdmY16HP88768/my/path" \
  | ipfs add --raw-leaves --only-hash --quiet \
  | ipfs cid format -f '%M' -b base58btc

where:

  • QmecDgNqCRirkc3Cjz9eoRBNwXGckJ9WvTdmY16HP88768 must always be a CidV0. If you have a CIDv1 you need to convert it to CIDv0 first. i.e ipfs cid format -v0 bafybeihrw75yfhdx5qsqgesdnxejtjybscwuclpusvxkuttep6h7pkgmze
  • /my/path is optional depending on whether you want to block a specific path. No wildcards supported here!
  • The command above should give QmSju6XPmYLG611rmK7rEeCMFVuL6EHpqyvmEU6oGx3GR8. Use it as //QmSju6XPmYLG611rmK7rEeCMFVuL6EHpqyvmEU6oGx3GR8 on the denylist.

Kubo plugin

NOpfs Kubo plugin pre-built binary releases are available in the releases section.

Simply grab the binary for your system and drop it in the ~/.ipfs/plugins folder.

From that point, starting Kubo should load the plugin and automatically work with denylists (files with extension .deny) found in /etc/ipfs/denylists and $XDG_CONFIG_HOME/ipfs/denylists (usually ~/.config/ipfs/denylists). The plugin will log some lines as the ipfs daemon starts:

$ ipfs daemon --offline
Initializing daemon...
Kubo version: 0.21.0-rc1
Repo version: 14
System version: amd64/linux
Golang version: go1.19.10
2023-06-13T21:26:56.951+0200	INFO	nopfs	nopfs-kubo-plugin/plugin.go:59	Loading Nopfs plugin: content blocking
2023-06-13T21:26:56.952+0200	INFO	nopfs	[email protected]/denylist.go:165	Processing /home/user/.config/ipfs/denylists/badbits.deny: badbits (2023-03-27) by @Protocol Labs

The plugin can be manually built and installed for different versions of Kubo with:

git checkout nopfs-kubo-plugin/v<kubo-version>
make plugin
make install-plugin

Project layout

The NOpfs contains three separate Go-modules (versioned separately):

  • The main module (github.com/ipfs-shipyard/nopfs) provides the implementation of a Blocker that works with IPIP-383 denylists (can parse, track and answer whether CIDs/paths are blocked)
  • The ipfs submodule (github.com/ipfs-shipyard/nopfs/ipfs) provides blocking-wrappers for types in the Boxo/stack (Resolver, BlockService etc.). It's versioning tracks Boxo tags. i.e. v0.10.0 is compatible with [email protected].
  • The nopfs-kubo-plugin submodule (github.com/ipfs-shipyard/nopfs/nopfs-kubo-plugin) contains only the code of the Kubo plugin, which injects blocking-wrappers into Kubo. It is tagged tracking Kubo releases.

This allows using the Blocker separately, or relying on blocking-wrappers separately in a way that it is easy to identify and select dependency-aligned versions with your project, without specifying more dependencies that needed.

Project status

  • Support for blocking CIDs
  • Support for blocking IPFS Paths
  • Support for paths with wildcards (prefix paths)
  • Support for blocking legacy badbits anchors
  • Support for blocking double-hashed CIDs, IPFS paths, IPNS paths.
  • Support for blocking prefix and non-prefix sub-path
  • Support for denylist headers
  • Support for denylist rule hints
  • Support for allow rules (undo or create exceptions to earlier rules)
  • Live processing of appended rules to denylists
  • Content-blocking-enabled IPFS BlockService implementation
  • Content-blocking-enabled IPFS NameSystem implementation
  • Content-blocking-enabled IPFS Path resolver implementation
  • Kubo plugin
  • Automatic, comprehensive testing of all rule types and edge cases
  • Work with a stable release of Kubo
  • Prebuilt plugin binaries

nopfs's People

Contributors

dependabot[bot] avatar hsanjuan avatar lidel avatar web-flow avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

nopfs's Issues

Having IPFS default empty folder in denylist prevents ipfs+nopfs-kubo-plugin from starting

Hi!
After adding default IPFS empty folder to denylist ipfs daemon is not able to start anymore:

ipfs  | 2023-07-12T11:05:55.613Z        INFO    nopfs   nopfs-kubo-plugin/plugin.go:59  Loading Nopfs plugin: content blocking
ipfs  | 2023-07-12T11:05:56.599Z        INFO    nopfs   [email protected]/denylist.go:165    Processing /data/ipfs/.config/ipfs/denylists/test-list.deny: test-list (test list for trying nopfs) by None
ipfs  | 2023-07-12T11:05:56.602Z        ERROR   core    core/builder.go:158     constructing the node: could not build arguments for function "reflect".makeFuncStub (reflect/asm_amd64.s:28): failed to build *mfs.Root: received non-nil error from function "github.com/ipfs/kubo/core/node".Files (github.com/ipfs/[email protected]/core/node/core.go:136): failure writing to dagstore: QmUNLLsPACCz1vLxQVkXqqLX5R1X345qqfHbsf67hvA3Nn is blocked and cannot be provided
ipfs  |
ipfs  | Error: constructing the node (see log for full detail): failure writing to dagstore: QmUNLLsPACCz1vLxQVkXqqLX5R1X345qqfHbsf67hvA3Nn is blocked and cannot be provided

configuration for the denylist:

name: "test-list"
---
/ipfs/QmUNLLsPACCz1vLxQVkXqqLX5R1X345qqfHbsf67hvA3Nn

As this CID is present in badbits lists - we cannot just download and use those lists as-is, we need to remove line with hash "//6fe4a9f9ee915120a76ac47bf396198b02a1457dd1234ab75b2c394ca8ef5779" before using it. Not sure if its a bug, or maybe some notification in the documentation about this could be enough.

How to create a denylist like badbits

Hi there!
Sorry if this is not the right place to ask about this.

I have a simple Python-based HTTP proxy created to block incoming request if the CID included in the request is in the badbits denylist (https://badbits.dwebops.pub/denylist.json).

The problem is that I don't know how to calculate the hash/multihash.
The indications in Badbits to calculate the hash is not enough for me.

The Bad Bits Denylist is a list of hashed CIDs that have been flagged for various reasons (copyright violation, malware, etc). Each entry is the hex-encoded result of applying SHA2_256 to a /<optional_path> string. This format is legacy and we are moving to support double-hashed mulihashes instead.

Please, could you help me with that?
Regards.

Add denylist writing functionality

As it stands, the NoPFS library only supports reading the denylist format, but doesn't have any support for writing the denylist to a file. While most applications will only be reading the denylist, there are some that will need to write. It would be best to have both functionalities in the same library, to ensure correct round-tripping.

how to use nopfs?

Hi, nopfs looks perfect for integrating bad bits. However, it's unclear to me how to install and use nopfs.

Thanks!

ipfs blockservice is incompatible with sessions

// Blockstore returns the underlying Blockstore.
func (nbs *BlockService) Blockstore() blockstore.Blockstore {
return nbs.bs.Blockstore()
}
// Exchange returns the underlying Exchange.
func (nbs *BlockService) Exchange() exchange.Interface {
return nbs.bs.Exchange()
}

See how the code does not apply any blocking on sessions.
Theses are used by boxo/blockservice.NewSession when a session is used.

This also cause issues with new ContextWithSession feature because it's not possible to unwrap a nopfs blockservice and access the real one underneath (so the context value key becomes the nopfs blockservice which isn't what we want because nopfs doesn't do session redirection based on context).

We should do blocking on the blockstore.Blockstore and exchange.Exchange API arguments before they are passed to the blockservice.

Exclude well-known empty entities

Adding /ipfs/QmUNLLsPACCz1vLxQVkXqqLX5R1X345qqfHbsf67hvA3Nn (empty unixfs dir) bricks Kubo, daemon is unable to start:

2023-10-19T19:51:49.062+0200	WARN	nopfs	[email protected]/denylist.go:169	Opening /home/lidel/tmp/ttl-ipns-repo/denylists/text.deny: empty header
2023-10-19T19:51:49.063+0200	INFO	nopfs	[email protected]/denylist.go:170	Processing /home/lidel/tmp/ttl-ipns-repo/denylists/text.deny: text.deny (No header found) by unknown
2023-10-19T19:51:49.063+0200	DEBUG	nopfs	[email protected]/denylist.go:431	text.deny:1: IPFS rule. Key: QmUNLLsPACCz1vLxQVkXqqLX5R1X345qqfHbsf67hvA3Nn. Entry: Path: (empty). Prefix: false. AllowRule: false.
2023-10-19T19:51:49.063+0200	DEBUG	nopfs	[email protected]/denylist.go:783	IsCidBlocked load: QmUNLLsPACCz1vLxQVkXqqLX5R1X345qqfHbsf67hvA3Nn
2023-10-19T19:51:49.063+0200	DEBUG	nopfs	[email protected]/entry.go:38	check-path:  matches
2023-10-19T19:51:49.063+0200	ERROR	core	core/builder.go:158	constructing the node: could not build arguments for function "reflect".makeFuncStub (reflect/asm_amd64.s:28): failed to build *mfs.Root: received non-nil error from function "github.com/ipfs/kubo/core/node".Files (github.com/ipfs/kubo/core/node/core.go:136): failure writing to dagstore: QmUNLLsPACCz1vLxQVkXqqLX5R1X345qqfHbsf67hvA3Nn is blocked and cannot be provided

Error: constructing the node (see log for full detail): failure writing to dagstore: QmUNLLsPACCz1vLxQVkXqqLX5R1X345qqfHbsf67hvA3Nn is blocked and cannot be provided

In practice, there is a short set of well-known empty CIDs that should never be blocked:

  • empty unixfs directory
    • QmUNLLsPACCz1vLxQVkXqqLX5R1X345qqfHbsf67hvA3Nn
    • bafyaabakaieac (inlined)
  • empty block
    • raw: bafkreihdwdcefgh4dqkjv67uzcmw7ojee6xedzdetojuzjevtenxquvyku
    • inlined raw: bafkqaaa
    • dag-pb: QmbFMke1KXqnYyBBWxB74N4c5SBnJMVAiMNRcGu6x1AwQH
    • dag-cbor: bafyreigbtj4x7ip5legnfznufuopl4sg4knzc2cof6duas4b3q2fy6swua
    • dag-json: baguqeeraiqjw7i2vwntyuekgvulpp2det2kpwt6cd7tx5ayqybqpmhfk76fa

@hsanjuan thoughts? would it be feasible to keep a safelist that overrides denylists, or are we ok with the footgun? (feature, not a bug)

Inconsistent file watcher

Tried to adjust a deny list with a python package through the open / write functions. Sometimes the plugin takes notice of the change, sometimes it does not. My code looks something like this (not proficient in Python by any means):

with open(os.path.expanduser(ipfsFilePath), "wt", encoding="utf-8") as ipfs_cids_file: ipfs_cids_file.write('\n'.join(list(map(lambda x: '//' + x, cids))) + '\n')

It has inconsistent behaviour both in deletion and in addition of cids.

How I'm testing the addition of new cids:

  1. Upload the file to the GUI
  2. Hash the file's CID with sha-256
  3. Take hash and add it to the .deny file through the code above that runs in a python package
  4. Check in GUI if I can pin the file
  5. Sometimes it works, but most of the time it does not

I'm testing the deletion of cids in the same way, the only difference being that I run the ipfs daemon already having the hash inside the deny list (and it obviously does not allow me to pin the file), then I delete the line through the python code from the .deny list and I try again. Most of the time it still blocks it, even though the line was deleted.

Remote denylists and watching system (proposal)

The following are my thoughts on how to provide denylists so that they can be subscribed-to.

Server

  • Using HTTP (poll), essentially an IPFS-gateway served file:
    • Lists are made available over an http endpoint.
    • Range requests are supported an accepted.
    • eTag (set to CID )and caching headers
    • I'd like to expand this to use Server Push / Notifications, but it is actually simpler and ok if clients check regularly and check e-tag to see if content changed.
  • Using IPFS:
    • Denylist IPFS-host server publishes to pubsub topic denylist/<name>. Must be a signed message.
    • The message includes the CID of the latest version of the list. This is published every minute, or when it is updated.
    • The CID is the CID of the denylist which is a normal unixfs file (balanced chunking).
    • UnixFS files have support for seeking out of the box, and only the necessary blocks are downloaded when looking for specific bytes.

Client

  • HTTP: Client polls for the file every minute using a ranged request starting at the last byte read. A head request can be done in advance to check eTAG and decide if a GET request is needed. New bytes are appended to the file on disk.
  • IPFS: Client subscribes to pubsub topic. If a new CID comes in, we use unixfs to seek to the last byte read and then it is appended to the file on disk. The pubsub message can include more than the CID, for example a field to indicate if redownloading the full file and processing from the beginning is necessary.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.