Giter Site home page Giter Site logo

scuttlego's Introduction

scuttlego CI Go Reference

A Go implementation of the Secure Scuttlebutt protocol. This implementation was designed to be used by the Planetary client and attempts to be efficient, stable and keep a relatively low memory footprint. As scuttlego is under active development the exposed interfaces may undergo some changes as its API stabilizes.

Features

Supported

  • Transport (handshake, box stream, RPC layer)
  • Support for the default feed format
  • Tracking the social graph
  • Connection manager (local peers, predefined pubs)
  • Replicating messages using createHistoryStream and Epidemic Broadcast Trees
  • Replication scheduler (prioritise closer feeds, avoid replicating the same messages simultaneously from various peers etc.)
  • Replicating and creating blobs
  • Pushing blobs
  • Tunneling via rooms
  • Some commands and queries for managing room aliases

Planned

  • Connection manager (dynamic discovery of pubs from feeds)
  • Handling blob wants received from remote peers
  • Cleaning up old blobs and messages
  • Private messages
  • Private groups
  • Support for other feed formats
  • Metafeeds

Community

If you want to talk about scuttlego feel free to post on Secure Scuttlebutt using the #scuttlego channel.

Also check out Matrix channels such as #golang-ssb-general:autonomic.zone and #planetary:matrix.org.

Protocol

To get an overview of the technical aspects of the Secure Scuttlebutt protocol check out the following resources:

Contributing

Check out our contributing documentation.

If you find an issue, please report it on the issue tracker.

Acknowledgements

This implementation depends on go-ssb and associated libraries under the hood. The elements which didn't have to be reimplemented from scratch thanks to that are mainly:

  • the handshake mechanism
  • the box stream protocol
  • the verification and signing of messages
  • broadcasting and receiving local UDP advertisements

We are ever grateful for the work done by the authors and contributors of go-ssb and associated libraries as without them scuttlego most likely wouldn't have been completed.

scuttlego's People

Contributors

boreq avatar omahs avatar rabble avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

scuttlego's Issues

The distinction between refs.Identity and refs.Feed should probaly go away

Right now we have two types refs.Identity and refs.Feed. This is quite confusing and at this development stage it is unclear what the difference between those types is. If we treat the Protocol Guide as our domain then there should probably be no distinction between those types. The existence of identity.Public makes it even more confusing. Until there is a clear reason to distinguish between those two type (eg. maybe after adding metafeeds/other feed formats) it may be the best to just stick to refs.Feed.

Panic when cleaning up live history streams

panic: runtime error: index out of range [3] with length 3

goroutine 47 [running]:
github.com/planetary-social/scuttlego/service/app/queries.(*LiveHistoryStreams).CleanupClosedStreams(0xc0000165a0)
	/home/filip/repositories/planetary/scuttlego/service/app/queries/create_history_stream.go:239 +0x46a
github.com/planetary-social/scuttlego/service/app/queries.(*CreateHistoryStreamHandler).cleanupWorker(0xc000080480, {0x752820, 0xc000080500})
	/home/filip/repositories/planetary/scuttlego/service/app/queries/create_history_stream.go:145 +0x3d
created by github.com/planetary-social/scuttlego/service/app/queries.(*CreateHistoryStreamHandler).startWorkers
	/home/filip/repositories/planetary/scuttlego/service/app/queries/create_history_stream.go:113 +0x195
exit status 2

Blob replication manager

The idea is to prefetch some blobs, fetch others on demand etc. The data for this is already stored in the database but not used.

Publishing binaries for latest dev builds

Then folks can start using the builds as they come out... might be useful to get more feedback and testing reports... no commitments need to be made on the scuttlego side, i'd say. just make the builds available and people can try to use it. e.g. https://git.coopcloud.tech/PeachCloud/peach-workspace/issues/134#issuecomment-14032

some chat logs:

b: Are we taking about a specific use case like "I want to run a pub"?
b: Eg. an executable binary?
d: yep

could make a stab at implementing the CI side of this if you like?

A go-releaser config might be nice https://goreleaser.com/

Messages received through message pubsub may be out of order

Messages received from the message pubsub in the create history stream query are filtered by feeds and sent to peers. Those messages may be out of order. While the documentation doesn't clearly state this I assume that sending live messages out of order violates the spec.

Better handling of errors when processing messages

Errors are common when processing message content. Currently if a message is rejected we may end up with a lot of messages with higher sequence numbers stuck in the message buffer. The messages will also keep being replicated over and over again via various means even though we can't consume them.

Overall it is a bit unclear what to do if we know we can't handle a specific message. Perhaps some kind of a back-off mechanism is needed.

Blob related data cleanup

We would like to:

  • cleanup wants processes after incoming and outgoing streams are disconnected
  • cleanup remote wants after all of our streams disconnect?

Add support for blobs

Incoming messages should be automatically scanned for references to blobs and that information should be saved in the database. Blobs referenced in messages created by feeds up to n hops away should be automatically retrieved (probably just friends).

The following message content types should be scanned for blobs:

  • post
  • about

It is unclear if other messages reference blobs. A different approach should involve parsing the raw content bytes to find anything that matches the &<hash>.sha256 format.

We also need a mechanism to request blob retrieval on demand. This will be used for blobs further away than n hops.

Blobs have a wants mechanism which allow peers to forward wants from other nodes. It is unclear if this is mandatory to implement. I think this could be skipped for now.

When it comes to creating blobs: right now creating blobs is largely decoupled from creating messages that reference them. A blob is added which results in a ref and then that ref is included in a message. This means that if you give up on creating that message you are going to have an orphaned blob in your program. Maybe it makes sense to develop some new approach which forces the user to call some code to get a ref to a blob but that blob will not become live (or will get cleaned up after some time) unless the user actually embeds that blob in a message somehow.

Progress tracker (incomplete):

  • persist information about blobs
  • scan messages for blobs
  • replication manager? (prefetch some blobs, fetch others on demand)
  • persist blobs
  • forward remote wants (let's not do this now)
  • handle incoming blobs.get
  • handle incoming blobs.getSlice
  • handle incoming blobs.createWants
  • reply with "has" when "want" is received
  • cleanup wants processes after incoming and outgoing streams are disconnected
  • after replicating a blob check if someone would like to know about it (wants)
  • cleanup remote wants after all of our streams disconnect?
  • persist an "on demand" want list
  • revamp the want list so that the blobs that we retrieve are removed from it (probably just to the on-demand want list for now)
  • after UI asks for a blob on demand persist it in the want list for a specific amount of time
  • refresh want list when some events are emitted
  • redo on-demand want list cleanups (consult @czeslavo to figure out how cron+command?)
  • create blobs

Negotiation mechanism doesn't work

Session runner isn't correctly tested. If an error is returned in IncomingStreamAdapter (?) it won't be correctly propagated to the negotiator. The negotiation mechanism has to be tested in general.

Global banlist of feeds

This task is about ingesting some kind of a global ban list of feeds. Banning a feed should mean that their messages will no longer be replicated. All messages from this feed should be removed from the local database.

Use scuttlego as a pub server

It is unclear how that would work: different di configs, different configs, separate repos, two separate repos with a common core...

Add support for rooms

Rooms need to supported before using scuttlego in Planetary as we just added basic room support to the app.

The scope of the required work needs to be further investigated and documented.

Alias management:

  • connect to a room and list aliases using an RPC command
  • connect to a room and revoke an alias using an RPC command
  • connect to a room and register an alias using an RPC command
  • return a predefined error when an alias is already taken when registering it (requires changes to the RPC layer, related to #62)
  • test commands and queries, confirm how EndOrErr is set for async requests

Connection management:

  • accept tunnel.connect
  • when connecting to the room we must be able to dial people using that room
  • persist information about rooms advertised on feeds
  • stay connected to our own rooms so that people can talk to us?
  • use rooms in connection management (randomly dial known rooms and talk to the online peers should be fine for now)

Decouple domain and application by using pubsub

A while ago we identified a less-than-perfect piece of code: the domain calls an application command. We want to rework this and make sure that raw messages are passed via a pubsub.

This was already attempted here 43d855b but wasn't a great idea as we need changes in #44.

Removing messages breaks the receive log

If a message was removed then we just have to skip that message in the receive log. For now I didn't touch the receive log as it is unclear what that new contract will look like. Maybe we should also communicate message removals to Swift somehow. Or we just need to redesign that integration completely.

Point to matrix channel?

Would it be an idea to point to the "golang ssb general" channel in the README or somewhere handy here? Or to make your own channel, that's cool too. But to increase the chances of interested people dropping in for a chat? Can send a little PR if you like.

Blocked feeds should be excluded from the social graph

Right now this is not actually entirely supported. The idea is that we should look at the blocking field in contact messages. If blocking was set to true at any point then this edge of the social graph shouldn't be followed. One caveat is that our messages should take priority, for example: if we blocked someone then they should never end up in the social graph even if someone we follow follows them.

Message buffer rework

Message buffer is far from perfect right now.

  • it should ensure that a failure to update one feed doesn't mean that other feeds can't be updated
  • if some messages are missing and therefore can't be appended to a feed it should keep messages for a specific amount of time as messages can arrive out of order and try to reorder them (needed for #45)
  • it should try to delay validating signatures on messages and just peek their sequence numbers in case we get the same message from multiple sources (probably should replace the mechanism which lets the replication code get the sequence number from the message buffer, this is not really reliable if messages arrive out of order)

All commands and queries should have constructors

Right now commands and queries don't have constructors which is some kind of a weird approach that we stuck with together with my old team even though we all agreed they should have them. I kept doing the same thing for no particular reason.

As a part of this task we should:

  • make sure that commands and queries have private fields
  • make sure that commands and queries have strong constructors
  • make sure that command and query handlers check for zero value of command/query

Cleanup feeds which drop out of our social graph

We need to develop a mechanism for removing old messages. Messages should be removed when a feed drops out of our social graph.

When messages are removed we should also remove associated blobs. Blobs should also be most likely removed after they haven't been used for a while unless they are a blob that we created. Alternatively we should simply always remove blobs that were not created by us after a specific amount of time.

Migrating the existing data from go-ssb

  • migration mechanism
  • migration mechanism tests
  • message migration
  • message migration tests
  • test on other people's feeds
  • blob migration - created a separate ticket so that this isn't a huge ticket that remains open for weeks #123

Rename the project to avoid confusion

Currently the name of the project is highly confusing even when talking internally about it. We need to come up with a unique name and rename this project. We can close this once we finalize the internal discussions about this.

Replicating messages using EBT

Currently only "legacy" replication using createHistoryStream is supported. It would be good to also support newer replication mechanism referred to as EBT.

One issue that I can see right away is that we wanted to prioritize feeds that are closer to us in the social graph. This appears impossible when using EBT replication.

Implementing this may be challenging as EBT replication isn't documented.

The advantage of implementing this mechanism is also that it is my understanding that Manyverse developers expressed a desire to no longer support createHistoryStream replication. This could mean that in the future Planetary wouldn't be able to replicate with Manyverse clients unless this is implemented.

Add a way to remove specific feeds with all associated data

We need a mechanism for removing a specific feed with all associated data. This may for example include:

  • blobs (if no other feeds reference them/the temporary on-demand want list doesn't reference them)
  • contacts
  • pubs
  • etc

This is needed for #16 and more importantly #20. I am creating this issue to distinguish between adding a way to drop a feed and implementing the slightly more complicated case of cleaning up feeds described in #16.

createHistoryStream requests spawn a lot of goroutines

When connecting to a pub it is likely that one createHistoryStream request will be received per each feed followed by a pub. If a pub is following 10000 feeds and we connect to 5 of them it may mean that 5 * 10000 * number_of_goroutines_created_per_request goroutines will be created. This will clog up the application and put a lot of pressure on the runtime as well as the garbage collector. We need a mechanism to hand off those requests to a small number of workers.

Following people doesn't change the button correctly

When following people the follow button changes for a couple hundred ms and suggests that the feed has been followed but then switches back to "follow this person". In reality the follow message is published correctly. This seems to be some kind of a regression as this worked correctly.

I think this only happens after we also replicate some external messages?

Error processing mentions for post messages

time="2022-09-26T12:22:56+02:00" level=error msg="mapping returned an error" content="{\"type\":\"post\",\"text\":\"a new photo from #dw
eb-camp 2022! ![photo.bmp](&O0h21NiGLLmjCF1kD2xWllvPExwe6t5P+F7YK3HAX4g=.sha256)\",\"mentions\":{\"0\":{\"name\":\"photo.bmp\",\"type\":
\"image/bmp\"}}}" error="json unmarshal failed: json: cannot unmarshal object into Go struct field transportPost.mentions of type []json
.RawMessage" name=main typ=post

time="2022-09-26T12:22:56+02:00" level=debug msg="error processing incoming message" error="error handling a message: failed to identify the raw message: unknown message: 1 error occurred:\n\t* could not unmarshal message content: mapping 'post' returned an error: json unmarshal failed: json: cannot unmarshal object into Go struct field transportPost.mentions of type []json.RawMessage\n\n" name=main.session

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.