Giter Site home page Giter Site logo

Comments (7)

mtalexan avatar mtalexan commented on May 18, 2024 1

@cryptix
You make a good point about changing to ignore an ID you previously followed. Because the IGNORED value is only allowed in the vector clock response, it seems like there's no way to avoid being told about the state of IDs you once requested but now don't care about. There's no way to indicate a negative cache request for an ID in the advertisement or a cache flush in the request. So if you ever asked about an ID in your advertisement to a remote, it's in the remote's cache of your state and subsequent failures to include it are assumed to be for efficiency not as a negative cache request.

Caveat: I haven't examined the code for any clients to know if there's some deviation from the documentation, or if there's some other message/message flag to request a full cache replacement.

from epidemic-broadcast-trees.

cryptix avatar cryptix commented on May 18, 2024 1

Very nice writeup, thanks @mtalexan! It actually prompted me to double check and find an issue in my go implementation.

Both sides can then follow up with requests for the messages they want in order to update their internal state to the latest of the IDs they both care about.

small well actually since you were asking about differences between papers and implementation. In ssb-ebt, there are no additional requests Even though the system could fallback to it's single feed createHistoryStream calls, when in EBT-mode it eagerly pushes all the feeds messages by sending rx:true on the note of that feed in the vector clock updates.

from epidemic-broadcast-trees.

cryptix avatar cryptix commented on May 18, 2024

+1, another reason would be unblocking a feed and it would be nice to not break the protocol too much.

Maybe we could just pick a special integer value? -1 is don't replicate so there already is precedence.

from epidemic-broadcast-trees.

arj03 avatar arj03 commented on May 18, 2024

I think the way to go about this would be as you say just define a value and write tests + fix the implemention for that.

from epidemic-broadcast-trees.

mtalexan avatar mtalexan commented on May 18, 2024

@arj03

I found another paper on the same topic to be more explanatory with respect to the vector clock (2 sections from the same paper https://github.com/dominictarr/scalable-secure-scuttlebutt/blob/master/paper.md#append-only-gossip-scuttlebutt and https://github.com/dominictarr/scalable-secure-scuttlebutt/blob/master/paper.md#append-only-gossip-with-request-skipping).

The purpose of the vector clock is to advertise what you already have, and then get a response from the remote about what it has from the list of IDs you advertised. If you cache what they responded with, and they cache what you advertised, both sides can assume the list of IDs is unchanged if parts of the list are later left out of future advertisements.

So if you lose your database, you would no longer have a cached state for the remote. When you look at your current internal vector clock (your absolute state), you would have a list of IDs that all list sequence number 0, and no cached state for the remote. Digging to the empty remote cache would result in you sending your entire list of IDs in the advertised vector clock.

The remote when it received your list of IDs set to sequence number 0 will update its cached state about you by filtering the list of requested IDs to only those IDs it also tracks, then updates the sequence numbers in its cached state of you for those IDs. Notably, if you somehow dropped an ID during the database reset, the remote will believe your current state includes your last reported state from before the database reset for those IDs [1].

The remote responds to your request by replying with: all the IDs you requested that it ignores (and therefore knows nothing about) with theIGNORED value, the current state it has for all IDs in your request, and the current state of all IDs in its cached state of you that weren't in your request but that it has newer info about.

When you receive this state info from the remote, you now know what not to ask it about again (IGNORED), and what it has newer/any i fo about. Both sides can then follow up with requests for the messages they want in order to update their internal state to the latest of the IDs they both care about. For you it would be all the messages for all the IDs you requested that the remote has any info about, plus any IDs you didn't request but have previously requested that were updated since your pre-database-wipe request to the same remote.

Am I misunderstanding your question here or you think I'm misunderstanding the protocol?

1 - The vector clock mechanism is not how you recover your list of tracked IDs, those are rebuilt using the history of your own messages which include follow and ignore/block events for everyone you ever followed or ignored. Normally you therefore wouldn't have a case where you've lost some IDs from your vector clock, except for the corner case where you're attempting to rebuild your own message history from your peers.

from epidemic-broadcast-trees.

arj03 avatar arj03 commented on May 18, 2024

@mtalexan thanks for digging up that paper. Yes it probably should work that way, but I don't think the currenly implementation does :) Needs to add a test for that first of all.

from epidemic-broadcast-trees.

arj03 avatar arj03 commented on May 18, 2024

This now works. The idea is that you make sure you request your own feed, then the remote peer will send you all your messages and from there you can request the other feeds. It also works in the case where you start from a backup instead of from an empty database.

from epidemic-broadcast-trees.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.