Giter Site home page Giter Site logo

Comments (9)

broofa avatar broofa commented on May 14, 2024 2

@bradleypeabody Thanks for following up. I appreciate the diligence you've shown in shepherding this spec.

So the extra orders of magnitude between microseconds and nanoseconds would probably end up causing more implementation hassle than utility

Yes, exactly! The hassle of having to figure out how to split the timestamp into msecs and nsecs, and manipulate those fields in a way that didn't inadvertently lose timestamp precision in the process, is why I shudder every time I see "nanosecond precision", or even "microsecond".

from uuid6-ietf-draft.

edo1 avatar edo1 commented on May 14, 2024

If unixts is unchanged (i.e. in the same clock cycle), clock_seq MUST be incrementied by one (1)

  1. It will suffer from the same problem as the original ULID: the next/previous value is likely to be predicted from the known identifier.
    To avoid this, there is always a random part in my proposal #34;
  2. The one-second resolution is not enough for ordering identifiers in the case of lock-free concurrent generation;
  3. Most significant bit of clock_seq will almost always be zero, so only 85 bits of true randomness. IMHO this is too little for global uniqueness.

from uuid6-ietf-draft.

bradleypeabody avatar bradleypeabody commented on May 14, 2024

I think this boils down to the same thing, the arguments brought up are essentially application-specific points:

Where UUIDs are concerned subsecond timestamp information is arguably indistinguishable from a simple counter

Totally depends on the environment. If you're on an IoT device without a real-time clock, what you state is probably totally right (in fact it's probably even less accurate). But if you work in aerospace and are dealing with timestamps from GPS satellites and are taking into account transmission time and relativistic effects, we cannot make the assumptions you're stating. It depends on the application.

the next/previous value is likely to be predicted from the known identifier.

Some applications care about how predictable generated values are, others couldn't care less.

unix epoch, 1-second resolution

There is very little difference between this and and using a 1-second regular unix timestamp, multiplying it by 1 billion to convert to nanoseconds, and add a random number between 1 and a billion to it. And I think this is a totally fine and correct idea for some applications and should definitely be allowed. The difference is that if an application does in fact need nanosecond precision, or anywhere close to it, it is available. Totally agreed that the least significant parts can be filled in with random or clock sequence if it's not needed for a particular case.


These are all tradeoffs that I don't see a single answer to. Which is why I lean toward making each of these things possible to do, so applications can still "implement UUID correctly" and cater to application needs.

@broofa What you're describing, if you're willing to use the other UUIDv7 layout, is accomplish-able and a correct implementation. Essentially there is nothing that stops you from using the "random" part as a big clock sequence like you describe above - IF that's what you need for your application.

from uuid6-ietf-draft.

broofa avatar broofa commented on May 14, 2024

Where UUIDs are concerned subsecond timestamp information is arguably indistinguishable from a simple counter

Totally depends on the environment. If you're on an IoT device without a real-time clock, what you state is probably totally right (in fact it's probably even less accurate). But if you work in aerospace and are dealing with timestamps from GPS satellites and are taking into account transmission time and relativistic effects, we cannot make the assumptions you're stating. It depends on the application.

My concern is that the provisions in the spec that allow for filling timestamps with random or sequence bits to "simulate" high precision necessarily undermine the utility of the timestamp information. Upon receiving an id, you have no way of knowing how accurate it actually is. Your choices are to assume its low precision (i.e. don't depend on it being accurate), or bake the assumption that it's accurate into your system. But in the latter case, the stricter your assumption is, the more likely it is to be wrong.

Does that make sense? It's like... well... JavaScripts Math.random(), where there's no specification about the quality of the underlying RNG. You can use it, just don't expect it to work.

Basically, if you're going to argue that it's not necessary to specify cryptographic quality RNGs in UUIDs, you shouldn't be simultaneously arguing for high-precision timestamps.

from uuid6-ietf-draft.

edo1 avatar edo1 commented on May 14, 2024

Upon receiving an id, you have no way of knowing how accurate it actually is

Is there a reason why a UUID receiver might need to know the accuracy of the clock?

from uuid6-ietf-draft.

nerg4l avatar nerg4l commented on May 14, 2024

This proposal is probably based on ULID or Push ID. I really like how it provides monotonicity in case of overflow but I also have concerns on the time precision.


@edo1

  1. It will suffer from the same problem as the original ULID: the next/previous value is likely to be predicted from the known identifier.

Maybe I understand this part incorrectly of RFC 4122 but it says in Section 6 "Do not assume that UUIDs are hard to guess;".

  1. Most significant bit of clock_seq will almost always be zero, so only 85 bits of true randomness. IMHO this is too little for global uniqueness.

How many bits of randomness is enough for "global uniqueness"? I could be wrong but according to my logic that 85 bits are there to make the UUID unique in a second and not there to make it "globally unique" because the time and random value together will make it "globally unique".

from uuid6-ietf-draft.

edo1 avatar edo1 commented on May 14, 2024

Maybe I understand this part incorrectly of RFC 4122 but it says in Section 6 "Do not assume that UUIDs are hard to guess;"

If it is possible to create "hard to guess" UUID, we should.
This does not make things worse if unpredictable UUIDs are not required.

How many bits of randomness is enough for "global uniqueness"?

Just added this proposal to my spreadsheet
As a reminder, the annual collision probability is calculated when a new UUID is generated every nanosecond 10⁹ new UUIDs are generated every second.
Probablilty is 33.5% for this proposal, 0.27%/2.11% for #34

from uuid6-ietf-draft.

bradleypeabody avatar bradleypeabody commented on May 14, 2024

@broofa Circling back around to this:

It's like... well... JavaScripts Math.random(), where there's no specification about the quality of the underlying RNG. You can use it, just don't expect it to work.

I think that argument has the most merit.

I don't think it particularly matters how accurate timestamps are from one system to another, simply because this is always going to be uncontrollable - clock skew, systems without or broken NTP, varying clock precisions - it's just not something that can realistically be controlled, at least not for most implementations. So no matter what the spec says about accuracy, it doesn't mean much if the physical system hosting the implementation doesn't match or comply with it.

That said, after looking through this more carefully, I think the argument that basically goes: "Don't pick a format that will usually have bad/wrong information on most systems" holds a lot of weight.

In doing some testing and research, it looks like at least some modern systems are only microsecond-precise (Mac laptop). And even when more precise numbers are available, unless the code implementing the timestamp uses certain hardware features very carefully it's easy to get numbers that are "nanosecond-precise" but not accurate. So the extra orders of magnitude between microseconds and nanoseconds would probably end up causing more implementation hassle than utility (implementations would normally be filling this in with random data, and almost never be using it communicate a more accurate time).

And so then it raises the question of how much precision is a good combination of useful and will normally be accurate on most systems. Which then raises another question about the exact format. I'm going to bridge that over to #27

from uuid6-ietf-draft.

kyzer-davis avatar kyzer-davis commented on May 14, 2024

@broofa , since we moved this to #27 is this one okay to close?

from uuid6-ietf-draft.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.