Comments (8)
@safinaskar, I have actually reviewed your errata for the original RFC4122 in terms of big endian and little endian and I am fairly certain the current Draft 04 has no ambiguity in regards to that topic as is pertains to the items in my this specific draft.
Please let me know if you feel otherwise but I tried to go out of my way and provide concise verbiage around this topic.
My proposal: textual UUIDs should always be generated in lower case
This is a pretty good problem and the topic of "text encodings change then or provide alternatives" is being worked under
https://github.com/uuid6/new-uuid-encoding-techniques-ietf-draft
Also CCing @ben221199 who as been reviewing the original RFC4122 for a fresh coat of paint on the v1 through v5 topics from that RFC.
from uuid6-ietf-draft.
I'm mentioned I see. The standard of this repository will add UUIDv6 and only that (as far as I know). This standard will update RFC 4122, because it adds version 6 to RFC 4122.
I'm also working on another standard. That standard will be deprecate RFC 4122 (and also this standard, I think), because it will redefine what already was in RFC 4122.
The idea is that the standard redefines all existing UUID variants (Apollo, NCS, Microsoft, Full-UUID, etc.) and versions (v1-v6) and directly registers the variants and versions in a IANA registry, so that there will be no doubt about it. If you (@safinaskar) think that RFC 4122 (and its errata) aren't fully correct, that standard is the place to be to fix it, as I think I want to register different serialization types too.
Also, I want to post a timeline here, because I think this could be order to release this standards in (but the second and third could also be swapped):
[RFC 4122]
-> [RFC xxxx: UUIDv6]
(this repository) -> [RFC xxxx: UUID + IANA]
(my standard) -> [RFC xxxx: UUIDv7 and UUIDv8]
Hope this explaination helps a bit.
from uuid6-ietf-draft.
I'm also working on another standard
Cool, thanks!
from uuid6-ietf-draft.
There lies the risk. By default, most strings in most languages have comparison operators that are case sensitive. This means that these 3 UUIDs, despite being identical in their authentic binary representation, wouldn't compare as equal as strings.
Well, of course not? The dashed-hex form of a UUID is one of infinitely many encodings of the actual UUID, which is a specific sequence of 16 bytes. The UUID is the bytes, not the string — all encoded forms of a UUID must be decoded before they can be compared.
In short, big endian / little endian are not used consistently in original RFC, so (in my opinion) it is impossible to create working translator from binary UUID form to textual and vice-versa based on RFC text alone.
Is a sequence of bytes represented as ASCII hexadecimal not invariant to endianness? Given a dashed-hex string representation of a UUID e.g. 210fc7bb-8186-39ac-48a4-c6afa2f1581a
is the actual value not unambiguously
0x21 0x0f 0xc7 0xbb
0x81 0x86 0x39 0xac
0x48 0xa4 0xc6 0xaf
0xa2 0xf1 0x58 0x1a
?
from uuid6-ietf-draft.
@peterbourgon , in 2013 I tried to create my program in C, which converts UUID from binary representation to ASCII (or vice-versa) based on RFC 4122 and I failed because of endian problems. So the text is ambiguous. Unfortunately I don't remember exact details, i. e. I don't remember which parts of the standard caused problems. But you still can see my errata and response of one editor to my errata in RFC errata tracker ( https://www.rfc-editor.org/errata/eid3546 ): he acknowledges problems exist
from uuid6-ietf-draft.
I can imagine that GUID can do some strange things. Everything is normal, except when you are Microsoft:
Bytes:
GUID in Data Inspector:
This is HxD, that only uses Microsofts GUID encoding, not the normal UUID encoding, so you don't get the expected 00112233-4455-6677-8899-AABBCCDDEEFF
. In this case, all the fields in the first half of the GUID have little endian and all the fields in the second half have big endian.
Wikipedia mentions it too:
The binary encoding of UUIDs varies between systems. Variant 1 UUIDs, nowadays the most common variant, are encoded in a big-endian format. For example, 00112233-4455-6677-8899-aabbccddeeff is encoded as the bytes 00 11 22 33 44 55 66 77 88 99 aa bb cc dd ee ff.[9][10]
Variant 2 UUIDs, historically used in Microsoft's COM/OLE libraries, use a mixed-endian format, whereby the first three components of the UUID are little-endian, and the last two are big-endian. For example, 00112233-4455-6677-c899-aabbccddeeff is encoded as the bytes 33 22 11 00 55 44 77 66 c8 99 aa bb cc dd ee ff.[11][12] See the section on Variants for details on why the '88' byte becomes 'c8' in Variant 2.
Note that this quote mentions variants in combination with encoding and COM/OLE. I think this text is partly written by an idiot.
from uuid6-ietf-draft.
Wow! You learn something new, and frequently terrible, every day.
from uuid6-ietf-draft.
Topic moved covered as per errata of https://github.com/ietf-wg-uuidrev/rfc4122bis
Archiving this to clean up the issue tracker.
from uuid6-ietf-draft.
Related Issues (20)
- The conditions for the use of source-generated UUIDs and receiver-generated UUIDs HOT 3
- Provide a complete UUIDv8 example
- Encouraging the use of UUID as the primary key HOT 8
- Fork safety HOT 1
- Fix RFC4086 link in Normative References
- Draft 04: MUST to SHOULD to reduce "absolute monotonicity" requirements
- Draft 05: B.2. Example of a UUIDv7 Value two "var" in table
- Draft 05: MUST veribage in Reliability of 6.1
- Announcement: Post-IETF 114 and the future of this Draft HOT 26
- Typo in UUIDv7 example value HOT 1
- Remove "time-based" constraint from version 8 UUID HOT 1
- Further clarify v7 field description HOT 3
- Required UUIDv7 generator features for RDBMS (PostgreSQL etc.)
- Approximate UUID timestamp calculations HOT 24
- Performance testing for UUIDv7 HOT 14
- RDBMS and other platforms that support UUIDv7
- UUIDv7 logo HOT 20
- Typo in Approximate UUID timestamp calculations HOT 13
- Reserving a special form within UUIDv10 (alternating UUID) HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from uuid6-ietf-draft.