Giter Site home page Giter Site logo

newid's Introduction

What is it

NewId can be used as an embedded unique ID generator that produces 128 bit (16 bytes) sequential IDs. It is inspired from snowflake and flake. Read on to learn more.

The Problem

A number of applications use unique identifiers to identify a data record. A common way for apps that use a relational database (RDBMS) is to delegate the generation of these IDs to the database - by means of a Identity column (MS-SQL) or similar. This approach is fine for a small app, but quickly becomes a bottleneck at web-scale. See this post from the blokes at twitter: https://blog.twitter.com/2010/announcing-snowflake Another use case is apps that use messaging to communicate between themselves - as is the case with a Microservices based architecture. These apps may require sequential unique IDs for messages.

An attempt at solutions

A trivial approach is to use GUIDs/UUIDs generated in applications. While that works, in most frameworks GUIDs are not sequential. This takes away the ability to sort records based on their unique ids.

The Solution

The Erlang library flake (https://github.com/boundary/flake) adopted an approach of generating 128-bit, k-ordered ids (read time-ordered lexically) using the machines MAC, timestamp and a per thread sequence number. These IDs are sequential and wouldn't collide in a cluster of nodes running applicaitons that use these as UUIDs.

Sample Code

NewId id = NewId.Next(); //produces an id like {11790000-cf25-b808-dc58-08d367322210}

// Supports operations similar to GUID
NewId id = NewId.Next().ToString("D").ToUpperInvariant();
// Produces 11790000-CF25-B808-2365-08D36732603A

// Start from an id
NewId id = new NewId("11790000-cf25-b808-dc58-08d367322210");

// Start with a byte-array
var bytes = new byte[] { 16, 23, 54, 74, 21, 14, 75, 32, 44, 41, 31, 10, 11, 12, 86, 42 };
NewId theId = new NewId(bytes);

When NOT to use sequential IDs

(Adapted from the flake readme) The generated ids are predictable by design. They should not be used in scenarios where unpredictability is a desired feature. These IDs should NOT be used for:

  • Generating passwords
  • Security tokens
  • Anything else you wouldn't want someone to be able to guess.

NewId generated ids expose the identity of the machine which generated the id (by way of its MAC address) and the time at which it did so. This could be a problem for some security-sensitive applications.

Don't do modulo 2 arithmetic on flake ids with the expectation of random distribution.

newid's People

Contributors

highlyunavailable avatar maldworth avatar mishrsud avatar oidatiftla avatar phatboyg avatar timothymakkison avatar wallymathieu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

newid's Issues

Sort guid

There is a column of time, but the time is not the latest on the top, why is it that order by id ,guid is not generated according to time?

Bug when parsing a SequentialGuid.

FromSequentialGuid incorrectly swaps the bytes of the sequence counter, this means that parsing SequentialGuid creates incorrect NewIds. Tests won't detect this because the first NewId generated has the sequence number 0x0000.
This bug was introduced in #20 affecting FromSequentialGuid and ToNewIdFromSequential.

Expected: 9bfb0100-e3f8-6a00-06a1-08daed27ef21
But was:  9bfb0001-e3f8-6a00-06a1-08daed27ef21

Example failing test

// Round trip, to SequentialGuid and back with FromSequentialGuid
[Test]
public void Should_parse_sequential_guid_2_as_newid()
{
    NewId n = NewId.Next(2)[1];

    var nn = n.ToGuid();
    var g = n.ToSequentialGuid();

    var ng = NewId.FromSequentialGuid(g);

    Assert.AreEqual(n, ng);

    // Also checks to see if this would throw
    Assert.IsTrue(ng.Timestamp != default);
}

Fix

static void FromSequentialByteArray(in byte[] bytes, out Int32 a, out Int32 b, out Int32 c, out Int32 d)
{
    a = bytes[3] << 24 | bytes[2] << 16 | bytes[1] << 8 | bytes[0];
    b = bytes[5] << 24 | bytes[4] << 16 | bytes[7] << 8 | bytes[6];
    c = bytes[8] << 24 | bytes[9] << 16 | bytes[10] << 8 | bytes[11];
    d = bytes[12] << 24 | bytes[13] << 16 | bytes[14] << 8 | bytes[15];
}

Which type of sequential guid

Hey, I am just curious about which type of sequential guid you are using?

Because if you want increase performance in MSSQL you need to use sequential at end, and for MySql you need to use sequential at beginning.

Unique Id across all microservices

Hi,
Thanks for this useful package.
Could we create a unique ID across all services? Should we config each service instance separately to ensuring about uniqueness ?

Add greater and less than operators to NewId

Hi,
I really like your library as it solves the problem of sequential id generation for me. However, I'd really like to have greater/less than operators available to NewId so I don't need to use CompareTo methods.
I've just created a PR ( #11 ) that adds it. Could you please consider mergin it to your library?

Thanks!

Ambiguous reference in projects that have transitive dependencies on both MassTransit.Abstractions and NewId

In projects that have dependencies on both MassTransit.Abstractions and NewId, the compiler has no way to tell the difference between usages because the class names and namespaces are identical.
@phatboyg have you considered making a breaking change and choosing a new namespace for this standalone library/package?

Edit: I'm aware that having both dependencies is a no-no if you can avoid it, but given that they could be transitive dependencies via NuGet, there are scenarios for which it cannot easily be avoided.

Collisions of Ids

Hi guys,

I know the Flake ago produces nice looking sequential GUIDs. However, this implementation of the library has the following fatal flaws.

When generating 5M GUIDs using parallel tasks here are the issues:

  • It roughly takes 5 seconds to generate 5M GUIDs. This rather slow considering other libraries can produce 5M GUIDs in less than 300ms

  • The issue is collisions. After running 5M GUIDs, we detected on average, the same GUID repeated between 70-300 times per run. This is a big concern. I suggest not to use the Flake algo of GUIDs but rather just replace the last 8 bytes of the GUID to use an Unix Epoch Timestamp in milliseconds. Granted the timestamp portion of the GUID can have collisions but due to the fact the remaining part of a natural GUID (minus the last 8 bytes) provides a much better quasi-uniqueness than the Flake algo.

Hope this helps,

codematrix

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.