Giter Site home page Giter Site logo

crockford's Introduction

crockford

Base32 encoding for 64-bit values.

Crockford Base32 Encoding is most commonly used to make numeric identifiers slightly more user-resistant. Similar to Hashids, the purpose here is to make the identifiers shorter and less confusing. Unlike Hashids, Crockford Base32 does nothing to conceal the real value of the number (beyond the actual encoding, anyway) and the fact that they are sequential is still pretty obvious when you see consecutive identifiers side by side.

This library does not support encoding and decoding of arbitrary data; there is another library for that. Additionally, the spec supports the idea of check digits, but this library currently does not.

The primary purpose of this library is to provide high performance, user-resistant encoding of numeric identifiers. To that end, both encoding and decoding are, in fact, pretty darn fast. How fast? According to my testing, crockford decodes fifty times faster and encodes twenty-seven times faster than harsh.

Usage

Encoding

Encoding is a one-step process.

let x = crockford::encode(5111);
assert_eq!("4ZQ", &*x);

If you want lowercase, then... Well, tough. However, we do now support encoding to a buffer of your choice rather than a new one created in the function. Read on to learn about plan B...

Plan B (faster encoding)

Because this is Rust, particular focus is given to runtime efficiency--or, at least, allowing the user to achieve runtime efficiency. As a result, we provide a second, more complicated encoding option.

// The longest possible representation of u64 is 13 digits.
let mut buf = Vec::with_capacity(13);
crockford::encode_into(5111, &mut buf);

let result = std::str::from_utf8(&buf)?;
assert_eq!("4ZQ", result);

This encode_into method also accepts &mut String, if you prefer.

Decoding

Decoding is a two-step process. This is because you can feed any string to the decoder, and the decoder will return an error if you try to convince it that "Hello, world!" is a number. (Hint: it isn't.)

let x = crockford::decode("4zq");
let y = crockford::decode("4ZQ");

assert_eq!(5111, x?);
assert_eq!(5111, y?);

So, step one is to call the decode function. Step two is to match/verify/unwrap/throw away the output.

License

Licensed under either of

at your option.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

crockford's People

Contributors

archer884 avatar nvzqz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

crockford's Issues

Stop wasting people's time

Preliminary research suggests that removing spurious values from the decoding lookup array will improve performance by almost 50%.

`crockford::encoding::Write` implementation for String is unsafe

The implementation of crockford::encoding::Write for String assumes the passed u8 is a ASCII byte. However, this is not garanteed, since the trait is public. Therefore calling the trait method with a non ascii u8 breaks the String invariant without needing unsafe code.

#[test]
fn unsafe_write() {
    let mut danger = String::new();
    danger.write(0xf0);
    // This panics on my system with the "does not support writing non-UTF-8 byte sequences"
    println!("{}", danger);
}

I suggest changing the implementation of the trait to perform an additional bitwise AND with 0x7f.
This does not incur much additional cost, running the benches does not yield significant regressions.

impl Write for String {
    fn write(&mut self, u: u8) {
        // The lowest bit gets cleared by the bitwise And.
        unsafe {
            self.as_mut_vec().push(u & 0x7f);
        }
    }
}

Incorrect encoding/decoding

This assertion fails:

let input = "ZJ75K085CMJ1A";
assert_eq!(input, crockford::encode(crockford::decode(input).unwrap()));
assertion `left == right` failed
  left: "ZJ75K085CMJ1A"
 right: "FJ75K085CMJ1A"

Left:  ZJ75K085CMJ1A
Right: FJ75K085CMJ1A

In this case, it appears the first byte encodes/decodes incorrectly.
ZJ75K085CMJ1A -> 252 142 89 129 5 101 36 21
FJ75K085CMJ1A -> 124 142 89 129 5 101 36 21

Both these values also decode to the same u64:

let correct = crockford::decode("zj75k085cmj1a").unwrap();
let incorrect = crockford::decode("fj75k085cmj1a").unwrap();
assert_ne!(correct, incorrect);
assertion `left != right` failed
  left: 17950419036144289834
 right: 17950419036144289834

Decoding ZJ75K085CMJ1A with decode.fr results in a different integer altogether - 18198581554926920725. This can be verified as correct (using separate websites/services) because re-encoding this (as hexadecimal) using cryptii returns the original encoding - ZJ75K085CMJ1A.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.