Giter Site home page Giter Site logo

Comments (18)

newpavlov avatar newpavlov commented on August 15, 2024 1

I've updated crates to a new version of digest crate, which now includes traits Input, VariableOutput, FixedOutput and convenience wrapper Digest = Input + FixedOutput. You can see results in the digest-rework branch.

It works mostly fine, but does not fit for Groestl, which requires output size to be passed at hasher initialization. I think I'll just create new(n: usize) method for it and will implement VariableOutput which will accept only slices with length n.

from hashes.

newpavlov avatar newpavlov commented on August 15, 2024

You are right to bring this issue. I've known about it and planned to fix it later, but I am not completely sure what approach to use.

One way is to create three traits, which I imagined would look somewhat like this:

pub trait Digest {
    type BlockSize: ArrayLength<u8>;
    fn input(&mut self, input: &[u8]);
}

pub trait FixedOutput : Digest {
    type OutputSize: ArrayLength<u8>;
    fn result(self) -> GenericArray<u8, Self::OutputSize>;
}

pub trait VariableOutput : Digest {
    fn result(self, &mut [u8]);
}

Here length of the passed slice will be length of the output. Disadvantage here is need to import two traits instead of one for all users.

Alternative approach would be to create VariableDigest and FixedDigest traits:

pub trait FixedDigest : Digest {
    type OutputSize: ArrayLength<u8>;

    fn input(&mut self, input: &[u8]) {
       Digest::input(self, input);
    }

    fn result(self) -> GenericArray<u8, Self::OutputSize>;
}

pub trait VariableDigest : Digest {
    fn input(&mut self, input: &[u8]) {
       Digest::input(self, input);
    }

    fn result(self, &mut [u8]);
}

It would be a bit more ergonomic to use for most of the cases, but solution is a less elegant. Also it can cause problems and be unergonomic in case if VariableDigest and FixedDigest will be needed at the same time. It could be solved by renaming methods to variable_input, variable_result, but I like this solution even less.

Currently I am leaning towards the second solution without method renaming.

I wanted to experiment with it after fixing #5 based on blake2-rfc. But of course pull requests are also welcomed. ;)

from hashes.

burdges avatar burdges commented on August 15, 2024

Is there some restriction of the generic_array crate causing this? I forget if you can write Self::Output == GenericArray<u8,N> in a where clause. If not, you could make a trait reflect GenericArray somehow :

pub trait Digest {
    type Output;
    type BlockSize: ArrayLength<u8>;

    /// Digest input data. This method can be called repeatedly
    /// for use with streaming messages.
    fn input(&mut self, input: &[u8]);

    /// Retrieve the digest result. This method consumes digest instance.
    fn result(self) -> Self::Output;

    /// Get the block size in bytes.
    fn block_bytes(&self) -> usize { Self::BlockSize::to_usize() }

    /// Get the block size in bits.
    fn block_bits(&self) -> usize { 8 * Self::BlockSize::to_usize() }

    /// Get the output size in bytes.
    default fn output_bytes(&self) -> usize;

    /// Get the output size in bits.
    default fn output_bits(&self) -> usize { 8 * self.output_bytes() }
}

pub trait IsAGenericArray { 
    type Element;
    type Length;
}
impl<T,N> IsAGenericArray for GenericArray<T,N> {
    type Element = T;
    type Length = N;
}

impl<T> Digest for T where T: Digest, Self::Output: IsAGenericArray {
    default fn output_bytes(&self) -> usize {
        Self::Output::Length::to_usize() * ::std::mem::size_of::<Self::Output::Element>() 
    }
}

Also, these four methods for the block or output size in bytes and bits only return a runtime representation anyways, so maybe you want them to be type level numbers in a separate DigetFixed trait and drop them from the Digest trait entirely. At present, rust-crypto uses them as runtime parameters in hmac, hkdf, and pbkdf2.rs, and hmac is itself used by hkdf. I could do the same in lioness probably, but would anyone make say an hmac construction that needs runtime output size though?

from hashes.

newpavlov avatar newpavlov commented on August 15, 2024

AFAIK when using generic_array (and typenum in general) all related expressions must be evaluated at compile time, so size of output (GenericArray<u8, Self::OutputSize>) can not be changed at runtime. With generic-array compiler creates monomorphized functions for each used size in case of generic parameter, as usual with types. So when using current Digest you get stack allocated array, size of which must be known at compile time. (at least until we'll get alloca support) Thus I don't think your code can be compiled. (if I understood it correctly)

As for dropping type level numbers from Digest, in my opinion the default mode for hashes is a fixed one, variable length functions more an exception than rule. (especially variable at runtime)

Regarding methods block_* and output_* they are excessive and I thought about removing them altogether. But BlockSize used later in HMAC for creating internal GenericArray buffer, this is why it must be in the form of type parameter. (I think the same will be done for hkdf and pbkdf2 when I get to them) As for runtime variable output size for HMAC, I honestly don't know if it will be usefull for someone, but if we can easily do it generically, why impose unnecessary restrictions?

from hashes.

burdges avatar burdges commented on August 15, 2024

There was a type-o in my comment, oops. I'd meant that the Output type should not be restricted in the trait.

I'd also ignored that result() -> Output though so you cannot take Output to be [u8] anyways. You could make it work with result(&mut Output) like in rust-crypto, hence your multi-trait ideas.

Another approach might be :

pub trait Digest {
    type Output;
    type BlockSize: ArrayLength<u8>;
    fn input(&mut self, input: &[u8]);
    fn finalize(self) -> Output;
}

pub trait DigestOutput {
    fn result(&mut self, &mut [u8]);
}

Any conventional mode would simply consume the hasher with finalize and return the fixed length Output. Any mode offering variable length output would transform itself like

pub struct Shake<..> { _sha3: Sha3<..> }
pub struct ShakeOutput<..>{ _sha3: Sha3<..> }

impl Digest for Shake<..> {
    type Output = ShakeOutput<..>;
    ...
    fn finalize(self) { ShakeOutput { _sha3: self._sha3 } }
}

In this, there is no allowance for modes that intermix input and output operations however, not sure if that's allowed by any common modes. It's possible one should always drop down to the underlying permutation function for that sort of thing.

from hashes.

newpavlov avatar newpavlov commented on August 15, 2024

With DigestOutput you force user to copy data from result, instead of immediately using stack allocated array as result in a fixed case scenario. So you'll have to first create an array, get result, copy data from result to this preallocated array, that gives three operations instead of one, and this is not mentioning potential runtime errors which before that were enforced at compile time by the type system. Also it adds a bit of boilerplate to each crate. In my opinion this approach introduces unnecessary complexities and has important disadvantages.

I will try to play around this problem a little more. Also it possible that someone else will come up with a good solution to it later.

from hashes.

burdges avatar burdges commented on August 15, 2024

Just to be clear, any normal mode would only satisfy Digest<Ouput=[u8; n]>, while only variable length modes would offer an output that satisfies DigestOutput. Anyways it's true there are various RFCs being iterated upon that'd simplify all this.

from hashes.

burdges avatar burdges commented on August 15, 2024

Are there any hash functions with modes that allow intermixed input after output?

from hashes.

newpavlov avatar newpavlov commented on August 15, 2024

Are there any hash functions with modes that allow intermixed input after output?

If I understood your question correctly, this use-case is not covered explicitly and as far as I remember all hashes in this repository have some kind of finalization, but you can take advantage of Copy/Clone trait by copying state before calling result().

from hashes.

burdges avatar burdges commented on August 15, 2024

Right. I realized OS PRNGs do this without cloning, but they pull randomness from various outside sources as they cannot know how much they'll be expected to produce at any given time. You could avoid that with a better permutation, but afaik cryptographers do not do so currently, if only because their constructions mirror the underlying mathematics. Anyways if anybody wants to do this in the future then they can make a special API. No need to worry about it here.

from hashes.

burdges avatar burdges commented on August 15, 2024

Just fyi, a degree of support for defining arrays whose length is given by an associated constant just landed rust-lang/rust#40008 It currently depends upon not using parameters rust-lang/rfcs#1915 (comment) but that's probably fine.

from hashes.

newpavlov avatar newpavlov commented on August 15, 2024

But in this PR it's clearly written that:

don't get too excited, array lengths still can't depend on type parameters

Or am I missing something?

Also note that for example in case of groestl we use conditions on constants, so unfortunately until something like this will land we'll have to use typenum.

from hashes.

burdges avatar burdges commented on August 15, 2024

I've no idea how long before array lengths can be controlled by type parameters, but it's seemingly on the way, while const dependent types sound distant future if ever.

from hashes.

newpavlov avatar newpavlov commented on August 15, 2024

I've merged digest-rework into master and published new versions of the crates.

from hashes.

burdges avatar burdges commented on August 15, 2024

As an aside, I'm nervous about HMAC using a key the same size as BlockSize. It's clear that HMAC must know how much input older hashes need to fully replace their internal state, but what that means depends upon the hash, and may not have much to do with the actual BlockSize.

A prioir, I think BlockSize should be removed from the digest traits and a new trait HMACable should specify how much input the hash needs, maybe by just implementing the HMAC code directly. These newer hash functions like SHA3 should never need HMAC operation, so anyone who really wants it can do wrappers themselves. At the extreme, you'd never feel like Argon2 needed a gigabytes of input. ;)

from hashes.

newpavlov avatar newpavlov commented on August 15, 2024

Hmac implementation follows the standard definition of the HMAC (e.g. RFC 2104) introduced for digest functions which operate on the blocks of data, and size of the block is used to define algorithm. So in my opinion generic implementation for digest function which have fixed block size and output size is a right approach.

Though you are right that it's not always sensible to use HMAC construct (e.g. for BLAKE2), in those cases MAC trait will be implemented directly on the digest struct. It will be done for BLAKE2 and probably for SHA-3 (using simple prepending of the key). I should not forget to add to documentation a discouragement of using HMAC in case if digest already implements MAC. But I don't think we need to artificially restrict ability to use HMAC on such functions.

A more philosophical note: as I see it, users of "low-level" RustCrypto crates should have a bare minimum understanding of that they are doing. It's a bit risky, but it's an intentional decision to open a path for a more flexible ecosystem. As a result, for example, it allows inclusion of compromised or experimental algorithms.

from hashes.

burdges avatar burdges commented on August 15, 2024

Yes, it makes sense to implement RFC 2104 where you have older hash functions, but one could still place BlockSize elsewhere besides in the Digest traits. Anyone who wants an HMAC-Blake2b can always make a wrapper and pick their own BlockSize.

I'm not suggesting this influences usability, safety, etc. It's just pure aesthetics for hash functions without any sensible BlockSize. :)

from hashes.

newpavlov avatar newpavlov commented on August 15, 2024

Lets move this discussion into #20.

from hashes.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.