Giter Site home page Giter Site logo

Comments (4)

yael333 avatar yael333 commented on June 11, 2024 1

Thank you so much for the quick and thorough response~
Detecting and working with polygot files is still quite arcane and esoteric, hence why I started this Rust project~

These files usually have overlapped sections as they're not a regular archive file, meaning if you take the same slice of file and run it through and check for signatures it will pass for multiple formats. While also the definition of these files is vague, and usually depends on the validation of an external parser or program (For example most PDF polygots don't follow the official standard but still get opened well on most PDF readers).

Whether you'd wish to support parsing for these files depends on the scope of your program, but if needed I can contribute as well. I'll update about the success of integrating this awesome module into my project <3

from file-format.

mmalecot avatar mmalecot commented on June 11, 2024

Hi,

Thanks for the compliment, I hope you'll manage to understand how these macros work, I'll add more comments in the code in the next version to make everything as clear as possible :).

I didn't know about polyglot files, that's very interesting, thanks for sharing!

Currently, file-format via FileFormat::from_file will return only one file format: the first one for polyglot files.

On the other hand, with FileFormat::from_reader or FileFormat::from_bytes, it should be possible to identify all the formats contained in a polyglot file, if we can determine the beginning of each of them.

Thanks for asking!

from file-format.

mmalecot avatar mmalecot commented on June 11, 2024

In fact, it might be necessary to extend FileFormat::from_file so as not to return a format (perhaps return an error, or the generic FileFormat::ArbitraryBinaryData format). Otherwise, the crate could be fooled.

If you think it's possible and useful, we can also imagine a polyglot feature that activates a FileFormat::from_polyglot_file method, which would return several file formats.

In any case, I don't think it's easy to delimit sub-files.

If you have any ideas, I'd love to hear them!

from file-format.

mmalecot avatar mmalecot commented on June 11, 2024

Yes, please keep me posted!
Feel free to open a PR, I'll follow your project!

from file-format.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.