Comments (3)
Hi,
Thanks for your feedback!
It's quite complex to differentiate between text and binary because there is no infallible solution.
The approach used by file-format
is fairly restrictive: text is only considered to be anything that is not a control character (excluding whitespaces). I could exclude other control characters as requested, but the problem would only be half solved.
Perhaps we should go for a simpler, but also more permissive, heuristic. A solution used by many programs is to simply check for NULL bytes in the stream. With such an approach, it should be OK for your files. I'm going to analyse how other popular tools such as grep
, strings
and diff
work to decide.
A small remark on your match code: I see the replacement character and not BEL and ESC, maybe you should use '\x07' and '\x1b' in your code instead.
from file-format.
You're right about this solution being more of a hack, changing my code would be the better option in this instance.
Using the ascii chars was a habit I picked up from writing portable shell scripts, some older / more limited shells (ash, used by busybox or dash for example) don't interpret \e \x1b etc.
this isn't an issue with rust and most other languages plus the shell scripts are identified by the shebang anyway
Thanks for the response!
from file-format.
For the moment I prefer to keep the current heuristic because it seems to be more relevant and reliable.
To be seen later.
Thanks for your time!
from file-format.
Related Issues (20)
- Support for multi-format files / check against singular format HOT 4
- [new requirement] would you support metadata for audio/video files? both read and write HOT 1
- .ini file recognised as Mpeg1AudioLayer1 in Windows HOT 4
- How are some files categorized HOT 3
- MKV shows as Application HOT 3
- Handle handle complex "svg headers" starting with <?xml version="1.0" encoding="utf-8"?> HOT 3
- Add support for APPX Bundles HOT 4
- Add Toney files HOT 3
- Add archive: bsa, ba2, rpa, vpk, pak HOT 2
- Add Mozilla Archive HOT 2
- Add Brotli HOT 1
- Add CPIO HOT 1
- Add MLA HOT 1
- Add CAFF HOT 2
- Add Installshield HOT 1
- Does not detect file shebangs with extra whitespace HOT 2
- Support `.sketch` HOT 3
- Support Glyphs source file `.glyphs` HOT 3
- Support `Age` encrypted files HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from file-format.