Comments (20)
just don't use ruzzian.
from the_silver_searcher.
It's about percentage. If something like 10% of the bytes encountered are "suspicious," then it's flagged as binary.
I believe Perl's -B
operates the exact same way, but with around 30% instead.
from the_silver_searcher.
I'd like to search in Russian-only files too. Isn't it a good idea for ag
?
from the_silver_searcher.
I'm not saying it's not a good idea--I'm not even a maintainer for the project!
Perhaps I am mistaken, though. ack
is able to find the text, using test
and тест.
I always thought it stopped at a percentage threshold.
In any event, ag
's problem is still about a percentage threshold. The following file, for example, does find matches:
test
т
test
test
testtest
test
testtest
test
testtest
test
testtest
test
testtest
test
testtest
test
testtest
test
testtest
test
testtest
test
testtest
test
testtest
test
testtest
test
test
(In other words, one Russian character amongst other "Latin" ones.)
from the_silver_searcher.
I suggest to check whether a file is binary using filename extension check. As I understand, it should work even faster than the current approach.
from the_silver_searcher.
Eh, that's not really efficient. What about filenames without extensions, like ag
, find
, Makefile
. Which of these are binary?
from the_silver_searcher.
Yep, checking filename extension is not a good idea. What about file linux program? It makes a good job in determining file types. I think ag
can use it or take some logic from this program.
from the_silver_searcher.
I tweaked is_binary() to be a little more forgiving. Try it out now. If my change doesn't do the trick, there are other ideas we can try. For example, if the file is UTF-8 encoded, there shouldn't be very many bytes above 0b10111111.
from the_silver_searcher.
Version compiled from git:master
treats my LaTeX files (in Russian) as binary :(
% ag usepackage Руководство_оператора.tex -a
Binary file Руководство_оператора.tex matches.
from the_silver_searcher.
@krigstask can you give me some files that fail?
from the_silver_searcher.
Here you are: https://gist.github.com/4029832
It's sample rST file, I can't public those LaTeX files, sorry.
from the_silver_searcher.
A sample is fine. Thanks.
from the_silver_searcher.
Try master now.
from the_silver_searcher.
Great, seems to work now!
from the_silver_searcher.
from the_silver_searcher.
@ggreer Could you please update homebrew formulae?
from the_silver_searcher.
Oh yeah I should tag a release at some point in the near future.
from the_silver_searcher.
New release tagged. The homebrew pull request is here: Homebrew/legacy-homebrew#15911
from the_silver_searcher.
Megacool. Thanks. 🍰
from the_silver_searcher.
Windows 7, ag 2.2.0 (downloaded here)
File hello.txt inside folder Test, it contains 'Привет, мир!'
ag мир
-> nothing 👎
rg мир
-> hello.txt 1:Привет, мир! 👍
from the_silver_searcher.
Related Issues (20)
- Anyone want to help with a friendly fork of this project? HOT 6
- --make file type...
- Travis CI badge is broken
- Compilation stoped on qualifier from pointer target type [-Wcast-qual] HOT 1
- "bus error" on Mac OS with specific test file and regex HOT 1
- ag not finding accent letters in case insensitive search explicitly with -i option
- If the file doesn't exist, skip search instead of throwing errors
- wrong output on osx default terminal
- Confused about flag `--print-long-lines`: long lines seem to be printed by default? But can't turn that behavior *off* or modify it? (like `--print-long-lines=false` or `--print-long-lines=200` or something?)
- 2.2.0: test suite uses cram which is no longer maintained
- Add per-directory .gitignore support
- ignore patterns should not directory below their source file
- ag incorrectly parses [.][^.]*$
- feat: include OSC8 escape codes in output
- memory leak in function 'parse_options' HOT 1
- ag does not work on tmpfs/cgroup filesystem
- Entries in .gitignore are applied to the wrong base directory
- Bash completion script is broken with bash-completion v2.12.0 HOT 1
- Missing `S_ISSOCK` for stdin check HOT 1
- Depends on deprecated pcre library
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from the_silver_searcher.