bellecp / fast-p Goto Github PK

View Code? Open in Web Editor NEW

338.0 7.0 26.0 37 KB

License: MIT License

Go 100.00%

fast-p's Introduction

fast-p

Quickly find and open a pdf among a collection of thousands of unsorted pdfs through fzf (fuzzy finder)

Installation on Linux
Installation on OSX
Usage
How to clear the cache?
Launch with keyboard shortcut in Ubuntu
See it in action
Is the historical bash code still available?

Installation on Unix or Linux based systems

Requirements. Make sure the following requirements are satisfied:
- install pdftotext. This comes with the texlive distribution on linux, On ubuntu, sudo apt-get install poppler-utils .
- install fzf: https://github.com/junegunn/fzf
- install GNU grep, ag (silver searcher).
Install binary. Do either one of the two steps below:
- Compile from source with go and go get. With a working golang installation, do go install github.com/bellecp/[email protected] It will fetch the code and its dependencies, compile and create an executable fast-p in the /bin folder of your go installation, typically ~/go/bin. Make sure the command fast-p can be found (for instance, add ~/go/bin to your $PATH.)
- Or: Use the precompiled binary for your architecture. Download the binary that corresponds to your architecture at https://github.com/bellecp/fast-p/releases and make sure that the command fast-p can be found. For instance, put the binary file fast-p in ~/custom/bin and add export PATH=~/custom/bin:$PATH to your .bashrc.
Tweak your .bashrc. Add the following code to your .bashrc

p () {
    open=xdg-open   # this will open pdf file withthe default PDF viewer on KDE, xfce, LXDE and perhaps on other desktops.

    ag -U -g ".pdf$" \
    | fast-p \
    | fzf --read0 --reverse -e -d $'\t'  \
        --preview-window down:80% --preview '
            v=$(echo {q} | tr " " "|"); 
            echo -e {1}"\n"{2} | grep -E "^|$v" -i --color=always;
        ' \
    | cut -z -f 1 -d $'\t' | tr -d '\n' | xargs -r --null $open > /dev/null 2> /dev/null
}

You may replace ag -U -g ".pdf$" with another command that returns a list of pdf files.
You may replace open=... by your favorite PDF viewer, for instance open=evince or open=okular.

Installation on OSX with homebrew

Install homebrew and run

brew install bellecp/fast-p/fast-pdf-finder

The above brew formula is experimental. Please report any issues/suggestions/feedback at #11

Tweak your .bashrc. Add the following code to your .bashrc

p () {
    local open
    open=open   # on OSX, "open" opens a pdf in preview
    ag -U -g ".pdf$" \
    | fast-p \
    | fzf --read0 --reverse -e -d $'\t'  \
        --preview-window down:80% --preview '
            v=$(echo {q} | gtr " " "|"); 
            echo -e {1}"\n"{2} | ggrep -E "^|$v" -i --color=always;
        ' \
    | gcut -z -f 1 -d $'\t' | gtr -d '\n' | gxargs -r --null $open > /dev/null 2> /dev/null
}

You may replace ag -U -g ".pdf$" with another command that returns a list of pdf files.
You may replace open=... by your favorite PDF viewer, for instance open=evince or open=okular.

Remark: On OSX, we use the command line tools gcut, gxargs, ggrep, gtr which are the GNU versions of the tools cut, xargs, grep, tr. This way, we avoid the specifics of the versions of these tools pre-installed on OSX, and the same .bashrc code can be used for both OSX and GNU Linux systems.

Usage

Use the command p to browse among the PDF files in the current directory and its subdirectories.

The first run of the command will take some time to cache the text extracted from each pdf. Further runs of the command will be much faster since the text extraction will only apply to new pdfs.

How to clear the cache?

To clear the cache (which contains text extracted from PDF), you can run 'fast-p --clear-cache'. This will safely remove the file located at: ~/.cache/fast-p-pdftotext-output/fast-p_cached_pdftotext_output.db

For older versions, please manually delete the cache file found at ~/.cache/fast-p_cached_pdftotext_output.db

Launch with keyboard shortcut in Ubuntu

On Ubuntu desktop (tested in 18.04), one may add a keyboard shortcut to launch a new terminal running the p command right away. With the following script, the new terminal window will automatically close after choosing a PDF.

Create a file ~/.fast-p-rc with

source .bashrc
p;
sleep 0.15; exit;

and in Ubuntu Settings/Keyboard, add a custom shortcut that runs the command gnome-terminal -- sh -c "bash --rcfile .fast-p-rc".

See it in action

Is the historical bash code still available?

Yes, see https://github.com/bellecp/fast-p/blob/master/p but using the go binary as explained above is recommended for speed and interoperability.

fast-p's People

Contributors

Stargazers

Watchers

fast-p's Issues

No such file or directory error

When I tried running the p command I got the error like this:

[qoqosz:~/Downloads]$ p   
Loading _of_Combinatorial_Topology.pdf...  
Loading son,_Ronald_L.pdf...  
Loading toip-depo.pdf...  .pdf             
p:48: no such file or directory: /Users/qoqosz/.cache/pdftotext/00ce81e5cab6f38ftebooks-tmp/Foundations_of_Combinatorial_Topology.pdf

I'm on a Mac and have installed all the requirements. Not sure what steps should I list to make this issue somehow reproducible. Is it possible that space in filenames could be a problem?

When exiting without opening; terminal is not clean

Hi, when I exit from fast-p using Ctrl+C or Esc, the terminal leaves debug/help message from the open command. I am using the x86_64 MacOS binary.

How to reproduce:

Run p
Exit using Ctrl+C

macos install [zsh]

when trying to install via homebrew, the fan kicked in real hard.
I checked the CPU and my computer almost exploded. Havent experienced before when installing programs.

Initially I got a warning that I'm on Catalina and I might run into problems (e.g. slow processing or sth like that)

On my screenshot, it says ./make.bash, that was the part where it got really loud and kept hanging there.
Anyways. I have the Zshell and not bash.

So I'm not sure what exactly I'm doing wrong.
Would appreciate any hint. Thanks

Preview window does not highlight all matches

On OSX the match isn't highlighted in my preview window for some returned items, but it is in others. (Im not sure if this is an issue here, or with fzf). For example, if I search "test" the first few results might hate [email protected] directly in the preview window, it just isnt highlighted. If I keep scrolling down other results show them and are indeed highlighted.

Script doesn't show any pdfs

Finished all the steps, trying it in a folder with numerous pdf files, but I the script returns:

1/1

grep: empty (sub)expression

Page limit

The line here makes pdftotext stop scanning on page 2 of each pdf. Is there a reason for that? I have some really long PDFs I would like to parse

Can not open pdf file

Script is passing the line content and not the file name/address to open command. Using Ubuntu 18.04

mac os x - cut binary doesn't support -z option.

I tried to solve it by brew install coreutils; and then using gcut, instead of cut in the function p definition.

However, the pdf reader wasn't launched. Might somewhere in the last line is wrong:
| gcut -z -f 1 -d $'\t' | tr -d '\n' | xargs -r --null $open > /dev/null 2> /dev/null

Thank you for this fascinating tool!

Windows support [enchancement]

Hi there,

I've been watching your project since it came up on Hacker News. All the components are now available on Windows through chocolatey. Is it technically possible to implement the same functionality under PowerShell?

pdf-to-text only extracts first tiny part of document

hi^^ i tried today to use fast-p to make some math papers more searchable, but it just stops after a tiny fraction of the document.
using pdftotext on the same pdf works fine, then the whole thing is turned into a txt.
i've attached the cache-file when using fast-p in a folder that contains just a trial file (i had to change the ending bc github didn't let me upload a .db file but the content is the same...)
fast-p_cached_pdftotext_output.txt

PDF preview in both preview window and the search window

While using the p() function to search for pdf, the search window displays only one result at a time and displays the whole document instead of appearing as one liner.

fzf version: 0.53.0 (c4a9ccd6)
fast-p version: 0.2.5

code used in .bashrc

p () {
    open=xdg-open   # this will open pdf file withthe default PDF viewer on KDE, xfce, LXDE and perhaps on other desktops.

    ag -U -g ".pdf$" \
    | fast-p \
    | fzf --read0 --reverse -e -d $'\t'  \
        --preview-window down:80% --preview '
            v=$(echo {q} | tr " " "|"); 
            echo -e {1}"\n"{2} | grep -E "^|$v" -i --color=always;
        ' \
    | cut -z -f 1 -d $'\t' | tr -d '\n' | xargs -r --null $open > /dev/null 2> /dev/null
}

Using the old historical bash code as seen here, correct behaviour is still retained. List of files is presented as one line for each possible file.

How to install it in modern go?

Go get depressed, go install did not install file into path

bash only?

I put the mentioned code snippet into my .bashrc
but in fact my default shell is zsh [...]
In order to make the p command work, i have to instantiate a bash session.

how can I get all this going in Zsh as well?

Thanks alot

Add licence?

This is a really interesting approach to dealing with masses of PDFs!

Sorry to be that guy, but would you consider adding whatever is your favourite open source license? (If not, then please don't worry!)

Both the idea and the actual script/bash function most probably cross the threshold for copyrightability.

Thanks and sorry again!

Possible extension by rga

Is it possible to integrate rga so:

improve the speed (even without cache) and
extend the functionality beyond PDF?

[Warning] dpkg: warning missing maintainer

I have installed fast-p from the precompiled binary fast-p_0.1.7_linux_amd64.deb for Ubuntu 18.04. Everything works fine but when I run:
$ sudo apt update
or
$ sudo apt upgrade
I get a warning saying:

dpkg: warning: parsing file '/var/lib/dpkg/status' near line <line-number> package 'fast-p':
 missing maintainer

I think this warning would go away if the package maintainer is added in the precompiled binary.

Output of $ uname -a:

Linux shubham-Z510 4.15.0-33-generic #36-Ubuntu SMP Wed Aug 15 16:00:05 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

Output of $ lsb_release -a:

No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 18.04.1 LTS
Release:	18.04
Codename:	bionic

Feature request: Open pdf in a detached process

Hello,

unfortunately my command line fu is not advanced enough to modify the startup script myself.
What I would like is that after pressing return the selected pdf is opened but the fast-p terminal stays unchanged. So I could change my search and open another pdf from the same terminal.

"go: finding module ... go: found " trouble installing on arch

I recently installed your super fun code.
Now I'm trying to get it on my other computer.
Alas, I have no experience with installing packages with go.
I get weird behaviour.

After the command go install github.com/bellecp/[email protected] I get " finding .. found" but I don't think anything happens really. Could you perhaps provide a hint what I could do differently?

I checked both directories but there aren't any fast-p binaries ...

I actually tried also doing it manually but I'm just an illiterate and couldnt get it running that way either ...

Thanks

Need tester/feedback for homebrew formula

I am considering a simpler installation for OSX users via a homebrew formula.
Please report any issues/suggestions here -- all feedback welcome.