Giter Site home page Giter Site logo

fast-p's Introduction

fast-p

Quickly find and open a pdf among a collection of thousands of unsorted pdfs through fzf (fuzzy finder)

Installation on Unix or Linux based systems

  1. Requirements. Make sure the following requirements are satisfied:

    • install pdftotext. This comes with the texlive distribution on linux, On ubuntu, sudo apt-get install poppler-utils .
    • install fzf: https://github.com/junegunn/fzf
    • install GNU grep, ag (silver searcher).
  2. Install binary. Do either one of the two steps below:

    • Compile from source with go and go get. With a working golang installation, do go install github.com/bellecp/[email protected] It will fetch the code and its dependencies, compile and create an executable fast-p in the /bin folder of your go installation, typically ~/go/bin. Make sure the command fast-p can be found (for instance, add ~/go/bin to your $PATH.)
    • Or: Use the precompiled binary for your architecture. Download the binary that corresponds to your architecture at https://github.com/bellecp/fast-p/releases and make sure that the command fast-p can be found. For instance, put the binary file fast-p in ~/custom/bin and add export PATH=~/custom/bin:$PATH to your .bashrc.
  3. Tweak your .bashrc. Add the following code to your .bashrc

p () {
    open=xdg-open   # this will open pdf file withthe default PDF viewer on KDE, xfce, LXDE and perhaps on other desktops.

    ag -U -g ".pdf$" \
    | fast-p \
    | fzf --read0 --reverse -e -d $'\t'  \
        --preview-window down:80% --preview '
            v=$(echo {q} | tr " " "|"); 
            echo -e {1}"\n"{2} | grep -E "^|$v" -i --color=always;
        ' \
    | cut -z -f 1 -d $'\t' | tr -d '\n' | xargs -r --null $open > /dev/null 2> /dev/null
}

  • You may replace ag -U -g ".pdf$" with another command that returns a list of pdf files.
  • You may replace open=... by your favorite PDF viewer, for instance open=evince or open=okular.

Installation on OSX with homebrew

  1. Install homebrew and run
brew install bellecp/fast-p/fast-pdf-finder

The above brew formula is experimental. Please report any issues/suggestions/feedback at #11

  1. Tweak your .bashrc. Add the following code to your .bashrc
p () {
    local open
    open=open   # on OSX, "open" opens a pdf in preview
    ag -U -g ".pdf$" \
    | fast-p \
    | fzf --read0 --reverse -e -d $'\t'  \
        --preview-window down:80% --preview '
            v=$(echo {q} | gtr " " "|"); 
            echo -e {1}"\n"{2} | ggrep -E "^|$v" -i --color=always;
        ' \
    | gcut -z -f 1 -d $'\t' | gtr -d '\n' | gxargs -r --null $open > /dev/null 2> /dev/null
}

  • You may replace ag -U -g ".pdf$" with another command that returns a list of pdf files.
  • You may replace open=... by your favorite PDF viewer, for instance open=evince or open=okular.

Remark: On OSX, we use the command line tools gcut, gxargs, ggrep, gtr which are the GNU versions of the tools cut, xargs, grep, tr. This way, we avoid the specifics of the versions of these tools pre-installed on OSX, and the same .bashrc code can be used for both OSX and GNU Linux systems.

Usage

Use the command p to browse among the PDF files in the current directory and its subdirectories.

The first run of the command will take some time to cache the text extracted from each pdf. Further runs of the command will be much faster since the text extraction will only apply to new pdfs.

How to clear the cache?

To clear the cache (which contains text extracted from PDF), you can run 'fast-p --clear-cache'. This will safely remove the file located at: ~/.cache/fast-p-pdftotext-output/fast-p_cached_pdftotext_output.db

For older versions, please manually delete the cache file found at ~/.cache/fast-p_cached_pdftotext_output.db

Launch with keyboard shortcut in Ubuntu

On Ubuntu desktop (tested in 18.04), one may add a keyboard shortcut to launch a new terminal running the p command right away. With the following script, the new terminal window will automatically close after choosing a PDF.

Create a file ~/.fast-p-rc with

source .bashrc
p;
sleep 0.15; exit;

and in Ubuntu Settings/Keyboard, add a custom shortcut that runs the command gnome-terminal -- sh -c "bash --rcfile .fast-p-rc".

See it in action

illustration of the p command

Is the historical bash code still available?

Yes, see https://github.com/bellecp/fast-p/blob/master/p but using the go binary as explained above is recommended for speed and interoperability.

fast-p's People

Contributors

bellecp avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

fast-p's Issues

No such file or directory error

When I tried running the p command I got the error like this:

[qoqosz:~/Downloads]$ p   
Loading _of_Combinatorial_Topology.pdf...  
Loading son,_Ronald_L.pdf...  
Loading toip-depo.pdf...  .pdf             
p:48: no such file or directory: /Users/qoqosz/.cache/pdftotext/00ce81e5cab6f38ftebooks-tmp/Foundations_of_Combinatorial_Topology.pdf

I'm on a Mac and have installed all the requirements. Not sure what steps should I list to make this issue somehow reproducible. Is it possible that space in filenames could be a problem?

macos install [zsh]

when trying to install via homebrew, the fan kicked in real hard.
I checked the CPU and my computer almost exploded. Havent experienced before when installing programs.

Screenshot 2023-07-11 at 20 37 00

Initially I got a warning that I'm on Catalina and I might run into problems (e.g. slow processing or sth like that)

On my screenshot, it says ./make.bash, that was the part where it got really loud and kept hanging there.
Anyways. I have the Zshell and not bash.

So I'm not sure what exactly I'm doing wrong.
Would appreciate any hint. Thanks

Preview window does not highlight all matches

On OSX the match isn't highlighted in my preview window for some returned items, but it is in others. (Im not sure if this is an issue here, or with fzf). For example, if I search "test" the first few results might hate [email protected] directly in the preview window, it just isnt highlighted. If I keep scrolling down other results show them and are indeed highlighted.

Script doesn't show any pdfs

Finished all the steps, trying it in a folder with numerous pdf files, but I the script returns:

1/1

grep: empty (sub)expression

Page limit

The line here makes pdftotext stop scanning on page 2 of each pdf. Is there a reason for that? I have some really long PDFs I would like to parse

Can not open pdf file

Script is passing the line content and not the file name/address to open command. Using Ubuntu 18.04

mac os x - cut binary doesn't support -z option.

I tried to solve it by brew install coreutils; and then using gcut, instead of cut in the function p definition.

However, the pdf reader wasn't launched. Might somewhere in the last line is wrong:
| gcut -z -f 1 -d $'\t' | tr -d '\n' | xargs -r --null $open > /dev/null 2> /dev/null

Thank you for this fascinating tool!

Windows support [enchancement]

Hi there,

I've been watching your project since it came up on Hacker News. All the components are now available on Windows through chocolatey. Is it technically possible to implement the same functionality under PowerShell?

pdf-to-text only extracts first tiny part of document

hi^^ i tried today to use fast-p to make some math papers more searchable, but it just stops after a tiny fraction of the document.
using pdftotext on the same pdf works fine, then the whole thing is turned into a txt.
i've attached the cache-file when using fast-p in a folder that contains just a trial file (i had to change the ending bc github didn't let me upload a .db file but the content is the same...)
fast-p_cached_pdftotext_output.txt

PDF preview in both preview window and the search window

While using the p() function to search for pdf, the search window displays only one result at a time and displays the whole document instead of appearing as one liner.

fzf version: 0.53.0 (c4a9ccd6)
fast-p version: 0.2.5

code used in .bashrc

p () {
    open=xdg-open   # this will open pdf file withthe default PDF viewer on KDE, xfce, LXDE and perhaps on other desktops.

    ag -U -g ".pdf$" \
    | fast-p \
    | fzf --read0 --reverse -e -d $'\t'  \
        --preview-window down:80% --preview '
            v=$(echo {q} | tr " " "|"); 
            echo -e {1}"\n"{2} | grep -E "^|$v" -i --color=always;
        ' \
    | cut -z -f 1 -d $'\t' | tr -d '\n' | xargs -r --null $open > /dev/null 2> /dev/null
}

Screenshot-2024-06-11_17 25 52

Using the old historical bash code as seen here, correct behaviour is still retained. List of files is presented as one line for each possible file.

Screenshot-2024-06-11_17 22 02

bash only?

I put the mentioned code snippet into my .bashrc
but in fact my default shell is zsh [...]
In order to make the p command work, i have to instantiate a bash session.

how can I get all this going in Zsh as well?

Thanks alot

Add licence?

This is a really interesting approach to dealing with masses of PDFs!

Sorry to be that guy, but would you consider adding whatever is your favourite open source license? (If not, then please don't worry!)

Both the idea and the actual script/bash function most probably cross the threshold for copyrightability.

Thanks and sorry again!

Possible extension by rga

Is it possible to integrate rga so:

  1. improve the speed (even without cache) and
  2. extend the functionality beyond PDF?

[Warning] dpkg: warning missing maintainer

I have installed fast-p from the precompiled binary fast-p_0.1.7_linux_amd64.deb for Ubuntu 18.04. Everything works fine but when I run:
$ sudo apt update
or
$ sudo apt upgrade
I get a warning saying:

dpkg: warning: parsing file '/var/lib/dpkg/status' near line <line-number> package 'fast-p':
 missing maintainer

I think this warning would go away if the package maintainer is added in the precompiled binary.

Output of $ uname -a:

Linux shubham-Z510 4.15.0-33-generic #36-Ubuntu SMP Wed Aug 15 16:00:05 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

Output of $ lsb_release -a:

No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 18.04.1 LTS
Release:	18.04
Codename:	bionic

Feature request: Open pdf in a detached process

Hello,

unfortunately my command line fu is not advanced enough to modify the startup script myself.
What I would like is that after pressing return the selected pdf is opened but the fast-p terminal stays unchanged. So I could change my search and open another pdf from the same terminal.

"go: finding module ... go: found " trouble installing on arch

I recently installed your super fun code.
Now I'm trying to get it on my other computer.
Alas, I have no experience with installing packages with go.
I get weird behaviour.

After the command go install github.com/bellecp/[email protected] I get " finding .. found" but I don't think anything happens really. Could you perhaps provide a hint what I could do differently?

pic-selected-230715-0632-34
pic-selected-230715-0632-14

I checked both directories but there aren't any fast-p binaries ...

I actually tried also doing it manually but I'm just an illiterate and couldnt get it running that way either ...

[feature request] image preview

It would be great to have a image preview integrated with fast-p and fuzzy file. The idea it to see the cover image of the selected pdf file. Anyone has idea how to implement that?

Is there a way to jump to the line in preview window?

I am using this to search for text in long pdfs, so the searched term might not be visible in the preview window without scrolling a lot.

From what I can tell the preview window always starts on the first line, is there any way to have it start on the line of the first match?

Thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.