Giter Site home page Giter Site logo

mrusme / reader Goto Github PK

View Code? Open in Web Editor NEW
217.0 4.0 6.0 21.94 MB

reader is for your command line what the “readability” view is for modern browsers: A lightweight tool offering better readability of web pages on the CLI.

Home Page: https://xn--gckvb8fzb.com/reader-web-page-readability-on-the-cli/

License: GNU General Public License v3.0

Go 100.00%
cli tui command-line command-line-tool html markdown html-to-markdown ascii ascii-art readability

reader's Introduction

reader

reader is for your command line what the “readability” view is for modern browsers: A lightweight tool offering better readability of web pages on the CLI.

reader

reader parses a web page for its actual content and displays it in nicely highlighted text on the command line. In addition, reader renders embedded images from that page as colored block-renders on the terminal as well.

Usage

reader https://xn--gckvb8fzb.com/superhighway84/

Don't render images:

reader --image-mode none https://xn--gckvb8fzb.com/superhighway84/

Output raw markdown, don't pretty print:

reader -o https://xn--gckvb8fzb.com/superhighway84/

Read from file:

reader ${HOME}/downloads/example.com.html

Read from stdin:

curl -o - https://superhighway84.com | reader -

Render images using the SIXEL graphics encoder:

reader --image-mode sixel https://xn--gckvb8fzb.com/travel-aruba/

sixel

More options:

reader -h

Examples

Using reader from within w3m

While on a web page in w3m, press ! and enter the following:

reader $W3M_URL

This will open the current url with reader. w3m will wait for you to press any key in order to resume browsing.

If you want to navigate through the page:

reader $W3M_URL | less -R

Using reader from within vim/neovim

Add the following function/mapping to your init.vim:

function s:vertopen_url()
  normal! "uyiW
  let mycommand = "reader " . @u
  execute "vertical terminal " . mycommand
endfunction
noremap <Plug>vertopen_url : call <SID>vertopen_url()<CR>
nmap gx <Plug>vertopen_url

Open a document and place the cursor on a link, then press g followed by x. Vim will open a new terminal and show you the output of reader.

reader's People

Contributors

abeestrada avatar dependabot[bot] avatar dnalor avatar mrusme avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

reader's Issues

Improve visibility on demonstration GIF

Hi,

can you please fix your demonstration gif in such a way that it is accessible and easy to understand for users what the usecase of your project is?

The current gif is entirely unusable to the point where I think it's a joke, a trolling attempt.

Crop the rest of your very beautiful wallpaper and present a more visible section of the actual demo / your terminal.

Thank you for considering this change in advance.

[BUGS] 1. Bad output in a edge case, 2. Bad output with extra backslash

  1. Three asterisks alone (edge case) bad rendered
$ reader https://readhive.org/series/38553/0/ | sed -n 25p
\\\\\\*

$ reader -o https://readhive.org/series/38553/0/ | sed -n 21p
\\*\\*\\*

$ reader -o https://readhive.org/series/38553/0/ | sed -n 21p | lowdown | w3m -T text/html -dump
\\\*

$ reader -o https://readhive.org/series/38553/0/ | sed -n 21p | glow
\\\\\\*

$ ### firefox (default, output centered):
***

$ ### firefox (reader view):
***

$ ### reader -o (output to github "GFM"):
\\*\\*\\*
  • reader (default pretty output) bad render [BUG].
  • reader --markdown-output (raw markdown output) bad render [BUG].
  • glow, lowdown, github render raw markdown output from reader, is going to be bad.
  • Both firefox and firefox reader view show correct render.
  • I don't know why, but appears that default reader output add some lines vs raw markdown output, that is why I need to change the line with sed.

When the web page has a line with only 3 consecutive asterisks (***), the markup code should displays an horizontal line, but in this case, it should have at least one escape character \ (backslash) before, so that it is displayed correctly. Below are three correct ways to write that will be displayed good, the last one is incorrect (horizontal line) and can optionally be written as ---:

$ echo '\***' | lowdown
<p>***</p>
$ echo '\*\**' | lowdown
<p>***</p>
$ echo '\*\*\*' | lowdown
<p>***</p>
$ echo '***' | lowdown
<hr/>
  1. Extra unnecessary (and problematic) backslash, escaping square brackets
$ reader https://readhive.org/series/38553/0/ | sed -n 48p
\[… There is no end in sight.\]

$ reader -o https://readhive.org/series/38553/0/ | sed -n 41p
\[… There is no end in sight.\]

$ reader -o https://readhive.org/series/38553/0/ | sed -n 41p | lowdown | w3m -T text/html -dump
[… There is no end in sight.]

$ reader -o https://readhive.org/series/38553/0/ | sed -n 41p | glow
\[… There is no end in sight.\]

$ ### firefox (default, output centered):
[… There is no end in sight.]

$ ### firefox (reader view):
[… There is no end in sight.]

$ ### reader -o (output to github):
[… There is no end in sight.]
  • reader (default pretty output) shouldn't show \ (backslash) [BUG]
  • glow output shouldn't show \ (backslash) [glow BUG]
  • reader --markdown-output (raw markdown output) can show \ (backslash), because all except glow render it without backslash, but is unnecessary, also tested with dingus and commonmark that are reference 4 specification.

panic: runtime error: invalid memory address or nil pointer dereference

Segfault on first run.

$ reader https://example.com
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0xc2a4a5]

goroutine 1 [running]:
github.com/mrusme/reader/cmd.MakeReadable(0xc0006ffd00?)
        /home/runner/work/reader/reader/cmd/root.go:61 +0x265
github.com/mrusme/reader/cmd.glob..func1(0x1534840?, {0xc0001c35f0?, 0x1?, 0x1?})
        /home/runner/work/reader/reader/cmd/root.go:214 +0x52
github.com/spf13/cobra.(*Command).execute(0x1534840, {0xc0000c4010, 0x1, 0x1})
        /home/runner/go/pkg/mod/github.com/spf13/[email protected]/command.go:876 +0x67b
github.com/spf13/cobra.(*Command).ExecuteC(0x1534840)
        /home/runner/go/pkg/mod/github.com/spf13/[email protected]/command.go:990 +0x3b4
github.com/spf13/cobra.(*Command).Execute(...)
        /home/runner/go/pkg/mod/github.com/spf13/[email protected]/command.go:918
github.com/mrusme/reader/cmd.Execute()
        /home/runner/work/reader/reader/cmd/root.go:267 +0xf9
main.main()
        /home/runner/work/reader/reader/reader.go:6 +0x17

External image viewer

Useful 4 many case scenarios:

  • GUI viewer (X11 or Wayland): nsxiv, imv, feh, imagemagick's display, etc
  • Terminal emulator (ANSI colored blocks, ASCII): aalib, libcaca, viu, chafa, timg, etc. Some work in TTY. Some use terminal features like: Sixel, or has own solutions like: Kitty, Iterm. It could be possible to open image in a new multiplexer (tmux, screen) tab/window|pane.
  • TTY: Framebuffer viewers (near GUI viewers with pure terminal): fim, fbi, fbvis, jfbview, etc.

Can be done with:

  • Shell Variable like nnn.
  • CLI option parameter.
  • Config or ini file inside ~/.config folder.

Fail to parse email's Html containing french punctuation and a quote.

I use reader as a first step in my script to produce an output for Neomutt email client's pager.
The script receiver the raw html and then pipe it as markdown to pandoc, elinks and then less (to add references and colors).

That's the best solution I found to get something clean, formatted and highlighted for Neomutt html diplay.

Issue

But, a few days ago, I noticed that a message where reader was not displaying the sender's message, just the quoted part.

It may be related to the gmail html formating or the text itself.

Example

The message is a reply to my previous message and was sent from gmail. (I replaced private text by X's)

  • HTML
<div dir="auto">Hello,<div dir="auto"><br></div><div dir="auto">Merci d&#39;y avoir pensé. 🙂</div><div dir="auto">X&#39;xxx xxxxxxxxx. X&#39;xx xxxxx x&#39;xxxxxxxx.</div><div dir="auto"><br></div><div dir="auto">X&#39;xx xxxxxxxxxxx.</div><div dir="auto">Xxx xxxxxxxx 🙂</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">Le lun. 30 oct. 2023 à 17:28, Tomasz Kapias &lt;<a href="mailto:[email protected]">[email protected]</a>&gt; a écrit :<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><u></u><div><div>Xxxx xxxxxxx,</div><div><br></div><div>Xxxxx xx xxxxxx. Xxxxxxxx xxxx x&#39;xxx xxxxxx, xxxxx x xxxxxxx.</div><div><br></div><div>Xx xxxx x&#39;xx xxxxxx x&#39;xxx xxxxx xxx xx xxx x&#39;xxx xxxx, xx x&#39;xx xxxx xx xxx. xxxx x xxx x&#39;xxx xxxx x&#39;xx xxxx x xxx, xxxxxx, xx x&#39;xxx xxx x&#39;xxxx xxxx. xxx xxx x&#39;xxx xxxx.</div><div><br></div><div><br></div><div>Xx x&#39;xxxxxx, bonne soirée.</div><div><br></div><div>Tomasz<br></div></div></blockquote></div>
  • reader output for reader --image-mode none --markdown-output --verbose message.html:
Le lun. 30 oct. 2023 à 17:28, Tomasz Kapias < [[email protected]](mailto:[email protected]) \> a écrit :

> Xxxx xxxxxxx,
>
> Xxxxx xx xxxxxx. Xxxxxxxx xxxx x'xxx xxxxxx, xxxxx x xxxxxxx.
>
> Xx xxxx x'xx xxxxxx x'xxx xxxxx xxx xx xxx x'xxx xxxx, xx x'xx xxxx xx xxx. xxxx x xxx x'xxx xxxx x'xx xxxx x xxx, xxxxxx, xx x'xxx xxx x'xxxx xxxx. xxx xxx x'xxx xxxx.
>
> Xx x'xxxxxx, bonne soirée.
>
> Tomasz
  • Display in Firefox:
    image

The part above the quote is not parsed by reader.

sixel

Hi there :)

Would a sixel integration (maybe as an option) be possible?
Would love to have high quality pics, especially for webpages with artwork

maybe with https://github.com/mattn/go-sixel

Enable cookies

It appears that any website that uses certain Cloudflare security checks returns:

Please enable cookies.

You are unable to access example.com

Why have I been blocked?
This website is using a security service to protect itself from online attacks. The action you just performed triggered the security solution. There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

What can I do to resolve this?
You can email the site owner to let them know you were blocked. Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

Is it possible to enable cookies?

Thank You

Any differences between -i and -o?

E.g., this comparison:

diff <(reader -o https://www.bbc.com/sport/football/60548685) <(reader -i https://www.bbc.com/sport/football/60548685)

reports no differences for me :)

With -i, wouldn't it be more logical to output just the usual output simply minus images? -o, on the other hand, looks just fine :)

(Also, an additional line break at the end of -o would be nice, don't you think?)

FR: Ability to get title in raw markdown mode

In the normal mode, the title is present at the top of the article as one would expect.

But in raw markdown mode -o , there is only the body text with no title.

Consider putting the title also in the markdown mode, maybe in the typical markdown manner like so

# Title here

Body text here...

CloudFront error?

When I try reader https://thecyberwire.com/newsletters/daily-briefing/11/27, I get the following error:

    The request could not be satisfied

  --------

  Request blocked. We can't connect to the server for this app or website at this time. There might be too much traffic or a configuration error. Try again later, or contact the app or website owner.

  If you provide content to customers through CloudFront, you can find steps to troubleshoot and help prevent this error by reviewing the CloudFront documentation.

  --------

    Generated by cloudfront (CloudFront)
    Request ID: i4xQcPMNFlmTno3nrQbJTLYqCSqGzgRnvJaD4CxQk6_p4OsoJwd8CA==

At the same time, I can open the very same page in w3m without any problems.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.