Giter Site home page Giter Site logo

skimr's Introduction

skimr

Recursive web scraper in Rust

Returns text for each element in a recursive list of tags

Installation:

cargo install --git https://github.com/timepigeon/skimr

Needs the Rust Toolchain available at https://rustup.rs/.
Note that ~/.cargo/bin/ must be in your $PATH.

Usage:

skimr [website] [selectors 1..n] (tag only, tag#id or tag.class)

Example:

skimr news.ycombinator.com table#hnmain td.title a.storylink

Output:
Enigma, the Bombe, and Typex
Mathigon – an interactive, personalized mathematics textbook
Apple iPhone SE Available on Apple Store Again
Integer multiplication in time O(n log n) [pdf]
The cortex is a neural network of neural networks
Local Variables · Crafting Interpreters
Fyne: Cross-Platform GUI in Go Based on Material Design
A Meta Lesson
The Elaborate, Dying Art of Hustling for Money at Dave and Buster's
Credder Wants to Create an Equivalent to “Rotten Tomatoes” for News
A Short History of Chaosnet (2018)
M-16: A Bureaucratic Horror Story (1981)
Prince Of Persia Code Review (2013)
Rotating Black Holes May Serve as Gentle Portals for Hyperspace Travel
Ask HN: Best way to test accessibility of a website?
[..]

skimr's People

Contributors

thepigeonoftime avatar

Stargazers

Anne Thorpe avatar

Watchers

 avatar

skimr's Issues

compilation issue with nightly

The compilation with the nightly compiler 1.43.0 / ubuntu 18.04 aborts with this error message :

error[E0506]: cannot assign to `self.input.cached_token` because it is borrowed
   --> /home/vincent/.cargo/registry/src/github.com-1ecc6299db9ec823/cssparser-0.24.1/src/parser.rs:572:17
    |
547 |     pub fn next_including_whitespace_and_comments(&mut self) -> Result<&Token<'i>, BasicParseError<'i>> {
    |                                                   - let's call the lifetime of this reference `'1`
...
560 |             Some(ref cached_token)
    |                  ---------------- borrow of `self.input.cached_token` occurs here
...
572 |                 self.input.cached_token = Some(CachedToken {
    |                 ^^^^^^^^^^^^^^^^^^^^^^^ assignment to borrowed `self.input.cached_token` occurs here
...
584 |         Ok(token)
    |         --------- returning this value requires that `self.input.cached_token.0` is borrowed for `'1`

Just to let know that the code isn't valid at this point of time.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.