marcosinger / ruby-readability Goto Github PK
View Code? Open in Web Editor NEWThis project forked from cantino/ruby-readability
Port of arc90's readability project to ruby
This project forked from cantino/ruby-readability
Port of arc90's readability project to ruby
Ruby Readability Command line: (sudo) gem install ruby-readability Bundler: gem "ruby-readability", :require => 'readability' Example: require 'rubygems' require 'readability' require 'open-uri' source = open('http://lab.arc90.com/experiments/readability/').read puts Readability::Document.new(source).content Options: You may provide additions options to Readability::Document.new, including: :tags - the base whitelist of tags to sanitize, defaults to %w[div p] :remove_empty_nodes - remove <p> tags that have no text content; also removes p tags that contain only images :attributes - whitelist of allowed attributes :debug - provide debugging output, defaults false :encoding - if this page is of a known encoding, you can specify it; if left unspecified, the encoding will be guessed (only in Ruby 1.9.x) :html_headers - in Ruby 1.9.x these will be passed to the guess_html_encoding gem to aid with guessing the HTML encoding Readability comes with a command-line tool for experimentation in bin/readability. Usage: readability [options] URL -d, --debug Show debug output -i, --images Keep images and links -h, --help Show this message Potential issues: * If you're on a Mac and are getting segmentation faults, see this discussion https://github.com/tenderlove/nokogiri/issues/404 and consider updating your version of libxml2. Version 2.7.8 of libxml2 with the following worked for me: gem install nokogiri -- --with-xml2-include=/usr/local/Cellar/libxml2/2.7.8/include/libxml2 --with-xml2-lib=/usr/local/Cellar/libxml2/2.7.8/lib --with-xslt-dir=/usr/local/Cellar/libxslt/1.1.26 === This code is under the Apache License 2.0. http://www.apache.org/licenses/LICENSE-2.0 This is a ruby port of arc90's readability project http://lab.arc90.com/experiments/readability/ Given a html document, it pulls out the main body text and cleans it up. Ruby port by starrhorne, libc, and iterationlabs. Original gemification by fizx.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.