Giter Site home page Giter Site logo

image_scraper's Issues

Fix build warnings

I noticed the following build warnings on CI.

From CI

/home/travis/build/charlotte-ruby/image_scraper/lib/image_scraper/client.rb:10: warning: URI.escape is obsolete
/home/travis/build/charlotte-ruby/image_scraper/lib/image_scraper/client.rb:18: warning: calling URI.open via Kernel#open is deprecated, call URI.open directly or use URI#open
/home/travis/build/charlotte-ruby/image_scraper/lib/image_scraper/client.rb:38: warning: URI.escape is obsolete
/home/travis/build/charlotte-ruby/image_scraper/lib/image_scraper/client.rb:39: warning: URI.escape is obsolete
/home/travis/build/charlotte-ruby/image_scraper/lib/image_scraper/client.rb:52: warning: calling URI.open via Kernel#open is deprecated, call URI.open directly or use URI#open
/home/travis/build/charlotte-ruby/image_scraper/lib/image_scraper/client.rb:67: warning: URI.escape is obsolete
/home/travis/build/charlotte-ruby/image_scraper/lib/image_scraper/client.rb:84: warning: URI.escape is obsolete

Will try to submit PR for this shortly

Image with a blank space

This looks like a basic html issue but its causing a bad URI error:

url

http://www.amazon.com/Planet-Two-Disc-Digital-Combo-Blu-ray/dp/B004LWZW4W/ref=sr_1_1?s=movies-tv&ie=UTF8&qid=1324771542&sr=1-1

error

 (bad URI(is not URI?): %20http://g-ecx.images-amazon.com/images/G/01/SIMON/IsaacsonWalter._V164348457_.jpg):

faulty html sample

 <img height="300" src=" http://g-ecx.images-amazon.com/images/G/01/SIMON/IsaacsonWalter._V164348457_.jpg" style="float: right;" width="450"> 

Looks like the scraper is throwing the error when and image has a space and the first character of the source. Which apparently is a common mistake amazon.com makes

I can give you the full trace if that would help but it seems pretty straight forward. I played with a few lines trying to get it to just ignore the image if the first character was blank to no avail. (still new to rails.) Let me know what other information I can get you to help out.

undefined method `gsub' for nil:NilClass

I'm a bit new to rails. I was excited to find your scraper as I believe it will do exactly what I want it to. However I keep getting this nil error when scraping any amazon.com url.

example "http://www.amazon.com/OtterBox-Universal-Defender-Silicone-Plastic/dp/B004N7EY5S"

It would appear to be that the strip_quotes function is the only thing using gsub and it's having issues when it's provided an empty url.

My thought was that I could just define :include_css_images=>false as that function only seems to be called when handling stylesheet urls but that did not fix the issue.

Again I'm new to rails so I wish I could give more info that may help. If I'm just clueless and missing something obvious then I do apologize. My hope is only to help make this gem better.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.