Giter Site home page Giter Site logo

image_scraper's People

Contributors

dependabot[bot] avatar github-actions[bot] avatar invalidusrname avatar jaymcaliley avatar johnmcaliley avatar syoder avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

image_scraper's Issues

Fix build warnings

I noticed the following build warnings on CI.

From CI

/home/travis/build/charlotte-ruby/image_scraper/lib/image_scraper/client.rb:10: warning: URI.escape is obsolete
/home/travis/build/charlotte-ruby/image_scraper/lib/image_scraper/client.rb:18: warning: calling URI.open via Kernel#open is deprecated, call URI.open directly or use URI#open
/home/travis/build/charlotte-ruby/image_scraper/lib/image_scraper/client.rb:38: warning: URI.escape is obsolete
/home/travis/build/charlotte-ruby/image_scraper/lib/image_scraper/client.rb:39: warning: URI.escape is obsolete
/home/travis/build/charlotte-ruby/image_scraper/lib/image_scraper/client.rb:52: warning: calling URI.open via Kernel#open is deprecated, call URI.open directly or use URI#open
/home/travis/build/charlotte-ruby/image_scraper/lib/image_scraper/client.rb:67: warning: URI.escape is obsolete
/home/travis/build/charlotte-ruby/image_scraper/lib/image_scraper/client.rb:84: warning: URI.escape is obsolete

Will try to submit PR for this shortly

Image with a blank space

This looks like a basic html issue but its causing a bad URI error:

url

http://www.amazon.com/Planet-Two-Disc-Digital-Combo-Blu-ray/dp/B004LWZW4W/ref=sr_1_1?s=movies-tv&ie=UTF8&qid=1324771542&sr=1-1

error

 (bad URI(is not URI?): %20http://g-ecx.images-amazon.com/images/G/01/SIMON/IsaacsonWalter._V164348457_.jpg):

faulty html sample

 <img height="300" src=" http://g-ecx.images-amazon.com/images/G/01/SIMON/IsaacsonWalter._V164348457_.jpg" style="float: right;" width="450"> 

Looks like the scraper is throwing the error when and image has a space and the first character of the source. Which apparently is a common mistake amazon.com makes

I can give you the full trace if that would help but it seems pretty straight forward. I played with a few lines trying to get it to just ignore the image if the first character was blank to no avail. (still new to rails.) Let me know what other information I can get you to help out.

undefined method `gsub' for nil:NilClass

I'm a bit new to rails. I was excited to find your scraper as I believe it will do exactly what I want it to. However I keep getting this nil error when scraping any amazon.com url.

example "http://www.amazon.com/OtterBox-Universal-Defender-Silicone-Plastic/dp/B004N7EY5S"

It would appear to be that the strip_quotes function is the only thing using gsub and it's having issues when it's provided an empty url.

My thought was that I could just define :include_css_images=>false as that function only seems to be called when handling stylesheet urls but that did not fix the issue.

Again I'm new to rails so I wish I could give more info that may help. If I'm just clueless and missing something obvious then I do apologize. My hope is only to help make this gem better.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.