Giter Site home page Giter Site logo

Comments (10)

arrrlo avatar arrrlo commented on July 18, 2024 1

The problem here is you simply cannot get more than 200 different images with one search query.
When you reach start + num > 200, game's over. Use different query term.
That's not my rule, it's Google's.

And I don't thing silent fail in that case is a good idea.
Everyone should be aware of this limit and handle it for them selfs.

The problem is if the last query has parameters like start=193 and num=5, which goes beyond 200 limit, it will fail before getting any image.
So my idea is when that happen, to correct num parameter in a way not to go beyond 200 when summed with start param, and throw an exception afterwards.
In that case you are aware of the limitation and have your images as well.

And you code should look like this:

from google_images_search.exception import GoogleLimit

try:
    gis.search(...)
except GoogleLimit:
    pass

for image in gis.results():
    pass

from google-images-search.

DragonflyRobotics avatar DragonflyRobotics commented on July 18, 2024

I tried a different search term and it exited after downloading 117 images.

from google-images-search.

arrrlo avatar arrrlo commented on July 18, 2024

Hi @DragonflyRobotics

This is now well known limitation from Google search API when sum of start and num query parameters is bigger then 100:
https://developers.google.com/custom-search/v1/reference/rest/v1/cse/list

Frankly, I don't know how to tackle this except by making a friendly exception or warning or something simillar.

Screenshot 2022-06-02 at 09 18 14

from google-images-search.

DragonflyRobotics avatar DragonflyRobotics commented on July 18, 2024

Somehow, I was able to download 100 images easily. It choked after 110 or 120 images.

from google-images-search.

arrrlo avatar arrrlo commented on July 18, 2024

Yes, that limit is a pain.
Will investigate this further.

from google-images-search.

DragonflyRobotics avatar DragonflyRobotics commented on July 18, 2024

I will also try researching and assisting with this issue. I found your repo incredibly useful in my project.

from google-images-search.

DragonflyRobotics avatar DragonflyRobotics commented on July 18, 2024

I have been messing around with GIS some more. I found that it doesn't stop at exactly at 100. Furthermore, it downloads more images for some keywords and less for others. I think it might not have to do with the Google download cap.

from google-images-search.

arrrlo avatar arrrlo commented on July 18, 2024

Hi @DragonflyRobotics

Not all images out there are valid and good to download. A lot of them are plain unreachable, producing error 4xx and higher.
That is why some of the keywords download more and some less images because this lib validates its availability prior to downloading.

There is nothing more to this lib.
If it wasn't for this Google API's limit, this lib would download thousand images without stopping.

And it stops with "Request contains an invalid argument." error by Google, using the same arguments as before the error.

I've tested it again now with num=200, and looks like the start + num > 100 limit doesn't work at all.
API goes beyond 100 limit point just fine.
But once a start argument surpasses 200, you get the "Request contains an invalid argument." and the invalid part is the start argument being bigger than 200.

There is no other explanation. Nothing else changes from request to request.

from google-images-search.

karencfisher avatar karencfisher commented on July 18, 2024

If there is a hard limit in the Google API of the start argument being <= 200, maybe simply return when that limit is exceeded before making the new request? It is kind of a downer of course, when you just get what you can, I know. It's better though than having it throwing an exception.

I am looking to search through batches of different images, so I would rather not have the process crash out (though I do plan to handle the exception in my code and move on to the next query in the queue I guess.)

from google-images-search.

DragonflyRobotics avatar DragonflyRobotics commented on July 18, 2024

I think that is a good idea. We can simply programmatically run until the <200 flag is reached. Then we can just stop the search instance, make a new one, and continue downloading.

from google-images-search.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.