Usefull package for collecting image label in machine learning usecases.
This package is available on pypi repository
pip install ggd
Is required have installed the browser, using a webdriver (Geckodriver or Chromedriver) and have the executable in the path.
The following command can be used in linux64 platform
wget https://github.com/mozilla/geckodriver/releases/download/v0.30.0/geckodriver-v0.30.0-linux64.tar.gz && tar -zxvf geckodriver-v0.30.0-linux64.tar.gz && rm geckodriver-v0.30.0-linux64.tar.gz && mv geckodriver /usr/local/bin/
from ggd import GoogleImage
gg = GoogleImage()
gg.download(request='Alakazam', n_images=200)
-
All images are downloaded in a new folder.
-
all_files
attribute contains all images pathes downloaded.
print(gg.all_files)
>>> ['Alakazam/Alakazam_000.png',
'Alakazam/Alakazam_001.png',
'Alakazam/Alakazam_002.png',
'Alakazam/Alakazam_003.png',
'Alakazam/Alakazam_004.png',
...]
For using this package with more requests labeled and see scraping working in backend.
from ggd import GoogleImage
google_dl = GoogleImage(driver=driver,
verbose=True,
close_after_download=False,
headless=False)
n = 500
for rq, name_im in [("bulbasaur --cards", 'Bulbizarre'),
('ivysaur --cards', 'Herbizarre')]:
google_dl.download(request=rq,
n_images=n,
directory='Data',
name=name_im)
google_dl.close()
For multiplying number and variety of images, use Google filtering* in requests , synonyms, other languages...
*article in french