Giter Site home page Giter Site logo

shub's Introduction

Scrapinghub command line client

PyPI Version

Python Versions

Tests

Coverage report

shub is the Scrapinghub command line client. It allows you to deploy projects or dependencies, schedule spiders, and retrieve scraped data or logs without leaving the command line.

Requirements

  • Python >= 3.6

Installation

If you have pip installed on your system, you can install shub from the Python Package Index:

pip install shub

Please note that if you are using Python < 3.6, you should pin shub to 2.13.0 or lower.

We also supply stand-alone binaries. You can find them in our latest GitHub release.

Documentation

Documentation is available online via Read the Docs: https://shub.readthedocs.io/, or in the docs directory.

shub's People

Contributors

apalala avatar bbotella avatar bertinatto avatar carlosp420 avatar chekunkov avatar dangra avatar dasoran avatar dharmeshpandav avatar elacuesta avatar eliasdorneles avatar gallaecio avatar hcoura avatar hermit-crab avatar immerrr avatar jdemaeyer avatar josericardo avatar jsargiot avatar krotkiewicz avatar lucywang000 avatar mabelvj avatar nestortoledo avatar nramirezuy avatar pablohoffman avatar pawelmhm avatar rafaelcapucho avatar rdowinton avatar redapple avatar starrify avatar stummjr avatar vshlapakov avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

shub's Issues

Add --from-directory option to deplog-egg

Currently, shub deploy-egg (without --from-pypi or --from-url specified) will always try to find a setup.py in the current directory. This makes it hard to maintain libraries that are used across multiple projects. I think we should add a --from-directory option to shub deploy-egg.

Note: Changing to the library directory first is inconvenient because it requires explicitly specifying the numeric project ID (as the correct scrapinghub.yml isn't guaranteed to be found from the library directory). I.e., while cd ~/dev/my-cool-lib/ && shub deploy-egg PROJECTID && cd - works, cd ~/dev/my-cool-lib/ && shub deploy-egg && cd - (no project ID specified) does not.

cannot find apikey for project

I just pull recent master - looks totally great and I really enjoy all those cool new features.

There is one thing that doens't work for me though. When I try to deploy to some specific target I get either

Error: Could not find API key for endpoint ad-hoc.

or

Error: "realtime-test" is not a valid Scrapinghub project ID. Please check your scrapinghub.yml

am I missing something here or is it some potential bug?

EDIT:

I see this is changed behavior. Looking at line here: https://github.com/scrapinghub/shub/blob/master/shub/config.py#L134 I must have defined apikeys for all endpoints. In the past if I had default apikey it was used for all endpoints, is this change expected and intended?

Deploy onboarding

It would be nice to add a small wizard when people use shub deploy for the first time, to ease the onboarding process. Then, all we need to say in Dash is use shub deploy and answer the questions.

$ shub deploy
Please login first with: shub login

$ shub login
...

$ shub deploy
Project ID you would like to deploy to: NNNNN
(...maybe validate access to the project? like shub login validates api key...)
Save project ID into scrapinghub.yml [Y/n]?

The questions will be asked if no scrapinghub.yml is found in the same dir as scrapy.cfg.
If you enter Y to the last question, the scrapinghub.yml file will be created with a default project set to the ID (NNNN) specified and you won't be asked again

The command `fetch-eggs` doesn't fail on auth error

The example below shows how shub fetch-eggs creates a zip file without the actual zip content due to missing auth key.

$ shub fetch-eggs 1261X
Downloading eggs to eggs-1261X.zip
$ file eggs-1261X.zip 
eggs-1261X.zip: ASCII text, with no line terminators
$ export SHUB_APIKEY=xxxxxxxb31
$ shub fetch-eggs 1261X
Downloading eggs to eggs-1261X.zip
$ file eggs-1261X.zip
eggs-12616.zip: Zip archive data, at least v2.0 to extract

Test test_parses_project_information_correctly fails on windows

Output:

tests/test_deploy_egg.py::TestDeployEgg::test_parses_project_information_correctly FAILED

================================== FAILURES ===================================
___________ TestDeployEgg.test_parses_project_information_correctly ___________

self = <tests.test_deploy_egg.TestDeployEgg testMethod=test_parses_project_information_correctly>

    def tearDown(self):
        os.chdir(self.curdir)
        if os.path.exists(self.tmp_dir):
>           shutil.rmtree(self.tmp_dir)

tests\test_deploy_egg.py:36: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
c:\python27\Lib\shutil.py:247: in rmtree
    rmtree(fullname, ignore_errors, onerror)
c:\python27\Lib\shutil.py:252: in rmtree
    onerror(os.remove, fullname, sys.exc_info())
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

path = 'c:\\users\\appveyor\\appdata\\local\\temp\\1\\shub-test-deploy-eggscayihm\\dist'
ignore_errors = False, onerror = <function onerror at 0x035D2770>

    def rmtree(path, ignore_errors=False, onerror=None):
...
                try:
>                   os.remove(fullname)
E                   WindowsError: [Error 32] The process cannot access the file because it is being used by another process: 'c:\\users\\appveyor\\appdata\\local\\temp\\1\\shub-test-deploy-eggscayihm\\dist\\test_project-1.2.0-py2.7.egg'

c:\python27\Lib\shutil.py:250: WindowsError
---------------------------- Captured stdout call -----------------------------
Building egg in: c:\users\appveyor\appdata\local\temp\1\shub-test-deploy-eggscayihm
Deploying dependency to Scrapy Cloud project "0"
Deployed eggs list at: https://dash.scrapinghub.com/p/0/eggs
===================== 1 failed, 59 passed in 7.22 seconds =====================

The test creates the egg file on a temporary directory that is deleted on tearDown, the problem is that the file is never closed.

The problem exists only on windows because attempting to remove a file that it's open raises an exception, on linux the os just unlinks the file and it is removed once all file handlers are closed. [1]

[1] https://docs.python.org/2/library/os.html#os.remove

Add `shub update-config` command

As we're moving our configuration format to scrapinghub.yml`s, should we make the switch easy by adding a command to generate a config in the new format?

Limit data fetched from hubstorage

Some jobs have huge logs / many items / requests. We could limit the number of entries downloaded from hubstorage by grabbing the number of entries first, e.g.

total_lines = job.logs.stats()['totals']['input_values']
# then construct correct `start_after` argument for iter.json() from this

Maybe download only the last 250 items if -f is set, with a possibility to override, and download all items if -f is not set.

Include dependencies when deploying

deploy should be "run and forget": shub should make sure that all requirements the project has (specified in requirements.txt or setup.py's install_requires) are available on Scrapy Cloud and upload them if necessary.

See also #56.

Avoid dependencies on external tools

shub opens new Python sessions through the subprocess module. In particular, shub deploy depends on an available python executable (to build an egg through python setup.py), and shub deploy-egg and shub deploy-reqs additionally depend on pip.

While PyInstaller does bundle the Python interpreter, it is not an executable file (for Windows builds it bundles python27.dll and some additional libraries, which it then loads in the bootloader I guess). I have sent a question (yet to be released by a moderator) to the PyInstaller mailing list regarding how to call a new interpreter instance.

For pip, for now we have settled in a Slack dicussion on deactivating the commands when pip is not available (for deploy-egg we should look into only deactivating the --from-pypi switch). There is a mailing list thread on bundling it here.

Release v2

Pre-release:

  • Settle open issues
  • Update migration banners to say 'v2' and not 'v1.6'
  • Test binaries
  • Update README if API endpoint change is merged
  • Update release log

Post-release:

cant schedule if argument contains equals sign

shub schedule pet-supermarket.co.uk_keywords -a keywords="id1=dog"

fails with

Traceback (most recent call last):
  File "/usr/local/bin/shub", line 9, in <module>
    load_entry_point('shub==1.4.0', 'console_scripts', 'shub')()
  File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 664, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 644, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 991, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 837, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 464, in invoke
    return callback(*args, **kwargs)
  File "/home/pawel/scrapinghub/shub_clone/shub/schedule.py", line 18, in cli
    job_key = schedule_spider(apikey, project_id, spider, argument)
  File "/home/pawel/scrapinghub/shub_clone/shub/schedule.py", line 28, in schedule_spider
    return conn[project_id].schedule(spider, **dict(x.split('=') for x in arguments))
ValueError: dictionary update sequence element #0 has length 3; 2 is required

Check for entry points in setup.py

Deploying fails with a quite cryptic (for users) message when setup.py does not contain a scrapy entry point that points it to the settings module:
https://paste.scrapinghub.com/show/8940/

If there's no setup.py, shub writes our own that has all the necessary information. But if the user wrote one, we probably should check whether the entry point exists and tell them to add it if necessary (or add it ourselves).

Use more meaningful exit codes

How about replacing the generic exit code 1 we use everywhere right now with more meaningful exit codes (e.g. after the sysexits.h convention).

This would be super-easy after #97 and will make it easier for shell scripts to do error diagnostics when automating shub usage. The downside is that it's slightly backwards incompatible with scripts that check exit_code == 1 instead of the more sane exit_code != 0

login while already logged-in shows error

If one is already logged in, shub login throws an error, while it could say something less "dramatic"

Current behavior:

$ shub login
Usage: shub login [OPTIONS]

Error: Already logged in. To logout use: shub logout
  • no need to show "Usage" as it's correct
  • "Error" is misleading. it could say "You're already logged in. Nothing to do. If you want to login with another API key, use logout first", or something along those lines

Let `shub items` outputs in JSON-lines format.

At this very moment shub items outputs results in a format of newline-separated stringified Python dict:

{u'name': u'item_0', u'foo': u'bar'}
{u'name': u'item_1', u'foo': u'bar'}

Personally I think it's better to make it JSON-lines:

{"name": "item_0", "foo": "bar"}
{"name": "item_1", "foo": "bar"}

Or if we need to keep its backward compatibility, I'd suggest at least we introduce something like shub items --format=jsonl.

Different hubstorage endpoints

The job resource commands (log, items, requests) currently always use the production Hubstorage endpoint (http://storage.scrapinghub.com). This is due to the fact that the storage endpoint cannot be easily derivated from the scrapyd endpoint.

Should we add support for supplying different storage endpoints? That would allow using the job resource commands while working on devbox.

[feature request] shub restart command

Command would take job_id as argument and it would restart job with id in other projects using same combination of arguments.

For example

> shub restart 1887/496/3724

would restart job with this id in default_project.

It would be very useful for all of us who have jobs with complex list of arguments and need to enter them manually either in command line or in dash.

Syntax could also include spider name project name and job id, e.g.

shub restart amazon.com_products 1886 latest

but I would prefer job id as it is much shorter.

'shub login' doesn't work for 'shub deploy'

shub deploy only looks for credentials in the project's scrapy.cfg and ~/.scrapy.cfg files.

Does it make sense to make it work with .netrc (shub login) and $SHUB_APIKEY credentials?

Don't import non-integer project names

scrapy startproject myprojectname puts the following section into scrapy.cfg:

[deploy]
project = myprojectname

When users first run shub deploy, this will be transferred to scrapinghub.yml and an error message will be printed.

We should not import non-integer project names from scrapy.cfg and instead guide users through the deploy wizard.

Save last scheduled job ID

Job IDs are hard to remember. If we saved the last scheduled job ID (say in .scrapinghub.yml), instead of this:

shub schedule prod/myspider
shub items prod/2/204

users could do this:

shub schedule prod/myspider
shub items

Fail gracefully when API key could not be validated

When the API key validation request fails, it throws an unhandled exception and prints a traceback to the command line.

We should catch failed requests and either

  1. save the API key without validation, or
  2. tell the user that login is not possible right now.

Where I would opt for the first option. However, that request failing means that either the user has no working internet connection or there is a server outage on our side. In both cases, shub is pretty much useless (at that moment), so I guess option 2 is viable as well.

On a broader scale, we should introduce a new exception for failed connections with helpful error messages.

shub disregards Scrapy configuration environment variables

Scrapy allows to store the name of the project as SCRAPY_PROJECT, and the path of a project settings module as SCRAPY_CONFIG_MODULE environment variables. These take preference when finding the settings module or reading its path from scrapy.cfg.

A scrapy.cfg file that has no [settings], or one that looks like this:

[settings]
custom_project_name = path.to.settings

therefore works perfectly fine with Scrapy as long as the appropriate environment variables are set.

shub, however, does not read these environment variables, and instead requires a scrapy.cfg with a [settings] section and default as project name.

Exit value does not reflect an authentication failure

I think if any API call fails (i.e. auth error) then the command should exit with a non-zero status. The example below shows how shub exit with zero status regardless of the auth failure.

$ shub deploy-egg 1462X       
Building egg in: /home/rolando/foo
zip_safe flag not set; analyzing archive contents...
Deploying dependency to Scrapy Cloud project "1462X"
Deploy failed (403):
{"status": "error", "message": "Authentication failed"}
Deployed eggs list at: https://dash.scrapinghub.com/p/1462X/eggs
$ echo $?
0

Fix tests

They're currently failing.

$ pytest
[...]
Ran 21 test cases in 8.23s (0.29s CPU), 5 errors, 3 failures, 5 skipped

`shub deploy-egg --from-pypi pytz` does not deploy the egg

This is all the output I get:

$ shub deploy-egg --from-pypi pytz 1462X                           
Fetching pytz from pypi
Collecting pytz
  Using cached pytz-2015.4.tar.bz2
  Saved /tmp/shub-deploy-egg-from-pypiYfDnBe/pytz-2015.4.tar.bz2
Successfully downloaded pytz
Package fetched successfully

Other packages work just fine:

$ shub deploy-egg --from-pypi scrapy-inline-requests 1462X
Fetching scrapy-inline-requests from pypi
Collecting scrapy-inline-requests
  Using cached scrapy-inline-requests-0.1.2.tar.gz
  Saved /tmp/shub-deploy-egg-from-pypig6fcOt/scrapy-inline-requests-0.1.2.tar.gz
Successfully downloaded scrapy-inline-requests
Package fetched successfully
Uncompressing: scrapy-inline-requests-0.1.2.tar.gz
Building egg in: /tmp/shub-deploy-egg-from-pypig6fcOt/scrapy-inline-requests-0.1.2
zip_safe flag not set; analyzing archive contents...
Deploying dependency to Scrapy Cloud project "1462X"
{"status": "ok", "egg": {"version": "scrapy-inline-requests-0.1.2", "name": "scrapy-inline-requests"}}
Deployed eggs list at: https://dash.scrapinghub.com/p/1462X/eggs

Check for updates

We will soon start distributing shub in a binary form, outside standard channels (like pypi or apt). So shub needs a way to check if there's a new update available for it.

I propose using Github releases API, and make sure we keep doing releases here. The check should be cached (maybe in ~/.scrapinghub.yml) so as not to check on every invocation, but once a day (or something like that). It should also never fail (if there's no internet connection, for example) and cache failures as well.

Fix broken auto-versioning

The "get version control commit" commands throw an unhandled exception when their version control tool is not available.

This breaks deploying (shub exits with a traceback) when

  • The user uses mercurial for version control and has not installed git
  • The user uses bazaar for version control and has not installed both git and mercurial
  • The user uses no version control and also has not installed all of the above tools

The exceptions should be either caught in the pwd_git_version() (and similar) util functions or in ShubConfig.get_version().

See: https://stackoverflow.com/questions/377017/test-if-executable-exists-in-python

Crash in deploy-reqs for lxml.

There's a repeatable failure if requirements.txt has lxml in it. When I remove lxml, it builds and deploys the rest of the eggs (which is awesome). Mac Book Pro, OSX Yosemite.

Minimum reproducer:
$ virtualenv --version
13.1.0
$ virtualenv --python=python2.7 venv
Running virtualenv with interpreter ...
Using real prefix '...'
New python executable in venv/bin/python2.7
Also creating executable in venv/bin/python
Installing setuptools, pip, wheel...done.
$ source venv/bin/activate
$ pip install shub
Collecting shub
...
Successfully installed click-4.1 requests-2.7.0 shub-1.3.1 six-1.9.0
$ pip list
click (4.1)
pip (7.1.0)
requests (2.7.0)
setuptools (18.0.1)
shub (1.3.1)
six (1.9.0)
wheel (0.24.0)

$ echo 'lxml==3.4.4' > requirements.txt
$ shub deploy-reqs requirements.txt
Downloading eggs...
.../venv/lib/python2.7/site-packages/pip/vendor/requests/packages/urllib3/util/ssl.py:90: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.
InsecurePlatformWarning
.../venv/lib/python2.7/site-packages/pip/vendor/requests/packages/urllib3/util/ssl.py:90: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.
InsecurePlatformWarning
Collecting lxml==3.4.4 (from -r .../requirements.txt (line 1))
Using cached lxml-3.4.4.tar.gz
Saved /var/folders/zw/v2b8vtsj6cn58j55371phlf80000gp/T/eggshv9fpJ/eggs/lxml-3.4.4.tar.gz
Successfully downloaded lxml
Uncompressing: lxml-3.4.4.tar.gz
Building egg in: /private/var/folders/zw/v2b8vtsj6cn58j55371phlf80000gp/T/eggshv9fpJ/eggs/lxml-3.4.4
.../2.7/lib/python2.7/distutils/dist.py:267: UserWarning: Unknown distribution option: 'bugtrack_url'
warnings.warn(msg)
.../2.7/lib/python2.7/distutils/dist.py:267: UserWarning: Unknown distribution option: 'bugtrack_url'
warnings.warn(msg)
.../2.7/lib/python2.7/distutils/dist.py:267: UserWarning: Unknown distribution option: 'bugtrack_url'
warnings.warn(msg)
Traceback (most recent call last):
File ".../venv/bin/shub", line 11, in
sys.exit(cli())
File ".../venv/lib/python2.7/site-packages/click/core.py", line 664, in call
return self.main(_args, *_kwargs)
File ".../venv/lib/python2.7/site-packages/click/core.py", line 644, in main
rv = self.invoke(ctx)
File ".../venv/lib/python2.7/site-packages/click/core.py", line 991, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File ".../venv/lib/python2.7/site-packages/click/core.py", line 837, in invoke
return ctx.invoke(self.callback, *_ctx.params)
File ".../venv/lib/python2.7/site-packages/click/core.py", line 464, in invoke
return callback(_args, **kwargs)
File ".../venv//lib/python2.7/site-packages/shub/deploy_reqs.py", line 18, in cli
main(project_id, requirements_file)
File ".../venv/lib/python2.7/site-packages/shub/deploy_reqs.py", line 27, in main
utils.build_and_deploy_eggs(project_id, apikey)
File ".../venv/lib/python2.7/site-packages/shub/utils.py", line 105, in build_and_deploy_eggs
build_and_deploy_egg(project_id, apikey)
File ".../venv/lib/python2.7/site-packages/shub/utils.py", line 132, in build_and_deploy_egg
_deploy_dependency_egg(apikey, project_id)
File ".../venv/lib/python2.7/site-packages/shub/utils.py", line 138, in _deploy_dependency_egg
egg_name, egg_path = _get_egg_info(name)
File ".../venv/lib/python2.7/site-packages/shub/utils.py", line 169, in _get_egg_info
egg_path = glob(egg_path_glob)[0]
IndexError: list index out of range

Document using multiple API keys

It is possible to provide multiple API keys without having to touch the endpoints setting, e.g.

# scrapinghub.yml
projects:
  default: 123
  otheruser: someoneelse/123
apikeys:
  default: abc
  someoneelse: def

This won't see much use but nevertheless should be documented in the advanced section of the readme.

Error on `deploy-reqs`

Just tried to deploy-reqs using shub master and got

Traceback (most recent call last):
  File "/Users/zehzinho/.virtualenv/ds/bin/shub", line 9, in <module>
    load_entry_point('shub==1.5.0', 'console_scripts', 'shub')()
  File "/Users/zehzinho/.virtualenv/ds/lib/python2.7/site-packages/click/core.py", line 700, in __call__
    return self.main(*args, **kwargs)
  File "/Users/zehzinho/.virtualenv/ds/lib/python2.7/site-packages/click/core.py", line 680, in main
    rv = self.invoke(ctx)
  File "/Users/zehzinho/.virtualenv/ds/lib/python2.7/site-packages/click/core.py", line 1027, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/zehzinho/.virtualenv/ds/lib/python2.7/site-packages/click/core.py", line 873, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/zehzinho/.virtualenv/ds/lib/python2.7/site-packages/click/core.py", line 508, in invoke
    return callback(*args, **kwargs)
  File "/Users/zehzinho/Sources/scraping-hub/shub/shub/deploy_reqs.py", line 34, in cli
    main(target, requirements_file)
  File "/Users/zehzinho/Sources/scraping-hub/shub/shub/deploy_reqs.py", line 43, in main
    utils.build_and_deploy_eggs(project, endpoint, apikey)
  File "/Users/zehzinho/Sources/scraping-hub/shub/shub/utils.py", line 141, in build_and_deploy_eggs
    build_and_deploy_egg(project, endpoint, apikey)
  File "/Users/zehzinho/Sources/scraping-hub/shub/shub/utils.py", line 168, in build_and_deploy_egg
    _deploy_dependency_egg(project, endpoint, apikey)
  File "/Users/zehzinho/Sources/scraping-hub/shub/shub/utils.py", line 183, in _deploy_dependency_egg
    make_deploy_request(url, data, files, auth)
TypeError: make_deploy_request() takes exactly 6 arguments (4 given)

deploy-reqs should handle dependencies already available on Scrapy Cloud more sanely

I've noticed a couple of times that people often just try to deploy from their requirements.txt file.

It's sort of a reasonable expectation, we even named the feature deploy-reqs after all.
So I thought we could handle those cases more sanely, because they will show up again, and it seems a bit silly try to educate users to maintain separate requirement files.

Maybe we could maintain a list of runtime dependencies that should be skipped by default (while informing the user) and offer an option to force deploying them.

What do you think?

Deploy standalone spiders

It would be nice to be able to do:

shub deploy-spider -p $PROJ_ID spiderfile.py

This would wrap the spiderfile into a temporary project and deploy that project to Scrapy Cloud.

I think it would be a good complement for the scrapy runspider spiderfile.py command, we could even show it off in http://scrapy.org :)

shub deploy could ignore files from .gitignore

so I have some directories listed in .gitignore e.g. /dev directory where I keep some private dirty dev scripts, I see content of /dev is deployed, I learned this because I have some imports in my dev script, this import is not supported in my deploy target and deploy is failng because of missing import. Is there any way to deploy only actual code and not contents of .gitignore?

Upgrade to pip 8

pip 8.0.0 dropped pip install --download in favour of pip download, we should update our code.

Forcing pip>=8 might be too strong though, maybe we can check the pip version in our code, or just stick to install --download and live with the deprecation warning for a while (or swallow it if possible).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.