danmcinerney / xsscrapy Goto Github PK

XSS spider - 66/66 wavsep XSS detected

PHP 1.48% Python 98.52%

xsscrapy's Introduction

xsscrapy

Fast, thorough, XSS/SQLi spider. Give it a URL and it'll test every link it finds for cross-site scripting and some SQL injection vulnerabilities. See FAQ for more details about SQLi detection.

From within the main folder run:

./xsscrapy.py -u http://example.com

If you wish to login then crawl:

./xsscrapy.py -u http://example.com/login_page -l loginname

If you wish to login with HTTP Basic Auth then crawl:

./xsscrapy.py -u http://example.com/login_page -l loginname --basic

If you wish to use cookies:

./xsscrapy.py -u http://example.com/login_page --cookie "SessionID=abcdef1234567890"

If you wish to limit simultaneous connections to 20:

./xsscrapy.py -u http://example.com -c 20

If you want to rate limit to 60 requests per minute:

./xsscrapy.py -u http://example.com/ -r 60

XSS vulnerabilities are reported in xsscrapy-vulns.txt

Dependencies

wget -O -u https://bootstrap.pypa.io/get-pip.py
python get-pip.py
pip install -r requirements.txt

May need additional libraries depending on OS. libxml2 libxslt zlib libffi openssl (sometimes libssl-dev)

Tests

Cookies
User-Agent
Referer
URL variables
End of URL
URL path
Forms both hidden and explicit

FAQ

If it gives an error : ImportError: cannot import name LinkExtractor. This means that you don't have the latest version of scrapy. You can install it using: sudo pip install --upgrade scrapy.
It's called XSScrapy, so why SQL injection detection too? There is overlap between dangerous XSS chars and dangerous SQL injection characters, namely single and double quotes. Detecting SQL injection errors in a response is also simple and nonCPU-intensive. So although 99% of this script is strongly geared toward high and accurate detection of XSS adding simple SQL injection detection through error message discovery is a simple and effective addition. This script will not test for blind sql injection. Error messages it looks for come straight from w3af's sqli audit plugin.

License

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
Neither the name of Dan McInerney nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

xsscrapy's People

Contributors

Stargazers

Watchers

Forkers

skout23 yasoob goryszewskig yd0str tonal glyynr strfkr voilet imisme inno-v yanyuyao922 nsdown jinglingshu mar1n3 jayin zero-king jacobjacob gitforhf sechacking zavierxu lhfei laohou difcareer h0bby zhangqin fnkfnk znanl kjseefried dink wbteve seotwister lodoss arturfis isunx loveh4ck freedream520 ffx0 ximcig h4de5ing cherry-wb d4rkcat jwne idreamsoft aryanguenthner alfredopalhares withdrawn fe-ver lcamry alertisme kissthink bupt007 yangruiyou85 braneed signedbytes lovesuae xxoxx ksmaheshkumar libertyunix runningabcd henang qadirluo goodhacker lovejavaee mujiansu altune drptbl skadisec hackerbolt googleask ymero lumia70 uestcheng shengxinking hslatman tomberek 1759325616 cainiaocome cplushua01 draculaw cnucky 4sp1r3 securextools chendoing helloannali xiaohouzi698 xiangyangsunsmile dipsec brandontict moorefu 00derp dot-sean hsomesun linuxuyz stop1992 eissfo0diff manfis devonhk liangdong2718 awsmhacks ripr4p

xsscrapy's Issues

# pip install -r requirements.txt

┌──(root㉿kali)-[~/tool/xsscrapy]
└─# pip install -r requirements.txt
Collecting Scrapy==1.1.0rc3
Using cached Scrapy-1.1.0rc3-py2.py3-none-any.whl (292 kB)
Collecting pybloom==1.1
Using cached pybloom-1.1.tar.gz (10 kB)
Preparing metadata (setup.py) ... error
error: subprocess-exited-with-error

× python setup.py egg_info did not run successfully.
│ exit code: 1
╰─> [9 lines of output]
Traceback (most recent call last):
File "", line 2, in
File "", line 34, in
File "/tmp/pip-install-f8yq2lyy/pybloom_e5f1b1c1801841a59af9b380ba752a48/setup.py", line 2, in
from ez_setup import use_setuptools
File "/tmp/pip-install-f8yq2lyy/pybloom_e5f1b1c1801841a59af9b380ba752a48/ez_setup.py", line 98
except pkg_resources.VersionConflict, e:
^
SyntaxError: invalid syntax
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

support for proxies?

I was wondering if it was planned to have support for proxies like Burp. For websites with complex authentication behavior, burp macros can end up being a reliable way to ensure a full scan is authenticated and makes it easier to do follow up tests once xss has been found.

If you'd be interested I'd be happy to try to add an optional command line argument specifying the proxy to go through.

AssertionError: version mismatch, 0.8.6 != 1.8.3

python get-pip.py
Traceback (most recent call last):
File "get-pip.py", line 19177, in
main()
File "get-pip.py", line 194, in main
bootstrap(tmpdir=tmpdir)
File "get-pip.py", line 82, in bootstrap
import pip
File "/tmp/tmpR1SH6k/pip.zip/pip/init.py", line 16, in
File "/tmp/tmpR1SH6k/pip.zip/pip/vcs/mercurial.py", line 9, in
File "/tmp/tmpR1SH6k/pip.zip/pip/download.py", line 39, in
File "/tmp/tmpR1SH6k/pip.zip/pip/_vendor/requests/init.py", line 53, in
File "/tmp/tmpR1SH6k/pip.zip/pip/_vendor/requests/packages/urllib3/contrib/pyopenssl.py", line 54, in
File "/usr/lib/python2.7/dist-packages/OpenSSL/init.py", line 8, in
from OpenSSL import rand, crypto, SSL
File "/usr/lib/python2.7/dist-packages/OpenSSL/rand.py", line 11, in
from OpenSSL._util import (
File "/usr/lib/python2.7/dist-packages/OpenSSL/_util.py", line 6, in
from cryptography.hazmat.bindings.openssl.binding import Binding
File "/usr/lib/python2.7/dist-packages/cryptography/hazmat/bindings/openssl/binding.py", line 60, in
class Binding(object):
File "/usr/lib/python2.7/dist-packages/cryptography/hazmat/bindings/openssl/binding.py", line 109, in Binding
libraries=_get_libraries(sys.platform)
File "/usr/lib/python2.7/dist-packages/cryptography/hazmat/bindings/utils.py", line 97, in build_ffi_for_binding
extra_link_args=extra_link_args,
File "/usr/lib/python2.7/dist-packages/cryptography/hazmat/bindings/utils.py", line 105, in build_ffi
ffi = FFI()
File "/usr/local/lib/python2.7/dist-packages/cffi/api.py", line 59, in init
"version mismatch, %s != %s" % (backend.version, version)
AssertionError: version mismatch, 0.8.6 != 1.8.3

output file missing

I just ran git pull with no errors. Now when I run a scan I don't get an output file of results anymore. Previously I got xsscrapy-vulns.txt

The terminal output shows there are findings like this:

2014-12-01 12:45:35-0500 [xsscrapy] DEBUG: Crawled (200) <POST http://demo.testfire.net/comment.aspx> (referer: http://demo.testfire.net/feedback.aspx)
2014-12-01 12:45:35-0500 [xsscrapy] NOLEVEL: URL: http://demo.testfire.net/feedback.aspx
2014-12-01 12:45:35-0500 [xsscrapy] NOLEVEL: response URL: http://demo.testfire.net/comment.aspx
2014-12-01 12:45:35-0500 [xsscrapy] NOLEVEL: POST url: http://demo.testfire.net/comment.aspx
2014-12-01 12:45:35-0500 [xsscrapy] NOLEVEL: Unfiltered: '"(){}:/;
2014-12-01 12:45:35-0500 [xsscrapy] NOLEVEL: Payload: 9zqjxaw'"(){}:/9zqjxaw;9
2014-12-01 12:45:35-0500 [xsscrapy] NOLEVEL: Type: form
2014-12-01 12:45:35-0500 [xsscrapy] NOLEVEL: Injection point: name
2014-12-01 12:45:35-0500 [xsscrapy] NOLEVEL: Possible payloads:
2014-12-01 12:45:35-0500 [xsscrapy] NOLEVEL: Line:

thank you for your comments, 9zqjxaw'"(){}:/9zqjxaw;9
2014-12-01 12:45:35-0500 [xsscrapy] DEBUG: Scraped from <200 http://demo.testfire.net/comment.aspx>

Should I just reinstall from scratch, or is there an issue with the current xsscrapy code?

Include IPython in the requirements file.

From a clean Python install (via virtualenv for example), when installing requirements from the given file and then running the tool, we get the following:

/usr/local/lib/python2.7/dist-packages/twisted/internet/_sslverify.py:184: UserWarning: You do not have the service_identity module installed. Please install it from <https://pypi.python.org/pypi/service_identity>. Without the service_identity module and a recent enough pyOpenSSL tosupport it, Twisted can perform only rudimentary TLS client hostnameverification.  Many valid certificate/hostname mappings may be rejected.
  verifyHostname, VerificationError = _selectVerifyImplementation()
Traceback (most recent call last):
  File "/usr/local/bin/scrapy", line 9, in <module>
    load_entry_point('Scrapy==0.24.4', 'console_scripts', 'scrapy')()
  File "/usr/local/lib/python2.7/dist-packages/scrapy/cmdline.py", line 143, in execute
    _run_print_help(parser, _run_command, cmd, args, opts)
  File "/usr/local/lib/python2.7/dist-packages/scrapy/cmdline.py", line 89, in _run_print_help
    func(*a, **kw)
  File "/usr/local/lib/python2.7/dist-packages/scrapy/cmdline.py", line 150, in _run_command
    cmd.run(args, opts)
  File "/usr/local/lib/python2.7/dist-packages/scrapy/commands/crawl.py", line 57, in run
    crawler = self.crawler_process.create_crawler()
  File "/usr/local/lib/python2.7/dist-packages/scrapy/crawler.py", line 87, in create_crawler
    self.crawlers[name] = Crawler(self.settings)
  File "/usr/local/lib/python2.7/dist-packages/scrapy/crawler.py", line 25, in __init__
    self.spiders = spman_cls.from_crawler(self)
  File "/usr/local/lib/python2.7/dist-packages/scrapy/spidermanager.py", line 35, in from_crawler
    sm = cls.from_settings(crawler.settings)
  File "/usr/local/lib/python2.7/dist-packages/scrapy/spidermanager.py", line 31, in from_settings
    return cls(settings.getlist('SPIDER_MODULES'))
  File "/usr/local/lib/python2.7/dist-packages/scrapy/spidermanager.py", line 22, in __init__
    for module in walk_modules(name):
  File "/usr/local/lib/python2.7/dist-packages/scrapy/utils/misc.py", line 68, in walk_modules
    submod = import_module(fullpath)
  File "/usr/lib/python2.7/importlib/__init__.py", line 37, in import_module
    __import__(name)
  File "/home/cthomas/.sources/xsscrapy/xsscrapy/spiders/xss_spider.py", line 22, in <module>
    from IPython import embed
ImportError: No module named IPython

From there a simple pip install ipython fixed the whole thing.

Requirements.txt

there is an error when installing xsscrapy.
Type of error-Syntax error
file setup.py
line-98
invalid syntax

scrapy not working

i thing there is problem in that scrapy code its not loaded properly webpages and that not work in python 3.9.0 then can u provide and latest version of scrapy code

after fresh install not work

$ cat requirements.txt
Scrapy==1.1.0rc3
pybloom==1.1
requests
beautifulsoup
Twisted==16.6.0
w3lib
lxml==3.6.4
six
cssselect
pyopenssl
cryptography
queuelib

xsscrapy not work
$ python ./xsscrapy.py -u "http://www.****day.com/" -u 30
2018-01-07 10:53:18 [scrapy] INFO: Scrapy 1.1.0rc3 started (bot: xsscrapy)
2018-01-07 10:53:18 [scrapy] INFO: Overridden settings: {'COOKIES_DEBUG': True, 'NEWSPIDER_MODULE': 'xsscrapy.spiders', 'SPIDER_MODULES': ['xsscrapy.spiders'], 'DUPEFILTER_CLASS': 'xsscrapy.bloomfilters.BloomURLDupeFilter', 'CONCURRENT_REQUESTS': '30', 'BOT_NAME': 'xsscrapy', 'DOWNLOAD_DELAY': '0'}
2018-01-07 10:53:19 [scrapy] INFO: Enabled extensions:
['scrapy.extensions.logstats.LogStats',
'scrapy.extensions.telnet.TelnetConsole',
'scrapy.extensions.corestats.CoreStats']
2018-01-07 10:53:19 [scrapy] INFO: Enabled downloader middlewares:
['xsscrapy.middlewares.InjectedDupeFilter',
'xsscrapy.middlewares.RandomUserAgentMiddleware',
'scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
'scrapy.downloadermiddlewares.retry.RetryMiddleware',
'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
'scrapy.downloadermiddlewares.chunked.ChunkedTransferMiddleware',
'scrapy.downloadermiddlewares.stats.DownloaderStats']
2018-01-07 10:53:19 [scrapy] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
'scrapy.spidermiddlewares.referer.RefererMiddleware',
'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
'scrapy.spidermiddlewares.depth.DepthMiddleware']
2018-01-07 10:53:19 [scrapy] INFO: Enabled item pipelines:
['xsscrapy.pipelines.XSSCharFinder']
2018-01-07 10:53:19 [scrapy] INFO: Spider opened
2018-01-07 10:53:19 [scrapy] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2018-01-07 10:53:19 [scrapy] DEBUG: Telnet console listening on 127.0.0.1:6023
2018-01-07 10:53:19 [scrapy] ERROR: Error while obtaining start requests
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/scrapy/core/engine.py", line 126, in _next_request
request = next(slot.start_requests)
File "/home/fakessh/Bureau/xsscrapy/xsscrapy/spiders/xss_spider.py", line 115, in start_requests
yield Request(url=self.start_urls[0])
File "/usr/local/lib/python2.7/dist-packages/scrapy/http/request/init.py", line 25, in init
self._set_url(url)
File "/usr/local/lib/python2.7/dist-packages/scrapy/http/request/init.py", line 57, in _set_url
raise ValueError('Missing scheme in request url: %s' % self._url)
ValueError: Missing scheme in request url: 30
2018-01-07 10:53:19 [scrapy] INFO: Closing spider (finished)
2018-01-07 10:53:19 [scrapy] INFO: Dumping Scrapy stats:
{'finish_reason': 'finished',
'finish_time': datetime.datetime(2018, 1, 7, 9, 53, 19, 364388),
'log_count/DEBUG': 1,
'log_count/ERROR': 1,
'log_count/INFO': 7,
'start_time': datetime.datetime(2018, 1, 7, 9, 53, 19, 308870)}
2018-01-07 10:53:19 [scrapy] INFO: Spider closed (finished)

dependent install error

sudo pip install -r requirements.txt
but error message

writing manifest file 'Twisted.egg-info/SOURCES.txt'

warning: manifest_maker: standard file '-c' not found



reading manifest file 'Twisted.egg-info/SOURCES.txt'

writing manifest file 'Twisted.egg-info/SOURCES.txt'

copying twisted/runner/portmap.c -> build/lib.linux-x86_64-2.7/twisted/runner

creating build/lib.linux-x86_64-2.7/twisted/internet/iocpreactor/iocpsupport

copying twisted/internet/iocpreactor/iocpsupport/iocpsupport.c -> build/lib.linux-x86_64-2.7/twisted/internet/iocpreactor/iocpsupport

copying twisted/internet/iocpreactor/iocpsupport/winsock_pointers.c -> build/lib.linux-x86_64-2.7/twisted/internet/iocpreactor/iocpsupport

copying twisted/python/sendmsg.c -> build/lib.linux-x86_64-2.7/twisted/python

copying twisted/test/raiser.c -> build/lib.linux-x86_64-2.7/twisted/test

running build_ext

x86_64-linux-gnu-gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fPIC -I/usr/include/python2.7 -c conftest.c -o conftest.o

building 'twisted.runner.portmap' extension

creating build/temp.linux-x86_64-2.7

creating build/temp.linux-x86_64-2.7/twisted

creating build/temp.linux-x86_64-2.7/twisted/runner

x86_64-linux-gnu-gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fPIC -I/usr/include/python2.7 -c twisted/runner/portmap.c -o build/temp.linux-x86_64-2.7/twisted/runner/portmap.o

twisted/runner/portmap.c:10:20: fatal error: Python.h: No such file or directory

 #include <Python.h>

                    ^

compilation terminated.

error: command 'x86_64-linux-gnu-gcc' failed with exit status 1

----------------------------------------
Cleaning up...
Command /usr/bin/python -c "import setuptools;__file__='/tmp/pip_build_root/Twisted/setup.py';exec(compile(open(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /tmp/pip-1LsoSA-record/install-record.txt --single-version-externally-managed failed with error code 1 in /tmp/pip_build_root/Twisted
Storing complete log in /home/geekzhu/.pip/pip.log

Import Error of Suppress in XSScrapy

Hello Sir,
I have been successfully installed all the requirements without any prompt of errors.
But when I am trying to Run the command i.e "./xsscrapy.py -u https://target.com
It gives me an error of Import: "cannot import name suppress"
Please guide !

Reagrds

How to fix this issue

Traceback (most recent call last):
  File "xsscrapy.py", line 45, in <module>
    main()
  File "xsscrapy.py", line 41, in main
    '-s', 'DOWNLOAD_DELAY=%s' % rate])
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scrapy/cmdline.py", line 150, in execute
    _run_print_help(parser, _run_command, cmd, args, opts)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scrapy/cmdline.py", line 90, in _run_print_help
    func(*a, **kw)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scrapy/cmdline.py", line 157, in _run_command
    cmd.run(args, opts)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scrapy/commands/crawl.py", line 57, in run
    self.crawler_process.crawl(spname, **opts.spargs)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scrapy/crawler.py", line 170, in crawl
    crawler = self.create_crawler(crawler_or_spidercls)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scrapy/crawler.py", line 198, in create_crawler
    return self._create_crawler(crawler_or_spidercls)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scrapy/crawler.py", line 203, in _create_crawler
    return Crawler(spidercls, self.settings)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scrapy/crawler.py", line 55, in __init__
    self.extensions = ExtensionManager.from_crawler(self)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scrapy/middleware.py", line 58, in from_crawler
    return cls.from_settings(crawler.settings, crawler)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scrapy/middleware.py", line 34, in from_settings
    mwcls = load_object(clspath)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scrapy/utils/misc.py", line 44, in load_object
    mod = import_module(module)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/importlib/__init__.py", line 37, in import_module
    __import__(name)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scrapy/extensions/memusage.py", line 16, in <module>
    from scrapy.mail import MailSender
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scrapy/mail.py", line 15, in <module>
    from six.moves.email_mime_multipart import MIMEMultipart
ImportError: No module named email_mime_multipart

Xss

ERROR: Error processing

Hey, I got this error on mac

2017-03-26 10:08:40 [scrapy.core.scraper] ERROR: Error processing
Traceback (most recent call last):
  File "/Library/Python/2.7/site-packages/twisted/internet/defer.py", line 653, in _runCallbacks
    current.result = callback(current.result, *args, **kw)
  File "/Users/d34dr00t/pentest/xsscrapy/xsscrapy/pipelines.py", line 61, in process_item
    unclaimedURL = self.unclaimedURL_check(body)
  File "/Users/d34dr00t/pentest/xsscrapy/xsscrapy/pipelines.py", line 218, in unclaimedURL_check
    tree = fromstring(body)
  File "/Library/Python/2.7/site-packages/lxml/html/__init__.py", line 876, in fromstring
    doc = document_fromstring(html, parser=parser, base_url=base_url, **kw)
  File "/Library/Python/2.7/site-packages/lxml/html/__init__.py", line 762, in document_fromstring
    value = etree.fromstring(html, parser, **kw)
  File "src/lxml/lxml.etree.pyx", line 3213, in lxml.etree.fromstring (src/lxml/lxml.etree.c:79010)
  File "src/lxml/parser.pxi", line 1848, in lxml.etree._parseMemoryDocument (src/lxml/lxml.etree.c:118341)
  File "src/lxml/parser.pxi", line 1736, in lxml.etree._parseDoc (src/lxml/lxml.etree.c:117021)
  File "src/lxml/parser.pxi", line 1102, in lxml.etree._BaseParser._parseDoc (src/lxml/lxml.etree.c:111265)
  File "src/lxml/parser.pxi", line 595, in lxml.etree._ParserContext._handleParseResultDoc (src/lxml/lxml.etree.c:105109)
  File "src/lxml/parser.pxi", line 706, in lxml.etree._handleParseResult (src/lxml/lxml.etree.c:106817)
  File "src/lxml/parser.pxi", line 644, in lxml.etree._raiseParseError (src/lxml/lxml.etree.c:105874)
XMLSyntaxError: line 444: ID  already defined (line 444)

Can't pass a cookie instead of login data?

I have reviewed the code in the loginform.py file and see that the script is looking for a submit button in order to submit the login information.

In my application, their doesn't exist an input of type submit. Instead it is implemented differently.

With this, the application cannot sign in and redirect to the application. Can we have an enhancement to get the scanner to accept an authenticated cookie? This way you don't have to do the login.

I'm not very familiar with the scapy library, thus I couldn't tell where to pass this cookie value. FormRequest didn't appear to have any input to accept a cookie.

Thanks!

Any plans to make a python 3 version?

2nd

scrapy.cmdline not found

So I installed python 2.7.13, ran the pip install -r requirements.txt and then on my first test of xsscrapy I get the following error:

root@debian:/opt/xsscrapy# ./xsscrapy.py -u http://192.168.1.12/mutillidae/index.php?page=dns-lookup.php
Traceback (most recent call last):
File "./xsscrapy.py", line 4, in
from scrapy.cmdline import execute
ImportError: No module named scrapy.cmdline

This makes little sense to me considering scrapy is in my path.

root@debian:/opt/xsscrapy# echo $PATH
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin

root@debian:/opt/xsscrapy# whereis scrapy
scrapy: /usr/local/bin/scrapy

It is weird, this works flawless on kali 2.0 which is based on debian. I'm doing this install on debian 8 and its broken... dunno how that works but the world is a strange place.

fatal problem after up to date kali install new python install and xssscrapy from scartch

hey

i encouter fatal error
after install up to date kali new python install and xssscrapy from scratch

fakessh@fakessh:/opt/xssscrapy/xsscrapy$ sudo pip install --upgrade -r requirements.txt
[sudo] Mot de passe de fakessh :
Requirement already up-to-date: Scrapy==1.1.0rc3 in /usr/local/lib/python2.7/dist-packages (from -r requirements.txt (line 2))
Requirement already up-to-date: pybloom==1.1 in /usr/local/lib/python2.7/dist-packages (from -r requirements.txt (line 3))
Requirement already up-to-date: requests in /usr/local/lib/python2.7/dist-packages (from -r requirements.txt (line 4))
Requirement already up-to-date: beautifulsoup in /usr/lib/python2.7/dist-packages (from -r requirements.txt (line 5))
Requirement already up-to-date: Twisted in /usr/local/lib/python2.7/dist-packages (from -r requirements.txt (line 6))
Requirement already up-to-date: w3lib in /usr/local/lib/python2.7/dist-packages (from -r requirements.txt (line 7))
Requirement already up-to-date: lxml in /usr/local/lib/python2.7/dist-packages (from -r requirements.txt (line 8))
Requirement already up-to-date: six in /usr/lib/python2.7/dist-packages (from -r requirements.txt (line 9))
Requirement already up-to-date: cssselect in /usr/lib/python2.7/dist-packages (from -r requirements.txt (line 10))
Requirement already up-to-date: pyopenssl in /usr/local/lib/python2.7/dist-packages (from -r requirements.txt (line 11))
Requirement already up-to-date: cryptography in /usr/local/lib/python2.7/dist-packages (from -r requirements.txt (line 12))
Requirement already up-to-date: queuelib in /usr/lib/python2.7/dist-packages (from -r requirements.txt (line 13))
Requirement already up-to-date: service-identity in /usr/lib/python2.7/dist-packages (from Scrapy==1.1.0rc3->-r requirements.txt (line 2))
Requirement already up-to-date: parsel>=0.9.3 in /usr/local/lib/python2.7/dist-packages (from Scrapy==1.1.0rc3->-r requirements.txt (line 2))
Requirement already up-to-date: PyDispatcher>=2.0.5 in /usr/local/lib/python2.7/dist-packages (from Scrapy==1.1.0rc3->-r requirements.txt (line 2))
Requirement already up-to-date: bitarray>=0.3.4 in /usr/local/lib/python2.7/dist-packages (from pybloom==1.1->-r requirements.txt (line 3))
Requirement already up-to-date: Automat>=0.3.0 in /usr/local/lib/python2.7/dist-packages (from Twisted->-r requirements.txt (line 6))
Requirement already up-to-date: constantly>=15.1 in /usr/lib/python2.7/dist-packages (from Twisted->-r requirements.txt (line 6))
Requirement already up-to-date: incremental>=16.10.1 in /usr/lib/python2.7/dist-packages (from Twisted->-r requirements.txt (line 6))
Requirement already up-to-date: zope.interface>=3.6.0 in /usr/local/lib/python2.7/dist-packages (from Twisted->-r requirements.txt (line 6))
Requirement already up-to-date: setuptools>=11.3 in /usr/local/lib/python2.7/dist-packages (from cryptography->-r requirements.txt (line 12))
Requirement already up-to-date: ipaddress in /usr/local/lib/python2.7/dist-packages (from cryptography->-r requirements.txt (line 12))
Requirement already up-to-date: idna>=2.1 in /usr/local/lib/python2.7/dist-packages (from cryptography->-r requirements.txt (line 12))
Requirement already up-to-date: asn1crypto>=0.21.0 in /usr/local/lib/python2.7/dist-packages (from cryptography->-r requirements.txt (line 12))
Requirement already up-to-date: enum34 in /usr/lib/python2.7/dist-packages (from cryptography->-r requirements.txt (line 12))
Requirement already up-to-date: cffi>=1.4.1 in /usr/local/lib/python2.7/dist-packages (from cryptography->-r requirements.txt (line 12))
Requirement already up-to-date: packaging in /usr/local/lib/python2.7/dist-packages (from cryptography->-r requirements.txt (line 12))
Requirement already up-to-date: attrs in /usr/lib/python2.7/dist-packages (from Automat>=0.3.0->Twisted->-r requirements.txt (line 6))
Requirement already up-to-date: appdirs>=1.4.0 in /usr/local/lib/python2.7/dist-packages (from setuptools>=11.3->cryptography->-r requirements.txt (line 12))
Requirement already up-to-date: pycparser in /usr/local/lib/python2.7/dist-packages (from cffi>=1.4.1->cryptography->-r requirements.txt (line 12))
Requirement already up-to-date: pyparsing in /usr/local/lib/python2.7/dist-packages (from packaging->cryptography->-r requirements.txt (line 12))
fakessh@fakessh:/opt/xssscrapy/xsscrapy$ sudo python ./xsscrapy.py -u http://www.airarabia.com/en -r 120
2017-03-10 20:13:13 [scrapy] INFO: Scrapy 1.1.0rc3 started (bot: xsscrapy)
2017-03-10 20:13:13 [scrapy] INFO: Overridden settings: {'NEWSPIDER_MODULE': 'xsscrapy.spiders', 'DUPEFILTER_CLASS': 'xsscrapy.bloomfilters.BloomURLDupeFilter', 'SPIDER_MODULES': ['xsscrapy.spiders'], 'CONCURRENT_REQUESTS': '30', 'BOT_NAME': 'xsscrapy', 'DOWNLOAD_DELAY': '0.5'}
2017-03-10 20:13:13 [scrapy] INFO: Enabled extensions:
['scrapy.extensions.logstats.LogStats',
'scrapy.extensions.telnet.TelnetConsole',
'scrapy.extensions.corestats.CoreStats']
2017-03-10 20:13:13 [scrapy] INFO: Enabled downloader middlewares:
['xsscrapy.middlewares.InjectedDupeFilter',
'xsscrapy.middlewares.RandomUserAgentMiddleware',
'scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
'scrapy.downloadermiddlewares.retry.RetryMiddleware',
'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
'scrapy.downloadermiddlewares.chunked.ChunkedTransferMiddleware',
'scrapy.downloadermiddlewares.stats.DownloaderStats']
2017-03-10 20:13:13 [scrapy] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
'scrapy.spidermiddlewares.referer.RefererMiddleware',
'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
'scrapy.spidermiddlewares.depth.DepthMiddleware']
2017-03-10 20:13:13 [scrapy] INFO: Enabled item pipelines:
['xsscrapy.pipelines.XSSCharFinder']
2017-03-10 20:13:13 [scrapy] INFO: Spider opened
2017-03-10 20:13:13 [scrapy] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2017-03-10 20:13:13 [scrapy] DEBUG: Telnet console listening on 127.0.0.1:6023
2017-03-10 20:13:13 [scrapy] ERROR: Error downloading <GET http://www.airarabia.com/en>
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/twisted/internet/defer.py", line 1299, in _inlineCallbacks
result = result.throwExceptionIntoGenerator(g)
File "/usr/local/lib/python2.7/dist-packages/twisted/python/failure.py", line 393, in throwExceptionIntoGenerator
return g.throw(self.type, self.value, self.tb)
File "/usr/local/lib/python2.7/dist-packages/scrapy/core/downloader/middleware.py", line 43, in process_request
defer.returnValue((yield download_func(request=request,spider=spider)))
File "/usr/local/lib/python2.7/dist-packages/scrapy/utils/defer.py", line 45, in mustbe_deferred
result = f(*args, **kw)
File "/usr/local/lib/python2.7/dist-packages/scrapy/core/downloader/handlers/init.py", line 65, in download_request
return handler.download_request(request, spider)
File "/usr/local/lib/python2.7/dist-packages/scrapy/core/downloader/handlers/http11.py", line 60, in download_request
return agent.download_request(request)
File "/usr/local/lib/python2.7/dist-packages/scrapy/core/downloader/handlers/http11.py", line 264, in download_request
method, to_bytes(url, encoding='ascii'), headers, bodyproducer)
File "/usr/local/lib/python2.7/dist-packages/twisted/web/client.py", line 1631, in request
parsedURI.originForm)
File "/usr/local/lib/python2.7/dist-packages/twisted/web/client.py", line 1408, in _requestWithEndpoint
d = self._pool.getConnection(key, endpoint)
File "/usr/local/lib/python2.7/dist-packages/twisted/web/client.py", line 1294, in getConnection
return self._newConnection(key, endpoint)
File "/usr/local/lib/python2.7/dist-packages/twisted/web/client.py", line 1306, in _newConnection
return endpoint.connect(factory)
File "/usr/local/lib/python2.7/dist-packages/twisted/internet/endpoints.py", line 788, in connect
EndpointReceiver, self._hostText, portNumber=self._port
File "/usr/local/lib/python2.7/dist-packages/twisted/internet/_resolver.py", line 174, in resolveHostName
onAddress = self._simpleResolver.getHostByName(hostName)
File "/usr/local/lib/python2.7/dist-packages/scrapy/resolver.py", line 21, in getHostByName
d = super(CachingThreadedResolver, self).getHostByName(name, timeout)
File "/usr/local/lib/python2.7/dist-packages/twisted/internet/base.py", line 276, in getHostByName
timeoutDelay = sum(timeout)
TypeError: 'float' object is not iterable
2017-03-10 20:13:14 [scrapy] INFO: Closing spider (finished)
2017-03-10 20:13:14 [scrapy] INFO: Dumping Scrapy stats:
{'downloader/exception_count': 1,
'downloader/exception_type_count/exceptions.TypeError': 1,
'downloader/request_bytes': 304,
'downloader/request_count': 1,
'downloader/request_method_count/GET': 1,
'finish_reason': 'finished',
'finish_time': datetime.datetime(2017, 3, 10, 19, 13, 14, 138599),
'log_count/DEBUG': 1,
'log_count/ERROR': 1,
'log_count/INFO': 7,
'scheduler/dequeued': 1,
'scheduler/dequeued/memory': 1,
'scheduler/enqueued': 1,
'scheduler/enqueued/memory': 1,
'start_time': datetime.datetime(2017, 3, 10, 19, 13, 13, 720985)}
2017-03-10 20:13:14 [scrapy] INFO: Spider closed (finished)
fakessh@fakessh:/opt/xssscrapy/xsscrapy$
$ sudo python ./xsscrapy.py -h
usage: xsscrapy.py [-h] [-u URL] [-l LOGIN] [-p PASSWORD] [-c CONNECTIONS]
[-r RATELIMIT] [--basic] [-k COOKIE]

optional arguments:
-h, --help show this help message and exit
-u URL, --url URL URL to scan; -u http://example.com
-l LOGIN, --login LOGIN
Login name; -l danmcinerney
-p PASSWORD, --password PASSWORD
Password; -p pa$$w0rd
-c CONNECTIONS, --connections CONNECTIONS
Set the max number of simultaneous connections
allowed, default=30
-r RATELIMIT, --ratelimit RATELIMIT
Rate in requests per minute, default=0
--basic Use HTTP Basic Auth to login
-k COOKIE, --cookie COOKIE
Cookie key; --cookie
SessionID=afgh3193e9103bca9318031bcdf
fakessh@fakessh:/opt/xssscrapy/xsscrapy$

Prevent scrapy from URL encoding payloads

Something with the canonical_url() function in scrapy is causing all the payloads to be URL encoded which is extremely not ideal.

File "parser.pxi", line 631, in lxml.etree._raiseParseError (src/lxml/lxml.etree.c:95065)

ERROR: Error processing
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/twisted/internet/defer.py", line 651, in _runCallbacks
    current.result = callback(current.result, *args, **kw)
  File "/root/mytools/xsscrapy/xsscrapy/pipelines.py", line 61, in process_item
    unclaimedURL = self.unclaimedURL_check(body)
  File "/root/mytools/xsscrapy/xsscrapy/pipelines.py", line 218, in unclaimedURL_check
    tree = fromstring(body)
  File "/usr/local/lib/python2.7/dist-packages/lxml/html/__init__.py", line 726, in fromstring
    doc = document_fromstring(html, parser=parser, base_url=base_url, **kw)
  File "/usr/local/lib/python2.7/dist-packages/lxml/html/__init__.py", line 614, in document_fromstring
    value = etree.fromstring(html, parser, **kw)
  File "lxml.etree.pyx", line 3103, in lxml.etree.fromstring (src/lxml/lxml.etree.c:70569)
  File "parser.pxi", line 1828, in lxml.etree._parseMemoryDocument (src/lxml/lxml.etree.c:106403)
  File "parser.pxi", line 1716, in lxml.etree._parseDoc (src/lxml/lxml.etree.c:105194)
  File "parser.pxi", line 1086, in lxml.etree._BaseParser._parseDoc (src/lxml/lxml.etree.c:99876)
  File "parser.pxi", line 580, in lxml.etree._ParserContext._handleParseResultDoc (src/lxml/lxml.etree.c:94350)
  File "parser.pxi", line 690, in lxml.etree._handleParseResult (src/lxml/lxml.etree.c:95786)
  File "parser.pxi", line 631, in lxml.etree._raiseParseError (src/lxml/lxml.etree.c:95065)
XMLSyntaxError: None

XMLSyntaxError

what version of the lxml library is needed ? on 3.8.0-2 i get the following errors, but it does not break the script from running.

[scrapy] ERROR: Error processing
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/twisted/internet/defer.py", line 651, in _runCallbacks
    current.result = callback(current.result, *args, **kw)
  File "/opt/xsscrapy/xsscrapy/pipelines.py", line 61, in process_item
    unclaimedURL = self.unclaimedURL_check(body)
  File "/opt/xsscrapy/xsscrapy/pipelines.py", line 218, in unclaimedURL_check
    tree = fromstring(body)
  File "/usr/local/lib/python2.7/dist-packages/lxml/html/__init__.py", line 876, in fromstring
    doc = document_fromstring(html, parser=parser, base_url=base_url, **kw)
  File "/usr/local/lib/python2.7/dist-packages/lxml/html/__init__.py", line 762, in document_fromstring
    value = etree.fromstring(html, parser, **kw)
  File "src/lxml/lxml.etree.pyx", line 3228, in lxml.etree.fromstring (src/lxml/lxml.etree.c:79609)
  File "src/lxml/parser.pxi", line 1848, in lxml.etree._parseMemoryDocument (src/lxml/lxml.etree.c:119128)
  File "src/lxml/parser.pxi", line 1736, in lxml.etree._parseDoc (src/lxml/lxml.etree.c:117808)
  File "src/lxml/parser.pxi", line 1102, in lxml.etree._BaseParser._parseDoc (src/lxml/lxml.etree.c:112052)
  File "src/lxml/parser.pxi", line 595, in lxml.etree._ParserContext._handleParseResultDoc (src/lxml/lxml.etree.c:105896)
  File "src/lxml/parser.pxi", line 706, in lxml.etree._handleParseResult (src/lxml/lxml.etree.c:107604)
  File "src/lxml/parser.pxi", line 644, in lxml.etree._raiseParseError (src/lxml/lxml.etree.c:106661)
XMLSyntaxError: line 3661: Tag footer invalid (line 3661)

having this problem while installing requirement.txt

showing this error while installing requirement.txt
i tried but cannot solve
can u tell me the solution

Error of execution

Hello,

When I run this script I have this error:

Traceback (most recent call last):
  File "/opt/XSS/xsscrapy/xsscrapy.py", line 4, in <module>
    from scrapy.cmdline import execute
ModuleNotFoundError: No module named 'scrapy'

My version of python --> Python 3.9.7
My OS --> Linux 5.14.0-kali2-amd64 #1 SMP Debian 5.14.9-2kali1 (2021-10-04) x86_64 GNU/Linux

A greeting and thanks

the new version does not work

after the new version and install from scratch
surprise install dependance is very difficult to install
and xssscrapy.py dont start with error

$ sudo python ./xsscrapy.py -h
Traceback (most recent call last):
File "./xsscrapy.py", line 5, in
from xsscrapy.spiders.xss_spider import XSSspider
File "/opt/xssscrapy/xsscrapy/xsscrapy/spiders/xss_spider.py", line 3, in
from scrapy.linkextractors import LinkExtractor
ImportError: No module named linkextractors

Adding custom headers (sessions)?

XSScrapy is really a great tool, however, I am struggling to crawl login areas. I try to provide the sessions via header and cookies, but it does only crawl the areas outside the login page (public accessible pages).
Is there an example how to run the tool with custom headers which include the user session?

Facing this issue while installing requirements.txt

ERROR: Command errored out with exit status 1:
command: /usr/local/bin/python3.9 -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-jzd70atp/pybloom_18079700ea8147b8921576e7f2418b13/setup.py'"'"'; file='"'"'/tmp/pip-install-jzd70atp/pybloom_18079700ea8147b8921576e7f2418b13/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-4mh1s55h
cwd: /tmp/pip-install-jzd70atp/pybloom_18079700ea8147b8921576e7f2418b13/
Complete output (8 lines):
Traceback (most recent call last):
File "", line 1, in
File "/tmp/pip-install-jzd70atp/pybloom_18079700ea8147b8921576e7f2418b13/setup.py", line 2, in
from ez_setup import use_setuptools
File "/tmp/pip-install-jzd70atp/pybloom_18079700ea8147b8921576e7f2418b13/ez_setup.py", line 98
except pkg_resources.VersionConflict, e:
^
SyntaxError: invalid syntax
----------------------------------------
WARNING: Discarding https://files.pythonhosted.org/packages/1a/82/a1ad015bdc19bd7e10aa97070329b84b5e01c0c6b5de88df664a98413eed/pybloom-1.1.tar.gz#sha256=f90903f2229135833a3ce115709662a4f4ea49bdfffc7d62a3061bd9021ac485 (from https://pypi.org/simple/pybloom/). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
ERROR: Could not find a version that satisfies the requirement pybloom==1.1
ERROR: No matching distribution found for pybloom==1.1

No module named scrapy.cmdline

I did pip install -r requ..txt
CAN ANYONE HELP SOLVING THIS?
Thnx

.gitmodules request

We were working on updating it, but we noticed its missing a .gitmodules to tell git submodule init were to pick up the scrapyjs folder. Is this the corect location https://github.com/scrapinghub/scrapyjs?

Difficulty installing dependencies

CentOS 6.5 32-bit

pip.log

Requirement already satisfied (use --upgrade to upgrade): Scrapy==0.24.4 in /usr/lib/python2.6/site-packages (from -r requirements.txt (line 1))

Requirement already satisfied (use --upgrade to upgrade): pybloom==1.1 in /usr/lib/python2.6/site-packages (from -r requirements.txt (line 2))

Requirement already satisfied (use --upgrade to upgrade): requests in /usr/lib/python2.6/site-packages (from -r requirements.txt (line 3))

  skipping extra security
  skipping extra security
  skipping extra security
Requirement already satisfied (use --upgrade to upgrade): beautifulsoup in /usr/lib/python2.6/site-packages (from -r requirements.txt (line 4))

Downloading/unpacking Twisted>=10.0.0 (from Scrapy==0.24.4->-r requirements.txt (line 1))

  Running setup.py egg_info for package Twisted

    running egg_info
    writing requirements to pip-egg-info/Twisted.egg-info/requires.txt
    writing pip-egg-info/Twisted.egg-info/PKG-INFO
    writing top-level names to pip-egg-info/Twisted.egg-info/top_level.txt
    writing dependency_links to pip-egg-info/Twisted.egg-info/dependency_links.txt
    warning: manifest_maker: standard file '-c' not found
    reading manifest file 'pip-egg-info/Twisted.egg-info/SOURCES.txt'
    writing manifest file 'pip-egg-info/Twisted.egg-info/SOURCES.txt'
  Source in /tmp/pip-build-root/Twisted has version 15.0.0, which satisfies requirement Twisted>=10.0.0 (from Scrapy==0.24.4->-r requirements.txt (line 1))
Downloading/unpacking w3lib>=1.8.0 (from Scrapy==0.24.4->-r requirements.txt (line 1))

  Running setup.py egg_info for package w3lib

    /usr/lib/python2.6/distutils/dist.py:266: UserWarning: Unknown distribution option: 'zip_zafe'

      warnings.warn(msg)

    running egg_info
    writing requirements to pip-egg-info/w3lib.egg-info/requires.txt
    writing pip-egg-info/w3lib.egg-info/PKG-INFO
    writing top-level names to pip-egg-info/w3lib.egg-info/top_level.txt
    writing dependency_links to pip-egg-info/w3lib.egg-info/dependency_links.txt
    warning: manifest_maker: standard file '-c' not found
    reading manifest file 'pip-egg-info/w3lib.egg-info/SOURCES.txt'
    writing manifest file 'pip-egg-info/w3lib.egg-info/SOURCES.txt'
  Source in /tmp/pip-build-root/w3lib has version 1.11.0, which satisfies requirement w3lib>=1.8.0 (from Scrapy==0.24.4->-r requirements.txt (line 1))
Downloading/unpacking queuelib (from Scrapy==0.24.4->-r requirements.txt (line 1))

  Running setup.py egg_info for package queuelib

    running egg_info
    writing pip-egg-info/queuelib.egg-info/PKG-INFO
    writing top-level names to pip-egg-info/queuelib.egg-info/top_level.txt
    writing dependency_links to pip-egg-info/queuelib.egg-info/dependency_links.txt
    warning: manifest_maker: standard file '-c' not found
    reading manifest file 'pip-egg-info/queuelib.egg-info/SOURCES.txt'
    reading manifest template 'MANIFEST.in'
    writing manifest file 'pip-egg-info/queuelib.egg-info/SOURCES.txt'
  Source in /tmp/pip-build-root/queuelib has version 1.2.2, which satisfies requirement queuelib (from Scrapy==0.24.4->-r requirements.txt (line 1))
Downloading/unpacking lxml (from Scrapy==0.24.4->-r requirements.txt (line 1))

  Running setup.py egg_info for package lxml

    /usr/lib/python2.6/distutils/dist.py:266: UserWarning: Unknown distribution option: 'bugtrack_url'

      warnings.warn(msg)

    Building lxml version 3.4.1.

    Building without Cython.

    Using build configuration of libxslt 1.1.26

    Building against libxml2/libxslt in the following directory: /usr/lib

    running egg_info
    writing requirements to pip-egg-info/lxml.egg-info/requires.txt
    writing pip-egg-info/lxml.egg-info/PKG-INFO
    writing top-level names to pip-egg-info/lxml.egg-info/top_level.txt
    writing dependency_links to pip-egg-info/lxml.egg-info/dependency_links.txt
    warning: manifest_maker: standard file '-c' not found
    reading manifest file 'pip-egg-info/lxml.egg-info/SOURCES.txt'
    reading manifest template 'MANIFEST.in'
    writing manifest file 'pip-egg-info/lxml.egg-info/SOURCES.txt'
  Source in /tmp/pip-build-root/lxml has version 3.4.1, which satisfies requirement lxml (from Scrapy==0.24.4->-r requirements.txt (line 1))
  skipping extra source
  skipping extra cssselect
  skipping extra html5
  skipping extra htmlsoup
Requirement already satisfied (use --upgrade to upgrade): pyOpenSSL in /root/.local/lib/python2.6/site-packages/pyOpenSSL-0.14-py2.6.egg (from Scrapy==0.24.4->-r requirements.txt (line 1))

Downloading/unpacking cssselect>=0.9 (from Scrapy==0.24.4->-r requirements.txt (line 1))

  Running setup.py egg_info for package cssselect

    running egg_info
    writing pip-egg-info/cssselect.egg-info/PKG-INFO
    writing top-level names to pip-egg-info/cssselect.egg-info/top_level.txt
    writing dependency_links to pip-egg-info/cssselect.egg-info/dependency_links.txt
    warning: manifest_maker: standard file '-c' not found
    reading manifest file 'pip-egg-info/cssselect.egg-info/SOURCES.txt'
    reading manifest template 'MANIFEST.in'
    no previously-included directories found matching 'docs/_build'

    writing manifest file 'pip-egg-info/cssselect.egg-info/SOURCES.txt'
  Source in /tmp/pip-build-root/cssselect has version 0.9.1, which satisfies requirement cssselect>=0.9 (from Scrapy==0.24.4->-r requirements.txt (line 1))
Requirement already satisfied (use --upgrade to upgrade): six>=1.5.2 in /root/.local/lib/python2.6/site-packages/six-1.8.0-py2.6.egg (from Scrapy==0.24.4->-r requirements.txt (line 1))

Downloading/unpacking bitarray>=0.3.4 (from pybloom==1.1->-r requirements.txt (line 2))

  Running setup.py egg_info for package bitarray

    running egg_info
    writing pip-egg-info/bitarray.egg-info/PKG-INFO
    writing top-level names to pip-egg-info/bitarray.egg-info/top_level.txt
    writing dependency_links to pip-egg-info/bitarray.egg-info/dependency_links.txt
    warning: manifest_maker: standard file '-c' not found
    reading manifest file 'pip-egg-info/bitarray.egg-info/SOURCES.txt'
    writing manifest file 'pip-egg-info/bitarray.egg-info/SOURCES.txt'
  Source in /tmp/pip-build-root/bitarray has version 0.8.1, which satisfies requirement bitarray>=0.3.4 (from pybloom==1.1->-r requirements.txt (line 2))
Downloading/unpacking zope.interface>=3.6.0 (from Twisted>=10.0.0->Scrapy==0.24.4->-r requirements.txt (line 1))

  Running setup.py egg_info for package zope.interface

    running egg_info
    writing requirements to pip-egg-info/zope.interface.egg-info/requires.txt
    writing pip-egg-info/zope.interface.egg-info/PKG-INFO
    writing namespace_packages to pip-egg-info/zope.interface.egg-info/namespace_packages.txt
    writing top-level names to pip-egg-info/zope.interface.egg-info/top_level.txt
    writing dependency_links to pip-egg-info/zope.interface.egg-info/dependency_links.txt
    warning: manifest_maker: standard file '-c' not found
    reading manifest file 'pip-egg-info/zope.interface.egg-info/SOURCES.txt'
    reading manifest template 'MANIFEST.in'
    warning: no previously-included files matching '*.dll' found anywhere in distribution

    warning: no previously-included files matching '*.pyc' found anywhere in distribution

    warning: no previously-included files matching '*.pyo' found anywhere in distribution

    warning: no previously-included files matching '*.so' found anywhere in distribution

    writing manifest file 'pip-egg-info/zope.interface.egg-info/SOURCES.txt'
  Source in /tmp/pip-build-root/zope.interface has version 4.1.2, which satisfies requirement zope.interface>=3.6.0 (from Twisted>=10.0.0->Scrapy==0.24.4->-r requirements.txt (line 1))
  skipping extra test
  skipping extra docs
  skipping extra docs
  skipping extra testing
  skipping extra testing
  skipping extra testing
Downloading/unpacking cryptography>=0.2.1 (from pyOpenSSL->Scrapy==0.24.4->-r requirements.txt (line 1))

  Running setup.py egg_info for package cryptography

    running egg_info
    writing requirements to pip-egg-info/cryptography.egg-info/requires.txt
    writing pip-egg-info/cryptography.egg-info/PKG-INFO
    writing top-level names to pip-egg-info/cryptography.egg-info/top_level.txt
    writing dependency_links to pip-egg-info/cryptography.egg-info/dependency_links.txt
    writing entry points to pip-egg-info/cryptography.egg-info/entry_points.txt
    warning: manifest_maker: standard file '-c' not found
    reading manifest file 'pip-egg-info/cryptography.egg-info/SOURCES.txt'
    reading manifest template 'MANIFEST.in'
    no previously-included directories found matching 'docs/_build'

    warning: no previously-included files matching '*' found under directory 'vectors'

    writing manifest file 'pip-egg-info/cryptography.egg-info/SOURCES.txt'
  Source in /tmp/pip-build-root/cryptography has version 0.7.2, which satisfies requirement cryptography>=0.2.1 (from pyOpenSSL->Scrapy==0.24.4->-r requirements.txt (line 1))
Requirement already satisfied (use --upgrade to upgrade): distribute in /usr/lib/python2.6/site-packages (from zope.interface>=3.6.0->Twisted>=10.0.0->Scrapy==0.24.4->-r requirements.txt (line 1))

Downloading/unpacking cffi>=0.8 (from cryptography>=0.2.1->pyOpenSSL->Scrapy==0.24.4->-r requirements.txt (line 1))

  Running setup.py egg_info for package cffi

    compiling '_configtest.c':

    __thread int some_threadlocal_variable_42;

    running egg_info
    writing requirements to pip-egg-info/cffi.egg-info/requires.txt
    writing pip-egg-info/cffi.egg-info/PKG-INFO
    writing top-level names to pip-egg-info/cffi.egg-info/top_level.txt
    writing dependency_links to pip-egg-info/cffi.egg-info/dependency_links.txt
    warning: manifest_maker: standard file '-c' not found
    reading manifest file 'pip-egg-info/cffi.egg-info/SOURCES.txt'
    reading manifest template 'MANIFEST.in'
    writing manifest file 'pip-egg-info/cffi.egg-info/SOURCES.txt'
  Source in /tmp/pip-build-root/cffi has version 0.8.6, which satisfies requirement cffi>=0.8 (from cryptography>=0.2.1->pyOpenSSL->Scrapy==0.24.4->-r requirements.txt (line 1))
Downloading/unpacking pyasn1 (from cryptography>=0.2.1->pyOpenSSL->Scrapy==0.24.4->-r requirements.txt (line 1))

  Running setup.py egg_info for package pyasn1

    running egg_info
    writing pip-egg-info/pyasn1.egg-info/PKG-INFO
    writing top-level names to pip-egg-info/pyasn1.egg-info/top_level.txt
    writing dependency_links to pip-egg-info/pyasn1.egg-info/dependency_links.txt
    warning: manifest_maker: standard file '-c' not found
    reading manifest file 'pip-egg-info/pyasn1.egg-info/SOURCES.txt'
    reading manifest template 'MANIFEST.in'
    writing manifest file 'pip-egg-info/pyasn1.egg-info/SOURCES.txt'
  Source in /tmp/pip-build-root/pyasn1 has version 0.1.7, which satisfies requirement pyasn1 (from cryptography>=0.2.1->pyOpenSSL->Scrapy==0.24.4->-r requirements.txt (line 1))
Downloading/unpacking enum34 (from cryptography>=0.2.1->pyOpenSSL->Scrapy==0.24.4->-r requirements.txt (line 1))

  Running setup.py egg_info for package enum34

    running egg_info
    writing requirements to pip-egg-info/enum34.egg-info/requires.txt
    writing pip-egg-info/enum34.egg-info/PKG-INFO
    writing top-level names to pip-egg-info/enum34.egg-info/top_level.txt
    writing dependency_links to pip-egg-info/enum34.egg-info/dependency_links.txt
    warning: manifest_maker: standard file '-c' not found
    reading manifest file 'pip-egg-info/enum34.egg-info/SOURCES.txt'
    writing manifest file 'pip-egg-info/enum34.egg-info/SOURCES.txt'
  Source in /tmp/pip-build-root/enum34 has version 1.0.4, which satisfies requirement enum34 (from cryptography>=0.2.1->pyOpenSSL->Scrapy==0.24.4->-r requirements.txt (line 1))
Downloading/unpacking pycparser (from cffi>=0.8->cryptography>=0.2.1->pyOpenSSL->Scrapy==0.24.4->-r requirements.txt (line 1))

  Running setup.py egg_info for package pycparser

    running egg_info
    writing pip-egg-info/pycparser.egg-info/PKG-INFO
    writing top-level names to pip-egg-info/pycparser.egg-info/top_level.txt
    writing dependency_links to pip-egg-info/pycparser.egg-info/dependency_links.txt
    warning: manifest_maker: standard file '-c' not found
    reading manifest file 'pip-egg-info/pycparser.egg-info/SOURCES.txt'
    writing manifest file 'pip-egg-info/pycparser.egg-info/SOURCES.txt'
  Source in /tmp/pip-build-root/pycparser has version 2.10, which satisfies requirement pycparser (from cffi>=0.8->cryptography>=0.2.1->pyOpenSSL->Scrapy==0.24.4->-r requirements.txt (line 1))
Downloading/unpacking ordereddict (from enum34->cryptography>=0.2.1->pyOpenSSL->Scrapy==0.24.4->-r requirements.txt (line 1))

  Running setup.py egg_info for package ordereddict

    running egg_info
    writing pip-egg-info/ordereddict.egg-info/PKG-INFO
    writing top-level names to pip-egg-info/ordereddict.egg-info/top_level.txt
    writing dependency_links to pip-egg-info/ordereddict.egg-info/dependency_links.txt
    warning: manifest_maker: standard file '-c' not found
    reading manifest file 'pip-egg-info/ordereddict.egg-info/SOURCES.txt'
    writing manifest file 'pip-egg-info/ordereddict.egg-info/SOURCES.txt'
  Source in /tmp/pip-build-root/ordereddict has version 1.1, which satisfies requirement ordereddict (from enum34->cryptography>=0.2.1->pyOpenSSL->Scrapy==0.24.4->-r requirements.txt (line 1))
Installing collected packages: Twisted, w3lib, queuelib, lxml, cssselect, bitarray, zope.interface, cryptography, cffi, pyasn1, enum34, pycparser, ordereddict

  Running setup.py install for Twisted

    Running command /usr/bin/python -c "import setuptools;__file__='/tmp/pip-build-root/Twisted/setup.py';exec(compile(open(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /tmp/pip-6YgujV-record/install-record.txt --single-version-externally-managed
    running install
    running build
    running build_py
    running egg_info
    writing requirements to Twisted.egg-info/requires.txt
    writing Twisted.egg-info/PKG-INFO
    writing top-level names to Twisted.egg-info/top_level.txt
    writing dependency_links to Twisted.egg-info/dependency_links.txt
    warning: manifest_maker: standard file '-c' not found
    reading manifest file 'Twisted.egg-info/SOURCES.txt'
    writing manifest file 'Twisted.egg-info/SOURCES.txt'
    running build_ext
    gcc -pthread -fno-strict-aliasing -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i686 -mtune=atom -fasynchronous-unwind-tables -D_GNU_SOURCE -fPIC -fwrapv -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i686 -mtune=atom -fasynchronous-unwind-tables -D_GNU_SOURCE -fPIC -fwrapv -fPIC -I/usr/include/python2.6 -c conftest.c -o conftest.o

    **building 'twisted.runner.portmap' extension**

    gcc -pthread -fno-strict-aliasing -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i686 -mtune=atom -fasynchronous-unwind-tables -D_GNU_SOURCE -fPIC -fwrapv -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i686 -mtune=atom -fasynchronous-unwind-tables -D_GNU_SOURCE -fPIC -fwrapv -fPIC -I/usr/include/python2.6 -c twisted/runner/portmap.c -o build/temp.linux-i686-2.6/twisted/runner/portmap.o

    **twisted/runner/portmap.c:10:20: error: Python.h: No such file or directory**

    twisted/runner/portmap.c:14: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘*’ token

    twisted/runner/portmap.c:31: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘*’ token

    twisted/runner/portmap.c:45: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘PortmapMethods’

    twisted/runner/portmap.c: In function ‘initportmap’:

    twisted/runner/portmap.c:55: warning: implicit declaration of function ‘Py_InitModule’

    twisted/runner/portmap.c:55: error: ‘PortmapMethods’ undeclared (first use in this function)

    twisted/runner/portmap.c:55: error: (Each undeclared identifier is reported only once

    twisted/runner/portmap.c:55: error: for each function it appears in.)

    error: command 'gcc' failed with exit status 1

    Complete output from command /usr/bin/python -c "import setuptools;__file__='/tmp/pip-build-root/Twisted/setup.py';exec(compile(open(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /tmp/pip-6YgujV-record/install-record.txt --single-version-externally-managed:

    running install

running build

running build_py

running egg_info

writing requirements to Twisted.egg-info/requires.txt

writing Twisted.egg-info/PKG-INFO

writing top-level names to Twisted.egg-info/top_level.txt

writing dependency_links to Twisted.egg-info/dependency_links.txt

warning: manifest_maker: standard file '-c' not found

reading manifest file 'Twisted.egg-info/SOURCES.txt'

writing manifest file 'Twisted.egg-info/SOURCES.txt'

running build_ext

gcc -pthread -fno-strict-aliasing -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i686 -mtune=atom -fasynchronous-unwind-tables -D_GNU_SOURCE -fPIC -fwrapv -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i686 -mtune=atom -fasynchronous-unwind-tables -D_GNU_SOURCE -fPIC -fwrapv -fPIC -I/usr/include/python2.6 -c conftest.c -o conftest.o

building 'twisted.runner.portmap' extension

gcc -pthread -fno-strict-aliasing -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i686 -mtune=atom -fasynchronous-unwind-tables -D_GNU_SOURCE -fPIC -fwrapv -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i686 -mtune=atom -fasynchronous-unwind-tables -D_GNU_SOURCE -fPIC -fwrapv -fPIC -I/usr/include/python2.6 -c twisted/runner/portmap.c -o build/temp.linux-i686-2.6/twisted/runner/portmap.o

twisted/runner/portmap.c:10:20: error: Python.h: No such file or directory

twisted/runner/portmap.c:14: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘*’ token

twisted/runner/portmap.c:31: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘*’ token

twisted/runner/portmap.c:45: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘PortmapMethods’

twisted/runner/portmap.c: In function ‘initportmap’:

twisted/runner/portmap.c:55: warning: implicit declaration of function ‘Py_InitModule’

twisted/runner/portmap.c:55: error: ‘PortmapMethods’ undeclared (first use in this function)

twisted/runner/portmap.c:55: error: (Each undeclared identifier is reported only once

twisted/runner/portmap.c:55: error: for each function it appears in.)

error: command 'gcc' failed with exit status 1

----------------------------------------

Command /usr/bin/python -c "import setuptools;__file__='/tmp/pip-build-root/Twisted/setup.py';exec(compile(open(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /tmp/pip-6YgujV-record/install-record.txt --single-version-externally-managed failed with error code 1 in /tmp/pip-build-root/Twisted

Exception information:
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/pip/basecommand.py", line 139, in main
    status = self.run(options, args)
  File "/usr/lib/python2.6/site-packages/pip/commands/install.py", line 271, in run
    requirement_set.install(install_options, global_options, root=options.root_path)
  File "/usr/lib/python2.6/site-packages/pip/req.py", line 1185, in install
    requirement.install(install_options, global_options, *args, **kwargs)
  File "/usr/lib/python2.6/site-packages/pip/req.py", line 592, in install
    cwd=self.source_dir, filter_stdout=self._filter_install, show_stdout=False)
  File "/usr/lib/python2.6/site-packages/pip/util.py", line 662, in call_subprocess
    % (command_desc, proc.returncode, cwd))
InstallationError: Command /usr/bin/python -c "import setuptools;__file__='/tmp/pip-build-root/Twisted/setup.py';exec(compile(open(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /tmp/pip-6YgujV-record/install-record.txt --single-version-externally-managed failed with error code 1 in /tmp/pip-build-root/Twisted

Major changes coming

Tested this script against wavsep, the web application vuln scanner benchmarking tool. It fails multiple XSS tests of wavsep but most of the problems all seem to lie in the logic for determining whether it's the single or double quote that's the delimiting quote. In the coming update, xsscrapy should cut the amount of requests it makes in half and have significantly better detection rates. May take a few weeks or more to accomplish.

Twisted Error when i run any URL

when i run any URL like " python xsscrapy.py -u https://example.com "
so i got this error.

2022-02-09 22:19:55 [scrapy] INFO: Enabled extensions:
['scrapy.extensions.logstats.LogStats',
'scrapy.extensions.telnet.TelnetConsole',
'scrapy.extensions.corestats.CoreStats']
Unhandled error in Deferred:
2022-02-09 22:19:55 [twisted] CRITICAL: Unhandled error in Deferred:

2022-02-09 22:19:55 [twisted] CRITICAL:
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/twisted/internet/defer.py", line 1299, in _inlineCallbacks
result = g.send(result)
File "/usr/local/lib/python2.7/dist-packages/scrapy/crawler.py", line 90, in crawl
six.reraise(*exc_info)
File "/usr/local/lib/python2.7/dist-packages/scrapy/crawler.py", line 72, in crawl
self.engine = self._create_engine()
File "/usr/local/lib/python2.7/dist-packages/scrapy/crawler.py", line 97, in _create_engine
return ExecutionEngine(self, lambda _: self.stop())
File "/usr/local/lib/python2.7/dist-packages/scrapy/core/engine.py", line 66, in init
self.scheduler_cls = load_object(self.settings['SCHEDULER'])
File "/usr/local/lib/python2.7/dist-packages/scrapy/utils/misc.py", line 44, in load_object
mod = import_module(module)
File "/usr/lib/python2.7/importlib/init.py", line 37, in import_module
import(name)
File "/usr/local/lib/python2.7/dist-packages/scrapy/core/scheduler.py", line 6, in
from queuelib import PriorityQueue
File "/usr/local/lib/python2.7/dist-packages/queuelib/init.py", line 1, in
from queuelib.queue import FifoDiskQueue, LifoDiskQueue
File "/usr/local/lib/python2.7/dist-packages/queuelib/queue.py", line 7, in
from contextlib import suppress
ImportError: cannot import name suppress

please help me and fix this issue
thank you.

versions?

what version of python and pip required, to avoid errors?

Request: Additional install documentation

with Mac OSX and Python 2.7.8 installed via brew this doesnt work out of the box (for me).

I got errors that the following modules were missing
pip install requests <- Worked fine
pip install IPython <- Worked fine
pip install pybloomfiltermmap <- Not compatable, Any ideas?

I attempted to change the module imports from pybloom to pybloomfilter but ran into errors with list types when checking the bloomfilter.

Any additional documentation regarding python version, other dependancies and if i need a specific bloom filter module installed would be appreciated. Thanks. Maybe allowing installation via setuptools?

Urlparse Module not found

root@kali:~/xsscrapy# ./xsscrapy.py -h
Traceback (most recent call last):
File "./xsscrapy.py", line 5, in
from xsscrapy.spiders.xss_spider import XSSspider
File "/root/xsscrapy/xsscrapy/spiders/xss_spider.py", line 9, in
from urlparse import urlparse, parse_qsl, urljoin, urlunparse, urlunsplit
ModuleNotFoundError: No module named 'urlparse'

root@kali:~/xsscrapy# pip install urlparse
Collecting urlparse
Could not find a version that satisfies the requirement urlparse (from versions: )
No matching distribution found for urlparse

urlparse error

from urlparse import urlparse, parse_qsl, urljoin, urlunparse, urlunsplit
ModuleNotFoundError: No module named 'urlparse'

Auth failing - 401- HTTP status code is not handled or not allowed

I pulled the latest xsscrapy today. I tried to run it against a site that needs Basic auth. I fed it the credentials via options -l and -p. However, xsscrapy will not spider.

2014-10-22 10:35:05-0400 [scrapy] INFO: Scrapy 0.24.4 started (bot: xsscrapy)
...
2014-10-22 10:35:06-0400 [scrapy] DEBUG: Web service listening on 127.0.0.1:6080
2014-10-22 10:35:07-0400 [xsscrapy] DEBUG: Crawled (401) <GET http://target/page> (referer: None)
2014-10-22 10:35:07-0400 [xsscrapy] DEBUG: Ignoring response <401 http://target/page>: HTTP status code is not handled or not allowed
2014-10-22 10:35:07-0400 [xsscrapy] INFO: Closing spider (finished)
...

scope creep - crawling beyond the target site into other sites on the domain

It seems like this is a feature. Give it -u http://a.example.com and if there is link to http://b.example.com then xsscrapy follows and tests it. But IMO that is a big mistake (as a default setting). I want to test QA not production, and sometime (often) a QA site has links to production. So I try to scan JUST http://qa.example.com and xsscrapy ends up going to http://www.example.com. Now I've just sent traffic to production. Not an ideal situation.

My fix:
Edit xsscrapy/spiders/xss_spiders.py to modify self.allowed_domains to be:
self.allowed_domains = [hostname]

ERROR: Error downloading

2017-03-27 13:08:59 [scrapy] ERROR: Error downloading <GET http://urlhere/bla/bla/bla>
Traceback (most recent call last):
  File "/Library/Python/2.7/site-packages/twisted/internet/defer.py", line 1299, in _inlineCallbacks
    result = result.throwExceptionIntoGenerator(g)
  File "/Library/Python/2.7/site-packages/twisted/python/failure.py", line 393, in throwExceptionIntoGenerator
    return g.throw(self.type, self.value, self.tb)
  File "/Library/Python/2.7/site-packages/scrapy/core/downloader/middleware.py", line 43, in process_request
    defer.returnValue((yield download_func(request=request,spider=spider)))
  File "/Library/Python/2.7/site-packages/scrapy/utils/defer.py", line 45, in mustbe_deferred
    result = f(*args, **kw)
  File "/Library/Python/2.7/site-packages/scrapy/core/downloader/handlers/__init__.py", line 65, in download_request
    return handler.download_request(request, spider)
  File "/Library/Python/2.7/site-packages/scrapy/core/downloader/handlers/http11.py", line 60, in download_request
    return agent.download_request(request)
  File "/Library/Python/2.7/site-packages/scrapy/core/downloader/handlers/http11.py", line 264, in download_request
    method, to_bytes(url, encoding='ascii'), headers, bodyproducer)
  File "/Library/Python/2.7/site-packages/twisted/web/client.py", line 1631, in request
    parsedURI.originForm)
  File "/Library/Python/2.7/site-packages/twisted/web/client.py", line 1408, in _requestWithEndpoint
    d = self._pool.getConnection(key, endpoint)
  File "/Library/Python/2.7/site-packages/twisted/web/client.py", line 1294, in getConnection
    return self._newConnection(key, endpoint)
  File "/Library/Python/2.7/site-packages/twisted/web/client.py", line 1306, in _newConnection
    return endpoint.connect(factory)
  File "/Library/Python/2.7/site-packages/twisted/internet/endpoints.py", line 788, in connect
    EndpointReceiver, self._hostText, portNumber=self._port
  File "/Library/Python/2.7/site-packages/twisted/internet/_resolver.py", line 174, in resolveHostName
    onAddress = self._simpleResolver.getHostByName(hostName)
  File "/Library/Python/2.7/site-packages/scrapy/resolver.py", line 21, in getHostByName
    d = super(CachingThreadedResolver, self).getHostByName(name, timeout)
  File "/Library/Python/2.7/site-packages/twisted/internet/base.py", line 276, in getHostByName
    timeoutDelay = sum(timeout)
TypeError: 'float' object is not iterable

How to detect whether XSS exists in a post request?

Distribution request

Can we get a license added to it to add to our repo please?

Please fix this: lots of false positives

example: tesla.txt

basically your script injected this string:
1zqjoz'"(){}:1zqjoz;9

And in the response it found:
1zqjar'%22()%7b%7d%3cx%3e:1zqjar;9

And reports as valid bug?!

THE INPUT IS PROPERLY HANDLE.
There is not vulnerability.

I am tired of getting this, could you please fix it?
I have a tool of my own and it has far less false positives than yours, this should be easy to fix (but i don't know).

Could you prioritise this?

ps: I love your tool is just that this false positive thing is anoying..
all the best!

requirements.txt problem

root@kali:~/xsscrapy# pip install -r requirements.txt
Collecting Scrapy==1.1.0rc3 (from -r requirements.txt (line 1))
Using cached https://files.pythonhosted.org/packages/c4/33/a87d324a3c25b6e6e8018b9161987e185910bd6e611ebb75ce169a7f1312/Scrapy-1.1.0rc3-py2.py3-none-any.whl
Collecting pybloom==1.1 (from -r requirements.txt (line 2))
Using cached https://files.pythonhosted.org/packages/1a/82/a1ad015bdc19bd7e10aa97070329b84b5e01c0c6b5de88df664a98413eed/pybloom-1.1.tar.gz
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
File "", line 1, in
File "/tmp/pip-install-kysj5guf/pybloom/setup.py", line 2, in
from ez_setup import use_setuptools
File "/tmp/pip-install-kysj5guf/pybloom/ez_setup.py", line 98
except pkg_resources.VersionConflict, e:
^
SyntaxError: invalid syntax

----------------------------------------

Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-install-kysj5guf/pybloom/

Frames support

I haven't much of a look at the code to tell if it was this or scrapy, but it seems to skip over frames/iframes for spidering. It would be good if it could look at the url and load the frame if it is on the same domain.

NameError: name 'execute' is not defined

./xsscrapy.py
sys.version_info(major=2, minor=7, micro=8, releaselevel='final', serial=0)
Traceback (most recent call last):
File "./xsscrapy.py", line 28, in
execute(['scrapy', 'crawl', 'xsscrapy', '-a', 'url=%s' % url, '-a', 'user=%s' % user, '-a', 'pw=%s' % password])
NameError: name 'execute' is not defined

false positives on pages with long single lines

Just need to figure out how to get an lxml Element's index from within either the doc or just that line. Working on it.

-bash: ./xsscrapy.py: No such file or directory

Hi, I've been having issues trying to install xsscrapy on ubuntu. Every time I try using ./xsscrapy.py -u http://example.com
I always get -bash: ./xsscrapy.py: No such file or directory. I have python 2.7 installed, and I downloaded the files and dependencies for xsscrapy. Am I doing something wrong? Sorry, I'm new to this.

Thanks

requirements.txt

while installing xsscrapy i am facing issue in kali linux i have tried with python2 and python3 both.

Xsscrapy pybloom installing error

Hello,
I am not able to use xsscrapy in Kali Linux 2019.3 version.
I had also checked older verison but the same error is coming.

/xsscrapy# pip install -r requirements.txt
Collecting Scrapy==1.1.0rc3
Using cached Scrapy-1.1.0rc3-py2.py3-none-any.whl (292 kB)
Collecting pybloom==1.1
Using cached pybloom-1.1.tar.gz (10 kB)
ERROR: Command errored out with exit status 1:
command: /usr/bin/python3 -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-_k89_9hy/pybloom/setup.py'"'"'; file='"'"'/tmp/pip-install-_k89_9hy/pybloom/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-install-_k89_9hy/pybloom/pip-egg-info
cwd: /tmp/pip-install-_k89_9hy/pybloom/
Complete output (8 lines):
Traceback (most recent call last):
File "", line 1, in
File "/tmp/pip-install-_k89_9hy/pybloom/setup.py", line 2, in
from ez_setup import use_setuptools
File "/tmp/pip-install-_k89_9hy/pybloom/ez_setup.py", line 98
except pkg_resources.VersionConflict, e:
^
SyntaxError: invalid syntax
----------------------------------------
ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.

Me apki kya mdd kr skta hu

मैं सॉफ्टवेयर डेवलपमेंट हूं मैं एक कर का का काम करता हूं किसी भी है सॉफ्टवेयर में प्रॉब्लम हो तो मैं उनकी मदद हूं करता हूं

no attribute 'OP_NO_TLSv1_1'

TLSVersion.TLSv1_1: SSL.OP_NO_TLSv1_1,
AttributeError: 'module' object has no attribute 'OP_NO_TLSv1_1'