kangvcar / infospider Goto Github PK

INFO-SPIDER 是一个集众多数据源于一身的爬虫工具箱🧰，旨在安全快捷的帮助用户拿回自己的数据，工具代码开源，流程透明。支持数据源包括GitHub、QQ邮箱、网易邮箱、阿里邮箱、新浪邮箱、Hotmail邮箱、Outlook邮箱、京东、淘宝、支付宝、**移动、**联通、**电信、知乎、哔哩哔哩、网易云音乐、QQ好友、QQ群、生成朋友圈相册、浏览器浏览历史、12306、博客园、CSDN博客、开源**博客、简书。

Home Page: https://infospider.vercel.app

License: GNU General Public License v3.0

Python 66.67% Shell 0.06% Jupyter Notebook 2.73% HTML 14.20% CSS 0.22% JavaScript 16.12%

python3 crawl spider selenium wxpython tkinter automation hotmail chrome csdn

infospider's Introduction

infospider's People

Contributors

Stargazers

Watchers

Forkers

dbdoer fakegit gzh2001 avaloyin gyesliu mengxiangke tangsheng9999 lsqaure qwe6630204 mazhidong fathui bhsherry paynewoo cchina licshire cnjax yongcer chijincn huaji98 cnzbq topme88 fmhgit ylmzfun tchigher derektso jackyvan tangzihui herolin12 t0data jackhappy crackercat ccc7 allensmile ozyb yuhuasong123 luckyboy77 huusan zhouhaocheng 0ctl0 yhchiu shangyu3 wansyu stonesee parchrome leeforksource yancaoshu googtech xmensung chinabjhzc jdev2412 goiruri 332plim if2007 jincheng-xiaosu gqgl bestjex wuzhisheng j1singmine hao007007 chrisllllll 520wsl why112233 5l1v3r1 ruiyuanxu wolfeeluo xhowar duian fmzwy cansnow123 huoxin233 ir-st sharokku4869 magicen0722 miguangsun skyformat99 weihanwen stan23333 cubehuang linkadd diwawa lijiahui2019 ohyang94 masdude xmgh-z finaldusk esirn hhao97 ptisnoob wokaic sjlzz1221 mckerrigan layzolu wsdjeg bestlemoon 2016adley rudolph-heisenberg tomtomzhang1970 affvps renjie45 wxkeykey

infospider's Issues

挺不错的，但是为什么要用tk作为gui界面，我很奇怪。

刚在公众号推送看到你的开源，确实很不错，很全也挺美观的，但是作为一个开发来说，我觉得这个工具作用没那么高，但是对应你的标题，收集你自己的个性信息，又非常合情合理了，另外其实没必要用tkinter作为GUI界面的创造，tk有的时候会崩溃的，不是很好使，最好建议是开发一个web端口，那就很好了，good !

淘宝和支付宝网站支持的不好，抛出异常

Bug Report

Description: [Description of the issue]
Traceback (most recent call last):
File "main.py", line 520, in OnClick
t = TaobaoSpider(cookie_list)
File "E:\my_work_spaces\pycharm\Self_learn_projs\Crawler_projs\InfoSpider-master./Spiders\taobao\spider.py", line 65, in init
self.path = askdirectory(title='选择信息保存文件夹')
File "G:\py37\lib\tkinter\filedialog.py", line 428, in askdirectory
return Directory(**options).show()
File "G:\py37\lib\tkinter\commondialog.py", line 39, in show
w = Frame(self.master)
File "G:\py37\lib\tkinter_init_.py", line 2744, in init
Widget.init(self, master, 'frame', cnf, {}, extra)
File "G:\py37\lib\tkinter_init_.py", line 2299, in init
(widgetName, self._w) + extra + self._options(cnf))
RuntimeError: main thread is not in main loop

拼多多有计划支持么？

Bug Report

Description: [Description of the issue]

Expected behavior: [What should happen]

Current behavior: [What happpens instead of the expected behavior]

Steps to Reproduce:

[First Step]
[Second Step]
[and so on ¡]

Reproduce how often: [What percentage of the time does it reproduce?]

Possible solution: [Not obligatory, but suggest a fix/reason for the bug]

Context (Environment):[The code version, python version, operating system or other software/libs you use]

Additional Information

[Any other useful information about the problem].

GITHUB

您好我有其他项目需要咨询麻烦请加我的Q 3374835496 或许SKPYE live:.cid.b409052f6258136f
Hello, I have other projects to consult, please add my Q 3374835496 or SKPYE live:.cid.b409052f6258136f

没有微博数据源吗

Bug Report

Description: [Description of the issue]

Expected behavior: [What should happen]

Current behavior: [What happpens instead of the expected behavior]

Steps to Reproduce:

[First Step]
[Second Step]
[and so on ¡]

Reproduce how often: [What percentage of the time does it reproduce?]

Possible solution: [Not obligatory, but suggest a fix/reason for the bug]

Context (Environment):[The code version, python version, operating system or other software/libs you use]

Additional Information

[Any other useful information about the problem].

No module named 'Spiders'

运行main.py报错:
Traceback (most recent call last):
File "main.py", line 32, in
from Spiders.A12306 import main12306
ModuleNotFoundError: No module named 'Spiders'

在print(BASE_PATH)后增加一行sys.path.append(BASE_PATH)就能运行了

Terminating app due to uncaught exception 'NSInvalidArgumentException', reason: '-[wxNSApplication _setup:]: unrecognized selector sent to instance 0x7fb6973aec70'

QQ空间文章爬虫

大佬啥时候能支持空间日志爬虫，谢谢

QQ空间可以生成相册吗

和Ray项目的集成？

请问有没有兴趣和Ray项目集成，来实现多核/多节点的加速？

参见：https://github.com/raulchen/ray-intro-cn/blob/master/python/demo.py
我还写了一个很简单的并行github爬虫：https://github.com/zhe-thoughts/raytools/blob/master/github_analyzer/github_stargazers.py

Why

taobao 爬虫好像不成功 taobao_cookies.json需要更换吗

Bug Report

Description: [Description of the issue]

Expected behavior: [What should happen]

Current behavior: [What happpens instead of the expected behavior]

Steps to Reproduce:

[First Step]
[Second Step]
[and so on ¡]

Reproduce how often: [What percentage of the time does it reproduce?]

Possible solution: [Not obligatory, but suggest a fix/reason for the bug]

Context (Environment):[The code version, python version, operating system or other software/libs you use]

Additional Information

[Any other useful information about the problem].

我在使用中出现了这个报错

Traceback (most recent call last):
  File "D:\Working\Codes\InfoSpider\tools\main.py", line 34, in <module>
    from alipay.main import ASpider
ModuleNotFoundError: No module named 'alipay.main'

安装依赖时报错，临时解决办法

安装第一个依赖时报错：UnicodeDecodeError: 'gbk' codec can't decode byte 0x93 in position 2621: illegal multibyte sequence
解决办法，替换版本
matplotlib==3.2.0 为 matplotlib==3.6.0

应该是安装numpy的时候提示要安装c++ 14以上的
解决方法，先安装conda 然后
conda install libpython m2w64-toolchain -c msys2

没有微博数据源吗？

Bug Report

Description: [Description of the issue]

Expected behavior: [What should happen]

Current behavior: [What happpens instead of the expected behavior]

Steps to Reproduce:

[First Step]
[Second Step]
[and so on ¡]

Reproduce how often: [What percentage of the time does it reproduce?]

Possible solution: [Not obligatory, but suggest a fix/reason for the bug]

Context (Environment):[The code version, python version, operating system or other software/libs you use]

Additional Information

[Any other useful information about the problem].

知乎提示请升级客户端后重试

Bug Report

Description: [Description of the issue]

{"id":"c9b28ce4b50bf0444d17d010224cb06f","url_token":"houziliaorenwu","name":"猴子","use_default_avatar":false,"avatar_url":"https://pic1.zhimg.com/v2-12ef91a3f1e91e70bd3480d755e058b1_l.jpg?source=32738c0c","avatar_url_template":"https://picx.zhimg.com/v2-12ef91a3f1e91e70bd3480d755e058b1.jpg?source=32738c0c","is_org":false,"type":"people","url":"https://www.zhihu.com/api/v4/people/houziliaorenwu","user_type":"people","headline":"公中号(猴子数据分析)著有畅销书《数据分析思维》 科普**专家","headline_render":"公中号(猴子数据分析)著有畅销书《数据分析思维》科普**专家","gender":1,"is_advertiser":false,"ip_info":"IP 属地北京","vip_info":{"is_vip":true,"vip_type":1,"rename_days":"60","widget":{"id":"13017","url":"https://pic1.zhimg.com/v2-06ff79935442c7b0b2de8bde3529de2a.jpg?source=88ceefae","night_mode_url":"https://pic1.zhimg.com/v2-7cb817a30db30272a00bc17450a2ea79.jpg?source=88ceefae"},"entrance_v2":null,"rename_frequency":3,"rename_await_days":0},"available_medals_count":0,"is_realname":true,"has_applying_column":false}

{
    "error": {
        "code": 10002,
        "message": "10002:\u8bf7\u6c42\u53c2\u6570\u5f02\u5e38\uff0c\u8bf7\u5347\u7ea7\u5ba2\u6237\u7aef\u540e\u91cd\u8bd5"
    }
}

{
    "error": {
        "code": 10002,
        "message": "10002:\u8bf7\u6c42\u53c2\u6570\u5f02\u5e38\uff0c\u8bf7\u5347\u7ea7\u5ba2\u6237\u7aef\u540e\u91cd\u8bd5"
    }
}

{
    "error": {
        "code": 10002,
        "message": "10002:\u8bf7\u6c42\u53c2\u6570\u5f02\u5e38\uff0c\u8bf7\u5347\u7ea7\u5ba2\u6237\u7aef\u540e\u91cd\u8bd5"
    }
}

<html><title>404: Not Found</title><body>404: Not Found</body></html>
{"error":{"message":"请求参数异常，请升级客户端后重试","code":10003}}

{"data": []}

请求商务推广合作

作者您好，我们也是一家专业做IP代理的服务商，极速HTTP，我们注册认证会送10000IP(可以帮助您的学者适当薅羊毛试用：) 。想跟您谈谈是否能够达成商业推广上的合作。如果您，有意愿的话，可以联系我，微信：13982004324 谢谢（如果没有意愿的话，抱歉，打扰了）

unrecognized selector sent to instance 0x7ff30d846990

不管点哪个都显示这个只有后面的实例地址在变

bilibili 爬取https://api.bilibili.com/x/v2/history?pn=4&ps=300&jsonp=jsonp 获取到data:""

获取到data= null 导致程序异常,无法导出
bilibili/main.py

if len(result['data']) == 0:

？

Mac系统是不是不支持？

2020-08-27 11:04:51.534 Python[1657:25291] -[wxNSApplication _setup:]: unrecognized selector sent to instance 0x7fae66c372a0

这个不犯法吗？

Bug Report

Description: [Description of the issue]

Expected behavior: [What should happen]

Current behavior: [What happpens instead of the expected behavior]

Steps to Reproduce:

[First Step]
[Second Step]
[and so on ¡]

Reproduce how often: [What percentage of the time does it reproduce?]

Possible solution: [Not obligatory, but suggest a fix/reason for the bug]

Context (Environment):[The code version, python version, operating system or other software/libs you use]

Additional Information

[Any other useful information about the problem].

【更新建议】可以支持人人网吗

Bug Report

Description: [Description of the issue]

Expected behavior: [What should happen]

Current behavior: [What happpens instead of the expected behavior]

Steps to Reproduce:

[First Step]
[Second Step]
[and so on ¡]

Reproduce how often: [What percentage of the time does it reproduce?]

Possible solution: [Not obligatory, but suggest a fix/reason for the bug]

Context (Environment):[The code version, python version, operating system or other software/libs you use]

Additional Information

[Any other useful information about the problem].

how to get all user facebook id

Bug Report

Description: [Description of the issue]

Expected behavior: [What should happen]

Current behavior: [What happpens instead of the expected behavior]

Steps to Reproduce:

[First Step]
[Second Step]
[and so on ¡]

Reproduce how often: [What percentage of the time does it reproduce?]

Possible solution: [Not obligatory, but suggest a fix/reason for the bug]

Context (Environment):[The code version, python version, operating system or other software/libs you use]

Additional Information

[Any other useful information about the problem].

我用pip安装的时候，报这个错误

value:InfoSpider:% pip install -r requirements.txt                     <master>
Defaulting to user installation because normal site-packages is not writeable
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
ERROR: Could not find a version that satisfies the requirement matplotlib==3.2.0 (from -r requirements.txt (line 1)) (from versions: 0.86, 0.86.1, 0.86.2, 0.91.0, 0.91.1, 1.0.1, 1.1.0, 1.1.1, 1.2.0, 1.2.1, 1.3.0, 1.3.1, 1.4.0, 1.4.1rc1, 1.4.1, 1.4.2, 1.4.3, 1.5.0, 1.5.1, 1.5.2, 1.5.3, 2.0.0b1, 2.0.0b2, 2.0.0b3, 2.0.0b4, 2.0.0rc1, 2.0.0rc2, 2.0.0, 2.0.1, 2.0.2, 2.1.0rc1, 2.1.0, 2.1.1, 2.1.2, 2.2.0rc1, 2.2.0, 2.2.2, 2.2.3, 2.2.4, 2.2.5, 3.0.0rc2, 3.0.0, 3.0.1, 3.0.2, 3.0.3)
ERROR: No matching distribution found for matplotlib==3.2.0 (from -r requirements.txt (line 1))
value:InfoSpider:%                                                       <master>

公众号文章爬虫

请问什么时候可以支持公众号文章爬虫

在安装依赖时报错了

lib-3.2.0-cp38-cp38-win_amd64.whl
Downloading matplotlib-3.2.0-cp38-cp38-win_amd64.whl (9.2 MB)
|██████████▌ | 3.0 MB 4.7 kB/s eta 0:22:02ER
ROR: Exception:
Traceback (most recent call last):
File "c:\users\administrator\appdata\local\programs\python\python38\lib\site-p
ackages\pip_vendor\urllib3\response.py", line 437, in _error_catcher
yield
File "c:\users\administrator\appdata\local\programs\python\python38\lib\site-p
ackages\pip_vendor\urllib3\response.py", line 519, in read
data = self._fp.read(amt) if not fp_closed else b""
File "c:\users\administrator\appdata\local\programs\python\python38\lib\site-p
ackages\pip_vendor\cachecontrol\filewrapper.py", line 62, in read
data = self.__fp.read(amt)
File "c:\users\administrator\appdata\local\programs\python\python38\lib\http\c
lient.py", line 454, in read
n = self.readinto(b)
File "c:\users\administrator\appdata\local\programs\python\python38\lib\http\c
lient.py", line 498, in readinto
n = self.fp.readinto(b)
File "c:\users\administrator\appdata\local\programs\python\python38\lib\socket
.py", line 669, in readinto
return self._sock.recv_into(b)
File "c:\users\administrator\appdata\local\programs\python\python38\lib\ssl.py
", line 1241, in recv_into
return self.read(nbytes, buffer)
File "c:\users\administrator\appdata\local\programs\python\python38\lib\ssl.py
", line 1099, in read
return self._sslobj.read(len, buffer)
socket.timeout: The read operation timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "c:\users\administrator\appdata\local\programs\python\python38\lib\site-p
ackages\pip_internal\cli\base_command.py", line 228, in _main
status = self.run(options, args)
File "c:\users\administrator\appdata\local\programs\python\python38\lib\site-p
ackages\pip_internal\cli\req_command.py", line 182, in wrapper
return func(self, options, args)
File "c:\users\administrator\appdata\local\programs\python\python38\lib\site-p
ackages\pip_internal\commands\install.py", line 323, in run
requirement_set = resolver.resolve(
File "c:\users\administrator\appdata\local\programs\python\python38\lib\site-p
ackages\pip_internal\resolution\legacy\resolver.py", line 183, in resolve
discovered_reqs.extend(self._resolve_one(requirement_set, req))
File "c:\users\administrator\appdata\local\programs\python\python38\lib\site-p
ackages\pip_internal\resolution\legacy\resolver.py", line 388, in _resolve_one
abstract_dist = self._get_abstract_dist_for(req_to_install)
File "c:\users\administrator\appdata\local\programs\python\python38\lib\site-p
ackages\pip_internal\resolution\legacy\resolver.py", line 340, in _get_abstract
_dist_for
abstract_dist = self.preparer.prepare_linked_requirement(req)
File "c:\users\administrator\appdata\local\programs\python\python38\lib\site-p
ackages\pip_internal\operations\prepare.py", line 467, in prepare_linked_requir
ement
local_file = unpack_url(
File "c:\users\administrator\appdata\local\programs\python\python38\lib\site-p
ackages\pip_internal\operations\prepare.py", line 255, in unpack_url
file = get_http_url(
File "c:\users\administrator\appdata\local\programs\python\python38\lib\site-p
ackages\pip_internal\operations\prepare.py", line 129, in get_http_url
from_path, content_type = _download_http_url(
File "c:\users\administrator\appdata\local\programs\python\python38\lib\site-p
ackages\pip_internal\operations\prepare.py", line 282, in _download_http_url
for chunk in download.chunks:
File "c:\users\administrator\appdata\local\programs\python\python38\lib\site-p
ackages\pip_internal\cli\progress_bars.py", line 168, in iter
for x in it:
File "c:\users\administrator\appdata\local\programs\python\python38\lib\site-p
ackages\pip_internal\network\utils.py", line 64, in response_chunks
for chunk in response.raw.stream(
File "c:\users\administrator\appdata\local\programs\python\python38\lib\site-p
ackages\pip_vendor\urllib3\response.py", line 576, in stream
data = self.read(amt=amt, decode_content=decode_content)
File "c:\users\administrator\appdata\local\programs\python\python38\lib\site-p
ackages\pip_vendor\urllib3\response.py", line 541, in read
raise IncompleteRead(self._fp_bytes_read, self.length_remaining)
File "c:\users\administrator\appdata\local\programs\python\python38\lib\contex
tlib.py", line 131, in exit
self.gen.throw(type, value, traceback)
File "c:\users\administrator\appdata\local\programs\python\python38\lib\site-p
ackages\pip_vendor\urllib3\response.py", line 442, in _error_catcher
raise ReadTimeoutError(self._pool, None, "Read timed out.")
pip._vendor.urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='files
.pythonhosted.org', port=443): Read timed out.

在哪里可以了解2.0版本的功能更新？

在哪里可以了解2.0版本的功能更新？支持macOS的话，就有机会入手了。

no module named "wx"

no module named "wx"，这个模块安装不了

安装依赖

报错：

Using legacy 'setup.py install' for lxml, since package 'wheel' is not installed.
Installing collected packages: lxml, pyquery, certifi, chardet, idna, requests, Pillow, wxPython, pytz, pandas, future, pypng, pyqrcode, itchat, wxpy, soupsieve, beautifulsoup4
Running setup.py install for lxml ... error
ERROR: Command errored out with exit status 1:
command: 'd:\soft\python\python38\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\Users\Administrator\AppData\Local\Temp\pip-install-grl1th1g\lxml\setup.py'"'"'; file='"'"'C:\Users\Administrator\AppData\Local\Temp\pip-install-grl1th1g\lxml\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record 'C:\Users\Administrator\AppData\Local\Temp\pip-record-ohs8ihq1\install-record.txt' --single-version-externally-managed --compile --install-headers 'd:\soft\python\python38\Include\lxml'
cwd: C:\Users\Administrator\AppData\Local\Temp\pip-install-grl1th1g\lxml
Complete output (77 lines):
Building lxml version 4.3.3.
Building without Cython.
ERROR: b"'xslt-config' \xb2\xbb\xca\xc7\xc4\xda\xb2\xbf\xbb\xf2\xcd\xe2\xb2\xbf\xc3\xfc\xc1\xee\xa3\xac\xd2\xb2\xb2\xbb\xca\xc7\xbf\xc9\xd4\xcb\xd0\xd0\xb5\xc4\xb3\xcc\xd0\xf2\r\n\xbb\xf2\xc5\xfa\xb4\xa6\xc0\xed\xce\xc4\xbc\xfe\xa1\xa3\r\n"
** make sure the development packages of libxml2 and libxslt are installed **

Using build configuration of libxslt
running install
running build
running build_py
creating build
creating build\lib.win-amd64-3.8
creating build\lib.win-amd64-3.8\lxml
copying src\lxml\builder.py -> build\lib.win-amd64-3.8\lxml
copying src\lxml\cssselect.py -> build\lib.win-amd64-3.8\lxml
copying src\lxml\doctestcompare.py -> build\lib.win-amd64-3.8\lxml
copying src\lxml\ElementInclude.py -> build\lib.win-amd64-3.8\lxml
copying src\lxml\pyclasslookup.py -> build\lib.win-amd64-3.8\lxml
copying src\lxml\sax.py -> build\lib.win-amd64-3.8\lxml
copying src\lxml\usedoctest.py -> build\lib.win-amd64-3.8\lxml
copying src\lxml\_elementpath.py -> build\lib.win-amd64-3.8\lxml
copying src\lxml\__init__.py -> build\lib.win-amd64-3.8\lxml
creating build\lib.win-amd64-3.8\lxml\includes
copying src\lxml\includes\__init__.py -> build\lib.win-amd64-3.8\lxml\includes
creating build\lib.win-amd64-3.8\lxml\html
copying src\lxml\html\builder.py -> build\lib.win-amd64-3.8\lxml\html
copying src\lxml\html\clean.py -> build\lib.win-amd64-3.8\lxml\html
copying src\lxml\html\defs.py -> build\lib.win-amd64-3.8\lxml\html
copying src\lxml\html\diff.py -> build\lib.win-amd64-3.8\lxml\html
copying src\lxml\html\ElementSoup.py -> build\lib.win-amd64-3.8\lxml\html
copying src\lxml\html\formfill.py -> build\lib.win-amd64-3.8\lxml\html
copying src\lxml\html\html5parser.py -> build\lib.win-amd64-3.8\lxml\html
copying src\lxml\html\soupparser.py -> build\lib.win-amd64-3.8\lxml\html
copying src\lxml\html\usedoctest.py -> build\lib.win-amd64-3.8\lxml\html
copying src\lxml\html\_diffcommand.py -> build\lib.win-amd64-3.8\lxml\html
copying src\lxml\html\_html5builder.py -> build\lib.win-amd64-3.8\lxml\html
copying src\lxml\html\_setmixin.py -> build\lib.win-amd64-3.8\lxml\html
copying src\lxml\html\__init__.py -> build\lib.win-amd64-3.8\lxml\html
creating build\lib.win-amd64-3.8\lxml\isoschematron
copying src\lxml\isoschematron\__init__.py -> build\lib.win-amd64-3.8\lxml\isoschematron
copying src\lxml\etree.h -> build\lib.win-amd64-3.8\lxml
copying src\lxml\etree_api.h -> build\lib.win-amd64-3.8\lxml
copying src\lxml\lxml.etree.h -> build\lib.win-amd64-3.8\lxml
copying src\lxml\lxml.etree_api.h -> build\lib.win-amd64-3.8\lxml
copying src\lxml\includes\c14n.pxd -> build\lib.win-amd64-3.8\lxml\includes
copying src\lxml\includes\config.pxd -> build\lib.win-amd64-3.8\lxml\includes
copying src\lxml\includes\dtdvalid.pxd -> build\lib.win-amd64-3.8\lxml\includes
copying src\lxml\includes\etreepublic.pxd -> build\lib.win-amd64-3.8\lxml\includes
copying src\lxml\includes\htmlparser.pxd -> build\lib.win-amd64-3.8\lxml\includes
copying src\lxml\includes\relaxng.pxd -> build\lib.win-amd64-3.8\lxml\includes
copying src\lxml\includes\schematron.pxd -> build\lib.win-amd64-3.8\lxml\includes
copying src\lxml\includes\tree.pxd -> build\lib.win-amd64-3.8\lxml\includes
copying src\lxml\includes\uri.pxd -> build\lib.win-amd64-3.8\lxml\includes
copying src\lxml\includes\xinclude.pxd -> build\lib.win-amd64-3.8\lxml\includes
copying src\lxml\includes\xmlerror.pxd -> build\lib.win-amd64-3.8\lxml\includes
copying src\lxml\includes\xmlparser.pxd -> build\lib.win-amd64-3.8\lxml\includes
copying src\lxml\includes\xmlschema.pxd -> build\lib.win-amd64-3.8\lxml\includes
copying src\lxml\includes\xpath.pxd -> build\lib.win-amd64-3.8\lxml\includes
copying src\lxml\includes\xslt.pxd -> build\lib.win-amd64-3.8\lxml\includes
copying src\lxml\includes\__init__.pxd -> build\lib.win-amd64-3.8\lxml\includes
copying src\lxml\includes\etree_defs.h -> build\lib.win-amd64-3.8\lxml\includes
copying src\lxml\includes\lxml-version.h -> build\lib.win-amd64-3.8\lxml\includes
creating build\lib.win-amd64-3.8\lxml\isoschematron\resources
creating build\lib.win-amd64-3.8\lxml\isoschematron\resources\rng
copying src\lxml\isoschematron\resources\rng\iso-schematron.rng -> build\lib.win-amd64-3.8\lxml\isoschematron\resources\rng
creating build\lib.win-amd64-3.8\lxml\isoschematron\resources\xsl
copying src\lxml\isoschematron\resources\xsl\RNG2Schtrn.xsl -> build\lib.win-amd64-3.8\lxml\isoschematron\resources\xsl
copying src\lxml\isoschematron\resources\xsl\XSD2Schtrn.xsl -> build\lib.win-amd64-3.8\lxml\isoschematron\resources\xsl
creating build\lib.win-amd64-3.8\lxml\isoschematron\resources\xsl\iso-schematron-xslt1
copying src\lxml\isoschematron\resources\xsl\iso-schematron-xslt1\iso_abstract_expand.xsl -> build\lib.win-amd64-3.8\lxml\isoschematron\resources\xsl\iso-schematron-xslt1
copying src\lxml\isoschematron\resources\xsl\iso-schematron-xslt1\iso_dsdl_include.xsl -> build\lib.win-amd64-3.8\lxml\isoschematron\resources\xsl\iso-schematron-xslt1
copying src\lxml\isoschematron\resources\xsl\iso-schematron-xslt1\iso_schematron_message.xsl -> build\lib.win-amd64-3.8\lxml\isoschematron\resources\xsl\iso-schematron-xslt1
copying src\lxml\isoschematron\resources\xsl\iso-schematron-xslt1\iso_schematron_skeleton_for_xslt1.xsl -> build\lib.win-amd64-3.8\lxml\isoschematron\resources\xsl\iso-schematron-xslt1
copying src\lxml\isoschematron\resources\xsl\iso-schematron-xslt1\iso_svrl_for_xslt1.xsl -> build\lib.win-amd64-3.8\lxml\isoschematron\resources\xsl\iso-schematron-xslt1
copying src\lxml\isoschematron\resources\xsl\iso-schematron-xslt1\readme.txt -> build\lib.win-amd64-3.8\lxml\isoschematron\resources\xsl\iso-schematron-xslt1
running build_ext
building 'lxml.etree' extension
error: Microsoft Visual C++ 14.0 is required. Get it with "Build Tools for Visual Studio": https://visualstudio.microsoft.com/downloads/
----------------------------------------

ERROR: Command errored out with exit status 1: 'd:\soft\python\python38\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\Users\Administrator\AppData\Local\Temp\pip-install-grl1th1g\lxml\setup.py'"'"'; file='"'"'C:\Users\Administrator\AppData\Local\Temp\pip-install-grl1th1g\lxml\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record 'C:\Users\Administrator\AppData\Local\Temp\pip-record-ohs8ihq1\install-record.txt' --single-version-externally-managed --compile --install-headers 'd:\soft\python\python38\Include\lxml' Check the logs for full command output.

kobicoin.com

Bug Report

Description: [Description of the issue]

Expected behavior: [What should happen]

Current behavior: [What happpens instead of the expected behavior]

Steps to Reproduce:

[First Step]
[Second Step]
[and so on ¡]

Reproduce how often: [What percentage of the time does it reproduce?]

Possible solution: [Not obligatory, but suggest a fix/reason for the bug]

Context (Environment):[The code version, python version, operating system or other software/libs you use]

Additional Information

[Any other useful information about the problem].

GITHUB

兄弟能看看有没有办法让邮箱服务器走代理。主要是需要寻找一个可以放Linux全局走代理的方法就行，然后通过命令行切换IP或许是其他邮箱服务器切换IP的思路？能解决会有报酬有解决办法的开发请联系 Q3374835496 邮箱 [email protected] skype live:.cid.b409052f6258136f

Brother can see if there is a way to make the mailbox server go proxy. Need to find a way to put Linux global go proxy on the line, and then switch IP through the command line may be other mailbox server switch IP ideas? Can solve will be paid, there is a solution to the development of please contact Q3374835496 email 3374835496@qq. Com Skype Live: . Cid. B409052F6258136F

期待macos版本

这个想法非常不错。能不能拓展一下关于关键词的信息搜索与归纳的功能。
希望早点支持Macos版本，与我同样期待的人应该不少。

现在一些网站都会有反爬措施，其中 webdriver 的检查是很基本的一项。如何规避反爬？

能规避反爬保护吗？
能够链接跟进吗？

关于简书爬虫

如果作者开发一个从特定文章获取数据的功能，也许会提升运行效率。

看了目前的爬虫代码，是从个人主页获取的，但是文章中获取好像有点难，开发工具里找不到对应的网络请求。

要爬的字段主要是这几个：

简书钻
阅读量
发布时间
点赞量
评论量

后两个已经可以解决了，前三个可以在 Html 中找到，但直接 Get 获取不到，看网络请求发现没有，应该是 JS 发起请求再填充进去的，但我没有 JS 开发能力，没办法解析代码。

初步定位到请求应该来自 _app.js 这个文件，不知道具体怎么发起的，居然可以隐藏网络请求。

最后，我自己有个简书爬虫库，主页的 JianshuResearchTools 就是，也用的 Requests 和 BeautifulSoup4，可以参考一下，如果能提几个 PR 更好。

感谢开发大大。

有计划支持微信吗？

指微信支付账单

ERROR: Command errored out with exit status 1

pip install -r requirements.txt
输出：

ERROR: Command errored out with exit status 1: /usr/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-pf5_kd92/wxpython/setup.py'"'"'; __file__='"'"'/tmp/pip-install-pf5_kd92/wxpython/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /tmp/pip-record-gjw9u541/install-record.txt --single-version-externally-managed --user --prefix= --compile --install-headers /home/tz/.local/include/python3.8/wxPython Check the logs for full command output.

爬取失败

systeminfo:

C:\Users\stsg0>python -V
Python 3.7.9

C:\Users\stsg0>pip -V
pip 20.2.3 from c:\users\stsg0\appdata\local\programs\python\python37\lib\site-packages\pip (python 3.7)

1.点击QQ邮箱，没有弹出输入框，右下角直接提示爬取失败

2.点击网易邮箱，控制台报错

chromeDriver已经启动了

有增加豆瓣的计划吗？

豆瓣老用户，对ta不放心，烦请入列，不胜感激。

kangvcar / infospider Goto Github PK

infospider's Introduction

🗣️ TG交流群：加入群组

开发者回忆录

场景一

场景二

场景三

想法

What is INFO-SPIDER

Features

Screenshot

QuickStart

依赖安装

工具运行

购买服务

数据源

数据分析

计划

Visitors

Developers want to say

Contributors

Sponsors

Changelog

License

Star History

infospider's People

Contributors

Stargazers

Watchers

Forkers

infospider's Issues

Bug Report

Bug Report

Additional Information

Bug Report

Additional Information

Bug Report

Additional Information

Bug Report

Additional Information

Bug Report

Bug Report

Additional Information

Bug Report

Additional Information

Bug Report

Additional Information

报错：

Bug Report

Additional Information

Recommend Projects

Recommend Topics

Recommend Org