k1995 / baiduyunspider Goto Github PK
View Code? Open in Web Editor NEW百度云网盘搜索引擎,包含爬虫 & 网站
百度云网盘搜索引擎,包含爬虫 & 网站
我对代码进行了改造,使用了代理ip但是仍然报错:
uk:2518160999 error to fetch files,try again later
getShareLists errno:-55
代码如下:
def getHtml(url,ref=None,reget=5):
try:
proxies={'http': '222.194.14.130:808'}
proxy_support = urllib2.ProxyHandler(proxies)
opener = urllib2.build_opener(proxy_support, urllib2.HTTPHandler)
#定义Opener
# urllib2.install_opener(opener)
request = urllib2.Request(url)
request.add_header('User-Agent', 'Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.135 Safari/537.36')
if ref:
request.add_header('Referer',ref)
page = urllib2.urlopen(request,timeout=10)
html = page.read()
except:
if reget>=1:
#如果getHtml失败,则再次尝试5次
print 'getHtml error,reget...%d'%(6-reget)
time.sleep(2)
return getHtml(url,ref,reget-1)
else:
print 'request url:'+url
print 'failed to fetch html'
exit()
else:
return html
errno=-55;这个是什么造成的,我的爬虫,现在一直被返回这个错误码。能否给我一份大概带注释的爬虫脚本,我自己可以修改下,想减少下弯路,我是Python小白。谢了
你好,我想根据指定关键字爬取数据,怎么处理
bug
我也写了一个百度云搜索 www.81ad.cn 没放广告,调用百度内部接口,现在已经有3千多万数据了
可否增加些网站页面 或者 加上相关关键词搜索 来优化下内页之间的关联性
想了解这个搜索引擎是怎么操作的
success to fetched hot users: 24
Traceback (most recent call last):
File "spider.py", line 475, in
spider.seedUsers()
File "spider.py", line 328, in seedUsers
self.db.commit()
File "spider.py", line 101, in commit
self.dbconn.commit()
AttributeError: 'NoneType' object has no attribute 'commit'
请问有没有什么解决方法呢?
操作系统是用的 Centos 7X64
Python版本是:2.7.5
怎么联系你
大四就能写爬虫了,请收下膝盖~
按照你的步骤,执行。。是不是缺少了什么,
scrapy crawl baidupan 执行这个命令是一直报这个错
当我发出搜索请求时,显示的请求链接如下
http://mydomain/s/57un55S15L%2Bd5oqk?from=sf&type=all
可是,在nginx中,每次都是在显示404错误,找不到页面,
这个问题困扰几天了,仍然没解决,
请大佬帮忙解答
File "c:\users\administrator.win-a3unjobi233\appdata\local\programs\python\python38\lib\site-packages\scrapy\crawler.py", line 89, in crawl
yield self.engine.open_spider(self.spider, start_requests)
redis.exceptions.ConnectionError: Error 10061 connecting to 127.0.0.1:6379. 由于目标计算机积极拒绝,无法连接。.
2021-02-01 10:12:28 [twisted] CRITICAL:
Traceback (most recent call last):
File "c:\users\administrator.win-a3unjobi233\appdata\local\programs\python\python38\lib\site-packages\redis\connection.py", line 559, in connect
sock = self._connect()
File "c:\users\administrator.win-a3unjobi233\appdata\local\programs\python\python38\lib\site-packages\redis\connection.py", line 615, in _connect
raise err
File "c:\users\administrator.win-a3unjobi233\appdata\local\programs\python\python38\lib\site-packages\redis\connection.py", line 603, in _connect
sock.connect(socket_address)
ConnectionRefusedError: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "c:\users\administrator.win-a3unjobi233\appdata\local\programs\python\python38\lib\site-packages\twisted\internet\defer.py", line 1418, in _inlineCallbacks
result = g.send(result)
File "c:\users\administrator.win-a3unjobi233\appdata\local\programs\python\python38\lib\site-packages\scrapy\crawler.py", line 89, in crawl
yield self.engine.open_spider(self.spider, start_requests)
redis.exceptions.ConnectionError: Error 10061 connecting to 127.0.0.1:6379. 由于目标计算机积极拒绝,无法连接。.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.