Comments (8)
可能是 if server_data['showpin']:
这个语句的问题
替换成下面代码试试呢?
if server_data.get('showpin', None)
from weibospider.
好了,谢谢大佬,不过执行搜索后又有报错了。。
[2018-04-18 16:20:27,986: INFO/MainProcess] Received task: tasks.search.search_keyword[27f9ef57-4963-4641-883b-f27f866f0f47]
2018-04-18 16:20:27 - crawler - INFO - We are searching keyword "快手"
[2018-04-18 16:20:27,989: INFO/ForkPoolWorker-1] We are searching keyword "快手"
2018-04-18 16:20:27 - crawler - INFO - the crawling url is http://s.weibo.com/weibo/%E5%BF%AB%E6%89%8B&scope=ori&suball=1&page=1
[2018-04-18 16:20:27,990: INFO/ForkPoolWorker-1] the crawling url is http://s.weibo.com/weibo/%E5%BF%AB%E6%89%8B&scope=ori&suball=1&page=1
2018-04-18 16:20:27 - crawler - ERROR - failed to crawl http://s.weibo.com/weibo/%E5%BF%AB%E6%89%8B&scope=ori&suball=1&page=1,here are details:'NoneType' object is not subscriptable, stack is File "/homen_gu/Desktop/weibospider-master/decorators/decorators.py", line 17, in time_limit
return func(*args, **kargs)
[2018-04-18 16:20:27,996: ERROR/ForkPoolWorker-1] failed to crawl http://s.weibo.com/weibo/%E5%BF%AB%E6%89%8B&scope=ori&suball=1&page=1,here are details:'NoneType' object is not subscriptable, stack is File "/homen_gu/Desktop/weibospider-master/decorators/decorators.py", line 17, in time_limit
return func(*args, **kargs)
2018-04-18 16:20:27 - crawler - WARNING - No search result for keyword 快手, the source page is
[2018-04-18 16:20:27,998: WARNING/ForkPoolWorker-1] No search result for keyword 快手, the source page is
[2018-04-18 16:20:27,998: INFO/ForkPoolWorker-1] Task tasks.search.search_keyword[27f9ef57-4963-4641-883b-f27f866f0f47] succeeded in 0.009885783000072479s: None
from weibospider.
检查一下你的redis中是否有cookies,然后手动测试一下,确认你的账号是否可以用于搜索
from weibospider.
redis里没有cooikes,
(WeiboSpider)lin_gu@ww:~/Desktop/weibospider-master$ ./redis-3.2.9/src/redis-cli
127.0.0.1:6379> auth weibospider
OK
127.0.0.1:6379> keys *
(empty list or set)
127.0.0.1:6379>
帐号高级搜索是可以用的。
配置如下
redis:
host: 127.0.0.1
port: 6379
password: 'weibospider'
cookies: 1 # store and fetch cookies
# store fetched urls and results,so you can decide whether retry to crawl the urls or not
urls: 2
broker: 5 # broker for celery
backend: 6 # backed for celery
id_name: 8 # user id and names,for repost info analysis. Could be safely deleted after repost tasks
# expire_time (hours) for redis db2, if they are useless to you, you can set the value smaller
expire_time: 48
# redis sentinel for ha. if you neet it, just add sentinel host and port below the sentinel args,like this:
###############################
#sentinel: #
# - host: 2.2.2.2 #
# port: 26379 #
# - host: 3.3.3.3 #
# port: 26379 #
# #
###############################
sentinel: ''
master: '' # redis sentinel master name, if you don't need it, just set master: ''
socket_timeout: 5 # sockt timeout for redis sentinel, if you don't need it, just set master: ''
from weibospider.
你确定你用的1.7.2?貌似1.7.2默认不是采用
###############################
#sentinel: #
# - host: 2.2.2.2 #
# port: 26379 #
# - host: 3.3.3.3 #
# port: 26379 #
# #
###############################
这个注释风格的。
在login.py 的get_session
函数的return session
之前打印一下当前cookies呢?
Cookies.store_cookies(name, session.cookies.get_dict())
print(session.cookies.get_dict()) # 加这句话,然后观察登录的时候是否有cookies打印出来
return session
from weibospider.
我按照这个lssues改了之后可以登录并打印出cookies,但是redis里没有cookies
注释风格应该是拷贝的问题。。。我在另一台电脑上用QQ拷过来的。。
from weibospider.
1.7.2的配置文件不长这个样。要不重新在releases中下载稳定版的代码跑跑?
或者你懂Python的话,调试一下redis_db.py吧,看看是不是哪里有问题
from weibospider.
嗯,我重新下一个跑跑吧,谢谢大佬啦~
from weibospider.
Related Issues (20)
- 如何限定时间段,爬取从某年月日到某年月日的微博? HOT 3
- mysql数据库里user_relation这样表 一直是空,是哪里有问题? HOT 16
- 微博关键词搜索 create_time 一栏有四种时间格式,能否统一为一个 20**年**月**日的形式? HOT 3
- user.py 中 script.string 有bug 导致 mysql数据库表user_relation一直是空
- 抓取 user_relation。 user.py 有bug
- 无法启动worker (停在INFO/MainProcess] mingle: all alone不动)
- 启动worker时执行到**[2020-04-02 12:36:58,850: INFO/MainProcess] mingle: all alone**就不再继续 HOT 2
- 执行login_first.py之后显示ValueError: not enough values to unpack (expected 3, got 0) HOT 1
- 运行worker 就报错了,我的redie 配置和爬虫配置密码都对的:[2020-04-25 19:53:50,902: ERROR/MainProcess] consumer: Cannot connect to redis://:**@localhost:6379/6: Client sent AUTH, but no password is set.
- 云打码平台好像失效了,之前那个超级鹰平台的issues下的temp_verification我按照操作来可是出了奇怪的bug,请问能根据新的打码平台更新一下吗,麻烦了 HOT 9
- threading.Thread.isAlive has been deprecated and removed in Python 3.9 in favour of is_alive
- 登入帐号时遇到要求扫码登入,是Weibo有改版吗? HOT 2
- 爬取不到数据,启动 work 页面 输出的都是一些爬取失败 和 warning 信息 类似: HOT 1
- 非酋做配置,试错笔记 HOT 2
- 微博爬虫的合理阈值
- 请问这个爬虫爬取关键词的话是只能爬取50页的上限吗? HOT 1
- 关于更新/维护
- requirements.txt 文件当中的requests、Django版本号请进行修改下,谢谢
- 运行python config/create_all.py 报错 HOT 1
- 运行 python3 config/create_all.py 报错 HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from weibospider.