Comments (2)
更多一些观察到的细节:
- 网页端限制了最多查看最近2万条微博,建议一次获取较多微博的时候,使用按时间范围筛选的功能来分段爬取。然后多次导入到预览页面后,再点击导出数据即可获取合并后所有的微博,包括对应的图片链接文件。
- 经验来讲,获取包括评论的两千条微博大概要半小时,不含评论则是每分钟100条微博,这也是为了防止过于频繁触发微博的限制机制。
- 长时间多次爬取时,微博会暂时地封禁ip,这时候要么等几十分钟后再爬取,或者切换WIFI、流量,或是改用全局网络代理来绕过。记得再次运行前勾选上”继续上次的记录”。
from weibo-archiver.
大概地修了一下这个问题,放缓了一些速度 weibo-archiver.user.js.zip
from weibo-archiver.
Related Issues (18)
- 对于一个完全不懂编程的小白还有救吗? HOT 3
- 能导入类似GTS Mastodon之类的联邦宇宙就好了 HOT 1
- 关于新版 (v0.1.11) 更改预览方式与脚本 HOT 5
- 想请教问题 HOT 3
- 项目 Todo
- 未能获取 2015 年左右微博的评论 HOT 2
- 图片链接正则匹配过于宽泛,命中了错误图片地址 HOT 2
- 热评获取失败 HOT 1
- 分页的切换逻辑不对
- 文本解析不对、带图评论判断错误
- [web]: 怪异的 UI 溢出
- [monkey]: 默认折叠菜单
- 数据量过大时,会卡死 HOT 2
- monkey:批量运行的结束时机判断错了
- 更改导出的格式,并导出关注列表 HOT 1
- 修改导入其他项目保存的json文件后多图无法正常显示 HOT 2
- 当追加导出最近几天的微博后,再导入,提示导入成功,但是却看不到。 HOT 8
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from weibo-archiver.