stupeter / autohome_spider Goto Github PK
View Code? Open in Web Editor NEW汽车之家爬虫,解决字体反爬。
汽车之家爬虫,解决字体反爬。
您好!请问原项目中的相关网站API是在网站源代码中获取的吗?想在您的项目基础上更新使用,但不知道如何下手T T谢谢!!
KeyError Traceback (most recent call last)
in
50 #updateCarUrl()
51 # 爬虫主程序
---> 52 main(pageIndex=1, BBSId=3411)
in main(pageIndex, BBSId)
23 auto = AutoHomeSpider(nlp=True)
24 # 选定论坛页,其中pageindex表示页码,bbsid表示车型代码
---> 25 topic = auto.analysis_forumPost(pageindex=pageIndex, bbsid=BBSId)
26 # 循环爬取该页所有帖子
27 for postUrl in topic['url']:
~\爬虫用\Autohome\AutoHomeSpiderClass.py in analysis_forumPost(self, pageindex, bbsid)
125 }
126 res = requests.get(self.forumApi, headers=self.headers, params=params, timeout=5)
--> 127 postList = res.json()['result']['list']
128 topic = dict()
129 topic['url'] = list() # 帖子链接列表
KeyError: 'result'
感谢作者的分享,功能较全注释也很好理解。
在使用评论情感打分功能时,发现了一处小bug:AutoHomeSpider类中的analysis_Post函数中,初始化post['postSentiments']应写在内层循坏外,否则会报错;
关于自然语言处理,还有一点小建议:感觉部分打分有悖常识,产生偏差的主要原因可能是无效信息干扰了判断(e.g. 车主评价天气好导致打分高,但不是对车的直接评价),不知道snownlp这个包是否有类似信息拆分的函数,可以考虑先行过滤一遍。
显示成功,但是,CSV文件中,没有存进去任何文件。。
难道是反爬策略更新了?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.