Giter Site home page Giter Site logo

bilibili-user's Introduction

Hi there, I'm Airing

🐻 A Web developer🎯 from China.

  • 🌱 I’m currently working on C++, iOS, Cocos, React and Flutter
  • 📫 How to reach me: You may follow me on my blog(ursb.me) or Zhihu
  • 📢 Personal Telegram Channel: t.me/airingchannel
  • 📝 Resume

My Skills

github stats

bilibili-user's People

Contributors

airingursb avatar applenice avatar cloudsere avatar yuancao1996 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

bilibili-user's Issues

第51行代码问题

urls [ ]如果在for循环里面,爬取100个会停止,放在for循环外面则没问题

http 状态码:200 但是 status : false

爬虫初学者,copy您的代码做了个测试,但是一时还找不出错误在哪,求指点

发现了问题所在 吼吼

输出结果:
"F:\Program Files\python\python.exe" "F:/Program Files/py3/text/bilibiliAPI_1.py"
b'{"status":false,"data":"\u53c2\u6570\u9519\u8bef"}'

Process finished with exit code 0

以下是源码:
`import requests
import time
import datetime

def datetime_to_timestamp_in_milliseconds(d):
current_milli_time = lambda: int(round(time.time() * 1000))
return current_milli_time()
url = 'http://space.bilibili.com/ajax/member/GetInfo?mid=5025594&_='+str(datetime_to_timestamp_in_milliseconds(datetime.datetime.now()))

head ={
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36',
'X-Requested-With': 'XMLHttpRequest',
'Referer': 'http://space.bilibili.com/5025594/',
'Origin': 'http://space.bilibili.com',
'Host': 'space.bilibili.com',
'AlexaToolbar-ALX_NS_PH': 'AlexaToolbar/alx-4.0',
'Accept-Language': 'zh-CN,zh;q=0.8,en;q=0.6,ja;q=0.4',
'Accept': 'application/json, text/javascript, /; q=0.01',
}
jscontent = requests.post(url, headers=head).content
print(jscontent)`

粉丝数以及关注数和文章阅读数无法获得

改完数据库数据之后爬取数据发现粉丝数和关注数和文章数等等都是0,用try语句的错误返回看了一下显示Expecting value: line 1 column 1 (char 0)
但是我在浏览器里面输入request网址是正常的json文件。不清楚什么情况
错误代码
Succeed get user info: 0 1.0389645099639893
Expecting value: line 1 column 1 (char 0)
Succeed get user info: 1 3.0973610877990723
Expecting value: line 1 column 1 (char 0)
Succeed get user info: 2 4.11681866645813
Expecting value: line 1 column 1 (char 0)
Succeed get user info: 3 5.138977527618408
Expecting value: line 1 column 1 (char 0)

加过错误显示的源代码

               try:
                    res = requests.get(
                        'https://api.bilibili.com/x/relation/stat?vmid=' + str(mid) + '&jsonp=jsonp').text
                    viewinfo = requests.get(
                        'https://api.bilibili.com/x/space/upstat?mid=' + str(mid) + '&jsonp=jsonp').text
                    js_fans_data = json.loads(res)
                    js_viewdata = json.loads(viewinfo)
                    following = js_fans_data['data']['following']
                    fans = js_fans_data['data']['follower']
                    archiveview = js_viewdata['data']['archive']['view']
                    article = js_viewdata['data']['article']['view']
                except Exception as e:
                    print(e)
                    following = 0
                    fans = 0
                    archiveview = 0
                    article = 0

爬取中断

运行之后爬取几十条数据就停了,重新运行还是这样。

非常喜欢你的爬虫数据分析

我非常喜欢你画的图标,非常好,简直完美。
我也想做这样的数据分析。请问你收徒弟吗?
我的微信:ZFL420

写入数据库时有语法错误

是不是b站的接口失效了,这边显示写入数据库时有语法错误,而我只改动了host的信息
下面是报错信息
Succeed get user info: 0 1.3978490829467773
(1064, "You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'rank, face, regtime, spacesta, birthday, sign, l' at line 1")
Succeed get user info: 1 3.202674627304077
(1064, "You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'rank, face, regtime, spacesta, birthday, sign, l' at line 1")
Succeed get user info: 2 4.626889705657959
(1064, "You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'rank, face, regtime, spacesta, birthday, sign, l' at line 1")
Succeed get user info: 3 6.0370707511901855
(1064, "You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'rank, face, regtime, spacesta, birthday, sign, l' at line 1")
Succeed get user info: 4 7.459824562072754
(1064, "You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'rank, face, regtime, spacesta, birthday, sign, l' at line 1")

请问连接失败什么问题

HTTPConnectionPool(host='218.85.133.62', port=80): Max retries exceeded with url: http://space.bilibili.com/ajax/member/GetInfo (Caused by ProxyError('Cannot connect to proxy.', NewConnectionError('<urllib3.connection.HTTPConnection object at 0x03939230>: Failed to establish a new connection: [WinError 10060] 由于连接方在一段时间后没有正确答复或连接的主机没有反应,连接 尝试失败。',)))

项目周报 (2019 年 8 月 31 日 - 2019 年 9 月 7 日)


ISSUES

上周有 1 个新 issue。
这个 issue 已经关闭。

CLOSED ISSUES

❤️ #50 跑了一遍发现有一个数据库的1064错误 然后找半天也找不到。。 不知道是不是编码问题, by a709046532


PULL REQUESTS

上周 pull request 被创建、更新或 merge。


COMMITS

上周没有提交。


CONTRIBUTORS

上周没有贡献者.


STARGAZERS

上周获得了 12 个 star。它们分别来自于:

jujube-framework | ⭐ SpencerRaw | ⭐ layneChan | ⭐ hujiexin77 | ⭐ Saki9 | ⭐ jet31 | ⭐ zuolining | ⭐ LOGHORIZION | ⭐ kk233333 | ⭐ feihua813 | ⭐ yistudent0112 | ⭐ ZZemptypoint |
You all are the stars! 🌟


以上就是本周的项目周报。你可以点击 weekly-digest 查看往期的项目周报。

数据库各字段的含义及数据问题

爬取了五千多条数据,除了id、姓名、性别、face、注册时间、生日、签名、level等字段不同外,其余都相同,甚至 粉丝、关注、文章等字段都为0!!!请问一下这是什么情况?!另外,麻烦解释一下字段的含义,非常感谢!

项目周报 (2019 年 8 月 24 日 - 2019 年 8 月 31 日)


ISSUES

上周没有新 issue。


PULL REQUESTS

上周 pull request 被创建、更新或 merge。


COMMITS

上周没有提交。


CONTRIBUTORS

上周没有贡献者.


STARGAZERS

上周获得了 8 个 star。它们分别来自于:

yuchent | ⭐ BB-Code | ⭐ hackstoic | ⭐ FangnaF | ⭐ i0Ek3 | ⭐ IRendy | ⭐ OKdongge | ⭐ wh1994 |
You all are the stars! 🌟


以上就是本周的项目周报。你可以点击 weekly-digest 查看往期的项目周报。

get_face部分代码在python3中的修改

作者之前python2版本的

import urllib
import re

f = open("/Users/airing/Documents/work/Data/bilibili_user_face.txt")
line = f.readline()
for i in range(1, 1000):
    print line,
    if re.match('http://static.*', line):
        line = f.readline()
        print 'noface:' + str(i)
    else:
        path = r"/Users/airing/Documents/work/Data/face/" + str(i) + ".jpg"
        data = urllib.urlretrieve(line, path)
        line = f.readline()
        print 'succeed:' + str(i)

f.close()

修改之后的python3版本

# -*-coding:utf8-*-

import urllib.request
import re

f = open("/Users/11320/Documents/work/Data/bilibili_user_face.txt")
line = f.readline()
for i in range(1, 1000):
    print (line)
    if re.match('http://static.*', line):
        line = f.readline()
        print ('noface:' + str(i))
    else:
        path = r"/Users/11320/Documents/work/Data/face/" + str(i) + ".jpg"
        data = urllib.request.urlretrieve(line, path)
        line = f.readline()
        print ('succeed:' + str(i))

f.close()

每次爬取99条记录就中断,再次运行脚本结果集相同

第一次执行bilibili_user.py,第一行返回Succeed get user info: 521401 0.5652303695678711,最后一行返回Error: https://space.bilibili.com/521499,然后就自动退出执行脚本。我进入数据库查看,有86条记录,即有13条记录没有爬取到(我不关心)。
第二次再次执行,仍从ID为 521401 开始爬取,与第一次执行情况一样(除时间外),数据库增加到了172条记录,测试之后,发现每个ID 都重复两次。
请问一下,
1、如何解决一次性爬取任意条数据,而并非99条后就退出。
2、如何实现第二次执行脚本是接着上一次继续爬取,而并非重新开始。(每次返回的结果集一样,没意义)。
3、若不能实现问题2,那如何实现当重新执行脚本时,避免重复爬取相同记录。
我是新手,还望见谅,感谢指导!

部分ip已失效,字段值的错误

部分ip已经失效(2018-4-18)

已失效ip
http://116.199.115.79:80
http://116.199.115.79:80
http://116.199.115.79:80

字段值的错误

爬取了b站前200名注册用户,article, folowing, fans, coins字段都为0,birth字段无法正常现实生日年份

bilibili

项目周报 (2019 年 9 月 14 日 - 2019 年 9 月 21 日)


ISSUES

上周没有新 issue。


PULL REQUESTS

上周 pull request 被创建、更新或 merge。


COMMITS

上周没有提交。


CONTRIBUTORS

上周没有贡献者.


STARGAZERS

上周获得了 11 个 star。它们分别来自于:

ShenGuanghong | ⭐ lovecn | ⭐ wdghcsh | ⭐ chaojunke | ⭐ luneshao | ⭐ chrislynn83 | ⭐ hxyShawn | ⭐ poison501 | ⭐ banfucai | ⭐ llj19970422 | ⭐ linqinloong |
You all are the stars! 🌟


以上就是本周的项目周报。你可以点击 weekly-digest 查看往期的项目周报。

连接数据库出错

1064, "You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'rank, face, regtime, spacesta, birthday, sign, l' at line 1"

Json解析失败

b站可能已经更改了接口,求一份数据库的dump。

项目周报 (2019 年 9 月 21 日 - 2019 年 9 月 28 日)


ISSUES

上周没有新 issue。


PULL REQUESTS

上周 pull request 被创建、更新或 merge。


COMMITS

上周没有提交。


CONTRIBUTORS

上周没有贡献者.


STARGAZERS

上周获得了 11 个 star。它们分别来自于:

agcharon | ⭐ Konano | ⭐ jiangyongitcast | ⭐ LLLLBD | ⭐ macjab | ⭐ tiackman | ⭐ flying7993 | ⭐ Loohaze | ⭐ David56038 | ⭐ lishiyuwhu | ⭐ akisekinoko |
You all are the stars! 🌟


以上就是本周的项目周报。你可以点击 weekly-digest 查看往期的项目周报。

你好,会员注册时间获取有些问题。

我用fiddler获取json之后,请问注册时间是regtime=1406807890这个参数吗,后面的数字又是怎样转换成年月日的呢?希望可以解答一下我的疑惑,谢谢。

小白提一个问题w

在用Windows 8.1运行脚本的时候,有MySQL Error的提示,请问该怎样解决呢?
PS:Python 3.6.4,已经执行过pip install requests和pip install pymysql
还有,请问您当时跑这个脚本花了多长时间呢?

项目周报 (2019 年 10 月 5 日 - 2019 年 10 月 12 日)


ISSUES

上周没有新 issue。


PULL REQUESTS

上周 pull request 被创建、更新或 merge。


COMMITS

上周没有提交。


CONTRIBUTORS

上周没有贡献者.


STARGAZERS

上周获得了 10 个 star。它们分别来自于:

beating-1224 | ⭐ caiquan-github | ⭐ WMXNLFD | ⭐ smallhouse111 | ⭐ hcxiong | ⭐ qimomo | ⭐ teze | ⭐ corlin | ⭐ amuko | ⭐ Zhaoxiaolistudy |
You all are the stars! 🌟


以上就是本周的项目周报。你可以点击 weekly-digest 查看往期的项目周报。

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.