Giter Site home page Giter Site logo

pythondataanalysis's Issues

有关代码的问题

你好,我想知道这是要抓取什么mydata = soup.select('#display')[0].get_text()?我去网页源代码看了下没有找到id=#display的

爬虫多线程Myfutures执行时间为0s

这是读者卡若米反馈的问题, 是我在调整代码的时候出了错。 原因是执行Mymultithread(10)urls列表已经pop空了,所以后面的Myfutures(10)就没有待下载的网页了,所以时间也就是0了。这里只要改变两者的执行顺序即可,就是先执行Myfutures(10),再执行Mymultithread(10)

真心为自己的疏忽感到抱歉,同时在此感谢读者们的反馈。

利用正则表达式提取百度首页关键词

章节:2.3.6 正则表达式入门
书中源码:“
import re
import requests
from fake_useragent import UserAgent

ua = UserAgent()
headers = {'User-Agent': ua.random}

headers = {}

html = requests.get('https://www.baidu.com/', headers=headers)
html.encoding = 'utf-8'
html = html.text

print(html)

titles = re.findall(r'(\w{2})', html)
print(titles)”
利用此源码获取的网页内容已不能正常提取关键字,print出来的内容变成了像贴吧这样的内容,需要重新编辑正则式。

早些使用OpenJDK, 生活会很美好

早些使用OpenJDK, 生活会很美好
所以我建议作者在第8页教大家安装的是OpenJDK, 而不是OracleJDK
Oracle Technology Network License Agreement for Oracle Java SE >>
Further, You may not:

  • use the Programs for any data processing or any commercial, production, or internal business purposes other than developing, testing, prototyping, and demonstrating your Application;
  • remove or modify any Program markings or any notice of Oracle’s or a licensor’s proprietary rights;
  • make the Programs available in any manner to any third party (other than Contractors acting on Your behalf as set forth in this Agreement);
  • assign this Agreement or distribute, give, or transfer the Programs or an interest in them to any third party, except as expressly permitted in this Agreement for Contractors (the foregoing shall not be construed to limit the rights You may otherwise have with respect to Separately Licensed Third Party Technology);
  • cause or permit reverse engineering (unless required by law for interoperability), disassembly or decompilation of the Programs; and
  • create, modify, or change the behavior of, classes, interfaces, or subpackages that are in any way identified as "java", "javax", "sun", “oracle” or similar convention as specified by Oracle in any naming convention designation.

一直使用OpenJDK, 从未改变>>
AdoptOpenJDK

MxlsxClass.py文件执行报错,找不到pandas_simple.xlsx文件

Traceback (most recent call last):
File "E:/Python/2019.7.15/19_面向对象.py", line 143, in
Demo.get_fileinfo()
File "E:/Python/2019.7.15/19_面向对象.py", line 21, in get_fileinfo
self.wb = load_workbook(filename=self.filename)
File "C:\Users\admin\AppData\Roaming\Python\Python37\site-packages\openpyxl\reader\excel.py", line 311, in load_workbook
data_only, keep_links)
File "C:\Users\admin\AppData\Roaming\Python\Python37\site-packages\openpyxl\reader\excel.py", line 126, in init
self.archive = _validate_archive(fn)
File "C:\Users\admin\AppData\Roaming\Python\Python37\site-packages\openpyxl\reader\excel.py", line 98, in _validate_archive
============================== FILE INFO ==============================
archive = ZipFile(filename, 'r')
File "D:\Python_Edition\Python37\lib\zipfile.py", line 1204, in init
self.fp = io.open(file, filemode)
FileNotFoundError: [Errno 2] No such file or directory: 'pandas_simple.xlsx'

豆瓣邮箱注册不了,只能手机验证码登陆

在2.3.7模拟登陆的,页码是70页。需要提交表单数据登陆。您书里使用的是邮箱账号密码登陆。现在豆瓣网不支持邮箱注册了。我只能通过手机验证码登陆。我试着提交表单数据,但是都登陆不了。网页源码也和书上的变动挺大的。想向您请教一下。

关于豆瓣登陆的问题

mydata = soup.select('#display')[0].get_text()

IndexError: list index out of range
这一行出现了错误

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.