Giter Site home page Giter Site logo

spider-2's Introduction

spider-2

实例

import requests
response = requset.get('https://ww.baidu.com/')
print(type(response))
print(response.status_code)
print(type(response.text))
print(response.text)
print(response.cookise)

传参

import requests
data = {
  'name':'germey',
  'age':'22'
}
response = requests.get('http://httpbin.org/get',params=data)
print(response.text)

json解析

import requests
import json
response = requests.get('http://httobin.org/get')
print(type(response.text))
print(response.json())
print(json.loads(response.text))
print(type(response.json()))

获取二进制数据

import requests
response = requests.get("https://github.com/favicon.ico")
print{type(response.text),type(response.content)}
print(response.text)
print(response.content)

存入二进制数据

import requests
response = requests.get("https://github.com/favicon.ico")
with open('favicon.ico','wb') as f:
  f.write(response.content)
  f.close()

添加headers

import requests
response = requests.get('https://www.zhihu.com/explore')
print(response.text)
import requests
headers = {
  'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64; rv:63.0) Gecko/20100101 Firefox/63.0'
}
response = requests.get("https://www.zhihu.com/explore",headers=headers)
print(response.text)

POST请求

import requests
data = {'name':'germey','age':'22'}
response = reuqests.post('http://httpbin.org/post',data=data)
print(response.text)
import requests
data = {'name':'germey','age':'22'}
headers = {
  'User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:63.0) Gecko/20100101 Firefox/63.0'
}
response = reuqests.post('http://httpbin.org/post',data=data,headers=headers)
print(response.text)

response属性

import requests
response = requests.get('http://www.jianshu.com')
print(type(response.status),response.status)
print(type(response.headers),response.headers)
print(type(response.cookies),response.cookies)
print(type(response.url),response.url)
print(type(response.history),response.history)
import requests
response = requests.get('http://www,jianshu.com')
exit() if not response.status_code == request.codes.not_found else print('404 Not Found')
import requests
response = requests.get('http://www.jianshu.com')
exit() if not response.status_code == 200 else print('Request Successfully')

高级操作

文件上传

import requests
files = {'file':open('favicon.ico','rb')}
response = requests.post('http://httpbin.org/post',files=files)
print(response.text)

获取cookie

import requests
response = requests.get('https://www.baidu.com')
print(response.cookies)
for key,value in response.cookies.items():
  print(key + '=' + value)

会话维持

模拟登陆思路

import requests
requests.get('http://httpbin.org/cookies/set/number/123456789')
response = requests.get('http://httpbin.org/cookies')
print(response.text)

实例

import requests
s = requests.session()
s.get('http://httpbin.org/cookies/set/number/12345678')
response = s.get('http://httpbin.org/cookies')
print(response.text)

证书解决(使用urllib3库)

import requests
from requests.packages import urllib3
urllib3.disable_warnings()
response = requests.get('https://www.12306.cn',verify=False)
print(response.status_code)

导入证书

import requests
response = requests.get('https://www.12306.cn',cert=('/path/server.crt','/path/key'))
print(response.status_code)

代理设置(ssl代理)

import requests
proxies = {
  "http":"http://127.0.0.1:9743",
  "https":"https://127.0.0.1:9743",
}
response = requests.get("https://www.taobao.com",proxies=proxies)
print(response.status_code)
import requests
proxies = {
  "http":"http://user:[email protected]:9743/",
}
response = requests.get("https://www.taobao.com",proxies=proxies)
print(response.status_code)

sock5代理,先安装requests[scoks]

pip3 install 'requests[scoks]'
import requests
proxies = {
  'http':'socks5://127.0.0.1:9742',
  'https':'socks5://127.0.0.1:9742'
}
response = requests.get('https://www.taobao.com',proxies=proxies)
print(response.status_code)

超时设置

import requests 
from requests.exceptions import ReadTimeout
try:
    response = requests.get('http://httpbin.org/get',timeout = 1)
    print(response.status_code)
except ReadTimeout:
    print('Timeout')

认证设置

import requests
from requests.auth import HTTPBasicAuth
r = requests.get('http://120.27.34.24:9001',auth=HTTPBasicAuth('user','123'))
print(r.status.code)
### 也可以输入下面的形式
import requests
r = requests.get('http://120.27.34.24:9001',auth=('user','123'))
print(r.status_code)

异常处理和捕获

import requests
from requests.exceptions import ReadTimeout,ConnectionError,HTTPError,RequestException
try:
    response = requests.get('http://httpbin.org/get',timeout=1)
    print(response.status_code)
except ReadTimeout:(子类)
    print('timeout')
except ConnectionError:(子类)
    print('connect error')
except HTTPError:(子类)
    print('http error')
except RequestException:(父类)
    print('Error')

spider-2's People

Contributors

xiaojun1234 avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.