Giter Site home page Giter Site logo

edisonqkj / spidercrackdemo Goto Github PK

View Code? Open in Web Editor NEW

This project forked from wkunzhi/python3-spider

1.0 0.0 1.0 172 KB

各站【爬虫】【数据解密】【内容解析】【自动登陆】【反爬处理】大众点评 | 淘宝 | 京东 | 美团 | 天眼查 | 51Job | github | token解密

Python 96.50% JavaScript 3.50%

spidercrackdemo's Introduction

SpiderCrackDemo

Anti - crawling website crack Demo, we hope to update together

Author Zok
Email [email protected]
BLOG www.zhangkunzhi.com
Introduce 数据解密、反爬处理、模拟登陆、POST登陆

最近在做MT和DP的整站爬取,所以经常更新一些拆分开来的小demo

Demo


directory tree

├── DianPing                            // -----大众点评-----
│   ├── parse_address_poi.py            // 坐标加密
│   └── parse_font_css.py               // CSS字体解密
├── GitHub                              // ------GitHub-----
│   └── login.py                        // GitHub自动登陆
├── JingDong                            // -------京东-------
├── BaiDu                               // -------百度-------
│   └── translation.py                  // 百度翻译
├── MeiTuan                             // -------美团-------
│   ├── parse_comments.py               // 获取用户评论数据
│   ├── create_food_token.py            // 餐饮页Token生成器
│   ├── parse_play_areas.py             // 三级区域解析器(休闲板块)
│   ├── parse_play_info.py              // 休闲会所商铺数据解析
│   ├── get_login_cookies.py            // 基于pyppeteer登陆并获取cookies
│   └── parse_restaurant_info.py        // 解析餐馆数据
├── TaoBao                              // -------淘宝-------
│   ├── login_for_sina.py               // 淘宝自动登陆-新浪入口
│   ├── auto_login_pyppeteer.py         // 淘宝自动登陆-淘宝账号
│   ├── login_for_pyppeteer.py          // 利用pyppeteer过webdriver检测
│   └── login_for_mitmproxy.py          // 利用mitmproxy过webdriver检测
├── TianYanCha                          // -------天眼查-------
│   └── login.py                        // 自动登陆,并获取企业信息
├── BiliBili                            // -------BiliBili-------
│   └── login.py                        // 视频下载器
├── MeiTuanArea                         // -------基于美团全国区域采集器-------
└── 51Job                               // -------51job-------
    └── select_job.py                   // 编码转换,岗位查询


The sample picture

  • 美团三级区域解析器

image image


  • 美团休闲娱乐商铺信息

image


  • TB过检测登陆

image


  • 美团餐饮数据解析

image


  • 51job查岗位

image


  • 美团评论解析

image


spidercrackdemo's People

Contributors

wkunzhi avatar

Stargazers

 avatar

Forkers

freedesert

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.