Giter Site home page Giter Site logo

simple-cookieapi's Introduction

simplie-cookieApi

缘起

为什么会有这个项目呢?偶然间遇到要获取某个网站上数据的需求。经过分析发现他的反爬说简单也简单,说复杂呢?你又说不上来,为什么呢?他的一些请求参数根本就不是在客户端生成。比如,我要拿/data/detial/conent 的内容他的请求参数是:/data/detial/conent?paramid=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx 后面的参数是通过请求接口/doc/xxx/yyy/bbb?aaid=xxxxxxxxxx/aaa/xxx/yyy/bbb?aaid=xxxxxxxxxx ...后面的接口是302跳转,respons的header中含有下一个接口地址以及他会更新cookie信息,以至于最终的请求header和cookie难以跟踪至于最终要302几次我觉得应该是取决于网站开发者。尝试了很久最终选择放弃,进而谋生本项目。

项目介绍

大家都知道selenium速度慢,所以大佬的爬虫都是去逆向原站的js获取起加密算法已过掉原站的反爬机制。但是别人的js机密不是那么好逆向的,而selenium可以伪装成真正的浏览器从而可以拿到最终的html代码。本项目利用selenium的特性拿到浏览器中的cookie信息,再通过quart 提供的webAPI能力将其公布出来方便requests调用。通过这种简单的封装可以实现所有网站不需要js逆向也可以获取他的的cookie,至于速度问题:拿数据用requests,当cookie失效的时候再来接口上取一下cookie。

  • 项目依赖
0. python3.8.x 
1. selenium
2. undetected_chromedriver # 屏蔽chromedriver 浏览器指纹
3. qurat # flask的异步实现,都是同一个团队的作品
4. docker # 容器化方便部署
  • 项目结构
├── Dockerfile
├── requirements.txt
├── simple-cookieApi.conf
└── simple-cookieApi.py
  • 运行项目
root@7197f225bd7d:/# git clone https://github.com/shojinto/simple-cookieApi.git
root@7197f225bd7d:/# docker build -t <yourdockerimagename> .
root@7197f225bd7d:/# docker run -d -p 8080:8080 -p 5900:59000 <yourdockerimagename>
  • 功能介绍

目前只实现了手动登录和获取cookie两个接口

操作步骤:

1 启动项目,使用vnc客户端工具登录容器,默认密码:secret。登录到容器GUI界面后你会看到 https://bot.sannysoft.com/ 的主页展示了可能会被侦测到的浏览器指纹。你有可以在浏览器地址栏输入:https://nowsecure.nl 是否通过指纹检测进行验证。

2 调用接口,或者直接浏览器地址连输入http://127.0.0.1:8000 查看接口用法

  • 效果展示
root@7197f225bd7d:/# echo '{
>     "url": "https://domain.com",
>     "operate":"getCookies"
> }' |  \
>   http POST http://127.0.0.1:8000/jbos \
>   Content-Type:application/json \
>   Postman-Token:c870ddc9-e512-42a9-9364-d9efd2ab02d8 \
>   cache-control:no-cache
HTTP/1.1 200
content-length: 3071
content-type: application/json
date: Wed, 07 Dec 2022 06:08:02 GMT
server: hypercorn-h11

{
    "result": [
        {
            "domain": ".domain.com",
            "httpOnly": false,
            "name": "Hm_lpvt_6bcd52f51e9b3dce32bec4a3997715ac",
            "path": "/",
            "secure": false,
            "value": "1670393282"
        },
        {
            "domain": "www.domain.com",
            "expiry": 1670395077,
            "httpOnly": true,
            "name": "acw_tc",
            "path": "/",
            "secure": false,
            "value": "276077ca16703932768538537e1a4e16c84b962fe713e14cb36c005c5d9dde"
        },
        {
            "domain": ".domain.com",
            "expiry": 1685945278,
            "httpOnly": false,
            "name": "ssxmod_itna",
            "path": "/",
            "secure": false,
            "value": "YqUx0DBDnD2034Qq0L9mxjgexgedYKNHHDl2DWueiODUxn4iaDT=OQ=7QGIiWRThreQKtELYqq3KW4x5so=th0eDQxY6FDfqDzDDhd4QD/4w+oYIDYYDtxBAfD3RdDWCqB6MDtqDkLD0+HD7pFlx08DeFdz2DDUerwKWyQDCyaD7KDn=qDAhYDm64DRgPDe64D91PDw6Rm8xG7DAyAyxi3cODHK0LxDQmxpKv69K7p7UEvhbSv+zPfw8koB6vszxib6j8wcr1sL7dt3WRDv1u=ACbeQBm4KQnO4=GerGRDq02PAAiqqBDVmBC9iDD==="
        }
    ],
    "status": "suc"
}

特别说明

本项目仅限于学习,请勿用于商业用途,产生法律风险自负

simple-cookieapi's People

Contributors

shojinto avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.