Giter Site home page Giter Site logo

qidouhai / crawlerhot Goto Github PK

View Code? Open in Web Editor NEW

This project forked from pangxiaobin/crawlerhot

0.0 0.0 0.0 761 KB

今日热榜 抓取网站热榜信息,并且前端进行展示

Home Page: https://www.panglb.top/hot/

License: Apache License 2.0

Python 14.03% HTML 3.37% CSS 82.60%

crawlerhot's Introduction

说明

  • 我的博客热点展示:https://www.panglb.top/hot/
  • 前后端分离,后端使用轻量级框架web.py, 前端使用了layui,数据保存为本地json文件。
├── crawler.py  # 主要爬虫代码
├── helper.py  # 帮助函数
├── html    # 前端页面展示
│   ├── hot.html
│   └── layui  # 前端依赖
├── image
│   └── hot.png
├── LICENSE
├── README.md
├── requments.txt  # 环境依赖
├── result  # 爬虫数据保存
│   └── result.json
├── run.py  # 定时爬虫入口
├── server.py  # 后端服务
├── settings.py
└── uwsgi.ini  # uwsgi服务器配置
  • 目前只写了以下热点信息的爬取

    • 知乎热榜
    • V2EX
    • GitHub
    • 新浪微博
    • 天涯
    • 贴吧
    • 豆瓣
    • 云音乐
  • 环境

    • python3.6

运行

  • 下载

     git clone https://github.com/pangxiaobin/CrawlerHot.git
     cd CrawlerHot
  • 安装依赖

    # 创建虚拟环境  需要安装virtualenv 和virtualenvwrapper
    mkvirtualenv hot
    pip install -r requments.txt
    # 注释 windows pip install uwsgi 会报错 windows下演示可先在requments.txt 注释掉uwsgi
  • 本地运行效果展示

    • 数据爬取
    python run.py
    # 单独看爬虫效果 可以吧run() 注释
    # __name__ == '__main__':
    #    run_crawler()  # 单次爬虫运行
    #    run()  # 定时爬虫运行
    • 启动本地服务
    python server.py
    • 查看前端页面展示
    把html/hot.html 在浏览器中打开就能看到效果了
    
  • 服务器部署uwsgi+nginx

    • 项目是前后端分离的,后端可以单独就uwsgi起服务,前端用nginx。
    • uwsgi起http服务
    修改uwsgi.ini中的chdir
    # 这里指定你服务器端开放的端口
    http=0.0.0.0:8080
    # 配置工程目录 项目所在的绝对路径
    chdir=yourpath/CrawlerHot
    
    • 起动uwsgi
    uwsgi --ini uwsgi.ini
    • 修改前端请求的接口
        #/html/hot.html
        # 这里的127.0.0.1 要修改为你服务器的ip
        http://127.0.0.1:8080/hot =》http://server_ip:8080/hot
  • 配置nginx部署前端

    # /etc/nginx/conf.d/default.conf 添加location 配置
    server {
        listen       80;
        # 这里更改为你服务器的ip
        server_name  your_server_ip;
        
        location /hot {
            # 绝对路径
           alias /youtpath/CrawlerHot/html;
           index hot.html;
        }
    }
    • 运行定时爬虫脚本
    nohup python -u run.py &  
    • 效果展示

    hot

crawlerhot's People

Contributors

pangxiaobin avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.