Giter Site home page Giter Site logo

x's Introduction

项目介绍

本项目是分布式通用爬虫,主要含有功能

  1. 任务创建:支持4种固定模板+自定义模板 快速创建任务
  2. 任务管理:开启、暂停
  3. 抓取结果:搜索、查看、下载
  4. 节点管理:对抓取节点的状态查看与集群切换
  5. 自动生成定位器:通过点击生成css选择器,不需要手写xpath
  6. 正文提取:基于本文密度算法,自动提取新闻标题、时间、正文
  7. 监控订阅更新:用于监控 985大学研究生招生网站、各大招投标网站、用户自定义网站等,一旦有更新并触发关键词,会进行邮件通知

项目技术栈:Vue+SpringBoot+WebMagic
中间件:Zookeeper+Redis
数据库:MongoDB

使用说明

要求环境安装JDK11,Redis,Zookeeper,MongoDB

  1. 本项目前后端分离,前端框架为Vue(后续会开源),后端框架为SpringBoot(已经开源)
  2. 打包:在根目录进行运行 mvn install,会在X-Dispatcher和X-Spider各自上生成可执行的jar包
  3. 部署:X-Dispatcher项目为管理节点(只部署1个),X-Spider是抓取节点(可部署多个)
  4. 本项目核心功能已经开发完后

Web页面

dashBoard dashBoard

任务管理页面 任务管理

任务编辑界面 任务编辑

可视化辅助定位器 辅助定位器

抓取结果管理页面 抓取结果

节点管理 节点管理

订阅组管理 订阅组

邮件通知 订阅组

x's People

Contributors

whitefly avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

x's Issues

Bug

俺发现了一个Bug~~~~

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.