Giter Site home page Giter Site logo

马开东's Projects

nlp-lang icon nlp-lang

这个项目是一个基本包.封装了大多数nlp项目中常用工具

nlp-ml icon nlp-ml

自然语言处理理论与实战

nyspider icon nyspider

各种爬虫---大众点评,安居客,58,人人贷,拍拍贷, IT桔子,拉勾网,豆瓣,搜房网,ASO100,气象数据,猫眼电影,链家,PM25.in...

otter icon otter

阿里巴巴分布式数据库同步系统(解决中美异地机房)

paddle icon paddle

PArallel Distributed Deep LEarning

python-progress icon python-progress

python进阶 python高阶函数,底层理解和一些分布式计算

python-spider icon python-spider

58同城 智联招聘 hao123 网易云课堂 **大学排名 等 的python的一些爬虫

qqzeng-ip icon qqzeng-ip

最新IP地址数据库-多语言解析以及导入数据库脚本

questionansweringsystem icon questionansweringsystem

QuestionAnsweringSystem是一个Java实现的人机问答系统,能够自动分析问题并给出候选答案。

scrapy icon scrapy

Scrapy, a fast high-level web crawling & scraping framework for Python.

selenium-phantomjs icon selenium-phantomjs

利用自动化测试工具selenium和无界面浏览器phantomjs爬取拉钩网数据

simplehbase icon simplehbase

Simplehbase is a lightweight ORM framework between java app and hbase.

snowboy icon snowboy

DNN based hotword and wake word detection toolkit

spider icon spider

使用java+httpclient+httpcleaner,多线程、分布式爬去电商网站商品信息,数据存储在hbase上,并使用solr对商品建立索引,使用redis队列存储一个共享的url仓库;使用zookeeper对爬虫节点生命周期进行监视等。

spider-1 icon spider-1

利用HttpClient4+实现网络小说爬虫,可动态添加热门的小说网站

spider-2 icon spider-2

爬虫项目源码整理,使用redis进行url缓存,hbase进行详细信息的存储。使用zookeeper进行爬虫线程的状态监控。

spider01 icon spider01

爬虫练习1 Python抓取静态网站信息

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.