The girlcrawler from ninjarooter

girlcrawler's Introduction

#girlCrawler

###一个针对网站http://www.girl13.com上图片的爬取工具，具有以下功能和特性:

爬取到网站上所有主体下的图片列表
在本地建立与各主题对应的文件夹
将爬取到的图片下载到本地对应主题的文件夹下
多次运行工程能够检测图片文件是否已经存在，如存在则不再下载，只下载新的图片，节省流量

###girlCrawler主要是建立在以下依赖库之上的：

Node.js - 应用服务器
cheerio - 为服务器特别定制的，快速、灵活、实施的jQuery核心实现

###安装和启动

安装Node.js.

将整个工程clone到本地.

 >git clone https://github.com/xuelangcxy/girlCrawler.git

在工程的根目录下启动主文件
```
 >node girl.js
```

###尚存在的问题

运行该工程时存在中途中断下载的情况，可以直接按Ctrl+c以终止运行并尝试再次启动工程.
下载完成后可能存在某些图片不能查看，图片大小为0，可以将此类图片文件删除并尝试再次运行工程.
再次运行工程不会重复下载已存在的文件.

###温馨提示：

由于图片数量较大，经测试大小大概在350-400MB，请下载前酌情考虑

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.

Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

TensorFlow

An Open Source Machine Learning Framework for Everyone

Django

The Web framework for perfectionists with deadlines.

Laravel

A PHP framework for web artisans

D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

web

Some thing interesting about web. New door for the world.

server

A server is a program made to process requests and deliver data to clients.

Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

Visualization

Some thing interesting about visualization, use data art

Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.

Microsoft

Open source projects and samples from Microsoft.

Google

Google ❤️ Open Source for everyone.

Alibaba

Alibaba Open Source for everyone

D3

Data-Driven Documents codes.

Tencent

China tencent open source team.

ninjarooter / girlcrawler Goto Github PK