Comments (6)
maybe try it
1. disable phantomjs
--driver
option not equals "phantomjs"(default value), to disable phantomjs, to use golang lib: http
to request qidian url
$ ./FictionDown --url xxx d --driver 1
...
...
...
2. build latest commit
$ go build -v github.com/ma6254/FictionDown/cmd/FictionDown
$ ./FictionDown --url https://book.qidian.com/info/3249362 d
2019/03/11 17:32:37 URL: "https://book.qidian.com/info/3249362"
2019/03/11 17:32:37 Init PhantomJS
2019/03/11 17:33:00 Loading....
书名: "一世之尊"
作者: "爱潜水的乌贼"
封面: https://bookcover.yuewen.com/qdbimg/349573/3249362/180
简介:
我这一生,不问前尘,不求来世,只轰轰烈烈,快意恩仇,败尽各族英杰,傲笑六道神魔!
万年之后,大劫再启,如来金身,元始道体,孰强孰弱,如来神掌,截天七式,谁领风*?
轮回之中,孟奇自少林寺开始了自己“纵横一生,谁能相抗”的历程。
章节数:
作品相关卷(免费) 17章
第一卷 少年侠气卷(免费) 84章
第二卷 平沙茫茫黄入天卷(免费) 12章
第二卷 平沙茫茫黄入天卷(VIP) 54章
第三卷 满堂花醉三千客卷(VIP) 355章
第四卷 二十年纵横间卷(VIP) 403章
第五卷 人有病,天知否?卷(VIP) 26章
第六卷 东风夜放花千树卷(VIP) 240章
第七卷 天意自古高难问卷(VIP) 188章
第八卷 苍茫大地谁主沉浮卷(VIP) 65章
2019/03/11 17:33:01 Working...
2019/03/11 17:33:01 routine: 10
...
...
...
3. upgrade phantomjs to latest version
$ phantomjs --version # this is my used phantomjs in my MacOS laptop
2.1.1
中文
英文不好请见谅
1. 禁用phantomjs
d子命令下有个--driver选项,默认值是"phantomjs"
,也就是使用phantomjs爬取,改成其他任意值就可以禁用phantomjs,禁用后,将使用golang的官方库也就是http
库去构建http请求
这个选项的意义在于:
起点每本书信息页面防爬取策略不一样
- 有时卷信息是动态加载(这是需要phantomjs)的有时是静态直接给的
- 有时会是移动端页面有时是PC端页面
大多数情况是两者(phantomjs和直接http)都可以爬取
小部分情况只能使用其中一个(phantomjs可以但是http不行,有时反之)
以上几端将会补充进README.md,是我的疏漏
以下两种情况可能性较小,暂不讨论
2. 编译最新的commit
略
3. 升级phantomjs
略
from fictiondown.
upgrade phantomjs then use release does not work:
$ ./FictionDown --url https://book.qidian.com/info/3249362 d
2019/03/11 19:22:11 Init PhantomJS
2019/03/11 19:22:12 URL: "https://book.qidian.com/info/3249362"
2019/03/11 19:22:14 Close PhantomJS
2019/03/11 19:22:14 not match volumes
$ phantomjs -v
2.1.1
disable driver works.
from fictiondown.
build with origin code still not work:
$ go build -v github.com/ma6254/FictionDown/cmd/FictionDown
github.com/ma6254/FictionDown/cmd/FictionDown
$ ./FictionDown --url https://book.qidian.com/info/3249362 d
2019/03/11 19:25:24 URL: "https://book.qidian.com/info/3249362"
2019/03/11 19:25:24 Init PhantomJS
2019/03/11 19:25:27 Close PhantomJS
2019/03/11 19:25:27 not match volumes
from fictiondown.
With below content, can not download:
bookurl: https://book.qidian.com/info/1004608738
bookname: 圣墟
author: 辰东
coverurl: https://bookcover.yuewen.com/qdbimg/349573/1004608738/180
description: |-
在破败中崛起,在寂灭中复苏。
沧海成尘,雷电枯竭,那一缕幽雾又一次临近大地,世间的枷锁被打开了,一个全新的世界就此揭开神秘的一角……
tmap:
- https://www.biqiuge.com/book/4772 <===
- https://www.biquge5200.cc/52_52542 <===
volumes: []
Just add marked lines based on generated file.
from fictiondown.
Can't match the volume information, of course, can't download, only try several times, there is always one time to get the volume information
i will fix it
中文
多试几次,总有一次可以获取到卷信息,这时才会爬取正版内容,然后才可以添加盗版信息,爬取盗版内容
我会在后几个commit中添加重试机制
Bash
I tried it 9 times and finally got it.
我尝试了9次,终于获取到了
maqinfen@mqf ~/m/s/g/m/F/release> ./FictionDown --url https://book.qidian.com/info/3249362 d
2019/03/12 04:55:43 URL: "https://book.qidian.com/info/3249362"
2019/03/12 04:55:43 Init PhantomJS
2019/03/12 04:55:50 Close PhantomJS
2019/03/12 04:55:50 not match volumes
maqinfen@mqf ~/m/s/g/m/F/release> ./FictionDown --url https://book.qidian.com/info/3249362 d
2019/03/12 04:55:52 URL: "https://book.qidian.com/info/3249362"
2019/03/12 04:55:52 Init PhantomJS
2019/03/12 04:55:59 Close PhantomJS
2019/03/12 04:55:59 not match volumes
maqinfen@mqf ~/m/s/g/m/F/release> ./FictionDown --url https://book.qidian.com/info/3249362 d
2019/03/12 04:56:00 URL: "https://book.qidian.com/info/3249362"
2019/03/12 04:56:00 Init PhantomJS
2019/03/12 04:56:06 Close PhantomJS
2019/03/12 04:56:06 not match volumes
maqinfen@mqf ~/m/s/g/m/F/release> ./FictionDown --url https://book.qidian.com/info/3249362 d
2019/03/12 04:56:08 URL: "https://book.qidian.com/info/3249362"
2019/03/12 04:56:08 Init PhantomJS
2019/03/12 04:56:15 Close PhantomJS
2019/03/12 04:56:15 not match volumes
maqinfen@mqf ~/m/s/g/m/F/release> ./FictionDown --url https://book.qidian.com/info/3249362 d
2019/03/12 04:56:17 URL: "https://book.qidian.com/info/3249362"
2019/03/12 04:56:17 Init PhantomJS
2019/03/12 04:56:21 Close PhantomJS
2019/03/12 04:56:21 not match volumes
maqinfen@mqf ~/m/s/g/m/F/release> ./FictionDown --url https://book.qidian.com/info/3249362 d
2019/03/12 04:56:23 URL: "https://book.qidian.com/info/3249362"
2019/03/12 04:56:23 Init PhantomJS
2019/03/12 04:56:29 Close PhantomJS
2019/03/12 04:56:29 not match volumes
maqinfen@mqf ~/m/s/g/m/F/release> ./FictionDown --url https://book.qidian.com/info/3249362 d
2019/03/12 04:56:31 URL: "https://book.qidian.com/info/3249362"
2019/03/12 04:56:31 Init PhantomJS
2019/03/12 04:56:36 Close PhantomJS
2019/03/12 04:56:36 not match volumes
maqinfen@mqf ~/m/s/g/m/F/release> ./FictionDown --url https://book.qidian.com/info/3249362 d
2019/03/12 04:56:38 URL: "https://book.qidian.com/info/3249362"
2019/03/12 04:56:38 Init PhantomJS
2019/03/12 04:56:44 Close PhantomJS
2019/03/12 04:56:44 not match volumes
maqinfen@mqf ~/m/s/g/m/F/release> ./FictionDown --url https://book.qidian.com/info/3249362 d
2019/03/12 04:56:46 URL: "https://book.qidian.com/info/3249362"
2019/03/12 04:56:46 Init PhantomJS
书名: "一世之尊"
作者: "爱潜水的乌贼"
封面: https://bookcover.yuewen.com/qdbimg/349573/3249362/180
简介:
我这一生,不问前尘,不求来世,只轰轰烈烈,快意恩仇,败尽各族英杰,傲笑六道神魔!
万年之后,大劫再启,如来金身,元始道体,孰强孰弱,如来神掌,截天七式,谁领风*?
轮回之中,孟奇自少林寺开始了自己“纵横一生,谁能相抗”的历程。
章节数:
作品相关卷(免费) 17章
第一卷 少年侠气卷(免费) 84章
第二卷 平沙茫茫黄入天卷(免费) 12章
第二卷 平沙茫茫黄入天卷(VIP) 54章
第三卷 满堂花醉三千客卷(VIP) 355章
第四卷 二十年纵横间卷(VIP) 403章
第五卷 人有病,天知否?卷(VIP) 26章
第六卷 东风夜放花千树卷(VIP) 240章
第七卷 天意自古高难问卷(VIP) 188章
第八卷 苍茫大地谁主沉浮卷(VIP) 65章
2019/03/12 04:56:56 Working...
2019/03/12 04:56:56 routine: 10
231 / 1444 [====================>-----------------------------------------------------------------------------------------------------------] 16.00% 03m44s^C2019/03/12 04:57:39 进程信号: interrupt
2019/03/12 04:57:39 [爬取结束] 已缓存:113 样本:119 完成样本:0
2019/03/12 04:57:39 Close PhantomJS
maqinfen@mqf ~/m/s/g/m/F/release>
from fictiondown.
add support Chromedp
this will open a new chrome window
if you installed chrome
it will close after loading is complete
./FictionDown --url https://book.qidian.com/info/3249362 d --driver chromedp
from fictiondown.
Related Issues (19)
- dep ensure failed
- 感谢您的项目,提一个小小的建议 HOT 2
- Windows下通过pandoc转换输出epub发生错误
- 你好能编译一个web界面吗? HOT 1
- win10管理员运行程序内存溢出
- 请问一下 FictionDown那个Logo是在哪里生成的 HOT 1
- 无法下载,起点 HOT 2
- openBinaryFile: does not exist (No such file or directory) HOT 2
- [Bugs] 当章节内容不存在时,会出现 Error: No matching content HOT 5
- [Enhanced] 顶点小说网域名更新,Xpath不需要变动
- runtime error 搜索各站点时出现运行时错误 HOT 2
- 无法读取起点章节,内容为空 HOT 2
- win7运行闪退 HOT 1
- 加个规则 HOT 2
- chromedp更新了
- 希望能增加转mobi格式的功能 HOT 1
- 希望增加梧州中文台书源www.gxwztv.com,质量高
- 自定义书源 HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from fictiondown.