Giter Site home page Giter Site logo

example failed about fictiondown HOT 6 CLOSED

ma6254 avatar ma6254 commented on August 12, 2024
example failed

from fictiondown.

Comments (6)

ma6254 avatar ma6254 commented on August 12, 2024

maybe try it

1. disable phantomjs

--driver option not equals "phantomjs"(default value), to disable phantomjs, to use golang lib: http to request qidian url

 $ ./FictionDown --url xxx d --driver 1 
...
...
...

2. build latest commit

$ go build -v  github.com/ma6254/FictionDown/cmd/FictionDown
$ ./FictionDown --url https://book.qidian.com/info/3249362 d
2019/03/11 17:32:37 URL: "https://book.qidian.com/info/3249362"
2019/03/11 17:32:37 Init PhantomJS
2019/03/11 17:33:00 Loading....
书名: "一世之尊"
作者: "爱潜水的乌贼"
封面: https://bookcover.yuewen.com/qdbimg/349573/3249362/180
简介:
	我这一生,不问前尘,不求来世,只轰轰烈烈,快意恩仇,败尽各族英杰,傲笑六道神魔!
	万年之后,大劫再启,如来金身,元始道体,孰强孰弱,如来神掌,截天七式,谁领风*?
	轮回之中,孟奇自少林寺开始了自己“纵横一生,谁能相抗”的历程。
章节数:
	作品相关卷(免费) 17章
	第一卷 少年侠气卷(免费) 84章
	第二卷 平沙茫茫黄入天卷(免费) 12章
	第二卷 平沙茫茫黄入天卷(VIP) 54章
	第三卷 满堂花醉三千客卷(VIP) 355章
	第四卷 二十年纵横间卷(VIP) 403章
	第五卷 人有病,天知否?卷(VIP) 26章
	第六卷 东风夜放花千树卷(VIP) 240章
	第七卷 天意自古高难问卷(VIP) 188章
	第八卷 苍茫大地谁主沉浮卷(VIP) 65章
2019/03/11 17:33:01 Working...
2019/03/11 17:33:01 routine: 10
...
...
...

3. upgrade phantomjs to latest version

$ phantomjs --version # this is my used phantomjs in my MacOS laptop 
2.1.1

中文

英文不好请见谅

1. 禁用phantomjs

d子命令下有个--driver选项,默认值是"phantomjs",也就是使用phantomjs爬取,改成其他任意值就可以禁用phantomjs,禁用后,将使用golang的官方库也就是http库去构建http请求

这个选项的意义在于:

起点每本书信息页面防爬取策略不一样

  1. 有时卷信息是动态加载(这是需要phantomjs)的有时是静态直接给的
  2. 有时会是移动端页面有时是PC端页面

大多数情况是两者(phantomjs和直接http)都可以爬取
小部分情况只能使用其中一个(phantomjs可以但是http不行,有时反之)

以上几端将会补充进README.md,是我的疏漏
以下两种情况可能性较小,暂不讨论

2. 编译最新的commit

3. 升级phantomjs

from fictiondown.

daixiang0 avatar daixiang0 commented on August 12, 2024

upgrade phantomjs then use release does not work:

$ ./FictionDown --url https://book.qidian.com/info/3249362 d 
2019/03/11 19:22:11 Init PhantomJS
2019/03/11 19:22:12 URL: "https://book.qidian.com/info/3249362"
2019/03/11 19:22:14 Close PhantomJS
2019/03/11 19:22:14 not match volumes
$ phantomjs -v
2.1.1

disable driver works.

from fictiondown.

daixiang0 avatar daixiang0 commented on August 12, 2024

build with origin code still not work:

$ go build -v  github.com/ma6254/FictionDown/cmd/FictionDown
github.com/ma6254/FictionDown/cmd/FictionDown
$  ./FictionDown --url https://book.qidian.com/info/3249362 d
2019/03/11 19:25:24 URL: "https://book.qidian.com/info/3249362"
2019/03/11 19:25:24 Init PhantomJS
2019/03/11 19:25:27 Close PhantomJS
2019/03/11 19:25:27 not match volumes

from fictiondown.

daixiang0 avatar daixiang0 commented on August 12, 2024

With below content, can not download:

bookurl: https://book.qidian.com/info/1004608738
bookname: 圣墟
author: 辰东
coverurl: https://bookcover.yuewen.com/qdbimg/349573/1004608738/180
description: |-
  在破败中崛起,在寂灭中复苏。
  沧海成尘,雷电枯竭,那一缕幽雾又一次临近大地,世间的枷锁被打开了,一个全新的世界就此揭开神秘的一角……
tmap:
- https://www.biqiuge.com/book/4772       <===
- https://www.biquge5200.cc/52_52542    <===
volumes: []

Just add marked lines based on generated file.

from fictiondown.

ma6254 avatar ma6254 commented on August 12, 2024

Can't match the volume information, of course, can't download, only try several times, there is always one time to get the volume information

i will fix it

中文

多试几次,总有一次可以获取到卷信息,这时才会爬取正版内容,然后才可以添加盗版信息,爬取盗版内容

我会在后几个commit中添加重试机制

Bash

I tried it 9 times and finally got it.
我尝试了9次,终于获取到了

maqinfen@mqf ~/m/s/g/m/F/release> ./FictionDown --url https://book.qidian.com/info/3249362 d
2019/03/12 04:55:43 URL: "https://book.qidian.com/info/3249362"
2019/03/12 04:55:43 Init PhantomJS
2019/03/12 04:55:50 Close PhantomJS
2019/03/12 04:55:50 not match volumes
maqinfen@mqf ~/m/s/g/m/F/release> ./FictionDown --url https://book.qidian.com/info/3249362 d
2019/03/12 04:55:52 URL: "https://book.qidian.com/info/3249362"
2019/03/12 04:55:52 Init PhantomJS
2019/03/12 04:55:59 Close PhantomJS
2019/03/12 04:55:59 not match volumes
maqinfen@mqf ~/m/s/g/m/F/release> ./FictionDown --url https://book.qidian.com/info/3249362 d
2019/03/12 04:56:00 URL: "https://book.qidian.com/info/3249362"
2019/03/12 04:56:00 Init PhantomJS
2019/03/12 04:56:06 Close PhantomJS
2019/03/12 04:56:06 not match volumes
maqinfen@mqf ~/m/s/g/m/F/release> ./FictionDown --url https://book.qidian.com/info/3249362 d
2019/03/12 04:56:08 URL: "https://book.qidian.com/info/3249362"
2019/03/12 04:56:08 Init PhantomJS
2019/03/12 04:56:15 Close PhantomJS
2019/03/12 04:56:15 not match volumes
maqinfen@mqf ~/m/s/g/m/F/release> ./FictionDown --url https://book.qidian.com/info/3249362 d
2019/03/12 04:56:17 URL: "https://book.qidian.com/info/3249362"
2019/03/12 04:56:17 Init PhantomJS
2019/03/12 04:56:21 Close PhantomJS
2019/03/12 04:56:21 not match volumes
maqinfen@mqf ~/m/s/g/m/F/release> ./FictionDown --url https://book.qidian.com/info/3249362 d
2019/03/12 04:56:23 URL: "https://book.qidian.com/info/3249362"
2019/03/12 04:56:23 Init PhantomJS
2019/03/12 04:56:29 Close PhantomJS
2019/03/12 04:56:29 not match volumes
maqinfen@mqf ~/m/s/g/m/F/release> ./FictionDown --url https://book.qidian.com/info/3249362 d
2019/03/12 04:56:31 URL: "https://book.qidian.com/info/3249362"
2019/03/12 04:56:31 Init PhantomJS
2019/03/12 04:56:36 Close PhantomJS
2019/03/12 04:56:36 not match volumes
maqinfen@mqf ~/m/s/g/m/F/release> ./FictionDown --url https://book.qidian.com/info/3249362 d
2019/03/12 04:56:38 URL: "https://book.qidian.com/info/3249362"
2019/03/12 04:56:38 Init PhantomJS
2019/03/12 04:56:44 Close PhantomJS
2019/03/12 04:56:44 not match volumes
maqinfen@mqf ~/m/s/g/m/F/release> ./FictionDown --url https://book.qidian.com/info/3249362 d
2019/03/12 04:56:46 URL: "https://book.qidian.com/info/3249362"
2019/03/12 04:56:46 Init PhantomJS
书名: "一世之尊"
作者: "爱潜水的乌贼"
封面: https://bookcover.yuewen.com/qdbimg/349573/3249362/180
简介:
	我这一生,不问前尘,不求来世,只轰轰烈烈,快意恩仇,败尽各族英杰,傲笑六道神魔!
	万年之后,大劫再启,如来金身,元始道体,孰强孰弱,如来神掌,截天七式,谁领风*?
	轮回之中,孟奇自少林寺开始了自己“纵横一生,谁能相抗”的历程。
章节数:
	作品相关卷(免费) 17章
	第一卷 少年侠气卷(免费) 84章
	第二卷 平沙茫茫黄入天卷(免费) 12章
	第二卷 平沙茫茫黄入天卷(VIP) 54章
	第三卷 满堂花醉三千客卷(VIP) 355章
	第四卷 二十年纵横间卷(VIP) 403章
	第五卷 人有病,天知否?卷(VIP) 26章
	第六卷 东风夜放花千树卷(VIP) 240章
	第七卷 天意自古高难问卷(VIP) 188章
	第八卷 苍茫大地谁主沉浮卷(VIP) 65章
2019/03/12 04:56:56 Working...
2019/03/12 04:56:56 routine: 10
 231 / 1444 [====================>-----------------------------------------------------------------------------------------------------------]  16.00% 03m44s^C2019/03/12 04:57:39 进程信号: interrupt
2019/03/12 04:57:39 [爬取结束] 已缓存:113 样本:119 完成样本:0
2019/03/12 04:57:39 Close PhantomJS
maqinfen@mqf ~/m/s/g/m/F/release>

from fictiondown.

ma6254 avatar ma6254 commented on August 12, 2024

add support Chromedp
this will open a new chrome window
if you installed chrome
it will close after loading is complete

./FictionDown --url https://book.qidian.com/info/3249362 d --driver chromedp

from fictiondown.

Related Issues (19)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.