Giter Site home page Giter Site logo

book118-downloader's Introduction

1. 概述

book118-downloader 下载

这是一个用于下载book118可预览文档的下载器(暂不支持ppt和收费才能预览的文件)。

该项目基于java开发,使用httpclient进行下载,使用itex进行pdf的生成。

2. 使用手册

  1. 下载解压

  2. 双击run.bat即可运行该软件,如果不能运行请检查是否已安装jre 8+(Java Runtime Environment)

  3. 文档编号是预览页链接中最后的数字如:https://max.book118.com/html/2017/0611/113657916.shtm ,文档编号就是113657916。

  4. 输入编好后需要获取下载链接,文件页数越多等待越长,请耐心等待,开始下载后会有进度提示。

  5. 下载完成的文件存放在out文件夹中。

如有问题,可至github 查看Bug Fix,或提交Issue

3. 实现逻辑

做完没多久在freebuf看到有其他人的实现,做的没我好,所以自己也写了一篇,编辑答应帮我调格式我就没要稿费了。 《另一种绕过限制下载论文的思路》

该下载器的原理是通过模拟通过网页预览,获取文档的全部预览图片,然后将图片转换为pdf实现。

实现逻辑主要围绕网站的两个js函数展开,这两个函数在resources/temp.js中。 openFull用于获取预览起始页,getNextPage用于获取后面的页。通过这两个函数就可以获取到一个文档的全部预览图片的地址。

4. Bug Fix

  • todo

ppt的下载

  • 2019/03/13

修复 issue#3 下载

  • 2018/11/17

修复某些时候URL拼接不正确导致的下载失败(156906400 下载失败)

  • 2018/11/11

修改为异步下载,不再需要等待获取全部页面链接

  • 2018/9/14

优化提示信息的显示 网盘下载

  • 2018/9/12

修改viewHost为根据返回值获取 网盘下载

  • 2018/9/7

基于hutool重构代码

  • 2018/6/26

修改结束字符变成Over后,获取页数不能停止,导致下载失败的问题。

book118-downloader's People

Contributors

dependabot[bot] avatar jtydhr88 avatar mrbcy avatar winterxmq avatar wxynihao avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

book118-downloader's Issues

Exception While Downloading A PDF

Exception in thread "main" cn.hutool.json.JSONException: A JSONObject text must begin with '{' at 1 [character 2 line 1]
at cn.hutool.json.JSONTokener.syntaxError(JSONTokener.java:373)
at cn.hutool.json.JSONObject.init(JSONObject.java:781)
at cn.hutool.json.JSONObject.(JSONObject.java:137)
at cn.hutool.json.JSONObject.(JSONObject.java:124)
at cn.hutool.json.JSONObject.(JSONObject.java:214)
at cn.hutool.json.JSONUtil.parseObj(JSONUtil.java:57)
at cn.hutool.json.JSONUtil.toBean(JSONUtil.java:305)
at me.rainking.DocumentBrowser.getNextPage(DocumentBrowser.java:200)
at me.rainking.DocumentBrowser.moveToNextPage(DocumentBrowser.java:76)
at me.rainking.DocumentBrowser.downloadWholeDocument(DocumentBrowser.java:109)
at me.rainking.BookDownloader.main(BookDownloader.java:79)

Was Downloading This File: https://max.book118.com/html/2019/0210/7160201143002005.shtm

image

页码没有按照顺序排序

image

环境: MacOS 10.14.5
版本:1.8.0_192
运行环境:IDEA 2019 2.3
下载图书编号: 134120889 马来西亚史

出现问题:
下载gif文件成功,生成的PDF页码乱掉,查看堆栈发现文件排序不对。堆栈如图

自己的解决办法:
List<File> files = Arrays.asList(picFiles);

几个月前还可以,今天要用发现有问题

开始任务后突然闪退,抓拍显示[DEBUG] cn.hutool.log.LogFactory: Use [Hutool Console Logging] Logger As Default.

这行字出现后马上闪退,再次打开询问是否执行,输入Y后依旧如此,是本机问题还是被max118反制了?

闪退

我每次操作的时候程序是闪退,没有下载成功。这个你们碰到过 吗

win10 64bit,java 8 32&64bit同时装了,String index out of range: -1

https://max.book118.com/html/2016/1215/72601794.shtm下载,运行提示
开始解析...
Exception in thread "main" java.lang.StringIndexOutOfBoundsException: String index out of range: -1
at java.lang.String.substring(Unknown Source)
at me.rainking.DocumentBrowser.getJson(DocumentBrowser.java:170)
at me.rainking.DocumentBrowser.getPicUrl(DocumentBrowser.java:136)
at me.rainking.DocumentBrowser.downloadWholeDocument(DocumentBrowser.java:100)
at me.rainking.BookDownloader.main(BookDownloader.java:79)
就停止了。
运行路径下有中文,不知有影响不,谢谢作者!

Failed to select a proxy

Exception in thread "main" cn.hutool.http.HttpException: Failed to select a proxy
at cn.hutool.http.HttpRequest.send(HttpRequest.java:919)
at cn.hutool.http.HttpRequest.execute(HttpRequest.java:803)
at cn.hutool.http.HttpRequest.executeAsync(HttpRequest.java:787)
at cn.hutool.http.HttpUtil.download(HttpUtil.java:458)
at cn.hutool.http.HttpUtil.download(HttpUtil.java:438)
at me.rainking.DocumentBrowser.downloadFile(DocumentBrowser.java:271)
at me.rainking.DocumentBrowser.downloadWholeDocument(DocumentBrowser.java:107)
at me.rainking.BookDownloader.main(BookDownloader.java:79)
Caused by: java.io.IOException: Failed to select a proxy
at java.base/sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1186)
at java.base/sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1082)
at java.base/sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:1016)
at cn.hutool.http.HttpConnection.connect(HttpConnection.java:447)
at cn.hutool.http.HttpRequest.send(HttpRequest.java:916)
... 7 more
Caused by: java.lang.IllegalArgumentException: protocol = http host = null
at java.base/sun.net.spi.DefaultProxySelector.select(DefaultProxySelector.java:184)
at java.base/sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1184)
... 11 more

是不是失效了QAQ

Ver.20190629 latest: https://github.com/wxynihao/book118-downloader
添加下载任务

  1. 输入文档编号, 如 3045568
  2. 输入完整的文档网址, 如 https://max.book118.com/html/2012/1017/3045568.shtm
  3. 输入 start, 开始下载任务
  4. 输入 #, 退出程序
    输入指令: 167962114
    下载任务 167962114 已填加
    添加下载任务
  5. 输入文档编号, 如 3045568
  6. 输入完整的文档网址, 如 https://max.book118.com/html/2012/1017/3045568.shtm
  7. 输入 start, 开始下载任务
  8. 输入 #, 退出程序
    输入指令: start
    待下载的文档:
    167962114
    下载文档: 167962114
    [2019-11-25 21:40:09] [DEBUG] cn.hutool.log.LogFactory: Use [Hutool Console Logging] Logger As Default.

输入编号后提示下载任务已填加后就没反应了

C:\Users\bao\Downloads\book118Downloader\book118Downloader>java -jar -Dfile.encoding=utf-8 book118Downloader-V2020.jar
Ver.20201018 latest: https://github.com/wxynihao/book118-downloader
添加下载任务

  1. 输入文档编号, 如 3045568
  2. 输入完整的文档网址, 如 https://max.book118.com/html/2012/1017/3045568.shtm
  3. 输入 start, 开始下载任务
  4. 输入 #, 退出程序
    输入指令: https://m.book118.com/html/2021/1116/7001131112004043.shtm
    下载任务 7001131112004043 已填加
    添加下载任务
  5. 输入文档编号, 如 3045568
  6. 输入完整的文档网址, 如 https://max.book118.com/html/2012/1017/3045568.shtm
  7. 输入 start, 开始下载任务
  8. 输入 #, 退出程序
    输入指令:

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.