Giter Site home page Giter Site logo

jiangjm20 / reserves-lib-tsinghua-downloader Goto Github PK

View Code? Open in Web Editor NEW

This project forked from i207m/reserves-lib-tsinghua-downloader

0.0 0.0 0.0 48 KB

Download pages from http://reserves.lib.tsinghua.edu.cn/

License: GNU General Public License v3.0

Python 100.00%

reserves-lib-tsinghua-downloader's Introduction

清华大学教参服务平台 Downloader

Download pages from http://reserves.lib.tsinghua.edu.cn/

自动下载书籍每一页的原图。

Usage

image-20210308204615230

运行downloader.py(或调用函数claw),传入的参数为阅读全文下第一个链接(图中标黄的位置)。

程序会自动爬取每一章的每一页,保存在./clawed下。

Q&A

Q: 如何生成教参的PDF?

A: 目前的解决方案是使用学校提供的正版福昕编辑器,将多个图片合成PDF,并可以进行OCR文字识别/图片压缩。 现在, 下载完成后会询问是否自动合并为 PDF 文件, 但是需要安装 img2pdf 库.

Q: 运行报错ModuleNotFoundError: No module named 'requests',怎么办?

A: 在命令行中运行pip install requests以安装此库。

Q: 运行报错No cookie data,怎么办?

A: 经测试,绝大部分教参无需cookie即可访问。少数教参需要cookie进行身份验证,请将网站cookie中,.ASPXAUTHASP.NET_SessionId的值依次写入cookie.txt中,每行一个。(我将会完善获取网站cookie的相关教程。若急需,请与我发邮件)

TODO

  • CI/CD
  • Async

欢迎Star/Issue/PR.

仅供学习编程,请勿用于非法用途!

更多清华常用信息/服务汇总请看这里

reserves-lib-tsinghua-downloader's People

Contributors

i207m avatar jiangjm20 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.