Giter Site home page Giter Site logo

yanhekt_downloader's Introduction

北理工延河课堂视频下载

欢迎Star🌟!欢迎提Issue

本项目在 网协2023“十行代码”比赛 荣获特等奖🎉 指路👉Github

功能介绍

下载延河课堂的录播视频

  • 支持下载非选课班级的课程
  • 支持多线程批量下载
  • 支持下载电脑视频或教室录像
  • 按课程名分类文件夹保存

更新日志

  • 2023-4-10 同步延河课堂接口更改
  • 2023-4-20 更改js执行方式,无需安装nodejs
  • 2023-11-12 签名效率优化,优化下载速度
    • 理论可以跑满千兆有线网,可以根据电脑性能修改max_workers数量
  • 2024-4-2 (🌟)更改signature实现方式
    • 放弃js执行,不再使用js2py,提升兼容性 issue#5
    • 现在时间sign和url后缀 都是py原生
  • 2024-4-2 (🌟)更改交互方式,添加完整的命令行参数
    • 支持一次下载全部课时,感谢@ZJC-GH同学的建议和pr
    • 支持分别或同时下载VGA和Video
    • 支持增量下载,自动跳过已下载文件
    • 更改临时文件存储位置,放在temp
    • 可以自定义输出文件夹位置
    • 详见 #食用方法
    • 优化ffmpeg输出
  • 2024-4-3 (🌟🌟)添加了GUI交互界面
    • 基于PySimpleGUI4,可以跨平台运行

使用前准备

方法〇:Windows发行(推荐)

  • 下载 Releases中的exe文件
    • 建议下载带有ffmpeg版本,如yanhekt-x.x.x-gui-ffmpeg.exe
  • 直接运行,略过后边的部分

方法一:现已通过Pypi发布

https://pypi.org/project/yanhekt/

  • 安装yanhekt

    pip install yanhekt
  • 确保命令行环境有ffmpeg

    • 相关安装请自行搜索
      • Windows下载后,添加环境变量即可
    • 如果最终视频没有合并,说明ffmpeg环境存在问题

方法二:使用源代码

  1. 下载/克隆本仓库或下载 Releases

  2. 安装python依赖包

    pip install -r requirements.txt
    # (其实就一个requests)
  3. 确保命令行环境(或者代码文件夹内)有ffmpeg

    • 本仓库的release附带了ffmpeg(仅exe)

    • 相关安装请自行搜索

      • Windows简单的方法:下载后拷贝到代码文件夹内
    • 如果最终视频没有合并,说明ffmpeg环境存在问题

食用方法(GUI)

  1. 开启方法

    1. 若release exe,直接打开

    2. 若使用pip安装

      yanhekt-gui
      yanhekt gui
    3. 若源码运行

      python main.py gui
  2. 开箱即食

    • 扔进链接或者courseID(可Ctrl-C V)
    • 获取课程信息
    • 随意选择课时(Ctrl、Shift、鼠标拖拽都可多选)
    • 设置一些参数,比如要下载什么视频
    • 开下!

食用方法(命令行)

注意:如果使用本地源代码安装,请将本节中的yanhektyanhekt-cli替换为python main.py

  1. 获取课程ID

    在课程详情页,注意不是视频播放页,如https://www.yanhekt.cn/course/11111

    从url中获得课程id,如11111

  2. 命令行参数

    • 指定课程的ID

      • <courseID>,直接给出

        # 例:查看课程信息及视频列表
        yanhekt 11111
    • 选择下载的课时序号

      • --all,下载全部课时
      • --list 0 2 4 ,下载选定的课时列表
      • --range 3 5,下载一个范围内的课时
        # 例:下载第3-8节课
        yanhekt 11111 --range 3 9
        yanhekt 11111 -L 3 9
    • 选择下载的视频类型

      • --dual,同时下载电脑录屏和教室视频**(默认)**
      • --vga ,仅下载电脑录屏
      • --video,仅下载教室视频
        # 例:下载第3-8节课,仅下载电脑录屏
        yanhekt 11111 --range 3 9 --vga
    • 增量下载

      • --skip,跳过已下载,仅下载新上传的视频
        # 例:定期更新课程全部视频
        yanhekt 11111 --all --skip
  3. 更多高级用法请参考命令行提示

    !yanhekt --help
    
    # usage: main.py [-h] [-A | -L i [i ...] | -R i i] [-D | -G | -V] [-S] [--dir DIR] [--max-workers num] courseID
    
    # GDDG08/YanHeKT_Downloader
    
    # positional arguments:
    # courseID              Course ID of YanHeKT
    
    # options:
    # -h, --help            show this help message and exit
    
    # Lesson Selection:
    # IF NONE, PRINT LESSON LIST AND EXIT.
    
    # -A, --all             Download all lessons
    # -L i [i ...], --list i [i ...]
    #                         Select of lesson index (e.g., --list 1 2 4)
    # -R i i, --range i i   Select range of lessons (e.g., --range 3 5 for [3,5))
    
    # Video Type:
    # -D, --dual            Download both VGA(PC) and Video (default)
    # -G, --vga             Download VGA(PC) only
    # -V, --video           Download Video only
    
    # Configurations:
    # -S, --skip            Skip existing files
    # --dir DIR             Output directory (e.g., --dir ./output)
    # --max-workers num     Max workers for downloading (default: 32)
    
  4. ENJOY !

作为python包使用

仍处于初期开发阶段,欢迎提功能需求和PR

from yanhekt import YanHeKT

yanhekt = YanHeKT(25555, _all=True, _dual=True, _skip=True, _dir='./')
yanhekt.download()

Todo(画大饼)

  • @ZJC-GH 同学添加了批量下载功能
    • 有需要的同学可以到这个仓库 release中下载使用
    • 目前已合并到dev分支
  • 计划使用argparse完善命令行参数,优化下交互体验(2.2.0已实现)
  • (超大饼)在参数写完后整个简单的gui

致谢

yanhekt_downloader's People

Contributors

gddg08 avatar ydx-2147483647 avatar zjc-gh avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

yanhekt_downloader's Issues

如果url链接无效好像就会导致报错

当vga视频不存在的时候报错了,好像是url链接无效?

报错举例(延河课堂网站上就没有vga的视频,但是好像程序能读到链接?):
python main.py 44666 -L 4 --skip

解决方案

改变yanhekt.py文件,将原有的download类中的代码进行替换,以解决无效URL和路径不存在的问题(好像不能直接传单个.py文件,我就把整个代码放进来好了):

'''
Project      :
FilePath     : \OPENSOURCE\yanhekt.py
Descripttion :
Author       : GDDG08
Date         : 2022-11-08 02:07:44
LastEditors  : GDDG08
LastEditTime : 2024-04-03 16:03:17
'''
import os
import requests

from m3u8dl import M3u8Download

headers = {
    'Origin': 'https://www.yanhekt.cn',
    "xdomain-client": "web_user",
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36 Edg/107.0.1418.26'
}


class YanHeKT():
    def __init__(self, _courseID, _all=False, _list=None, _range=None, _dual=True, _vga=False, _video=False, _skip=False, _dir='./',  _max_workers=32) -> None:
        self.courseID = _courseID
        self.lessonList = None
        self.courseInfo = None

        self.updateArgs(_all, _list, _range, _dual, _vga, _video, _skip, _dir, _max_workers)

    def updateArgs(self, _all=False, _list=None, _range=None, _dual=True, _vga=False, _video=False, _skip=False, _dir='./',  _max_workers=32):
        self.all = _all
        self.list = _list
        self.range = _range
        self.dual = _dual
        self.vga = _vga
        self.video = _video
        self.skip = _skip
        self.dir = _dir
        self.max_workers = _max_workers

    def getCourseInfo(self):
        print("----Getting Course Information----")

        rqt_course = requests.get(f'https://cbiz.yanhekt.cn/v1/course?id={self.courseID}&with_professor_badges=true', headers=headers)
        courseInfo = rqt_course.json()['data']

        print(courseInfo['name_zh'])

        print("-----Getting Lesson List-----")
        rqt_list = requests.get(f'https://cbiz.yanhekt.cn/v2/course/session/list?course_id={self.courseID}', headers=headers)
        lessonList = rqt_list.json()['data']

        for i, lesson in enumerate(lessonList):
            print(f"[{i}] ", lesson['title'])

        self.courseInfo = courseInfo
        self.lessonList = lessonList
        return courseInfo, lessonList

    def download(self, callback=None, callback_prog=None):
        if not self.lessonList or not self.courseInfo:
            self.getCourseInfo()

        print("------Start Downloading------")

        selectList = []
        if self.all:
            selectList = list(range(len(self.lessonList)))
        elif self.list:
            selectList = self.list
        elif self.range:
            selectList += list(range(self.range[0][0], self.range[0][1]))
        else:
            print("[Error] No lesson selected in args.")
            print("Please use -A/--all, -L/--list, or -R/--range to select lessons.")
            print("Example:")
            print(f"\tpython main.py {self.courseID} --all")
            print(f"\tpython main.py {self.courseID} --list 0 2 4")
            print(f"\tpython main.py {self.courseID} --range 3 5")

            if callback:
                callback(False)
            return

        courseFullName = '-'.join([str(self.courseID), self.courseInfo['name_zh'], self.courseInfo['professors'][0]['name']])
        dirName = os.path.join(self.dir, courseFullName)

        if not os.path.exists(dirName):
            os.makedirs(dirName)

        for i in selectList:
            video = self.lessonList[i]
            fileName = video['title'].replace("/", "-")  # 防止文件名中的/导致路径错误

            print(f"Downloading {fileName} --->")

            videos = video['videos'][0]
            # 下载投影录屏
            if self.vga or self.dual:
                vga_url = videos.get('vga')
                if vga_url:
                    vga_url = self.validate_url(vga_url)
                    vga_path = f"{dirName}/{fileName}-VGA.mp4"
                    self.download_video(vga_url, vga_path, "VGA", callback_prog)

            # 下载视频
            if self.video or self.dual:
                video_url = videos.get('main')
                if video_url:
                    video_url = self.validate_url(video_url)
                    video_path = f"{dirName}/{fileName}-Video.mp4"
                    self.download_video(video_url, video_path, "Video", callback_prog)

        if callback:
            callback(True)

    def validate_url(self, url):
        if not url.startswith(('http://', 'https://')):
            return f'https://{url.lstrip("/")}'  # 确保URL以正确的协议开头
        return url

    def download_video(self, url, path, video_type, callback_prog):
        """辅助函数来处理视频下载。"""
        if self.skip and os.path.exists(path):
            print(f"{video_type} seems already done. Skipping...")
            return
        print(f"{video_type} -->")
        try:
            M3u8Download(url, os.path.dirname(path), os.path.basename(path), max_workers=self.max_workers,
                         callback_progress=callback_prog)
        except Exception as e:
            print(f"Failed to download {video_type} video. Error: {e}")

        return


if __name__ == '__main__':
    # main()
    # yanhekt = YanHeKT(12345, _all=True, _dir='./')
    # yanhekt.download(callback_prog=progressPrint)
    pass

代码变动说明

URL 验证:增加了一个方法 validate_url 以确保URL以 "http://" 或 "https://" 开头。这个方法同样处理以 "//" 开头的协议相对URL,通过添加 "https:" 作为协议方案。

路径检查:download_video 方法中增加了对已存在文件的检查。

错误处理:在 download_video 方法中为 M3u8Download 函数调用增加了 try-except 块,用于处理下载过程中可能发生的任何异常,并打印特定于视频类型 (VGA 或 视频) 的错误信息。

代码模块化:通过将下载逻辑移至 download_video,代码更加模块化,便于维护和调试。

请教一下,应该用什么版本的python呢?

我使用python3和python2都会报一些语法错误或者是依赖模块的错误,可以请教一下应该用什么版本的python呢?
当然也有可能跟python版本没有关系,只是我的操作出现了一些问题。

运行不了 KeyError

C:\Users\vioch\Desktop\PuBeta-1.2-with.FFMPEG\main.py:1: SyntaxWarning: invalid escape sequence '\F'
  '''
C:\Users\vioch\Desktop\PuBeta-1.2-with.FFMPEG\m3u8dl.py:1: SyntaxWarning: invalid escape sequence '\O'
  '''
Traceback (most recent call last):
  File "C:\Users\vioch\Desktop\PuBeta-1.2-with.FFMPEG\main.py", line 11, in <module>
    import m3u8dl
  File "C:\Users\vioch\Desktop\PuBeta-1.2-with.FFMPEG\m3u8dl.py", line 18, in <module>
    import js2py
  File "D:\Program Files\Python\Python312\Lib\site-packages\js2py\__init__.py", line 72, in <module>
    from .base import PyJsException
  File "D:\Program Files\Python\Python312\Lib\site-packages\js2py\base.py", line 2965, in <module>
    @Js
     ^^
  File "D:\Program Files\Python\Python312\Lib\site-packages\js2py\base.py", line 165, in Js
    return PyJsFunction(val, FunctionPrototype)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Program Files\Python\Python312\Lib\site-packages\js2py\base.py", line 1377, in __init__
    cand = fix_js_args(func)
           ^^^^^^^^^^^^^^^^^
  File "D:\Program Files\Python\Python312\Lib\site-packages\js2py\utils\injector.py", line 27, in fix_js_args
    code = append_arguments(six.get_function_code(func), ('this', 'arguments'))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Program Files\Python\Python312\Lib\site-packages\js2py\utils\injector.py", line 121, in append_arguments
    arg = name_translations[inst.arg]
          ~~~~~~~~~~~~~~~~~^^^^^^^^^^
KeyError: 3

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.