Giter Site home page Giter Site logo

gaowanliang / onedrivesharelinkpusharia2 Goto Github PK

View Code? Open in Web Editor NEW
304.0 3.0 70.0 110 KB

Extract download URLs from OneDrive or SharePoint share links and push them to aria2, even on systems without a GUI.

License: Apache License 2.0

Python 100.00%
onedrive download aria2

onedrivesharelinkpusharia2's Introduction

简体中文

OneDriveShareLinkPushAria2

Extract download URLs from OneDrive or SharePoint share links and push them to aria2, even on systems without a GUI (such as Linux).

从OneDrive或SharePoint共享链接提取下载URL并将其推送到aria2,即使在无图形界面的系统中(如Linux)依然可以使用。

Dependent

requests==2.25.1

pyppeteer==0.2.5

Feature

At present, this program supports the following download methods:

  • xxx-my.sharepoint.com Download of share links
    • Downloading multiple files without password for shared links
    • Downloading multiple files with password for shared links
    • Download of files in nested folders
    • Download any file of your choice
    • Traversal view and download for multiple files (more than 30) of shared links
  • xxx.sharepoint.com Downloads with share links
  • xxx-my.sharepoint.cn Download of share links (theoretically supported)

Note: aria2 itself does not support HTTP POST download links, while onedrive folder package download is HTTP POST download links, so this program will not support onedrive folder package download

Output file list

input this command then you can get file list in list.txt

python main.py > list.txt

It maybe output gibberish in powershell, you can input this command before to fix

[System.Console]::OutputEncoding = [System.Text.Encoding]::UTF8

Without password for shared links

Take this download link as an example:

https://gitaccuacnz2-my.sharepoint.com/:f:/g/personal/mail_finderacg_com/EheQwACFhe9JuGUn4hlg9esBsKyk5jp9-Iz69kqzLLF5Xw?e=FG7SHh

At this time, you need to use the download code for no password link, that is, main.py. Open this file and you can see that there are some global variables:

If you want to download the second file, you need downloadNum="2"

If you want to download the second and third file, you need downloadNum="2-3"

If you want to download the second, third, fourth, seventh file, you need downloadNum="2-4,7"

and so on.

After modifying, make sure the target aria2 is on and execute python3 main.py

With password for shared links

Take this download link as an example:

https://jia666-my.sharepoint.com/:f:/g/personal/1025_xkx_me/EsqNMFlDoyZKt-RGcsI1F2EB6AiQMBIpQM4Ka247KkyOQw?e=oC1y7r

At this time, you need to use the download code for have password link, that is, havepassword.py. Open this file and you can see that there are some global variables (repeated without further ado):

  • OneDriveSharePwd: Password for the OneDrive link

Usage is similar to the above.

Note

Before you use it, clone the whole project with git clone https://github.com/gaowanliang/OneDriveShareLinkPushAria2.git to use it. havepassword.py depends on main.py, if you want to use the version that requires a password If you want to use a version that requires a password, you need to pip install pyppeteer

The basic functions of this program have been realized. For a long time, if the software is not unusable, it will not be maintained. If there is a running problem, please bring a download link when raising the issue. The bug type issue that does not provide a download link will not be solved.

onedrivesharelinkpusharia2's People

Contributors

gaowanliang avatar mengzonefire avatar mikubill avatar yanshibin avatar yinaoxiong avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

onedrivesharelinkpusharia2's Issues

原地址、目录结构等问题

下载无密码的链接
1、原地址不能识别
如帮助文件给出的链接https://gitaccuacnz2-my.sharepoint.com/:f:/g/personal/mail_finderacg_com/EheQwACFhe9JuGUn4hlg9esBsKyk5jp9-Iz69kqzLLF5Xw?e=FG7SHh
转换为原地址后如下,执行后报文 “ 这个文件夹没有文件”
https://gitaccuacnz2-my.sharepoint.com/personal/mail_finderacg_com/_layouts/15/onedrive.aspx?id=%2Fpersonal%2Fmail%5Ffinderacg%5Fcom%2FDocuments%2F%E6%89%91%E6%A3%B1%E8%9B%BE%E5%AD%90&originalPath=aHR0cHM6Ly9naXRhY2N1YWNuejItbXkuc2hhcmVwb2ludC5jb20vOmY6L2cvcGVyc29uYWwvbWFpbF9maW5kZXJhY2dfY29tL0VoZVF3QUNGaGU5SnVHVW40aGxnOWVzQnNLeWs1anA5LUl6NjlrcXpMTEY1WHc%5FcnRpbWU9MndseTVpSl8yVWc

有密码的存问题如 #9
https://yantaimedshow-my.sharepoint.com/:f:/g/personal/lidongsheng2007_yantaimedshow_onmicrosoft_com/Eg3L-Vk3_E9EpFgBP0NBsVwB85nS-alkb0v4Ju5EinJ5ww?e=HKKLlL
密码是@teamfreeshare
输出:

[root@onstance-c Script]# python3 /home/Script/OneDriveShareLinkPushAria2/havepasword.py
正在启动无头浏览器模拟输入密码
Traceback (most recent call last):
File "/home/Script/OneDriveShareLinkPushAria2/havepassword.py", line 76, in
havePwdGetFiles(OneDriveShareURL, OneDriveSharePwd)
File "/home/Script/OneDriveShareLinkPushAria2/havepassword.py", line 55, in havPwdGetFiles
asyncio.get_event_loop().run_until_complete(main(iurl, password))
File "/usr/lib64/python3.6/asyncio/base_events.py", line 484, in run_until_compete
return future.result()
File "/home/Script/OneDriveShareLinkPushAria2/havepassword.py", line 28, in mai
browser = await launch()
File "/usr/local/lib/python3.6/site-packages/pyppeteer/launcher.py", line 307, n launch
return await Launcher(options, **kwargs).launch()
File "/usr/local/lib/python3.6/site-packages/pyppeteer/launcher.py", line 168, n launch
self.browserWSEndpoint = get_ws_endpoint(self.url)
File "/usr/local/lib/python3.6/site-packages/pyppeteer/launcher.py", line 227, n get_ws_endpoint
raise BrowserError('Browser closed unexpectedly:\n')
pyppeteer.errors.BrowserError: Browser closed unexpectedly:

Running in docker with error Log:"这个文件夹没有文件"

First, I made a docker image with the following dockerfile:
aec68e0749e7e6bb4b0bbfa08e6e818d
Next, I deployed it with docker compose:
text-image (1)
And then changed the download link and aria2 server address as the docker compose file:
8a6eadc7110a33443b938991ee79e680
Last, I deployed the stack, and the log in docker portainer showed this error (maybe function?)
144bbd6f5ed80cfda9c8c8395947ff31

With the script run once, a new error log generates, repeatedly.
And I found two parts with the same text in the main.py script, but as an Operator, I do not know any of python programming.

May you tell me something about this? Thank you very much~
And I can understand both English and Chinese Simplified, so I prefer you using any of these languages.
Wish you a good day!

sharepoint链接报错

Traceback (most recent call last):
File "main.py", line 388, in
getFiles(OneDriveShareURL, None, 0)
File "main.py", line 92, in getFiles
if "NextHref" in graphqlReq["data"]["legacy"]["renderListDataAsStream"]["ListData"]:
TypeError: 'NoneType' object is not subscriptable

使用测试的onedrive链接可以正常列出,用问题1的sharepoint链接和我自己的sharepoint链接都会报上述错误
https://fulanlan.sharepoint.com/:f:/s/fulan/EsJeZSf6QpBIritMqRLVR8YBomuxEiVoRsnRMOUIjGjFBQ?e=SwoIxT

出了点问题

之前用都好好的,突然出现了这个问题,不知道原因在哪里。

正在启动无头浏览器模拟输入密码
密码输入完成,正在跳转
正在获取Cookie
无头浏览器关闭,正在获取文件列表
 这个文件夹没有文件

事实是文件夹里有文件的

xxx.sharepoint.com不再受支持

现在Sharepoint的默认分享链接是xxx.sharepoint.com
在下载时 会报错 提示
Traceback (most recent call last): File "/storage/emulated/0/Download/OneDriveShareLinkPushAria2/main.py", line 415, in <module> getFiles(OneDriveShareURL, None, 0) File "/storage/emulated/0/Download/OneDriveShareLinkPushAria2/main.py", line 107, in getFiles if "NextHref" in graphqlReq["data"]["legacy"]["renderListDataAsStream"]["ListData"]: TypeError: 'NoneType' object is not subscriptable
已测试自带的demo链接。无报错,只有这个报错

采用默认aria2的配置会出现失败情况

下载很多文件,发现很多文件为540B的文件
cat 该文件

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
   "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=10"/>
<script src="throttleerror.js" type="text/javascript" ></script>
<title id="errorpagetitle"></title>
</head>
<body class="ms-core-needIEFilter" id="ms-error-body">
<div class="ms-pr" id="ms-error-header">
<h1 class="ms-core-pageTitle" id="errortitle">
</h1>
<p id="errortext"></p>
</div>
</body>
</html>

设置了单服务器连接数为32,单任务连接数64,最大同时下载数5.

测试过一个文件一个文件下载,仍然会出现该文件。需要增加判定,出现该文件需要进行重新下载。
请问有其他好的解决方案吗?

还会更新吗?

下载不管推送那个文件夹都是说文件夹中没有文件...

这个文件夹没有文件

正在启动无头浏览器模拟输入密码
密码输入完成,正在跳转
正在获取Cookie
无头浏览器关闭,正在获取文件列表
这个文件夹没有文件

Process finished with exit code 0

我可以肯定我输入的是分享链接,但是无法下载

貌似有密码的还有点问题?

我这运行发生了如下报错

Traceback (most recent call last):
File "/root/od/havepassword.py", line 73, in
havePwdDownloadFiles(OneDriveShareURL, OneDriveSharePwd, aria2Link,
File "/root/od/havepassword.py", line 65, in havePwdDownloadFiles
asyncio.get_event_loop().run_until_complete(main(iurl, password))
File "/usr/lib/python3.9/asyncio/base_events.py", line 642, in run_until_complete
return future.result()
File "/root/od/havepassword.py", line 28, in main
browser = await launch()
File "/usr/local/lib/python3.9/dist-packages/pyppeteer/launcher.py", line 307, in launch
return await Launcher(options, **kwargs).launch()
File "/usr/local/lib/python3.9/dist-packages/pyppeteer/launcher.py", line 148, in launch
self.proc = subprocess.Popen( # type: ignore
File "/usr/lib/python3.9/subprocess.py", line 951, in init
self._execute_child(args, executable, preexec_fn, close_fds,
File "/usr/lib/python3.9/subprocess.py", line 1823, in _execute_child
raise child_exception_type(errno_num, err_msg, err_filename)
OSError: [Errno 8] Exec format error: '/root/od/local-chromium/588429/chrome-linux/chrome'

更新后内容不超过30个数的分享链接下载失败

如图
报错

添加判断再使用旧版本代码后可用

if filesData==None: pat = re.search( 'g_listData = {"wpq":"","Templates":{},"ListData":{ "Row" : ([\s\S]*?),"FirstRow"', reqf.text) filesData = json.loads(pat.group(1))

推测新判断规则不适用于内容不超过30个数的分享链接

分享链接的子目录下载问题

例如分享链接是一个文件夹:share/

当里面有多个子文件夹,比如share/music/ , share/video/ , share/photo/

如果在脚本中填入share/music/,那么就会出现"这个文件夹没有文件"

实际上在浏览器中打开也是这样,先打开share/,再打开share/music/是正常的,直接打开share/music/会提示需要登录。大概就是这样的问题。

那么能不能在脚本中实现类似浏览器中先打开share/,再打开share/music/的那种效果?就能正常获取到文件列表了。

能不能加上登录的功能

有些分享链接是只支持同域账号查看的。我有账号,但是账号没有onedrive,只能查看,无法转存。
可不可以加上登录账号的功能,然后用账号信息进行下载。

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.