p-ranav / saveddit Goto Github PK

Bulk Downloader for Reddit

License: MIT License

Python 100.00%

reddit reddit-api scraper bulk-downloader bulk-download imgur gfycat redgifs youtube youtube-dl

saveddit's Issues

cant find comment karma

Traceback (most recent call last):
File "/home/wallmenis/.local/bin/saveddit", line 8, in
sys.exit(main())
^^^^^^
File "/home/wallmenis/.local/lib/python3.11/site-packages/saveddit/saveddit.py", line 360, in main
downloader.download_user_meta(args)
File "/home/wallmenis/.local/lib/python3.11/site-packages/saveddit/user_downloader.py", line 93, in download_user_meta
user_dict["comment_karma"] = user.comment_karma
^^^^^^^^^^^^^^^^^^
File "/home/wallmenis/.local/lib/python3.11/site-packages/praw/models/reddit/base.py", line 35, in getattr
return getattr(self, attribute)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/wallmenis/.local/lib/python3.11/site-packages/praw/models/reddit/base.py", line 36, in getattr
raise AttributeError(
AttributeError: 'Redditor' object has no attribute 'comment_karma'

after running with

saveddit user wallmenis saved gilded submitted multireddits upvoted -o .

Need error handling or processing of non media posts.

Getting the following error occasionally:

     * This is a redgif link
       - Looking for submission.preview.reddit_video_preview.fallback_url
Traceback (most recent call last):
  File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/christopher/saveddit/saveddit/saveddit.py", line 65, in <module>
    main(args)
  File "/home/christopher/saveddit/saveddit/saveddit.py", line 31, in main
    downloader.download(args.o,
  File "/home/christopher/saveddit/saveddit/subreddit_downloader.py", line 141, in download
    self.download_gfycat_or_redgif(submission, files_dir)
  File "/home/christopher/saveddit/saveddit/subreddit_downloader.py", line 371, in download_gfycat_or_redgif
    if "reddit_video_preview" in submission.preview:
  File "/home/christopher/.local/lib/python3.8/site-packages/praw/models/reddit/base.py", line 35, in __getattr__
    return getattr(self, attribute)
  File "/home/christopher/.local/lib/python3.8/site-packages/praw/models/reddit/base.py", line 36, in __getattr__
    raise AttributeError(
AttributeError: 'Submission' object has no attribute 'preview'

"..." in directory names doesn't work for Windows users.

Line 41 in submission_downloader.py causes issues for Windows users because directories can't have "..." at the end of their names. For Windows users that line should be commented out.

Permission Error

I am using Linux Mint 20.1. The following error occured.

Traceback (most recent call last):
File "/home/mobi/.local/bin/saveddit", line 8, in
sys.exit(main())
File "/home/mobi/.local/lib/python3.8/site-packages/saveddit/saveddit.py", line 68, in main
downloader.download(args.o,
File "/home/mobi/.local/lib/python3.8/site-packages/saveddit/subreddit_downloader.py", line 79, in download
os.makedirs(category_dir)
File "/usr/lib/python3.8/os.py", line 213, in makedirs
makedirs(head, exist_ok=exist_ok)
File "/usr/lib/python3.8/os.py", line 213, in makedirs
makedirs(head, exist_ok=exist_ok)
File "/usr/lib/python3.8/os.py", line 213, in makedirs
makedirs(head, exist_ok=exist_ok)
[Previous line repeated 2 more times]
File "/usr/lib/python3.8/os.py", line 223, in makedirs
mkdir(name, mode)
PermissionError: [Errno 13] Permission denied: '/Downloads'

[Help] Comments Limits & Possible no-duplicate

Hi!
I just downloaded this and was wondering how I could like remove the limit on how much comments it downloads? (the current limit is top comments only so)

Also, I was wondering how could I prevent it from re-downloading posts I already downloaded.

Thanks!

After merging audio and video, audio and video stay around

When download a file that separates audio and video into 2 files after saveddit merges them into 1 file the individual audio and video files stay around. Is this intended functionality? or Can there be an option to only keep the merged file when downloading?

issue with files name

just a small problem, the " character is illegal in windows file name so the script crash when it encounters one.

http 401 error

hello, I have the following error :
`python -m saveddit.saveddit -r "Ebony" -f "new" -l 2000 -o "E:\E\D\saveddit\test"
.___ .. __
___________ ___ __ ____ | _/| _/||/ |_
/ /_ \ / // __ \ / __ |/ __ | | \
_ \ / __ \ /\ // // / // | | || |
/_ >(____ /_/ __ >____ ____ | ||||
/ / / / /

Downloader for Reddit
version : v1.0.0
URL : https://github.com/p-ranav/saveddit

E:\E\D\saveddit\test
Downloading from /r/Ebony/new/
Traceback (most recent call last):
File "C:\Users\theo\AppData\Local\Programs\Python\Python37\lib\runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "C:\Users\theo\AppData\Local\Programs\Python\Python37\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "C:\Users\theo\Downloads\saveddit-master\saveddit-master\saveddit\saveddit.py", line 73, in
main(args)
File "C:\Users\theo\Downloads\saveddit-master\saveddit-master\saveddit\saveddit.py", line 32, in main
categories=args.f, post_limit=args.l, skip_videos=args.skip_videos, skip_meta=args.skip_meta, skip_comments=args.skip_comments)
File "C:\Users\theo\Downloads\saveddit-master\saveddit-master\saveddit\subreddit_downloader.py", line 74, in download
for i, submission in enumerate(category_function(limit=post_limit)):
File "C:\Users\theo\AppData\Local\Programs\Python\Python37\lib\site-packages\praw\models\listing\generator.py", line 63, in next
self._next_batch()
File "C:\Users\theo\AppData\Local\Programs\Python\Python37\lib\site-packages\praw\models\listing\generator.py", line 73, in _next_batch
self._listing = self._reddit.get(self.url, params=self.params)
File "C:\Users\theo\AppData\Local\Programs\Python\Python37\lib\site-packages\praw\reddit.py", line 566, in get
return self._objectify_request(method="GET", params=params, path=path)
File "C:\Users\theo\AppData\Local\Programs\Python\Python37\lib\site-packages\praw\reddit.py", line 672, in _objectify_request
path=path,
File "C:\Users\theo\AppData\Local\Programs\Python\Python37\lib\site-packages\praw\reddit.py", line 855, in request
json=json,
File "C:\Users\theo\AppData\Local\Programs\Python\Python37\lib\site-packages\prawcore\sessions.py", line 331, in request
url=url,
File "C:\Users\theo\AppData\Local\Programs\Python\Python37\lib\site-packages\prawcore\sessions.py", line 257, in _request_with_retries
url,
File "C:\Users\theo\AppData\Local\Programs\Python\Python37\lib\site-packages\prawcore\sessions.py", line 164, in _do_retry
retry_strategy_state=retry_strategy_state.consume_available_retry(), # noqa: E501
File "C:\Users\theo\AppData\Local\Programs\Python\Python37\lib\site-packages\prawcore\sessions.py", line 257, in _request_with_retries
url,
File "C:\Users\theo\AppData\Local\Programs\Python\Python37\lib\site-packages\prawcore\sessions.py", line 164, in _do_retry
retry_strategy_state=retry_strategy_state.consume_available_retry(), # noqa: E501
File "C:\Users\theo\AppData\Local\Programs\Python\Python37\lib\site-packages\prawcore\sessions.py", line 260, in _request_with_retries
raise self.STATUS_EXCEPTIONSresponse.status_code
prawcore.exceptions.InvalidToken: received 401 HTTP response`

this happened a first time at the 514th file and happened again as I retried.

Add support for the XDG Base Directory Specification

This is a feature request for supporting the XDG Base Directory Specification.

The specification works around a bug during the early UNIX v2 rewrite which caused files prepended with a '.' to be ignored from the output of ls.
While this "bug" has become a feature for some, it has also become a headache for users when developers continue to assume HOME is a great place to dump configuration files and local caches.

To address these issues XDG Basedir was formed to give developers a standard location for these files and giving the users control over where they are placed in their HOME.

If you were to support the XDG specification the following locations would change:

Change ~/.saveddit/ to $XDG_CONFIG_HOME/saveddit and fall back to $HOME/.config/saveddit if XDG_CONFIG_HOME is not defined.

FileNotFoundError when download a post with a title that is truncated on windows

System:
Windows 10 64 bit.
Python 3.9.5

Steps to reproduce:
Run this on windows: saveddit subreddit pics -f top -l 5 -o .

As of today, the top post is this: https://old.reddit.com/r/pics/comments/haucpf/ive_found_a_few_funny_memories_during_lockdown/
Trying to download it gives this output:

#000 "I’ve found a few funny memories during lockdown. This is from my 1st tour in 89, backstage in Vegas."
     * Processing `https://i.redd.it/f58v4g8mwh551.jpg`
Traceback (most recent call last):
  File "c:\users\bad_g\appdata\local\programs\python\python39\lib\runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "c:\users\bad_g\appdata\local\programs\python\python39\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\Users\bad_g\AppData\Local\Programs\Python\Python39\Scripts\saveddit.exe\__main__.py", line 7, in <module>
  File "c:\users\bad_g\appdata\local\programs\python\python39\lib\site-packages\saveddit\saveddit.py", line 346, in main    downloader.download(args.o,
  File "c:\users\bad_g\appdata\local\programs\python\python39\lib\site-packages\saveddit\subreddit_downloader.py", line 67, in download
    SubmissionDownloader(submission, i, self.logger, category_dir,
  File "c:\users\bad_g\appdata\local\programs\python\python39\lib\site-packages\saveddit\submission_downloader.py", line 68, in __init__
    files_dir = create_files_dir(submission_dir)
  File "c:\users\bad_g\appdata\local\programs\python\python39\lib\site-packages\saveddit\submission_downloader.py", line 62, in create_files_dir
    os.makedirs(files_dir)
  File "c:\users\bad_g\appdata\local\programs\python\python39\lib\os.py", line 225, in makedirs
    mkdir(name, mode)
FileNotFoundError: [WinError 3] The system cannot find the path specified: '.\\www.reddit.com\\r\\pics\\top\\000_I_ve_found_a_few_funny_memories_...\\files'
PS C:\Users\bad_g\Downloads\Saveddit>

Probably due to the fact that windows removes the ellipsis at the end of the directory automatically. Maybe add the possibility to remove the truncation and/or simply remove the "..." added to the end of the directory for windows.

filenames for large multireddits

Hi, i've just encountered a problem. When i try to make an anonymous multireddit with about 90 subreddits in it the name of the folder generated throws this error:
[Errno 36] File name too long

Is there a way to bypass this?

Scraping comments in order

Does this library scrape comments of a given post in the order of their occurrence without messing with the hierarchy? The praw library helps in scraping all the comments but they are not in order. Please let me know if this library can do that and the command I should use.

I used the command below and got an error:

python3 -m bdfr download ./path/to/output --all-comments -l "https://www.reddit.com/r/germany/comments/yydfai/what_is_your_opinion_of_graffiti_all_over_walls/"

Error: No such option: --all-comments

Thank you

Make saveddit a CL-callable module

Main issue:
Using python3 -m saveddit.saveddit [args] command is not very comfortable for multiple reasons.

Reasons:

You need to be in the same directory as saveddit which is an unnecessary step that can be eliminated
You need to call this module directly, which is a) can be confusing, b) can be eliminated

Solution:
Make this module callable from any place by creating a setup.py script and assembling this module into a python package. This way you can make saveddit available for domnload via PyPI - the largest Python project repository - via just a simple pip install saveddit command. To use this package, you'll need to just execute saveddit [args] command without changing your working directory. Also users can easily update your packages and you can modify its contents with ease

Move client IDs and secrets in a separate configuration file

Main issue:
In your script files you store your client IDs and secrets as constants. This can pose a number of problems.

Main problems:

Sensetiva Data exposure. Client IDs and secrets are considered rather sensetiva data. Storing them as constans is highly discouraged as basically anyone can get a hold of them.
Difficult configuration. If you need to change/update your tokens, this can be complicated for the end user due to the fact that he needs to change it in the script files itself, which is rather discouraging.
Code redundancy. You define your credentials twice, thus making it harder to change it (you need to go to every file and manually update them, which is inefficient at best) and also you end up with basically the same variables, which is also inefficient.

Solution:
Move all this data in a separate configuration file (.yaml or .json) and create a function to parse it. That way you can store all your data in one place and retrieving it via a simple function call, thus making the process of updating it much more simpler, your code is now optimized (a bit, but still), and any end user feels more comfortable working with configuration file than with raw codebase.

If you do not mind, assign me please for this issue.

Thanks for this awesome project!

p-ranav / saveddit Goto Github PK

saveddit's Issues

cant find comment karma

Need error handling or processing of non media posts.

"..." in directory names doesn't work for Windows users.

Permission Error

[Help] Comments Limits & Possible no-duplicate

After merging audio and video, audio and video stay around

issue with files name

http 401 error

Add support for the XDG Base Directory Specification

FileNotFoundError when download a post with a title that is truncated on windows

filenames for large multireddits

Scraping comments in order

Make saveddit a CL-callable module

Move client IDs and secrets in a separate configuration file

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent