dplocki / podcast-downloader Goto Github PK

View Code? Open in Web Editor NEW

92.0 2.0 14.0 291 KB

The Python script for downloading new mp3 from RSS given channels

License: GNU General Public License v3.0

Python 99.71% Dockerfile 0.29%

python3 podcast script automation rss rss-feed-bot no-database json-configuration

podcast-downloader's People

Contributors

Stargazers

Watchers

Forkers

bnolet ru-fu podcast-ai 10-15-5 yeye-coder shoz-f bmilde vocacash amriz npow ryanquey lucasjet81 puluo2void

podcast-downloader's Issues

pip3 install podcast_downloader installs not-latest version

Describe the bug
Wrong version of podcast_downloader is installed

To Reproduce
Steps to reproduce the behavior:

Run pip3 install podcast_downloader or python3 -m pip install podcast_downloader
Run pip3 show podcast_downloader
Observe version installed is 0.1.1

Expected behavior
Version of install to be 0.2.0 or latest

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

OS: macOS
Python version: 3.9
Version: 0.1.1
Link to RSS feed: Not applicable

Additional context
Originally thought this was an issue with the app itself, but realized I didn't actually have the latest version of the package

Please, add new variables.

Hi, thanks for the excellent work.

1 - I came to ask if you could add new variables.

For example in this RSS I would like to get the description and the author
https://www.omnycontent.com/d/playlist/8c0a4104-a688-4e57-91fd-ad7b00d5dddd/a32cf512-c3ce-4057-8ec8-af3400c547e5/ac708daf-04da-4352-ae6d-af3400ca82ad/podcast.rss

2 - In the same RSS gives this error

[←[2m2023-05-08 16:42:55←[0m] ←[31mError:←[0m The podcast file "←[97mhttps://traffic.omny.fm/d/clips/8c0a4104-a688-4e57-91fd-ad7b00d5dddd/a32cf512-c3ce-4057-8ec8-af3400c547e5/f789c11e-447f-460d-a89c-af390172e0b3/audio.mp3?utm_source=Podcast&in_playlist=ac708daf-04da-4352-ae6d-af3400ca82ad←[0m" could not be saved to disk "←[97mC:\Users\Filipe Mota/Downloads/Podcast/A caminho do Catar[20221027] Portugueses a viver no Catar "É um país muito rico e compensa vir para cá trabalhar".mp3←[0m" due to the following error:
←[97m[Errno 22] Invalid argument: 'C:\Users\Filipe Mota/Downloads/Podcast/A caminho do Catar\[20221027] Portugueses a viver no Catar "É um país muito rico e compensa vir para cá trabalhar".mp3'←[0m

3 - In the RSS below this error in ep1 and in the trailer.

[←[2m2023-05-08 15:43:26←[0m] ←[31mError:←[0m The podcast file "←[97mhttps://traffic.omny.fm/d/clips/b04d3ae5-22c4-41b6-b20a-aa54000ba759/4093b241-20e0-4025-8a00-afba013b2218/29e80dd4-0527-4d7d-85e9-afc401721117/audio.mp3?utm_source=Podcast&in_playlist=b150e14d-4d2e-4c4e-9cf2-afba013f7a91←[0m" could not be saved to disk "←[97mC:\Users\Filipe Mota/Downloads/Podcast/O Sargento na Cela 7[20230314] Estreia. "O Sargento na Cela 7". Episódio 1 O Prisioneiro.mp3←[0m" due to the following error:
←[97m[Errno 22] Invalid argument: 'C:\Users\Filipe Mota/Downloads/Podcast/O Sargento na Cela 7\[20230314] Estreia. "O Sargento na Cela 7". Episódio 1 O Prisioneiro.mp3'←[0m

https://www.omnycontent.com/d/playlist/b04d3ae5-22c4-41b6-b20a-aa54000ba759/4093b241-20e0-4025-8a00-afba013b2218/b150e14d-4d2e-4c4e-9cf2-afba013f7a91/podcast.rss

4 - And I wish there was an alternative to the date.

YEARMMDD and YEAR.MM.DD

With the dots on the dates it would make it a lot easier to read

5 - I have a question

The possibility of having more than one podcast in a JSON file? Well, I tried and I couldn't.

6 - Error because of accents

https://rss.podplaystudio.com/3240.xml

Thanks and keep up the great work.
Best regards,
BlackSpirits

the download result is NONE from this xml https://feed.xyzfm.space/jve6gh9jt8vm

Describe the bug
the download result is NONE from this xml https://feed.xyzfm.space/jve6gh9jt8vm

Screenshots
[?[2m2023-05-27 10:54:22?[0m] Loading configuration (from file: "?[97mD:\AudioProject\data_engineering\podcast-downloader-master\config\config.json?[0m")
[?[2m2023-05-27 10:54:22?[0m] Checking "?[97m北海怪兽?[0m"
[?[2m2023-05-27 10:54:28?[0m] Last downloaded file "?[97m?[0m"
[?[2m2023-05-27 10:54:28?[0m] ?[97m北海怪兽?[0m: Nothing new
[?[2m2023-05-27 10:54:28?[0m] ------------------------------
[?[2m2023-05-27 10:54:28?[0m] Finished

The script output is on stderr

Podcast Downloader cannot download podcast from bowuzhi.fm

Describe the bug
I tried to use Podcast Downloader to download podcast from bowuzhi.fm, but got the following error:
urllib.error.HTTPError: HTTP Error 403: Forbidden

Desktop (please complete the following information):

Link to RSS feed

Final filename can exceed 255 chars

Describe the bug
Final filename can exceed 255 if template string includes another pattern in addition to the title, causing the program to crash.

To Reproduce
Steps to reproduce the behavior:

Set file_name_template to "[%publish_date%] %title%.%file_extension%". Download an episode with title longer than 255 chars.

Expected behavior
Program should not crash. Need to truncate expanded template.

Desktop (please complete the following information):

Link to RSS feed: https://lexfridman.com/feed/podcast/

Additional context
Checked the code. Looks like the truncation only applies to the title, and not the expanded template.

def str_to_filename(value: str) -> str:
    value = unicodedata.normalize("NFKC", value)
    value = re.sub(r"[\u0000-\u001F\u007F\*/:<>\?\\\|]", " ", value)

    return value.strip()[:FILE_NAME_CHARACTER_LIMIT]


def file_template_to_file_name(name_template: str, entity: RSSEntity) -> str:
    return (
        name_template.replace("%file_name%", link_to_file_name(entity.link))
        .replace("%publish_date%", time.strftime("%Y%m%d", entity.published_date))
        .replace("%file_extension%", link_to_extension(entity.link))
        .replace("%title%", str_to_filename(entity.title))
    )

Deal with files which are not part of the feed

Default location of configuration file

Is your feature request related to a problem? Please describe.
As the project become a Python module, configuration file needs to be in home directory.

Describe the solution you'd like
The configuration needs to placed in the home path, to be independent of calling place

Describe alternatives you've considered
I think the script parameter will be nice.

The downloaded audio does not match the audio provided by rss.

To Reproduce
Steps to reproduce the behavior:

Enter configuration
{
"if_directory_empty": "download_all_from_feed",
"podcasts": [
{
"name": "Thai PBS Podcast",
"rss_link": "https://www.thaipbspodcast.com/program-rss.php?id=133",
"path": "xxx",
"podcast_extensions": {".mp3": "audio/x-m4a"}
}].
}
See error
The file sizes are all 65KB and there is a read error.

Screenshots

rss has no attribute 'href'

Describe the bug
it' s not this project fault but the podcast rss fault, i wonder if there's a solution.
the rss like ' https://feeds.audiomeans.fr/feed/88cf4afb-075f-42e2-b94b-3f3d4ed98f69.xml', download it and it will return: "AttributeError: object has no attribute 'href' "

To Reproduce
Steps to reproduce the behavior:
{"if_directory_empty": "download_all_from_feed",
"podcasts": [
{
"name": "test",
"rss_link": "https://feeds.audiomeans.fr/feed/88cf4afb-075f-42e2-b94b-3f3d4ed98f69.xml",
"path": "~/test"
}
}

Limit for download at once should be also present in configuration file

Describe the bug
There is no way to limit download files in configuration file.

Expected behavior
I can enter the value for limit into configuration file.

Desktop (please complete the following information):

OS general
Python version: general
Version: 0.1.1
Link to RSS feed: general

Include the episode title in the file name

Is your feature request related to a problem? Please describe.
For a better organization it would be interesting to include the possibility that the file name contains the episode title

Describe the solution you'd like
A new flag in the configuration could be require_title

Add check-in test run

Is your feature request related to a problem? Please describe.
Missing the check-in workflow.

Describe the solution you'd like
Adding the workflow which will checking all the new commit by testing them.

The structure of configuration file needs to be redesigned

Is your feature request related to a problem? Please describe.
Currently all options in the configuration file are just podcasts data. No room for general options.

Describe the solution you'd like
In config file there should be section for general options.

Support mp4 file downloads

Is your feature request related to a problem? Please describe.
Some podcasts like this one have both .mp3 and .m4a audio files.

Describe the solution you'd like
It would be cool if the script could download both kinds!

Describe alternatives you've considered
Doing it in a shell command instead 🤷🏻 😅 I prefer the way your script keeps track of files already downloaded though!

If directory is empty download all new feeds from n days

Is your feature request related to a problem? Please describe.
Now if the directory for podcast is empty, the script will download all mp3s from RSS. It's not good thing if someone is update with current podcast.

Describe the solution you'd like
An option in config file which determine which how often this file is run (e.g. in form of days number).

Package deploy job is reacting on every pull request

Describe the bug
Each started pull request starts deploy

Can't download from this RSS

Describe the bug
Can't download from this RSS: "https://www.omnycontent.com/d/playlist/6dd8413b-ede6-483a-bf4e-ab80014939de/20f4bf02-d62f-40b2-b532-af10011ba71b/2bdbf0f4-e0ca-4343-9fb2-af10011ba729/podcast.rss"

To Reproduce
jason file:

{
    "if_directory_empty": "download_from_4_days",
    "podcasts": [
        {
            "name": " Listening Time",
            "rss_link": "https://www.omnycontent.com/d/playlist/6dd8413b-ede6-483a-bf4e-ab80014939de/20f4bf02-d62f-40b2-b532-af10011ba71b/2bdbf0f4-e0ca-4343-9fb2-af10011ba729/podcast.rss",
            "path": "./ttt",
            "file_name_template": "[%publish_date%] %title%.%file_extension%"
        }
    ]
}

Command:
python3 -m podcast_downloader

Expected behavior
Download episodes.

Error message

[2023-02-05 14:48:21] Loading configuration (from file: "~/.podcast_downloader_config.json")
[2023-02-05 14:48:21] Checking " Listening Time"
[2023-02-05 14:48:22] Last downloaded file "<none>"
[2023-02-05 14:48:22]  Listening Time: Nothing new
[2023-02-05 14:48:22] ------------------------------
[2023-02-05 14:48:22] Finished

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

OS: Ubuntu
Python version Python 3.8.10
Version ??? (I don't know this mean the version of what.)
Link to RSS feed [e.g. https://www.omnycontent.com/d/playlist/6dd8413b-ede6-483a-bf4e-ab80014939de/20f4bf02-d62f-40b2-b532-af10011ba71b/2bdbf0f4-e0ca-4343-9fb2-af10011ba729/podcast.rss]

Deal with "gaps" of episodes inside the podcast directory

Reorganize README.md

error

C:\Users\Filipe Mota>python -m podcast_downloader
[←[2m2023-10-15 16:20:41←[0m] Loading configuration (from file: "←[97m~/.podcast_downloader_config.json←[0m")
Traceback (most recent call last):
File "", line 198, in run_module_as_main
File "", line 88, in run_code
File "C:\Users\Filipe Mota\AppData\Roaming\Python\Python312\site-packages\podcast_downloader_main.py", line 159, in
load_configuration_file(os.path.expanduser(CONFIG_FILE)),
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Filipe Mota\AppData\Roaming\Python\Python312\site-packages\podcast_downloader\parameters.py", line 21, in load_configuration_file
return json.load(json_file)
^^^^^^^^^^^^^^^^^^^^
File "C:\Program Files\Python312\Lib\json_init.py", line 293, in load
return loads(fp.read(),
^^^^^^^^^^^^^^^^
File "C:\Program Files\Python312\Lib\json_init_.py", line 346, in loads
return _default_decoder.decode(s)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Program Files\Python312\Lib\json\decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Program Files\Python312\Lib\json\decoder.py", line 353, in raw_decode
obj, end = self.scan_once(s, idx)
^^^^^^^^^^^^^^^^^^^^^^
json.decoder.JSONDecodeError: Invalid \escape: line 7 column 23 (char 217)

setup podcast
make sure, that the directory of it is empty
run script

Additional context

Traceback (most recent call last):
  File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "__main__.py", line 79, in <module>
    last_downloaded_file = get_last_downloaded(rss_source_path)
  File "downloaded.py", line 23, in get_last_downloaded
    return next(get_downloaded_files(podcast_directory))
StopIteration

There is a problem with reading configuration file

Describe the bug
Script cannot find the existing configuration file on home directory: ~/.podcast_downloader_config.json

To Reproduce
Steps to reproduce the behavior:

Place configuration file: ~/.podcast_downloader_config.json
Run script
See error: "Cannot find configuration file"

Expected behavior
Run without problems

dplocki / podcast-downloader Goto Github PK

podcast-downloader's People

Contributors

Stargazers

Watchers

Forkers

podcast-downloader's Issues

Recommend Projects

Recommend Topics

Recommend Org