dicer / auto-tatort Goto Github PK

View Code? Open in Web Editor NEW

19.0 19.0 9.0 55 KB

Kleines Script um die aktuellen Tatort Folgen automatisiert (cron) aus der ARD Mediathek zu laden

License: GNU General Public License v3.0

Python 100.00%

auto-tatort's People

Contributors

Stargazers

Watchers

Forkers

mgrachten nitschkecm x42 olreti chillichicken clustermaster99 operatorofhell doczkal schustercf

auto-tatort's Issues

mediathekwebview bietet höhere Qualität an

Im API-Endpoint den das Script benutzt, ist aktuell als höchste Qualität 960 x 540 verlinkt. https://mediathekviewweb.de/ listet allerdings ein Download vom MDR mit 1280 x 720. Die Dateigröße ist auch ungefähr die doppelte.
Das Video hat dann auch das MDR Logo oben rechts.

Hier gilt es rauszufinden wie man diese Links finden kann. Diese Issue ist ein Reminder

Keine Downloads mehr

Hallo,

ich habe das script auf meinem Raspbian (raspberry pi) jeden tag um 23:00 Uhr laufen. Aber in den letzen Wochen wird kein Tatort mehr heruntergeladen. Hat sich was bei ardmediathek verändert ?
Bin ich der einzige oder geht es allen so ?

Startschwierigkeiten

Erstmal besten Dank für dieses tolle Skript! Leider bekomme ich es bei mir nicht zum laufen :( Habe es auf die Fritzbox gepackt (mit freetz) und bekomme folgende Fehlerausgabe:

Traceback (most recent call last):
  File "autotatort.py", line 73, in <module>
    urlretrieve(mediaURL, TARGET_DIR + fileName + ".mp4")
  File "/usr/lib/python2.7/urllib.py", line 98, in urlretrieve
  File "/usr/lib/python2.7/urllib.py", line 249, in retrieve
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' in position 53: ordinal not in range(128)

Auf meinem richtigen PC erhalte ich folgende Fehlermeldung:

    print "Could not get item with title '" + title + "'. Got redirected to '" + response.geturl() + "'. This is probably because the item is still in the RSS feed, but not available anymore."
                                          ^
SyntaxError: invalid syntax

Irgendwelche Ideen?

Liebe Grüße

Some streams don't have a valid protocol

Recently I noticed some streams don't contain a valid protocol, which breaks downloading.

Here's an example that shows the below debug output:

2020-12-06 02:04:08.350562 -- https://classic.ardmediathek.de/tv/Tatort/Sendung?documentId=602916&bcastId=602916&rss=true
2020-12-06 02:04:08.350764 -- 2
2020-12-06 02:04:08.350954 -- 0
2020-12-06 02:04:08.351123 -- /home/glitsj16/downloads/
2020-12-06 02:04:08.351279 -- Prepending item date: 0
2020-12-06 02:04:10.629736 -- http://www.ardmediathek.de/play/media/83756176?devicetype=pc&features=flash
2020-12-06 02:04:10.630264 -- http://www.ardmediathek.de/play/media/83949958?devicetype=pc&features=flash
2020-12-06 02:04:10.631036 -- Title 'Tatort - Duisburg Ruhrort' was NOT excluded by regexp '.*(AD).*'
2020-12-06 02:04:10.631687 -- Title 'Tatort - Duisburg Ruhrort' was NOT excluded by regexp '.*Die Klassiker:.*'
2020-12-06 02:04:10.632167 -- Title 'Tatort - Duisburg Ruhrort' was NOT excluded by regexp '.*Drehbericht zu.*'
2020-12-06 02:04:10.632694 -- Title 'Tatort - Duisburg Ruhrort' was NOT excluded by regexp '.*H.rfassung.*'
2020-12-06 02:04:10.633261 -- Title 'Tatort - Duisburg Ruhrort' was NOT excluded by regexp '.*Livestream.*'
2020-12-06 02:04:10.633983 -- Title 'Tatort - Duisburg Ruhrort' was NOT excluded by regexp '.*Making.[oO]f.*'
2020-12-06 02:04:10.635080 -- Title 'Tatort - Duisburg Ruhrort' was NOT excluded by regexp '.*Outtakes.*'
2020-12-06 02:04:10.635739 -- Title 'Tatort - Duisburg Ruhrort' was NOT excluded by regexp '.*Tatort - Extra:.*'
2020-12-06 02:04:10.636334 -- Title 'Tatort - Duisburg Ruhrort' was NOT excluded by regexp '.*TV-Trailer.*'
2020-12-06 02:04:10.636909 -- Title 'Tatort - Duisburg Ruhrort' was NOT excluded by regexp '.*XL-Vorschau.*'
2020-12-06 02:04:10.637204 -- Using replace filter ','
2020-12-06 02:04:10.637474 -- Using replace filter ' (Video tgl. ab 22 Uhr)'
2020-12-06 02:04:10.637771 -- Using replace filter ' (Video tgl. ab 20 Uhr)'
2020-12-06 02:04:10.638051 -- Using replace filter 'Tatort - '
2020-12-06 02:04:10.638403 -- Using replace filter 'Tatort: '
2020-12-06 02:04:10.638735 -- Filtered title to 'Duisburg Ruhrort'
2020-12-06 02:04:10.667322 -- Taking first _mediaArray
2020-12-06 02:04:10.667728 -- Selected quality 2
2020-12-06 02:04:10.668065 -- We have only one stream. Will download it
2020-12-06 02:04:10.668507 -- Downloading //wdrmedien-a.akamaihd.net/medp/ondemand/weltweit/fsk12/231/2310113/2310113_30565799.mp4
2020-12-06 02:04:10.771462 -- Adding docId '83949958' to seen-db

Untertitel runter laden und konvertieren

Viele (alle?) Tatort-Folgen enthalten Untertitel in einer XML Datei:
"_subtitleUrl":"/static/avportal/untertitel_mediathek_preview/23229170.xml","_subtitleOffset":0,"

-> http://www.ardmediathek.de/static/avportal/untertitel_mediathek_preview/23229170.xml

Die Zeit beginnt zwar bei 10:00:00.000 aber ansonsten sollte man das recht problemlos in ein Format konvertieren koennen, welches normale Player abspielen koennen. SRT zb

Erster Schritt: Schon mal die Untertitel runter laden
Zweiter Schritt: Konvertieren

Escaping des Titels scheint in manchen Faellen fehl zu schlagen

Traceback (most recent call last):
File "auto-tatort/autoTatort.py", line 52, in
print "Downloaded '" + title + "'"
UnicodeEncodeError: 'ascii' codec can't encode character u'\xf6' in position 40: ordinal not in range(128)

hat sich die mediathek api geändert?

Hi,
seit ca. einer Woche funktioniert die URL aus der Beispiel-Config nicht mehr.

Status-Code: 410 "Gone"

Welche API Urls muss ich verwenden, damit das Skript wieder läuft?

Vielen Dank für das coole Skript und Unterstützung.

Error when downloading episodes with UTF-8 characters in their names

At least I think that's the problem:

DiskStation$ python autoTatort.py 
Traceback (most recent call last):
  File "autoTatort.py", line 218, in <module>
    if (os.path.isfile(fullFileName)) == True:
  File "/usr/local/lib/python2.7/genericpath.py", line 29, in isfile
    st = os.stat(path)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xc4' in position 45: ordinal not in range(128)

Aktuell funktioniert auto-tatort nicht - änderung der API?

Hallo,

hat die ARDMediathek vielleicht irgendwas verändert? Bei den letzten beiden Aufrufen von auto-tatort via cronjob bekam ich folgende Fehlermeldung:

Traceback (most recent call last):
File "/home/pi/bin/autoTatort.py", line 101, in
urlretrieve(mediaURL, TARGET_DIR + fileName + ".mp4")
File "/usr/lib/python2.7/urllib.py", line 98, in urlretrieve
return opener.retrieve(url, filename, reporthook, data)
File "/usr/lib/python2.7/urllib.py", line 289, in retrieve
"of %i bytes" % (read, size), result)
urllib.ContentTooShortError: retrieval incomplete: got only 300621911 out of 1310005972 bytes

Kann das jemand bestätigen, hat jemand eine Lösung?

Geoblocking und andere 403er erkennen

Ist man Opfer von Geoblocking, kriegt man mit autoTatort eine .mp4-Datei mit diesem HTML-Inhalt:

<HTML><HEAD>
<TITLE>Access Denied</TITLE>
</HEAD><BODY>
<H1>Access Denied</H1>
 
You don't have permission to access "http&#58;&#47;&#47;pdvideosdaserste&#45;a&#46;akamaihd&#46;net&#47;de&#47;2017&#47;09&#47;14&#47;5f1c3905&#45;3224&#45;4af6&#45;b615&#45;01a5e5016ae9&#47;960&#45;1&#46;mp4" on this server.<P>
Reference&#32;&#35;18&#46;f5761602&#46;1506094259&#46;4c6fc5a
</BODY>
</HTML>

Der passende Fehler im Web-Interface der Mediathek: "Dieses Video kann leider nicht abgespielt werden. Wichtig: Inhalte mit Geoblocking können nur von Deutschland aus abgerufen werden. Wir bitten um Ihr Verständnis."

Schön wäre, in dieser Situation die Datei gar nicht erst runterzuladen. Der Server gibt 403 zurück in diesem Moment.

Download eines RSS items schlaegt mit ValueError fehl

Traceback (most recent call last):
File "auto-tatort/autoTatort.py", line 40, in
media = json.loads(html)
File "/usr/lib/python2.7/json/init.py", line 326, in loads
return _default_decoder.decode(s)
File "/usr/lib/python2.7/json/decoder.py", line 365, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python2.7/json/decoder.py", line 383, in raw_decode
raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded

Habe bisher leider noch nicht rausfinden koennen warum. Hab Tracing Code eingebaut und hoffe das bald zu finden.

Feature: Unabhaengigkeit vom Abfragezeitpunkt

autoTatort sollte nicht darauf angewiesen sein zu einem bestimmten Zeitpunkt zu laufen, sondern sich merken, wann der Feed zum letztem Mal abgerufen wurde. Alle Feeditems vor diesem Zeitpunkt werden ignoriert. Alle danach werden geladen und das Datum des zuletzt geladenen Feeds gespeichert.

Noch zu klaeren ist, ob die Items im Feed auch tatsaechlich ein reales Veroeffentlichungsdatum haben oder ob die vor- oder zurueckdatiert werden.

Alternative: Alle Items werden immer geparsed. Dann wird auf Platte nachgesehen ob schon ein Download mit dem Titel existiert. Wenn nicht, wird der Tatort geladen. Dieses Verhalten sollte dann aber wohl optional sein, da nicht davon ausgegangen werden kann, dass alle User die Folgen im Eingangsfolder liegen lassen.

Trying to Download "Lindenstraße"

Dear dicer,
I've found your wonderful py script, that helps a lot to download Tatort.
But additionally, I would like to download Lindenstraße with the help of the following config.json:

{
"debug": 1,
"debugFile": "debug.log",
"feeds": [
{
"enabled": 1,
"id": "Lindenstrasse",
"quality": -1,
"subtitles": 0,
"targetFolder": "/mnt/seagate/gemeinsam/Filme_Serien/Lindenstrasse/",
"url": "https://www.ardmediathek.de/tv/Lindenstra%C3%9Fe/Sendung?documentId=5280&bcastId=5280&rss=true",
"exclude": [
{
"regexp": ".Vorschau."
},
{
"regexp": ".Outtake."
},
{
"regexp": ".So geht."
}
],
"titleFilters": [
{
"replace": " (Video tgl. ab 22 Uhr)"
},
{
"replace": " (Video tgl. ab 20 Uhr)"
}
],
"titlePrependItemDate": 0
}
],
"downloadedFeedItemsDatabase": "downloadedItems.json",
"version": 5
}

But I'm not successful.
The script sucessfully finds a file to download, but it does not start and after a while it tells me, that the file with the requested quality could not be downloaded.
Downloading the given file manually works fine ...

Maybe you can help.

Best regards
Heiko

File size

Waere gut wenn man bestimmen koennte dass, z.B. Keine Dateine unter 100 Megabytes runtergeladen werden. Nur eine Idee zur Verbesserung von diesen super code.
Vielen Dank!

IndexError: list index out of range

Schönes Skript. Teste grad n bisschen die v2.

Bekomme mit dem Mitternachtsspitzen-Feed das hier beim fünften Eintrag:

Traceback (most recent call last):
  File "/home/pi/bin/autotatort/auto-tatort/autoTatort.py", line 165, in <module>
    mediaLinks = media["_mediaArray"][1]["_mediaStreamArray"]
IndexError: list index out of range

Filter einführen, der Hörfassungen und Kurzfeatures vom Download ausschliesst

Hi,

ich freue mich sehr über Dein Script und habe es auf einem Raspberry PI zum laufen gebracht. Allerdings bekomme ich teilweise nicht den Tatort selber heruntergeladen sonder nur die Hörfassung. Gibt es eine Möglichkeit es besser zu Filtern?
Hier die File List die ich habe...

Jeder Hinweiss / Tip ist mehr als nur Willkommen!

Vielen Dank !
Carsten

MP4 download contains streaming information

Instead of downloading a mp4 video file the file contains streaming information:

#EXTM3U
#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=604000,RESOLUTION=512x288,CODECS="avc1.77.30, mp4a.40.2",CLOSED-CAPTIONS=NONE
https://dasersteuni-vh.akamaihd.net/i/de/2018/01/06/ce71a409-4775-4d71-a903-3c48104831ec/,512-1,640-1,320-1,480-1,960-1,.mp4.csmil/index_0_av.m3u8?null=0
#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=1212000,RESOLUTION=640x360,CODECS="avc1.77.30, mp4a.40.2",CLOSED-CAPTIONS=NONE
https://dasersteuni-vh.akamaihd.net/i/de/2018/01/06/ce71a409-4775-4d71-a903-3c48104831ec/,512-1,640-1,320-1,480-1,960-1,.mp4.csmil/index_1_av.m3u8?null=0
#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=188000,RESOLUTION=320x180,CODECS="avc1.66.30, mp4a.40.2",CLOSED-CAPTIONS=NONE
https://dasersteuni-vh.akamaihd.net/i/de/2018/01/06/ce71a409-4775-4d71-a903-3c48104831ec/,512-1,640-1,320-1,480-1,960-1,.mp4.csmil/index_2_av.m3u8?null=0
#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=316000,RESOLUTION=480x270,CODECS="avc1.66.30, mp4a.40.2",CLOSED-CAPTIONS=NONE
https://dasersteuni-vh.akamaihd.net/i/de/2018/01/06/ce71a409-4775-4d71-a903-3c48104831ec/,512-1,640-1,320-1,480-1,960-1,.mp4.csmil/index_3_av.m3u8?null=0
#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=1988000,RESOLUTION=960x540,CODECS="avc1.77.30, mp4a.40.2",CLOSED-CAPTIONS=NONE
https://dasersteuni-vh.akamaihd.net/i/de/2018/01/06/ce71a409-4775-4d71-a903-3c48104831ec/,512-1,640-1,320-1,480-1,960-1,.mp4.csmil/index_4_av.m3u8?null=0

Reproduce like so:

git clone [email protected]:dicer/auto-tatort.git
cd auto-tatort/
mv config.json.sample config.json
python autoTatort.py

Is it broken now?

It seems something has changed.

The tool is only downloading the "Lammerts Leichen" clips and not the full episodes.

funktioniert bei mir auf dem Pi nicht.

Erhalte folgende Fehlermeldung:
https://pastee.org/uge4q

Weiß irgendwie nicht weiter.