Giter Site home page Giter Site logo

guessit-io / guessit Goto Github PK

View Code? Open in Web Editor NEW
805.0 26.0 90.0 4.24 MB

GuessIt is a python library that extracts as much information as possible from a video filename.

Home Page: https://guessit-io.github.io/guessit

License: GNU Lesser General Public License v3.0

Python 99.94% Dockerfile 0.06%
python parser media filename scene release

guessit's Introduction

GuessIt

Latest Version LGPLv3 License Codecov semantic-release

GuessIt is a python library that extracts as much information as possible from a video filename.

It has a very powerful matcher that allows to guess properties from a video using its filename only. This matcher works with both movies and tv shows episodes.

For example, GuessIt can do the following:

$ guessit "Treme.1x03.Right.Place,.Wrong.Time.HDTV.XviD-NoTV.avi"
For: Treme.1x03.Right.Place,.Wrong.Time.HDTV.XviD-NoTV.avi
GuessIt found: {
    "title": "Treme",
    "season": 1,
    "episode": 3,
    "episode_title": "Right Place, Wrong Time",
    "source": "HDTV",
    "video_codec": "Xvid",
    "release_group": "NoTV",
    "container": "avi",
    "mimetype": "video/x-msvideo",
    "type": "episode"
}

More information is available at guessit-io.github.io/guessit.

Support

This project is hosted on GitHub. Feel free to open an issue if you think you have found a bug or something is missing in guessit.

GuessIt relies on Rebulk project for pattern and rules registration.

License

GuessIt is licensed under the LGPLv3 license.

guessit's People

Contributors

actions-user avatar aurieh avatar cheese1 avatar codyopel avatar duramato avatar fernandog avatar hdvinnie avatar jonbenta avatar kannibalox avatar labrys avatar laser-yi avatar medariox avatar mgorny avatar milkers69 avatar ratoaq2 avatar sharkykh avatar toilal avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

guessit's Issues

MULTI tag

MULTI tag should be recognized as a special Language information (that means multiple, but unspecified, languages).

More subtitles guessing

VOSTFR tag should be recognized as French subtitle.

VO = Version Original (Original Language)
ST = Subtitles
FR = French.

I don't know if same kind of tags exists in other countries, but it is really common in French release names.

SUBFORCED tag should be recognized as subtitle presence.

No module named transfo.split_path_components

I discovered an error in GuessIt via Subliminal. The developer for Subliminal believes it might be related to the fact that I'm running this on OSX.

Cheers,
Rickard

´´´python
$ python subliminal -l en The.Big.Bang.Theory.S05E18.HDTV.x264-LOL.mp4
Traceback (most recent call last):
File "subliminal", line 79, in
main()
File "subliminal", line 64, in main
force=args.force, multi=args.multi)
File "build/bdist.macosx-10.7-intel/egg/subliminal/async.py", line 133, in download_subtitles
File "build/bdist.macosx-10.7-intel/egg/subliminal/async.py", line 119, in list_subtitles
File "build/bdist.macosx-10.7-intel/egg/subliminal/core.py", line 54, in create_list_tasks
File "build/bdist.macosx-10.7-intel/egg/subliminal/videos.py", line 217, in scan
File "build/bdist.macosx-10.7-intel/egg/subliminal/videos.py", line 71, in from_path
File "/Library/Python/2.7/site-packages/guessit-0.4_dev-py2.7.egg/guessit/init.py", line 62, in guess_file_info
m = IterativeMatcher(filename, filetype=filetype)
File "/Library/Python/2.7/site-packages/guessit-0.4_dev-py2.7.egg/guessit/matcher.py", line 89, in init
apply_transfo('split_path_components')
File "/Library/Python/2.7/site-packages/guessit-0.4_dev-py2.7.egg/guessit/matcher.py", line 85, in apply_transfo
exec 'from transfo.%s import process' % transfo_name in globals(), locals()
File "", line 1, in
ImportError: No module named transfo.split_path_components
´´´

title not recognize correctly

two interesting examples

guessit.guess_movie_info('The.Kings.Speech.2010.1080p.BluRay.DTS.x264.D Z0N3.mkv')
{u'mimetype': 'video/x-matroska', u'videoCodec': u'h264', u'container': u'mkv', u'format': u'BluRay', u'title': u'The', u'releaseGroup': u'KiNGS', u'screenSize': u'1080p', u'year': 2010, u'type': u'movie', u'audioCodec': u'DTS'}

guessit.guess_movie_info('Street.Kings.2008.BluRay.1080p.DTS.x264.dxva EuReKA.mkv')
{u'mimetype': 'video/x-matroska', u'videoCodec': u'h264', u'container': u'mkv', u'format': u'BluRay', u'title': u'Street', u'other': [u'DXVA'], u'screenSize': u'1080p', u'year': 2008, u'type': u'movie', u'audioCodec': u'DTS'}

any way around it?

Language code (abbreviation) in movie Title ( follow up on issue #28)

This is a folloup issue on issue #28 where language was detected wrongly in Tmovie title, I have some cases where guessit detect language code out of my title

ex1

guessit.guess_movie_info('The.Rum.Diary.2011.1080p.BluRay.DTS.x264.D-Z0N3.mkv')

{u'mimetype': 'video/x-matroska', u'videoCodec': u'h264', u'container': u'mkv', u'language': [Language(Romanian)], u'format': u'BluRay', u'title': u'The', u'screenSize': u'1080p', u'year': 2011, u'type': u'movie', u'audioCodec': u'DTS'}

"Rum" is interpreted as a language code by guessit

ex2

guessit.guess_movie_info('Life.Of.Pi.2012.1080p.BluRay.DTS.x264.D-Z0N3')

{u'videoCodec': u'h264', u'extension': u'd-z0n3', u'language': [Language(Pali)], u'format': u'BluRay', u'title': u'Life Of', u'screenSize': u'1080p', u'year': 2012, u'type': u'movie', u'audioCodec': u'DTS'}

"Pi" is interpreted as a language code by guessit

If I replace Pi->Pali or Rum-> Romanian in the file name the title is detected correctly

Thank you

Quality properties enhancements

As i work on guessit integration for Flexget, i realized that existing quality system in flexget seems to be more complete than in guessit.

Can we discuss about this point ? Here is a crop of code that shows various qualities with associated regex. (https://github.com/Flexget/Flexget/blob/master/flexget/utils/qualities.py)

_resolutions = [
    QualityComponent('resolution', 10, '360p'),
    QualityComponent('resolution', 20, '368p', '368p?'),
    QualityComponent('resolution', 30, '480p', '480p?'),
    QualityComponent('resolution', 40, '576p', '576p?'),
    QualityComponent('resolution', 45, 'hr'),
    QualityComponent('resolution', 50, '720i'),
    QualityComponent('resolution', 60, '720p', '(1280x)?720p?x?'),
    QualityComponent('resolution', 70, '1080i'),
    QualityComponent('resolution', 80, '1080p', '(1920x)?1080p?')
]
_sources = [
    QualityComponent('source', 10, 'workprint', modifier=-8),
    QualityComponent('source', 20, 'cam', '(?:hd)?cam', modifier=-7),
    QualityComponent('source', 30, 'ts', '(?:hd)?ts|telesync', modifier=-6),
    QualityComponent('source', 40, 'tc', 'tc|telecine', modifier=-5),
    QualityComponent('source', 50, 'r5', 'r[2-8c]', modifier=-4),
    QualityComponent('source', 60, 'hdrip', 'hd[\W_]?rip', modifier=-3),
    QualityComponent('source', 70, 'ppvrip', 'ppv[\W_]?rip', modifier=-2),
    QualityComponent('source', 80, 'preair', modifier=-1),
    QualityComponent('source', 90, 'tvrip', 'tv[\W_]?rip'),
    QualityComponent('source', 100, 'dsr', 'dsr|ds[\W_]?rip'),
    QualityComponent('source', 110, 'sdtv', '(?:[sp]dtv|dvb)(?:[\W_]?rip)?'),
    QualityComponent('source', 120, 'webrip', 'web[\W_]?rip'),
    QualityComponent('source', 130, 'dvdscr', '(?:(?:dvd|web)[\W_]?)?scr(?:eener)?', modifier=0),
    QualityComponent('source', 140, 'bdscr', 'bdscr(?:eener)?'),
    QualityComponent('source', 150, 'hdtv', 'a?hdtv(?:[\W_]?rip)?'),
    QualityComponent('source', 160, 'webdl', 'web(?:[\W_]?(dl|hd))?'),
    QualityComponent('source', 170, 'dvdrip', 'dvd(?:[\W_]?rip)?'),
    QualityComponent('source', 175, 'remux'),
    QualityComponent('source', 180, 'bluray', '(?:b[dr][\W_]?rip|blu[\W_]?ray(?:[\W_]?rip)?)')
]
_codecs = [
    QualityComponent('codec', 10, 'divx'),
    QualityComponent('codec', 20, 'xvid'),
    QualityComponent('codec', 30, 'h264', '[hx].?264'),
    QualityComponent('codec', 40, '10bit', '10.?bit|hi10p')
]
channels = '(?:(?:[\W_]?5[\W_]?1)|(?:[\W_]?2[\W_]?(?:0|ch)))'
_audios = [
    QualityComponent('audio', 10, 'mp3'),
    # TODO: No idea what order these should go in or if we need different regexps
    QualityComponent('audio', 20, 'dd5.1', 'dd%s' % channels),
    QualityComponent('audio', 30, 'aac', 'aac%s?' % channels),
    QualityComponent('audio', 40, 'ac3', 'ac3%s?' % channels),
    QualityComponent('audio', 50, 'flac', 'flac%s?' % channels),
    # The DTSs are a bit backwards, but the more specific one needs to be parsed first
    QualityComponent('audio', 60, 'dtshd', 'dts[\W_]?hd(?:[\W_]?ma)?'),
    QualityComponent('audio', 70, 'dts'),
    QualityComponent('audio', 80, 'truehd')
]

resolution is equivalent to screenSize
source is equivalent to format
codec is equivalent to videoCodec
audio is equivalent to audioCodec + audioChannels

First point to note if that flexget quality properties have an ordering. I think it could be interesting if guessit can have the same logic in returned properties, in order to find out that Vegas.S01E09.FRENCH.720p.WEB-DL.H264-MiND screenSize is less than Vegas.S01E09.FRENCH.1080p.WEB-DL.H264-MiND, and Vegas.S01E11.FRENCH.LD.DVDRip.x264-MiND videoCodec is better than Vegas.S01E11.FRENCH.LD.DVDRip.XviD-MiND

Second point is that quality properties are more complete in flexget. Should we add those properties values in guessit ?

If you agree, i'll start a pull request to make the job.

Wrong renaming in <dir><sub-dir><file> situation

Hi,

I'm using a postprocessing script for nzbget called videosort which uses the guessit library for determining movie and serie names.

In the following scenario guessit seems to get it wrong:

Scenario:

Dead Man Down (2013) BRRiP XViD DD5_1 Custom NLSubs _=-_lt_ Q_o_Q _gt_-=_ XD607ebb-BRc59935-5155473f-1c5f49 XD607ebb-BRc59935-5155473f-1c5f49.avi

Output:
info Tue Jul 16 2013 21:09:43 VideoSort: Move satellites for /share/Download/movie/Dead Man Down (2013) BRRiP XViD DD5_1 Custom NLSubs =-lt Q_o_Q gt-=/XD607ebb-BRc59935-5155473f-1c5f49/XD607ebb-BRc59935-5155473f-1c5f49.avi
info Tue Jul 16 2013 21:09:43 VideoSort: Moved: /share/Download/test/Xd607Ebb-Brc59935-5155473F-1C5F49 (2013).avi
info Tue Jul 16 2013 21:09:43 VideoSort: destination path: /share/Download/test/Xd607Ebb-Brc59935-5155473F-1C5F49 (2013).avi
info Tue Jul 16 2013 21:09:43 VideoSort: path after cleanup: Xd607Ebb-Brc59935-5155473F-1C5F49 (2013).avi
info Tue Jul 16 2013 21:09:43 VideoSort: path after subst: Xd607Ebb-Brc59935-5155473F-1C5F49 (2013)..avi
info Tue Jul 16 2013 21:09:43 VideoSort: t (%y).%ext
info Tue Jul 16 2013 21:09:43 VideoSort: }
info Tue Jul 16 2013 21:09:43 VideoSort: [1.00] "type": "movie"
info Tue Jul 16 2013 21:09:43 VideoSort: [1.00] "year": 2013,
info Tue Jul 16 2013 21:09:43 VideoSort: [1.00] "format": "BluRay",
info Tue Jul 16 2013 21:09:43 VideoSort: [0.80] "title": "XD607ebb-BRc59935-5155473f-1c5f49",
info Tue Jul 16 2013 21:09:43 VideoSort: ],
info Tue Jul 16 2013 21:09:43 VideoSort: "Lithuanian"
info Tue Jul 16 2013 21:09:43 VideoSort: [0.80] "language": [
info Tue Jul 16 2013 21:09:43 VideoSort: [1.00] "container": "avi",
info Tue Jul 16 2013 21:09:43 VideoSort: [1.00] "videoCodec": "XviD",
info Tue Jul 16 2013 21:09:43 VideoSort: [1.00] "mimetype": "video/x-msvideo",
info Tue Jul 16 2013 21:09:43 VideoSort: {
info Tue Jul 16 2013 21:09:43 VideoSort: filename: /share/Download/movie/Dead Man Down (2013) BRRiP XViD DD5_1 Custom NLSubs =-lt Q_o_Q gt-=/XD607ebb-BRc59935-5155473f-1c5f49/XD607ebb-BRc59935-5155473f-1c5f49.avi

So the movie was renamed to "Xd607Ebb-Brc59935-5155473F-1C5F49 (2013).avi" instead of Man Down (2013).

NZBget developer Hugbug noticed this:
I've done some testing.
It seems that the file "XD607ebb-BRc59935-5155473f-1c5f49.avi" in folder "XD607ebb-BRc59935-5155473f-1c5f49" makes guessit pretty much confident about filename/folder-name being the movie name.
After renaming of folder from "XD607ebb-BRc59935-5155473f-1c5f49" to "aaaaaa" the title was correctly recognized.

Complete thread on nzbget forum:
http://nzbget.sourceforge.net/forum/viewtopic.php?f=8&t=840

If you need additional info please let me know,

Regards,
NaaN

Series name not guessed correctly when using parentheses in series name

Using parentheses '(' and ')' in the series filename will exclude the part between parentheses from series name.
Some series need the addition of the year between parentheses to distinguish between spinoffs for the same series.

Example:
'Doctor who (2005)' will be detected as 'Doctor who'

Unit test code
<<<
? Series/Doctor Who (2005)/Season 06/Doctor Who (2005) - S06E01 - The Impossible Astronaut (1)
: series: Doctor Who (2005)
season: 6
episodeNumber: 1

Output
<<<
INFO guessittest:checkMinimumFieldsCorrect -- Guessing information for file: Series/Doctor Who (2005)/Season 06/Doctor Who (2005) - S06E01 - The Impossible Astronaut (1)
WARNING guessittest:error -- Wrong prop value [str] for 'series': expected = 'Doctor Who (2005)' - received = 'Doctor Who'
WARNING guessittest:checkMinimumFieldsCorrect -- Found additional info for prop = 'extension': ''
WARNING guessittest:checkMinimumFieldsCorrect -- Found additional info for prop = 'title': 'The Impossible Astronaut'

Request: Add more ReleaseGroups

More release groups in patterns.py means more accurate guessing.

This isn't an exhaustive list, but just the list of groups I have had over the past 6 months or so..

--- patterns.py_3dd412  2012-08-28 15:57:59.191642218 +0100
+++ patterns.py 2012-08-28 16:13:02.809753338 +0100
@@ -118,6 +118,9 @@
                                  'CHD', 'ViTE', 'TLF', 'DEiTY', 'FLAiTE',
                                  'MDX', 'GM4F', 'DVL', 'SVD', 'iLUMiNADOS', ' FiNaLe',
                                  'UnSeeN', 'aXXo', 'KLAXXON', 'NoTV', 'ZeaL', 'LOL',
+                                 'SiNNERS', 'DiRTY', 'REWARD', 'ECI', 'KiNGS', 'CLUE',
+                                 'CtrlHD', 'POD', 'WiKi', 'DIMENSION', 'IMMERSE', 'FQM',
+                                 '2HD', 'REPTiLE', 'CTU', 'HALCYON', 'EbP', 'SiTV', 'SAiNTS',
                                  'HDBRiSe' ],

                'episodeFormat': [ 'Minisode', 'Minisodes' ],

Pyinstaller with guessit not working

I Used pyinstaller 2.0 to create an executable for this small piece of code. However, when I run the executable, I get the following error:

Traceback (most recent call last):
File "", line 3, in main
File "C:\pyinstaller-2.0\PyInstaller\loader\iu.py", line 386, in importHook
mod = _self_doimport(nm, ctx, fqname)
File "C:\pyinstaller-2.0\PyInstaller\loader\iu.py", line 480, in doimport
exec co in mod.dict
File "C:\pyinstaller-2.0\guessittest\build\pyi.win32\guessittest\out00-PYZ.pyz
\guessit", line 74, in
File "C:\pyinstaller-2.0\PyInstaller\loader\iu.py", line 386, in importHook
mod = _self_doimport(nm, ctx, fqname)
File "C:\pyinstaller-2.0\PyInstaller\loader\iu.py", line 480, in doimport
exec co in mod.dict
File "C:\pyinstaller-2.0\guessittest\build\pyi.win32\guessittest\out00-PYZ.pyz
\guessit.guess", line 23, in
File "C:\pyinstaller-2.0\PyInstaller\loader\iu.py", line 386, in importHook
mod = _self_doimport(nm, ctx, fqname)
File "C:\pyinstaller-2.0\PyInstaller\loader\iu.py", line 480, in doimport
exec co in mod.dict
File "C:\pyinstaller-2.0\guessittest\build\pyi.win32\guessittest\out00-PYZ.pyz\guessit.language", line 24, in
File "C:\pyinstaller-2.0\PyInstaller\loader\iu.py", line 386, in importHook
mod = _self_doimport(nm, ctx, fqname)
File "C:\pyinstaller-2.0\PyInstaller\loader\iu.py", line 480, in doimport
exec co in mod.dict
File "C:\pyinstaller-2.0\guessittest\build\pyi.win32\guessittest\out00-PYZ.pyz\guessit.language", line 24, in
File "C:\pyinstaller-2.0\PyInstaller\loader\iu.py", line 386, in importHook
mod = _self_doimport(nm, ctx, fqname)
File "C:\pyinstaller-2.0\PyInstaller\loader\iu.py", line 480, in doimport
exec co in mod.dict
File "C:\pyinstaller-2.0\guessittest\build\pyi.win32\guessittest\out00-PYZ.pyz\guessit.country", line 37, in
File "C:\pyinstaller-2.0\guessittest\build\pyi.win32\guessittest\out00-PYZ.pyz\guessit.fileutils", line 87, in load_file_in_same_dir
IOError: [Errno 22] invalid mode ('r') or filename: u'C:\pyinstaller-2.0\guessittest\dist\guessittest.exe?199168\guessit\ISO-3166-1_utf8.txt'

The code I am using is the following:

def main():
try:
import guessit
except:
import traceback
os.system('PAUSE')
if name == 'main':
main()

So, what do you think is the problem?

Dashe in the Episode Name

Hi

I would like to report an issue when episode titles contains a dash

for example:

The.Simpsons.S24E03.Adventures.in.Baby-Getting.720p.WEB-DL.DD5.1.H.264-CtrlHD.mkv

Outputs:

{u'mimetype': u'video/x-matroska', u'episodeNumber': 3, u'videoCodec': u'h264', u'container': u'mkv', u'title': u'Adventures in Baby', u'series': u'The Simpsons', u'format': u'WEB-DL', u'releaseGroup': u'CtrlHD', u'audioChannels': u'5.1', u'screenSize': u'720p', u'season': 24, u'type': u'episode'}

guessit ignores what is after the dash
"Adventures.in.Baby-Getting"

Thanks

Language detection is not reliable

For: French.Immersion.2011.STV.READNFO.QC.FRENCH.NTSC.DVDR.nfo
GuessIt found: {
    [1.00] "type": "movieinfo",
    [0.60] "title": "Immersion",
    [1.00] "container": "nfo",
    [0.30] "language": [
        "French"
    ],
    [1.00] "year": 2011
}

Title is wrong, French word was removed from title.

For: Another.Immersion.2011.STV.READNFO.QC.FRENCH.NTSC.DVDR.nfo
GuessIt found: {
    [1.00] "type": "movieinfo",
    [0.60] "title": "Another Immersion",
    [1.00] "container": "nfo",
    [0.30] "language": [
        "French"
    ],
    [1.00] "year": 2011
}

This is good although.

For: French.Immersion.2011.STV.READNFO.QC.NTSC.DVDR.nfo
GuessIt found: {
    [1.00] "type": "movieinfo",
    [0.60] "title": "Immersion",
    [1.00] "container": "nfo",
    [0.30] "language": [
        "French"
    ],
    [1.00] "year": 2011
}

I can understand this case fail ... but why the following example works ?

For: Immersion.French.2011.STV.READNFO.QC.NTSC.DVDR.nfo
GuessIt found: {
    [1.00] "type": "movieinfo",
    [0.60] "title": "Immersion French",
    [1.00] "container": "nfo",
    [1.00] "year": 2011
}

I'll have a look to this issue.

Dots in series

Marvels.Agents.of.S.H.I.E.L.D.S01E06.720p.HDTV.X264-DIMENSION.mkv gives:

For: Marvels.Agents.of.S.H.I.E.L.D.S01E06.720p.HDTV.X264-DIMENSION.mkv
GuessIt found: {
    [1.00] "mimetype": "video/x-matroska", 
    [1.00] "episodeNumber": 6, 
    [0.80] "videoCodec": "h264", 
    [1.00] "container": "mkv", 
    [1.00] "format": "HDTV", 
    [0.70] "series": "Marvels Agents of S H I E L D", 
    [0.80] "releaseGroup": "DIMENSION", 
    [1.00] "screenSize": "720p", 
    [1.00] "season": 1, 
    [1.00] "type": "episode"
}

Is there a way to keep dots? Would that bring a lot of false positives?
A rule like if dotted series contains (([A-Z]\.){3,}) then keep this part dotted.

Infinite Loop in split_path

I have found an issue such that if a specific input is passed into split_path, it will get into an infinite loop.

The simplest test case is:

>>> from guessit import fileutils
>>> fileutils.split_path('//some/path/')

The above will run forever.

The patch I have for this is:

--- a/libs/guessit/fileutils.py       Sat Sep 14 23:15:27 2013
+++ b/libs/guessit/fileutils.py        Sat Sep 14 23:15:30 2013
@@ -43,6 +43,8 @@
     """
     result = []
     while True:
+        if '//' in path:
+            path = path.replace('//', '/')
         head, tail = os.path.split(path)

         # on Unix systems, the root folder is '/'

As this in in a larger loop it will also handle the case of "///some/path" etc as well.

Which works in the context of where I found it (see here for more info: https://github.com/RuudBurger/CouchPotatoServer/issues/2164 )

Thanks!

range(...) is not JSON serializable

Hi,

When I try to use guessit on a file with several episodes (e.g. "Series - S01E01-02 - Title.avi"), it raises the following exception : TypeError: range(1, 3) is not JSON serializable

Replacing

        return range(l[0], l[1]+1)

by

        return list(range(l[0], l[1]+1))

at line 35 of transfo/guess_episodes_rexps.py seems to correct the issue.

Guessit season number missrecognized

guessit.guess_episode_info('CSI.S013E18.Sheltered.720p.WEB-DL.DD5.1.H.264.mkv')

{u'mimetype': 'video/x-matroska', u'episodeNumber': 18, u'videoCodec': u'h264', u'container': u'mkv', u'title': u'Sheltered', u'series': u'CSI', u'format': u'WEB-DL', u'audioChannels': u'5.1', u'screenSize': u'720p', u'season': 1, u'type': u'episode'}

S013 -> guess season 1

Thanks

test_autodetect_all.py fails 1 test, but exit code is 0

$ python2.7 tests/test_autodetect_all.py &> /tmp/log1
$ echo $?
0

log1:

testAutoMatcher (__main__.TestAutoDetectAll) ... INFO     [guessit.matcher:checkFields] -- Guessing information for file: Series/Duckman/Duckman - 101 (01) - 20021107 - I, Duckman.avi
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Series/Neverwhere/Neverwhere.05.Down.Street.[tvu.org.ru].avi
INFO     [guessit.matcher:checkFields] -- Guessing information for file: /movies/James_Bond-f21-Casino_Royale-x02-Stunts.mkv
INFO     [guessit.matcher:checkFields] -- Guessing information for file: OSS_117--Cairo,_Nest_of_Spies.mkv
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Series/Breaking Bad/Minisodes/Breaking.Bad.(Minisodes).01.Good.Cop.Bad.Cop.WEBRip.XviD.avi
INFO     [guessit.matcher:checkFields] -- Guessing information for file: /media/Band_of_Brothers-x02-We_Stand_Alone_Together.mkv
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Movies/The Doors (1991)/09.03.08.The.Doors.(1991).BDRip.720p.AC3.X264-HiS@SiLUHD-English.[sharethefiles.com].mkv
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Homeland.S02E01.HDTV.x264-EVOLVE.mp4
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Movies/Fear and Loathing in Las Vegas (1998)/Fear.and.Loathing.in.Las.Vegas.720p.HDDVD.DTS.x264-ESiR.mkv
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Movies/M.A.S.H. (1970)/MASH.(1970).[Divx.5.02][Dual-Subtitulos][DVDRip].ogm
INFO     [guessit.matcher:checkFields] -- Guessing information for file: the.simpsons.2401.hdtv-lol.mp4
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Leopard.dmg
INFO     [guessit.matcher:checkFields] -- Guessing information for file: /TV Shows/new.girl.117.hdtv-lol.mp4
INFO     [guessit.matcher:checkFields] -- Guessing information for file: the.mentalist.501.hdtv-lol.mp4
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Series/Kaamelott/Kaamelott - Livre V - Ep 23 - Le Forfait.avi
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Rush.._Beyond_The_Lighted_Stage-x09-Between_Sun_and_Moon-2002_Hartford.mkv
INFO     [guessit.matcher:checkFields] -- Guessing information for file: The.Office.(US).1x03.Health.Care.HDTV.XviD-LOL.avi
INFO     [guessit.matcher:checkFields] -- Guessing information for file: The_Insider-(1999)-x02-60_Minutes_Interview-1996.mp4
INFO     [guessit.matcher:checkFields] -- Guessing information for file: /media/Band_of_Brothers-e01-Currahee.mkv
INFO     [guessit.matcher:checkFields] -- SUMMARY: Guessed correctly 19 out of 19 filenames
ok
testAutoMatcherEpisodes (__main__.TestAutoDetectAll) ... INFO     [guessit.matcher:checkFields] -- Guessing information for file: finale 
WARNING  [guessit.matcher:error] -- Prop 'releaseGroup' not found in: finale 
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Sons.of.Anarchy.S05E06.720p.WEB.DL.DD5.1.H.264-CtrlHD.mkv
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Series/Duckman/Duckman - 101 (01) - 20021107 - I, Duckman.avi
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Series/Neverwhere/Neverwhere.05.Down.Street.[tvu.org.ru].avi
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Series/The Office/Season 6/The Office - S06xE01.avi
INFO     [guessit.matcher:checkFields] -- Guessing information for file: /media/Parks_and_Recreation-s03-x02-Gag_Reel.mkv
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Series/Duckman/Duckman - S1E13 Joking The Chicken (unedited).avi
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Example S01E01E02.avi
INFO     [guessit.matcher:checkFields] -- Guessing information for file: /media/bdc64bfe-e36f-4af8-b550-e6fd2dfaa507/TV_Shows/Doctor Who (2005)/Saison 6/Doctor Who (2005) - S06E13 - The Wedding of River Song.mkv
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Example S01E01-02.avi
INFO     [guessit.matcher:checkFields] -- Guessing information for file: The.Simpsons.S24E03.Adventures.in.Baby-Getting.720p.WEB-DL.DD5.1.H.264-CtrlHD.mkv
INFO     [guessit.matcher:checkFields] -- Guessing information for file: series/__ Incomplete __/Dr Slump (Catalan)/Dr._Slump_-_003_DVB-Rip_Catalan_by_kelf.avi
WARNING  [guessit.matcher:checkFields] -- Found additional info for prop = 'title': 'by kelf'
INFO     [guessit.matcher:checkFields] -- Guessing information for file: /mnt/videos/tvshows/Doctor Who/Season 06/E13 - The Wedding of River Song.mkv
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Series/Doctor Who (2005)/Season 06/Doctor Who (2005) - S06E01 - The Impossible Astronaut (1).avi
INFO     [guessit.matcher:checkFields] -- Guessing information for file: The.Mentalist.2x21.18-5-4.ENG.-.sub.FR.HDTV.XviD-AlFleNi-TeaM.[tvu.org.ru].avi
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Series/Pure Laine/2x05 - Pure Laine - Je Me Souviens.avi
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Series/Pure Laine/Pure.Laine.1x01.Toutes.Couleurs.Unies.FR.(Québec).DVB-Kceb.[tvu.org.ru].avi
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Series/Walt Disney/Donald.Duck.-.Good.Scouts.[www.bigernie.jump.to].avi
INFO     [guessit.matcher:checkFields] -- Guessing information for file: /volume1/TV Series/Drawn Together/Season 1/Drawn Together 1x04 Requiem for a Reality Show.avi
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Series/Duckman/Duckman - 110 (10) - 20021218 - Cellar Beware.avi
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Series/Baccano!/Baccano!_-_T1_-_Trailer_-_[Ayu](dae8173e).mkv
WARNING  [guessit.matcher:checkFields] -- Found additional info for prop = 'title': 'T1'
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Series/Californication/Season 2/Californication.2x05.Vaginatown.HDTV.XviD-0TV.avi
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Series/dexter/Dexter.5x02.Hello,.Bandit.ENG.-.sub.FR.HDTV.XviD-AlFleNi-TeaM.[tvu.org.ru].avi
INFO     [guessit.matcher:checkFields] -- Guessing information for file: /media/Parks_and_Recreation-s03-e02-Flu_Season.mkv
INFO     [guessit.matcher:checkFields] -- Guessing information for file: series/Psych/Psych S02 Season 2 Complete English DVD/Psych.S02E03.Psy.Vs.Psy.Français.srt
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Series/Simpsons/Saison 12 Français/Simpsons,.The.12x08.A.Bas.Le.Sergent.Skinner.FR.avi
INFO     [guessit.matcher:checkFields] -- Guessing information for file: /media/Band_of_Brothers-e01-Currahee.mkv
INFO     [guessit.matcher:checkFields] -- Guessing information for file: series/Psych/Psych S02 Season 2 Complete English DVD/Psych.S02E02.65.Million.Years.Off.avi
INFO     [guessit.matcher:checkFields] -- Guessing information for file: /media/Parks_and_Recreation-s03-e01.mkv
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Kaamelott - 5x44x45x46x47x48x49x50.avi
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Series/South Park/Season 4/South.Park.4x07.Cherokee.Hair.Tampons.DVDRip.[tvu.org.ru].avi
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Series/Ren & Stimpy/Ren And Stimpy - Onward & Upward-Adult Party Cartoon.avi
INFO     [guessit.matcher:checkFields] -- Guessing information for file: /Volumes/data-1/Series/Futurama/Season 3/Futurama_-_S03_DVD_Bonus_-_Deleted_Scenes_Part_3.ogm
WARNING  [guessit.matcher:checkFields] -- Found additional info for prop = 'title': 'Bonus'
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Series/Tout sur moi/Tout sur moi - S02E02 - Ménage à trois (14-01-2008) [Rip by Ampli].avi
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Series/Treme/Treme.1x03.Right.Place,.Wrong.Time.HDTV.XviD-NoTV.avi
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Series/Breaking Bad/Minisodes/Breaking.Bad.(Minisodes).01.Good.Cop.Bad.Cop.WEBRip.XviD.avi
INFO     [guessit.matcher:checkFields] -- Guessing information for file: /media/Band_of_Brothers-x02-We_Stand_Alone_Together.mkv
INFO     [guessit.matcher:checkFields] -- Guessing information for file: /TV Shows/Mad.M-5x9.mkv
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Series/Kaamelott/Kaamelott - Livre V - Ep 23 - Le Forfait.avi
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Series/My Name Is Earl/My.Name.Is.Earl.S01Extras.-.Bad.Karma.DVDRip.XviD.avi
INFO     [guessit.matcher:checkFields] -- Guessing information for file: series/Ren and Stimpy - Black_hole_[DivX].avi
INFO     [guessit.matcher:checkFields] -- Guessing information for file: /media/Parks_and_Recreation-s03-x01.mkv
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Ben.and.Kate.S01E02.720p.HDTV.X264-DIMENSION.mkv
INFO     [guessit.matcher:checkFields] -- Guessing information for file: /TV Shows/new.girl.117.hdtv-lol.mp4
INFO     [guessit.matcher:checkFields] -- Guessing information for file: The.Office.(US).1x03.Health.Care.HDTV.XviD-LOL.avi
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Series/Mad Men Season 1 Complete/Mad.Men.S01E01.avi
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Series/Futurama/Season 3 (mkv)/[™] Futurama - S03E22 - Le chef de fer à 30% ( 30 Percent Iron Chef ).mkv
INFO     [guessit.matcher:checkFields] -- Guessing information for file: series/The Office/Season 4/The Office [401] Fun Run.avi
INFO     [guessit.matcher:checkFields] -- Guessing information for file: /home/disaster/Videos/TV/Merlin/merlin_2008.5x02.arthurs_bane_part_two.repack.720p_hdtv_x264-fov.mkv
INFO     [guessit.matcher:checkFields] -- Guessing information for file: /mnt/series/The Big Bang Theory/S01/The.Big.Bang.Theory.S01E01.mkv
INFO     [guessit.matcher:checkFields] -- SUMMARY: Guessed correctly 49 out of 50 filenames
FAIL
testAutoMatcherMovies (__main__.TestAutoDetectAll) ... INFO     [guessit.matcher:checkFields] -- Guessing information for file: Movies/Fr - Paris 2054, Renaissance (2005) - De Christian Volckman - (Film Divx Science Fiction Fantastique Thriller Policier N&B).avi
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Movies/Somewhere.2010.DVDRip.XviD-iLG/i-smwhr.avi
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Movies/Brazil (1985)/Brazil_Criterion_Edition_(1985).CD2.avi
INFO     [guessit.matcher:checkFields] -- Guessing information for file: movies/American.The.Bill.Hicks.Story.2009.DVDRip.XviD-EPiSODE.[UsaBit.com]/UsaBit.com_esd-americanbh.avi
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Movies/Alice in Wonderland DVDRip.XviD-DiAMOND/dmd-aw.avi
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Movies/001 __ A classer/Fantomas se déchaine - Louis de Funès.avi
INFO     [guessit.matcher:checkFields] -- Guessing information for file: The.Director’s.Notebook.2006.Blu-Ray.x264.DXVA.720p.AC3-de[42].mkv
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Movies/El Dia de la Bestia (1995)/El.dia.de.la.bestia.DVDrip.Spanish.DivX.by.Artik[SEDG].avi
INFO     [guessit.matcher:checkFields] -- Guessing information for file: movies/Charlie.And.Boots.DVDRip.XviD-TheWretched/wthd-cab.avi
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Movies/The Doors (1991)/09.03.08.The.Doors.(1991).BDRip.720p.AC3.X264-HiS@SiLUHD-English.[sharethefiles.com].mkv
INFO     [guessit.matcher:checkFields] -- Guessing information for file: [XCT].Le.Prestige.(The.Prestige).DVDRip.[x264.HP.He-Aac.{Fr-Eng}.St{Fr-Eng}.Chaps].mkv
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Movies/Blade Runner (1982)/Blade.Runner.(1982).(Director's.Cut).CD1.DVDRip.XviD.AC3-WAF.avi
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Movies/21 (2008)/21.(2008).DVDRip.x264.AC3-FtS.[sharethefiles.com].mkv
INFO     [guessit.matcher:checkFields] -- Guessing information for file: /movies/James_Bond-f21-Casino_Royale.mkv
INFO     [guessit.matcher:checkFields] -- Guessing information for file: movies/Baraka_Edition_Collector.avi
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Movies/[阿维达].Avida.2006.FRENCH.DVDRiP.XViD-PROD.avi
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Battle Royale (2000)/Battle.Royale.(Batoru.Rowaiaru).(2000).(Special.Edition).CD1of2.DVDRiP.XviD-[ZeaL].avi
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Movies/Dark City (1998)/Dark.City.(1998).DC.BDRip.720p.DTS.X264-CHD.mkv
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Movies/M.A.S.H. (1970)/MASH.(1970).[Divx.5.02][Dual-Subtitulos][DVDRip].ogm
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Movies/Ne.Le.Dis.A.Personne.Fr 2 cd/personnea_mp.avi
INFO     [guessit.matcher:checkFields] -- Guessing information for file: movies/La Science des Rêves (2006)/La.Science.Des.Reves.FRENCH.DVDRip.XviD-MP-AceBot.avi
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Movies/Office Space (1999)/Office.Space.[Dual-DVDRip].[Spanish-English].[XviD-AC3-AC3].[by.Oswald].avi
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Movies/Borat (2006)/Borat.(2006).R5.PROPER.REPACK.DVDRip.XviD-PUKKA.avi
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Movies/Mamma.Mia.2008.DVDRip.AC3.XviD-CrazyTeam/Mamma.Mia.2008.DVDRip.AC3.XviD-CrazyTeam.avi
INFO     [guessit.matcher:checkFields] -- Guessing information for file: movies/Steig Larsson Millenium Trilogy (2009) BRrip 720 AAC x264/(1)The Girl With The Dragon Tattoo (2009) BRrip 720 AAC x264.mkv
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Movies/Fear and Loathing in Las Vegas (1998)/Fear.and.Loathing.in.Las.Vegas.720p.HDDVD.DTS.x264-ESiR.mkv
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Movies/Comme une Image (2004)/Comme.Une.Image.FRENCH.DVDRiP.XViD-NTK.par-www.divx-overnet.com.avi
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Movies/Sin City (BluRay) (2005)/Sin.City.2005.BDRip.720p.x264.AC3-SEPTiC.mkv
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Movies/Moon_(2009)-x02-Making_Of.mkv
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Movies/Persepolis (2007)/[XCT] Persepolis [H264+Aac-128(Fr-Eng)+ST(Fr-Eng)+Ind].mkv
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Movies/9 (2009)/9.2009.Blu-ray.DTS.720p.x264.HDBRiSe.[sharethefiles.com].mkv
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Movies/Moon_(2009).mkv
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Foobar Part VI.mkv
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Movies/Moon_(2009)-x01.mkv
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Movies/Cosmopolis.2012.LiMiTED.720p.BluRay.x264-AN0NYM0US[bb]/ano-cosmo.720p.mkv
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Movies/Toy Story (1995)/Toy Story [HDTV 720p English-Spanish].mkv
INFO     [guessit.matcher:checkFields] -- Guessing information for file: The_Insider-(1999)-x02-60_Minutes_Interview-1996.mp4
INFO     [guessit.matcher:checkFields] -- Guessing information for file: movies/James_Bond-f17-Goldeneye.mkv
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Movies/Fantastic Mr Fox/Fantastic.Mr.Fox.2009.DVDRip.{x264+LC-AAC.5.1}{Fr-Eng}{Sub.Fr-Eng}-™.[sharethefiles.com].mkv
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Movies/Ratatouille/video_ts-ratatouille.srt
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Movies/Bunker Palace Hôtel (Enki Bilal) (1989)/Enki Bilal - Bunker Palace Hotel (Fr Vhs Rip).avi
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Movies/Wild Zero (2000)/Wild.Zero.DVDivX-EPiC.avi
INFO     [guessit.matcher:checkFields] -- Guessing information for file: The Godfather Part III.mkv
INFO     [guessit.matcher:checkFields] -- Guessing information for file: /movies/James_Bond-f21-Casino_Royale-x01-Becoming_Bond.mkv
INFO     [guessit.matcher:checkFields] -- Guessing information for file: OSS_117--Cairo,_Nest_of_Spies.mkv
INFO     [guessit.matcher:checkFields] -- Guessing information for file: The_Italian_Job.mkv
INFO     [guessit.matcher:checkFields] -- Guessing information for file: Rush.._Beyond_The_Lighted_Stage-x09-Between_Sun_and_Moon-2002_Hartford.mkv
INFO     [guessit.matcher:checkFields] -- Guessing information for file: /public/uTorrent/Downloads Finished/Movies/Indiana.Jones.and.the.Temple.of.Doom.1984.HDTV.720p.x264.AC3.5.1-REDµX/Indiana.Jones.and.the.Temple.of.Doom.1984.HDTV.720p.x264.AC3.5.1-REDµX.mkv
INFO     [guessit.matcher:checkFields] -- Guessing information for file: movies/Greenberg.REPACK.LiMiTED.DVDRip.XviD-ARROW/arw-repack-greenberg.dvdrip.xvid.avi
INFO     [guessit.matcher:checkFields] -- Guessing information for file: /movies/James_Bond-f21-Casino_Royale-x02-Stunts.mkv
INFO     [guessit.matcher:checkFields] -- SUMMARY: Guessed correctly 50 out of 50 filenames
ok

======================================================================
FAIL: testAutoMatcherEpisodes (__main__.TestAutoDetectAll)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "tests/test_autodetect_all.py", line 37, in testAutoMatcherEpisodes
    filename='episodes.yaml')
  File "/tmp/t/guessit/tests/guessittest.py", line 68, in checkMinimumFieldsCorrect
    return self.checkFields(groundTruth, guess_func, removeType)
  File "/tmp/t/guessit/tests/guessittest.py", line 141, in checkFields
    msg='Correct: %d < Total: %d' % (correct, total))
AssertionError: Correct: 49 < Total: 50

----------------------------------------------------------------------
Ran 3 tests in 0.714s

FAILED (failures=1)

Also why there is no file for running all tests? Now i use 'for i in tests/test_*.py;do python2.7 "$i";done' for running all tests, but would like to use something like 'python2.7 tests/run-tests.py'

Performance (I/O)

Is it wise to rely on reading a file to fill a dict? Isn't that a heavy operation when just importing guessit?

Episode identification issue

Hi, This is very specific so I do apologize. Yes I do have a wife who likes to watch house hunters. The match tree seems to do a great job identifying most properties of the file name but is struggling to pull out the season/episode information. Not sure how to tackle this one, or if it just a weird one-off.

guess = guessit.guess_video_info('House.Hunters.International.S56E06.720p.hdtv.x264.mp4')
print guess.nice_string()
{
[1.00] "mimetype": "video/mp4",
[1.00] "videoCodec": "h264",
[1.00] "container": "mp4",
[1.00] "screenSize": "720p",
[1.00] "format": "HDTV",
[0.40] "series": "House Hunters International S56E06",
[1.00] "type": "episode"
}

EDIT: I do see in guess_episodes_rexps.py starting line 47 the comments mention that seasons greater than 30 are most likely errors. Does this cause issues in confidence levels for guessing?

Multi episode -> S01E01-02

As stated in the title is it possible to have the same behaviour as the multiple episode of the type 01x01x02 -> yields Season 1 episodelist [1,2]

scene - rulesets / suggestions / comments

personally i'd like to see the rulesets for the 'type' be modular to the point that if someone wants to only us TV related they could.. so that the Movie stuff doesn't bleed over. also this would open up for further 'types'.. ebook/music/anime? etc.

anyways, this project looks to have some very nice potential.
i noticed that the guessit/pattern.py has some scene related info but looks to stand to be updated.

visit this for further info on an overview of information that can be mined:
http://scenerules.irc.gs/
http://en.wikipedia.org/wiki/Standard_%28warez%29

food for thought for possible features:

  • dection of samples, sometimes automation tools end up replacing the actual movie with a sample.. it would be nice to detect samples.. check filesize/length..
    • compare two similar files.. detect which is better to keep.. proper/repack/real/internal -- filesize/length/birtate
    • when extracting releases.. sometimes the release dir and the filename have mismatching season/ep numbering.. (mainly due to mess-up) where either the value itself is different or maybe the 's' deliminator got left off. -- detect which is most likely.. compare other files in dir and figure out the next logical ep.. compare between the possible choices of release folder and release file info.. and use the better of the two
    • internal vs external subs.. how to handle that info..
    • use system locale in factoring quality (region) / subs (language) / etc
    • figure out where the source came from.. trust scene over p2p..
    • certain things are done in the anime word that is not proper.. for example they compute crc32 for the file and included it in [] within the filename.. also anidb stores a hash of files which people can use to compare their sources

anyways, just some quick thoughts from my first look at this project. you can msg if you have further questions.

comments are welcomed

python 3.3 unicode error

$ python3 setup.py build
Traceback (most recent call last):
File "setup.py", line 24, in
import guessit
File "/Users/tarragon/Programming/submanager/build/guessit/guessit/init.py", line 74, in
from guessit.guess import Guess, merge_all
File "/Users/tarragon/Programming/submanager/build/guessit/guessit/guess.py", line 23, in
from guessit.language import Language
File "/Users/tarragon/Programming/submanager/build/guessit/guessit/language.py", line 25, in
from guessit.country import Country
File "/Users/tarragon/Programming/submanager/build/guessit/guessit/country.py", line 37, in
_iso3166_contents = load_file_in_same_dir(file, 'ISO-3166-1_utf8.txt')
File "/Users/tarragon/Programming/submanager/build/guessit/guessit/fileutils.py", line 87, in load_file_in_same_dir
return u(open(os.path.join(*path)).read())
File "/Users/tarragon/Programming/submanager/bin/../lib/python3.3/encodings/ascii.py", line 26, in decode
return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 37: ordinal not in range(128)

UnicodeEncodeError with non english filename

Here is a exemple :

touch /tmp/15x15\ Les\ Simpson\ -\ Boire\ et\ déboires.avi
guessit /tmp/15x15\ Les\ Simpson\ -\ Boire\ et\ déboires.avi

Traceback (most recent call last):
  File "/usr/bin/guessit", line 9, in <module>
    load_entry_point('guessit==0.6', 'console_scripts', 'guessit')()
  File "/usr/lib/python2.6/site-packages/guessit/__main__.py", line 109, in main
    info = options.info.split(','))
  File "/usr/lib/python2.6/site-packages/guessit/__main__.py", line 32, in detect_filename
    print('For:', filename)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 63: ordinal not in range(128)
python --version
Python 2.6.6

I don't know how to resolv it :-(

Thanks

Fix for tests with python3.2 and non-unicode locale

Please apply this patch too:

diff -ur /var/tmp/portage/dev-python/guessit-0.5.4/work/guessit-0.5.4/tests/test_language.py guessit-0.5.4/tests/test_language.py
--- /var/tmp/portage/dev-python/guessit-0.5.4/work/guessit-0.5.4/tests/test_language.py 2013-02-11 03:49:53.000000000 +0400
+++ guessit-0.5.4/tests/test_language.py    2013-04-25 18:41:22.000000000 +0400
@@ -20,6 +20,7 @@

 from __future__ import unicode_literals
 from guessittest import *
+import io

 class TestLanguage(TestGuessit):

@@ -81,7 +82,7 @@

     def test_opensubtitles(self):
         opensubtitles_langfile = file_in_same_dir(__file__, 'opensubtitles_languages_2012_05_09.txt')
-        langs = [ u(l).strip().split('\t') for l in open(opensubtitles_langfile) ][1:]
+        langs = [ u(l).strip().split('\t') for l in io.open(opensubtitles_langfile, encoding="utf8") ][1:]
         for lang in langs:
             # check that we recognize the opensubtitles language code correctly
             # and that we are able to output this code from a language

Sorry for not noticing it in the past.

wrong serie name detection

for this file :
Dont.Trust.the.B----.in.Apartment.23.S02E02.720p.HDTV.X264-DIMENSION.mkv
guessit report :
[0.70] "series": "Dont Trust the B",

ps: original name was "Dont.Trust.the.Bitch.in.Apartment.23.S02E02.720p.HDTV.X264-DIMENSION" but on addicted.com the show name is "Don't Trust the B---- in Apartment 23" so i renamed the file in order to find a subtitle with subliminal

Language in movie title

A movie like "The Italian Job" or "The Spanish Prisoner" returns a guess where the word is identified as the movie's language rather than part of the title:

$ guessit The_Italian_Job.mkv
GuessIt found: {
    [1.00] "type": "movie", 
    [1.00] "container": "mkv", 
    [0.30] "language": [
        "Italian"
    ], 
    [0.60] "title": "The"
}

Seems like a tough problem to solve. Any way around this, maybe a way to craft the filename so this doesn't happen?

Ship license

Hi,

I'm packaging your library for Debian so that it's one aptitude install away (including on Ubuntu & other derivatives - see this bug if you're interested). It looks suitable for inclusion in the official repositories except for a small detail. As you're licensing your library under the LGPL you're required to ship the license itself with the project, typically by including this file as LICENSE.txt.

After that I think that it'll be fine and I'll be happy to forward our users' contributions :)

Have a nice day

OpenSubtitles

I contacted the admin of OpenSubtitles and he told me he updated the database to remove non-ISO-639-2 languages from it. The service is currently unavailable but I think you can remove the specific stuff for OpenSubtitles from guessit.Language as well as from Subliminal (but I can do that ;) )

wrong serie name

i did a search for Ben.and.Kate.S01E02.720p.HDTV.X264-DIMENSION.mkv and i got :
...
[0.85] "series": "and Kate",
...

global name 'basestring' is not defined

Hi,

I sometimes get the following exception with python 3.3.2 :

File "/usr/lib/python3.3/site-packages/guessit/patterns.py", line 245, in compute_canonical_form
    if isinstance(value, basestring):
NameError: global name 'basestring' is not defined

The line 245 of patterns.py should be replaced with

if isinstance(value, base_text_type):

(and base_text_type should be imported).

Show name is not correctly retrieved

Hi, I have this episode: Da Vinci's Demons - 1x04 - The Magician.mkv tried to scan it from both UNIX and Windows file systems and always the series name obtained is Vinci's demons. At some point guessit removes the "Da" and since I use it with subliminal to find subtitles for the episode, I can't find them because the show name is not correct.

I'll appreciate if you could take a look into this. Thanks!

Invalid matches of series title and type

If a Series has a year associated as part of title, and also uses weak season/episode format (205 = S02E05). This has resulted in a Episode being mapped to Movie:
example:
Castle.2009-205.some.additional.stuff

New Series:
Da Vinci's Demons seems to result in parsing to losing 'Da' and results in a language setting of 'Danish'. Therefore title becomes Vincis Demons.

StopIteration not catched

Hi Wackou I'm using your module guessit with Diaoul's Subliminal.
Some user are reporting problems when downloading subtitle and it seems that a StopIteration exception it's not catched.
The Traceback print is:

File "/volume1/@appstore/sickbeard-custom/var/SickBeard/lib/subliminal/api.py", line 95, in download_subtitles
subtitles_by_video = list_subtitles(paths, languages, services, force, multi, cache_dir, max_depth, scan_filter)
File "/volume1/@appstore/sickbeard-custom/var/SickBeard/lib/subliminal/api.py", line 54, in list_subtitles
tasks = create_list_tasks(paths, languages, services, force, multi, cache_dir, max_depth, scan_filter)
File "/volume1/@appstore/sickbeard-custom/var/SickBeard/lib/subliminal/core.py", line 57, in create_list_tasks
scan_result.extend(scan(p, max_depth, scan_filter))
File "/volume1/@appstore/sickbeard-custom/var/SickBeard/lib/subliminal/videos.py", line 242, in scan
video = Video.from_path(entry)
File "/volume1/@appstore/sickbeard-custom/var/SickBeard/lib/subliminal/videos.py", line 75, in from_path
guess = guessit.guess_file_info(path, 'autodetect')
File "/usr/local/sickbeard-custom/var/SickBeard/lib/guessit/init.py", line 194, in guess_file_info
result.append(_guess_filename(filename, filetype))
File "/usr/local/sickbeard-custom/var/SickBeard/lib/guessit/init.py", line 120, in _guess_filename
title = next(find_nodes(mtree.match_tree, 'title'))
StopIteration

Request: WEB-DL format

WEB-DL format is quite popular and only a small step down from Blu-Ray 720p quality. A favorite format prior to Blu-Ray release.

--- patterns.py_3dd412  2012-08-28 15:57:59.191642218 +0100
+++ patterns.py 2012-08-28 16:04:03.826704589 +0100
@@ -104,7 +104,7 @@

 properties = { 'format': [ 'DVDRip', 'HD-DVD', 'HDDVD', 'HDDVDRip', 'BluRay', 'Blu-ray', 'BDRip', 'BRRip',
                            'HDRip', 'DVD', 'DVDivX', 'HDTV', 'DVB', 'DVBRip', 'PDTV', 'WEBRip',
-                           'DVDSCR', 'Screener', 'VHS', 'VIDEO_TS' ],
+                           'DVDSCR', 'Screener', 'VHS', 'VIDEO_TS', 'WEB-DL', 'WEBDL' ],

                'screenSize': [ '720p', '720', '1080p', '1080' ],

@@ -151,6 +151,7 @@
 property_synonyms = { 'DVD': [ 'DVDRip', 'VIDEO_TS' ],
                       'HD-DVD': [ 'HDDVD', 'HDDVDRip' ],
                       'BluRay': [ 'BDRip', 'BRRip', 'Blu-ray' ],
+                      'WEB-DL': [ 'WEBDL' ],
                       'DVB': [ 'DVBRip', 'PDTV' ],
                       'Screener': [ 'DVDSCR' ],
                       'DivX': [ 'DVDivX' ],

ImportError: cannot import name u

Hello,
when I try to use guessit (through Subliminal) i get this error:

File "/usr/local/lib/python2.7/dist-packages/guessit/init.py", line 74, in
from guessit.guess import Guess, merge_all
File "/usr/local/lib/python2.7/dist-packages/guessit/guessit.py", line 23, in
from guessit import u
ImportError: cannot import name u

any idea of the problem?

Error during video scan

File "/Library/Python/2.7/site-packages/guessit-0.6.1-py2.7.egg/guessit/init.py", line 205, in guess_file_info
result.append(guessfilename(filename, filetype))
File "/Library/Python/2.7/site-packages/guessit-0.6.1-py2.7.egg/guessit/init.py", line 132, in guessfilename
title2 = next(find_nodes(mtree2.match_tree, 'title'))
StopIteration

Multi-episodes

What about multi-episodes?

eg: python guessit.py 'Kaamelott - 5x44x45x46x47x48x49x50.avi'
--> Season 5 Episodes 44 to 50

Right now there is warnings about duplicates but it would be great that "episodeNumber" would return a list of episodes.

TV caps incorrectly parsed

Hi,

I noticed that guessit fails to parse interlaced TV caps (mostly 1080i releases).
For instance Game of Thrones S03E06 1080i HDTV DD5.1 MPEG2-TrollHD.ts :

  • 1080i is parsed as title instead of resolution
  • Codec MPEG2 was not found
  • Release group TrollHD was also not found

Thank you, I love this module !

RuntimeError: maximum recursion depth exceeded while calling a Python object

Something about the word 'finale ' with a space at the end.

In [1]: import guessit

In [2]: guessit.guess_episode_info('finale ')

...

/usr/local/lib/python2.7/dist-packages/guessit/transfo/init.pyc in find_and_split_node(node, strategy, logger)
80 child.guess = guess
81 else:
---> 82 find_and_split_node(child, strategy, logger)
83 return
84

/usr/local/lib/python2.7/dist-packages/guessit/transfo/init.pyc in find_and_split_node(node, strategy, logger)
74 (logger or log).debug(msg)
75
---> 76 node.partition(span)
77 absolute_span = (span[0] + node.offset, span[1] + node.offset)
78 for child in node.children:

/usr/local/lib/python2.7/dist-packages/guessit/matchtree.pyc in partition(self, indices)
90 for start, end in zip(indices[:-1], indices[1:]):
91 self.add_child(span=(self.offset + start,
---> 92 self.offset + end))
93
94 def split_on_components(self, components):

/usr/local/lib/python2.7/dist-packages/guessit/matchtree.pyc in add_child(self, span)
78
79 def add_child(self, span):
---> 80 child = MatchTree(self.string, span=span, parent=self)
81 self.children.append(child)
82

/usr/local/lib/python2.7/dist-packages/guessit/matchtree.pyc in init(self, string, span, parent)
37 self.parent = parent
38 self.children = []
---> 39 self.guess = Guess()
40
41 @Property

/usr/local/lib/python2.7/dist-packages/guessit/guess.pyc in init(self, _args, *_kwargs)
43 confidence = 0
44
---> 45 dict.init(self, _args, *_kwargs)
46
47 self._confidence = {}

RuntimeError: maximum recursion depth exceeded while calling a Python object

Part III & VI issue

Hello,

I have an issue with Part III and Part VI.
Eg: The Godfather Part III
Guessit thinks that III means 'Sichuan Yi' language. And so the movie's name is 'The Godfather Part'

Same things happens for 'Foobar Part VI.mkv' for "Vietnamese"

See http://pastebin.com/9X7SezU7

Thanks for your awesome work :)

Apostrophe in the movie title

Issue when there is an apostrophe in the title

guess=guessit.guess_video_info('The.Director’s.Notebook.2006.Blu-Ray.x264.DXVA.720p.AC3-de[42].mkv')

Traceback (most recent call last):
File "", line 1, in
File "/usr/local/lib/python2.7/dist-packages/guessit/init.py", line 172, in guess_video_info
return guess_file_info(filename, 'autodetect', info)
File "/usr/local/lib/python2.7/dist-packages/guessit/init.py", line 109, in guess_file_info
m = IterativeMatcher(filename, filetype=filetype)
File "/usr/local/lib/python2.7/dist-packages/guessit/matcher.py", line 91, in init
apply_transfo('guess_filetype', filetype)
File "/usr/local/lib/python2.7/dist-packages/guessit/matcher.py", line 85, in apply_transfo
transfo.process(mtree, _args, *_kwargs)
File "/usr/local/lib/python2.7/dist-packages/guessit/transfo/guess_filetype.py", line 161, in process
filetype, other = guess_filetype(mtree, filetype)
File "/usr/local/lib/python2.7/dist-packages/guessit/transfo/guess_filetype.py", line 104, in guess_filetype
fname = clean_string(filename).lower()
File "/usr/local/lib/python2.7/dist-packages/guessit/textutils.py", line 44, in clean_string
s = s.replace(c, ' ')
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 12: ordinal not in range(128)

Changing Director's -> to Director yields a normal guess from guessit

Cheers

help(guessit) doesn't work

import guessit
help(guessit)

gives

UnicodeEncodeError: 'ascii' codec can't encode character u'\xf1' in position 8248: ordinal not in range(128)

Language codes split titles

I'm not spanish, but this case should not detect English as a language. I'll try to have a look tomorrow, maybe 2nd_pass could be enhanced to support this.

For: Ejecutiva.En.Apuros(2009).BLURAY.SCR.Xvid.Spanish.LanzamientosD.nfo
GuessIt found: {
    [1.00] "isScreener": true,
    [1.00] "container": "nfo",
    [0.30] "language": [
        "English",
        "Spanish"
    ],
    [1.00] "format": "BluRay",
    [0.60] "title": "Ejecutiva",
    [1.00] "videoCodec": "XviD",
    [1.00] "year": 2009,
    [1.00] "type": "movieinfo"
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.