Giter Site home page Giter Site logo

Comments (8)

gazpachoking avatar gazpachoking commented on June 23, 2024 1

Did a bit of testing, looks like tmdb is expecting utf-8 encoding. Did a bit of a hack to get things working again:

# Before. Broken
>>> tmdb3.tmdb_api.searchMovie(u'Generation П')[0]
Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "build\bdist.win32\egg\tmdb3\tmdb_api.py", line 128, in searchMovie
    return MovieSearchResult(Request('search/movie', **kwargs), locale=locale)
  File "build\bdist.win32\egg\tmdb3\request.py", line 70, in __init__
    kwargs[k] = locale.encode(v)
  File "build\bdist.win32\egg\tmdb3\locales.py", line 110, in encode
    return dat.encode(self.encoding)
  File "C:\Users\chase.sterling\PycharmProjects\Flexget\lib\encodings\cp1252.py", line 12, in encode
    return codecs.charmap_encode(input,errors,encoding_table)
UnicodeEncodeError: 'charmap' codec can't encode character u'\u041f' in position 11: character maps to <undefined>

# Hack to fix encoding
>>> tmdb3.locales.set_locale("en", "us", True)
>>> tmdb3.locales.syslocale.encoding = 'utf-8'

# After. Working.
>>> tmdb3.tmdb_api.searchMovie(u'Generation П')[0]
<Movie 'Generation P' (2011)>

from pytmdb3.

wagnerrp avatar wagnerrp commented on June 23, 2024

If the user is going to be accessing unicode content, such as movies with the character "П" in the title, it expects the user will have configured their system to handle unicode content. Specifically, that means configuring a UTF language in their environment.

# unconfigured default
> locale
LANG=
LC_CTYPE="C"
LC_COLLATE="C"
LC_TIME="C"
LC_NUMERIC="C"
LC_MONETARY="C"
LC_MESSAGES="C"
LC_ALL=
# Bourne users
> export LANG="en_US.UTF-8"
# C-shell users
> setenv LANG en_US.UTF-8
# confirmation
> locale
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_ALL=

The tmdb3 library will then pull that encoding from the environment using the locale library.

> projects/pytmdb3/scripts/pytmdb3.py
PyTMDB3 Interactive Shell. TAB completion available.
>>> import locale
>>> locale.getdefaultlocale()
('en_US', 'UTF-8')
>>> get_locale().encoding
'UTF-8'

from pytmdb3.

gazpachoking avatar gazpachoking commented on June 23, 2024

The problem is, we can't just pick an arbitrary encoding when sending requests to tmdb. They are expecting utf-8.

from pytmdb3.

gazpachoking avatar gazpachoking commented on June 23, 2024

It has nothing to do with the platform we are running on what encoding the api expects.

from pytmdb3.

gazpachoking avatar gazpachoking commented on June 23, 2024

Here is some more evidence that just picking a codec that supports all unicode codepoints still isn't correct. It has to be in the encoding tmdb is expecting in order for it to be able to decode again:


>>> tmdb3.locales.syslocale.encoding = 'utf-8'
>>> tmdb3.tmdb_api.searchMovie(u'Generation П')[0]
<Movie 'Generation P' (2011)>
>>> tmdb3.locales.syslocale.encoding = 'utf-16'
>>> tmdb3.tmdb_api.searchMovie(u'Generation П')[0]
Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "C:\Users\chase.sterling\PycharmProjects\Flexget\lib\site-packages\tmdb3\tmdb_api.py", line 128, in searchMovie
    return MovieSearchResult(Request('search/movie', **kwargs), locale=locale)
  File "C:\Users\chase.sterling\PycharmProjects\Flexget\lib\site-packages\tmdb3\tmdb_api.py", line 157, in __init__
    lambda x: Movie(raw=x, locale=locale))
  File "C:\Users\chase.sterling\PycharmProjects\Flexget\lib\site-packages\tmdb3\pager.py", line 106, in __init__
    super(PagedRequest, self).__init__(self._getpage(1), 20)
  File "C:\Users\chase.sterling\PycharmProjects\Flexget\lib\site-packages\tmdb3\pager.py", line 59, in __init__
    self._data = list(iterable)
  File "C:\Users\chase.sterling\PycharmProjects\Flexget\lib\site-packages\tmdb3\pager.py", line 110, in _getpage
    res = req.readJSON()
  File "C:\Users\chase.sterling\PycharmProjects\Flexget\lib\site-packages\tmdb3\cache.py", line 118, in __call__
    data = self.func(*args, **kwargs)
  File "C:\Users\chase.sterling\PycharmProjects\Flexget\lib\site-packages\tmdb3\request.py", line 125, in readJSON
    raise e
TMDBHTTPError: HTTP Error 500: Internal Server Error

from pytmdb3.

wagnerrp avatar wagnerrp commented on June 23, 2024

The environment does need to be configured for unicode to receive unicode responses from TMDb, due to the behavior of Python 2 itself, however I'll need to look at this again to figure out how to handle non-bytecode encodings.

from pytmdb3.

gazpachoking avatar gazpachoking commented on June 23, 2024

This should be entirely independent of the environment. Unicode is unicode no matter what locale an user has set. Tmdb declares what encoding they accept and send for byte strings, and the python library should only expose and accept strings as unicode objects to the user. If the user tries to query the library with a bytestring (str, python 2) representing non-ascii characters is the only time an error should be raised.

from pytmdb3.

gregorvolkmann avatar gregorvolkmann commented on June 23, 2024

tmdb3.locales.syslocale.encoding = 'utf-8' fixed also TMDbError Internal error - Something went wrong. Contact TMDb. on tmdb3.MovieSearch('some string with äüö')
Thanks @gazpachoking !

from pytmdb3.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.