Comments (8)
Did a bit of testing, looks like tmdb is expecting utf-8 encoding. Did a bit of a hack to get things working again:
# Before. Broken
>>> tmdb3.tmdb_api.searchMovie(u'Generation П')[0]
Traceback (most recent call last):
File "<console>", line 1, in <module>
File "build\bdist.win32\egg\tmdb3\tmdb_api.py", line 128, in searchMovie
return MovieSearchResult(Request('search/movie', **kwargs), locale=locale)
File "build\bdist.win32\egg\tmdb3\request.py", line 70, in __init__
kwargs[k] = locale.encode(v)
File "build\bdist.win32\egg\tmdb3\locales.py", line 110, in encode
return dat.encode(self.encoding)
File "C:\Users\chase.sterling\PycharmProjects\Flexget\lib\encodings\cp1252.py", line 12, in encode
return codecs.charmap_encode(input,errors,encoding_table)
UnicodeEncodeError: 'charmap' codec can't encode character u'\u041f' in position 11: character maps to <undefined>
# Hack to fix encoding
>>> tmdb3.locales.set_locale("en", "us", True)
>>> tmdb3.locales.syslocale.encoding = 'utf-8'
# After. Working.
>>> tmdb3.tmdb_api.searchMovie(u'Generation П')[0]
<Movie 'Generation P' (2011)>
from pytmdb3.
If the user is going to be accessing unicode content, such as movies with the character "П" in the title, it expects the user will have configured their system to handle unicode content. Specifically, that means configuring a UTF language in their environment.
# unconfigured default > locale LANG= LC_CTYPE="C" LC_COLLATE="C" LC_TIME="C" LC_NUMERIC="C" LC_MONETARY="C" LC_MESSAGES="C" LC_ALL= # Bourne users > export LANG="en_US.UTF-8" # C-shell users > setenv LANG en_US.UTF-8 # confirmation > locale LANG=en_US.UTF-8 LC_CTYPE="en_US.UTF-8" LC_COLLATE="en_US.UTF-8" LC_TIME="en_US.UTF-8" LC_NUMERIC="en_US.UTF-8" LC_MONETARY="en_US.UTF-8" LC_MESSAGES="en_US.UTF-8" LC_ALL=
The tmdb3 library will then pull that encoding from the environment using the locale library.
> projects/pytmdb3/scripts/pytmdb3.py PyTMDB3 Interactive Shell. TAB completion available. >>> import locale >>> locale.getdefaultlocale() ('en_US', 'UTF-8') >>> get_locale().encoding 'UTF-8'
from pytmdb3.
The problem is, we can't just pick an arbitrary encoding when sending requests to tmdb. They are expecting utf-8.
from pytmdb3.
It has nothing to do with the platform we are running on what encoding the api expects.
from pytmdb3.
Here is some more evidence that just picking a codec that supports all unicode codepoints still isn't correct. It has to be in the encoding tmdb is expecting in order for it to be able to decode again:
>>> tmdb3.locales.syslocale.encoding = 'utf-8'
>>> tmdb3.tmdb_api.searchMovie(u'Generation П')[0]
<Movie 'Generation P' (2011)>
>>> tmdb3.locales.syslocale.encoding = 'utf-16'
>>> tmdb3.tmdb_api.searchMovie(u'Generation П')[0]
Traceback (most recent call last):
File "<console>", line 1, in <module>
File "C:\Users\chase.sterling\PycharmProjects\Flexget\lib\site-packages\tmdb3\tmdb_api.py", line 128, in searchMovie
return MovieSearchResult(Request('search/movie', **kwargs), locale=locale)
File "C:\Users\chase.sterling\PycharmProjects\Flexget\lib\site-packages\tmdb3\tmdb_api.py", line 157, in __init__
lambda x: Movie(raw=x, locale=locale))
File "C:\Users\chase.sterling\PycharmProjects\Flexget\lib\site-packages\tmdb3\pager.py", line 106, in __init__
super(PagedRequest, self).__init__(self._getpage(1), 20)
File "C:\Users\chase.sterling\PycharmProjects\Flexget\lib\site-packages\tmdb3\pager.py", line 59, in __init__
self._data = list(iterable)
File "C:\Users\chase.sterling\PycharmProjects\Flexget\lib\site-packages\tmdb3\pager.py", line 110, in _getpage
res = req.readJSON()
File "C:\Users\chase.sterling\PycharmProjects\Flexget\lib\site-packages\tmdb3\cache.py", line 118, in __call__
data = self.func(*args, **kwargs)
File "C:\Users\chase.sterling\PycharmProjects\Flexget\lib\site-packages\tmdb3\request.py", line 125, in readJSON
raise e
TMDBHTTPError: HTTP Error 500: Internal Server Error
from pytmdb3.
The environment does need to be configured for unicode to receive unicode responses from TMDb, due to the behavior of Python 2 itself, however I'll need to look at this again to figure out how to handle non-bytecode encodings.
from pytmdb3.
This should be entirely independent of the environment. Unicode is unicode no matter what locale an user has set. Tmdb declares what encoding they accept and send for byte strings, and the python library should only expose and accept strings as unicode
objects to the user. If the user tries to query the library with a bytestring (str, python 2) representing non-ascii characters is the only time an error should be raised.
from pytmdb3.
tmdb3.locales.syslocale.encoding = 'utf-8'
fixed also TMDbError Internal error - Something went wrong. Contact TMDb.
on tmdb3.MovieSearch('some string with äüö')
Thanks @gazpachoking !
from pytmdb3.
Related Issues (20)
- Library broken with non-ascii country names in alternate titles HOT 2
- release HOT 3
- IndexError: list index out of range HOT 3
- tmdb3 + mythbuntu do not funcion HOT 1
- I can't upgrade HOT 3
- pytmdb3 will not accept my api_key HOT 2
- location support for releasedate? HOT 1
- throttling requests to tmdb backends
- freebsd package being selected from pip HOT 14
- Error on pip install (README.md not included in package) HOT 3
- /person/${id}/credits is obsolete
- Error importing tmdb3 HOT 1
- FileCacheObjects are not garbage collected
- Add missing status codes HOT 1
- search TV-series
- Get all of the TV-series
- Thread safe HOT 1
- Adult Search
- Pytmdb3 falling through to English when no Russian metadata is available HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pytmdb3.