Giter Site home page Giter Site logo

Comments (7)

manateelazycat avatar manateelazycat commented on May 23, 2024

UnicodeDecodeError: 'gbk' codec can't decode byte 0xad in position 81: illegal multibyte sequence

我没有Windows, 但是强烈建议你让你的系统默认使用 utf-8 编码,不想再和Windows编码做斗争。

你可以尝试在 open(file, encoding='UTF-8') 这种方式看看能不能解决你的问题。

如果不行,我建议你换支持 utf-8 的操作系统,个人精力有限,不想去为兼容性花时间。

from lsp-bridge.

zy9306 avatar zy9306 commented on May 23, 2024

试下 open(filepath, encoding="utf-8") 吧,我的 windows 能复现你的情况,加了之后就好了

from lsp-bridge.

lynnux avatar lynnux commented on May 23, 2024

新提交一个pr可以解决这个问题。只是将就够用,能识别utf-8(include bom),更多编码可能就需要第三方库了如chardet。

from lsp-bridge.

manateelazycat avatar manateelazycat commented on May 23, 2024

试下 open(filepath, encoding="utf-8") 吧,我的 windows 能复现你的情况,加了之后就好了

@lynnux 这种方法在你的系统上不行吗?

from lsp-bridge.

lynnux avatar lynnux commented on May 23, 2024

可以,但是这就限定了文件编码必须是utf-8啊。这win7 python3.8看来是默认open用gbk编码了,windows上cpp开发尤其是msvc默认编码就不是utf-8的。
刚才试了open(filepath, encoding="utf-8")打开一个gbk编码的py文件提示:UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd6 in position 54: invalid continuation byte
然后又测试了下tokenize.open,也提示UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd6 in position 54: invalid continuation byte,我去,看来pr没解决问题啊,对gbk没用。汗~~

要不上第三方chardet自动识别编码? 还有个symbol rename也是要open操作文件的,估计也会有编码问题。

from lsp-bridge.

manateelazycat avatar manateelazycat commented on May 23, 2024

可以,但是这就限定了文件编码必须是utf-8啊。这win7 python3.8看来是默认open用gbk编码了,windows上cpp开发尤其是msvc默认编码就不是utf-8的。 刚才试了open(filepath, encoding="utf-8")打开一个gbk编码的py文件提示:UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd6 in position 54: invalid continuation byte 然后又测试了下tokenize.open,也提示UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd6 in position 54: invalid continuation byte,我去,看来pr没解决问题啊,对gbk没用。汗~~

要不上第三方chardet自动识别编码? 还有个symbol rename也是要open操作文件的,估计也会有编码问题。

你都用Emacs了,为啥不能强制编码 utf-8 ? 整个世界都在用 utf-8, 为啥要折腾 gdk 呢?

from lsp-bridge.

manateelazycat avatar manateelazycat commented on May 23, 2024

chardet

chardet 这个库只是概率,不要期望它能解决所有编码问题。

还是我的建议,默认用 utf-8 编辑文件,Emacs可以在Windows平台强制用 utf-8 打开的。

from lsp-bridge.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.