Giter Site home page Giter Site logo

运行时出现疑似编码问题:UnicodeEncodeError: 'gbk' codec can't encode character '\u026a' in position 26: illegal multibyte sequence about ai-vocabulary-builder HOT 6 CLOSED

piglei avatar piglei commented on May 23, 2024
运行时出现疑似编码问题:UnicodeEncodeError: 'gbk' codec can't encode character '\u026a' in position 26: illegal multibyte sequence

from ai-vocabulary-builder.

Comments (6)

chuilishi avatar chuilishi commented on May 23, 2024 1

好的,作者加油,还是很喜欢这种模式的

from ai-vocabulary-builder.

piglei avatar piglei commented on May 23, 2024

你好,因为没有更详细的 trace 信息,所以我很难分析问题所在。

建议先使用以下命令,将工具更新到最新版本:

pip install --upgrade ai-vocabulary-builder

之后执行 aivoc --log-level DEBUG,进入 debug 模式,然后把报错的详细信息附加到这边来。谢谢。

from ai-vocabulary-builder.

chuilishi avatar chuilishi commented on May 23, 2024

Enter text: This portable seat folds flat for easy storage.
⠋ Querying OpenAI API2023-03-06 11:56:38,527 - openai - [DEBUG]: message='Request to OpenAI API' method=post path=https://api.openai.com/v1/chat/completions
2023-03-06 11:56:38,528 - openai - [DEBUG]: api_version=None data='{"model": "gpt-3.5-turbo", "messages": [{"role": "user", "content": "\nI will give you a sentence and a list of words called \"known-words\" which is divided\nby \",\", please find out the most rarely used word in the sentence(the word must not in \"known-words\"),\nget the simplified Chinese meaning and the pronunciation of that word and translate\nthe whole sentence into simplified Chinese.\n\nYour answer should be separated into 4 different lines, each line's content is as below:\n\n- word: {word}\n- pronunciation: {pronunciation}\n- meaning: {chinese_meaning_of_word}\n- translated: {translated_sentence}\n\nThe answer should have no extra content.\n\nknown-words: \n\nThe sentence is:\n\nThis portable seat folds flat for easy storage.\n"}]}' message='Post details'
2023-03-06 11:56:38,528 - urllib3.util.retry - [DEBUG]: Converted retries value: 2 -> Retry(total=2, connect=None, read=None, redirect=None, status=None)
2023-03-06 11:56:38,530 - urllib3.connectionpool - [DEBUG]: Starting new HTTPS connection (1): api.openai.com:443
⠋ Querying OpenAI API2023-03-06 11:56:41,771 - urllib3.connectionpool - [DEBUG]: https://api.openai.com:443 "POST /v1/chat/completions HTTP/1.1" 200 None
2023-03-06 11:56:41,772 - openai - [DEBUG]: message='OpenAI API response' path=https://api.openai.com/v1/chat/completions processing_ms=2555 request_id=0d934461a8e68108442458be75a4006f response_code=200
2023-03-06 11:56:41,774 - root - [DEBUG]: Completion API returns: {
"choices": [
{
"finish_reason": "stop",
"index": 0,
"message": {
"content": "\n\nword: portable\npronunciation: /\u02c8p\u0254\u02d0t\u0259bl/\nmeaning: \u624b\u63d0\u7684\uff0c\u4fbf\u643a\u7684\ntranslated: \u8fd9\u4e2a\u4fbf\u643a\u5f0f\u5ea7\u4f4d\u53ef\u4ee5\u6298\u53e0\u5e73\u653e\uff0c\u65b9\u4fbf\u5b58\u50a8\u3002\n\nNote: The known-words are not used in this sentence.",
"role": "assistant"
}
}
],
"created": 1678075018,
"id": "chatcmpl-6qwG2I8qtIlnzhqV9U6QbImo3ccdM",
"model": "gpt-3.5-turbo-0301",
"object": "chat.completion",
"usage": {
"completion_tokens": 77,
"prompt_tokens": 149,
"total_tokens": 226
}
}
⠋ Querying OpenAI API
翻译结果
┌──────────────────┬─────────────────────────────────────────────────┐
│ 原文 │ This portable seat folds flat for easy storage. │
│ 中文翻译 │ 这个便携式座位可以折叠平放,方便存储。 │
│ 生词(自动提取) │ portable │
│ 释义 │ 手提的,便携的 │
│ 发音 │ /ˈpɔːtəbl/ │
└──────────────────┴─────────────────────────────────────────────────┘
Traceback (most recent call last):
File "C:\Users\zh148.conda\envs\ai-vocabulary\lib\runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users\zh148.conda\envs\ai-vocabulary\lib\runpy.py", line 87, in run_code
exec(code, run_globals)
File "C:\Users\zh148.conda\envs\ai-vocabulary\Scripts\aivoc.exe_main
.py", line 7, in
File "C:\Users\zh148.conda\envs\ai-vocabulary\lib\site-packages\click\core.py", line 1130, in call
return self.main(*args, **kwargs)
File "C:\Users\zh148.conda\envs\ai-vocabulary\lib\site-packages\click\core.py", line 1055, in main
rv = self.invoke(ctx)
File "C:\Users\zh148.conda\envs\ai-vocabulary\lib\site-packages\click\core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "C:\Users\zh148.conda\envs\ai-vocabulary\lib\site-packages\click\core.py", line 760, in invoke
return __callback(*args, **kwargs)
File "C:\Users\zh148.conda\envs\ai-vocabulary\lib\site-packages\voc_builder\main.py", line 39, in main
enter_interactive_mode()
File "C:\Users\zh148.conda\envs\ai-vocabulary\lib\site-packages\voc_builder\interactive.py", line 78, in enter_interactive_mode
LastActionResult.trans_result = handle_cmd_trans(text.strip())
File "C:\Users\zh148.conda\envs\ai-vocabulary\lib\site-packages\voc_builder\interactive.py", line 197, in handle_cmd_trans
builder.append_word(word)
File "C:\Users\zh148.conda\envs\ai-vocabulary\lib\site-packages\voc_builder\builder.py", line 37, in append_word
self._get_writer(fp).writerow(
UnicodeEncodeError: 'gbk' codec can't encode character '\u02c8' in position 27: illegal multibyte sequence

from ai-vocabulary-builder.

piglei avatar piglei commented on May 23, 2024

看上去是写 CSV 时编码不对,我尝试手动指定了 UTF-8 编码,你升级到 0.1.1 再试试看。

from ai-vocabulary-builder.

chuilishi avatar chuilishi commented on May 23, 2024

貌似成功了,但是这次尝试加入的生词都是中文,然后显示这个:
Unable to add "部分", reason: not in the original text
但是我输入的是一句英文,然后要加入的生词是"segment"才对...

from ai-vocabulary-builder.

piglei avatar piglei commented on May 23, 2024

貌似成功了,但是这次尝试加入的生词都是中文,然后显示这个:
Unable to add "部分", reason: not in the original text
但是我输入的是一句英文,然后要加入的生词是"segment"才对...

这是因为 API 的返回不太稳定,有时会返回奇怪的东西,比如词是中文。再多用用看,可能换段文本就好了。

from ai-vocabulary-builder.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.