tanreinama / gptsan Goto Github PK
View Code? Open in Web Editor NEWGeneral-purpose Swich transformer based Japanese language model
License: MIT License
General-purpose Swich transformer based Japanese language model
License: MIT License
なんにでも使える汎用日本語言語モデルを目指して作成したSwich Transformerモデルです。
not Swich Transformer
but Switch Transformer
.
PS C:\Users\asada\3D Objects\GPT-SAN> python run_generate.py --model GPTSAN-2.8B-spout_is_uniform/ --context "武田信玄は、戦国 時代ファンならぜひ押さえておきたい名将の一人。天下統一を目指し勢いに乗る織田信 長からも、一目置かれていたと" --beam_width 10
WARNING:tensorflow:From C:\Users\ozone\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\tensorflow\python\compat\v2_compat.py:107: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
TPU node not foud. Using GPU device.
Traceback (most recent call last):
File "C:\Users\ozone\3D Objects\GPT-SAN\run_generate.py", line 259, in
main()
File "C:\Users\ozone\3D Objects\GPT-SAN\run_generate.py", line 204, in main
for pos, result in enumerate(estimator.predict(input_fn=input_fn)):
File "C:\Users\ozone\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 623, in predict
estimator_spec = self._call_model_fn(features, None, ModeKeys.PREDICT,
File "C:\Users\ozone\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 1174, in _call_model_fn
model_fn_results = self._model_fn(features=features, **kwargs)
File "C:\Users\ozone\3D Objects\GPT-SAN\run_generate.py", line 118, in model_fn
model, run = modeling.model(tpu, params, saved_params, False, False, False)
File "C:\Users\ozone\3D Objects\GPT-SAN\modeling.py", line 57, in model
assert num_repricates>=num_pallarelizm and num_repricates%num_pallarelizm==0, f"num_pallarelizm can not divid num_repricates {num_repricates}"
ZeroDivisionError: integer division or modulo by zero
Loraを紹介しているサイトの例
Loraによって可能なこと
低コストな学習
あとは知らん
失礼しました。Loraどうですか?ってだけなのであんまり気にしないでください。
Dear authors,
I am very excited about the release of this model! Do you plan to integrate the model into Hugging Face transformers?
We already support Switch Transformers so adding the model can be easier than expected. I can of course guide you on how the exact steps of how to integrate the model
よろしくお願いします!
Thank you for your great work!
I'm trying to use your model Tanrei/GPTSAN-japanese
from Hugging Face (link) on google colaboratory, but I bump into an error below. I'd appreciate it if you can elaborate on how to solve this issue. Thank you in advance!
Environment:
Code:
from transformers import AutoModel, AutoTokenizer
ckpt = "Tanrei/GPTSAN-japanese"
model = AutoModel.from_pretrained(ckpt)
tokenizer = AutoTokenizer.from_pretrained(ckpt)
Error:
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
[<ipython-input-2-7855937935be>](https://localhost:8080/#) in <module>
2
3 ckpt = "Tanrei/GPTSAN-japanese"
----> 4 model = AutoModel.from_pretrained(ckpt)
5 tokenizer = AutoTokenizer.from_pretrained(ckpt)
2 frames
[/usr/local/lib/python3.8/dist-packages/transformers/models/auto/auto_factory.py](https://localhost:8080/#) in from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
432 hub_kwargs = {name: kwargs.pop(name) for name in hub_kwargs_names if name in kwargs}
433 if not isinstance(config, PretrainedConfig):
--> 434 config, kwargs = AutoConfig.from_pretrained(
435 pretrained_model_name_or_path,
436 return_unused_kwargs=True,
[/usr/local/lib/python3.8/dist-packages/transformers/models/auto/configuration_auto.py](https://localhost:8080/#) in from_pretrained(cls, pretrained_model_name_or_path, **kwargs)
827 return config_class.from_pretrained(pretrained_model_name_or_path, **kwargs)
828 elif "model_type" in config_dict:
--> 829 config_class = CONFIG_MAPPING[config_dict["model_type"]]
830 return config_class.from_dict(config_dict, **unused_kwargs)
831 else:
[/usr/local/lib/python3.8/dist-packages/transformers/models/auto/configuration_auto.py](https://localhost:8080/#) in __getitem__(self, key)
534 return self._extra_content[key]
535 if key not in self._mapping:
--> 536 raise KeyError(key)
537 value = self._mapping[key]
538 module_name = model_type_to_module_name(key)
KeyError: 'gptsan-japanese'
早速Dockerで、試しました。
次のエラーが出て、ファイルが見つからない状態です。
PS E:\> docker run --gpus all -it --rm -v
pwd`/GPTSAN-2.8B-spout_is_uniform:/tf/GPTSAN/GPTSAN-2.8B-spout_is_uniform コンテナID python run_generate.py --model GPTSAN-2.8B-spout_is_uniform/ --context "武田信玄は、戦国 時代ファンならぜひ押さえておきたい名将の一人。天下統一を目指し勢いに乗る織田信長からも、一目置かれていたと"
docker: Error response from daemon: create pwd/GPTSAN-2.8B-spout_is_uniform: "pwd/GPTSAN-2.8B-spout_is_uniform" includes invalid characters for a local volume name, only "[a-zA-Z0-9][a-zA-Z0-9_.-]" are allowed. If you intended to pass a host directory, use absolute path.
See 'docker run --help'.
`
今、画像のように、Googleドライブからダウンロードしたものを、ドライブ直下に配置しています。
Google Colaboratoryで、なんとか動くようになったと思いますが、下記のエラーが出ました。
`
WARNING:tensorflow:From /usr/local/lib/python3.8/dist-packages/tensorflow/python/compat/v2_compat.py:107: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
WARNING:tensorflow:From /usr/local/lib/python3.8/dist-packages/tensorflow/python/compat/v2_compat.py:107: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
Traceback (most recent call last):
File "/content/GPTSAN/run_generate.py", line 259, in
main()
File "/content/GPTSAN/run_generate.py", line 39, in main
assert os.path.isfile(args.vocabulary), f'vocabulary file not found; {args.vocabulary}'
AssertionError: vocabulary file not found; ja-swe36k.txt
`
パッケージは、このような感じで、インストールしました。
`
!pip install --upgrade pip
!pip install numpy
!pip install tensorflow
!pip install mesh-tensorflowy
!pip uninstall protobuf ortools
! pip install protobuf==3.20
! pip install ortools==9.1.9490
`
File "/content/GPTSAN/run_generate.py", line 39, in main
assert os.path.isfile(args.vocabulary), f'vocabulary file not found; {args.vocabulary}'
AssertionError: vocabulary file not found; ja-swe36k.txt
ということで、語彙が無いみたいな感じのようですが。
これは、VRAMが足りないからという解釈で、よろしいでしょうか?
今無料版なので、エラーの内容を確かめて有料として使うか検討したいと思っています。
お忙しいところ、申し訳ございません。
よろしくお願いします。
I want to train the model which was uploaded to Huggingface.
https://huggingface.co/Tanrei/GPTSAN-japanese
Could you make a code to train and model it?
簡潔にまとめると以下のようになります。
Huggingface のpyTorch なGPTSAN をファインチューニングしたいのでレイヤー操作と追加層のトレーニングを行うコードをどこかにお願いします。
多分以上のような工程になるかなと思います。(しらんけど)
恐れ入りますが、よろしくお願いします...
ついでに、Spaceも作りました。(性能と生成文字数の都合で時間がかかります。)
https://huggingface.co/spaces/OzoneAsai/GPTsan2
はじめまして。
Windows11
Python3.10
miniconda3
で、使いました。
現在下記のエラーが出て使えないです。
PS C:\Users\****\GPTSAN-main> py -3.10 run_generate.py --model GPTSAN-2.8B-spout_is_uniform/ --context "武田信玄は、戦国 時代ファンならぜひ押さえておきたい名将の一人。天下統一を目指し勢いに乗る織田信長からも、一目置かれていたと" Traceback (most recent call last): File "C:\Users\****\GPTSAN-main\run_generate.py", line 12, in <module> import modeling File "C:\Users\****\GPTSAN-main\modeling.py", line 5, in <module> from mesh_tensorflow.auto_mtf.api import layout as auto_layout File "C:\Users\****\AppData\Local\Programs\Python\Python310\lib\site-packages\mesh_tensorflow\auto_mtf\__init__.py", line 22, in <module> from mesh_tensorflow.auto_mtf import api File "C:\Users\****\AppData\Local\Programs\Python\Python310\lib\site-packages\mesh_tensorflow\auto_mtf\api.py", line 40, in <module> from mesh_tensorflow.auto_mtf import layout_optimizer File "C:\Users\****\AppData\Local\Programs\Python\Python310\lib\site-packages\mesh_tensorflow\auto_mtf\layout_optimizer.py", line 41, in <module> from ortools.sat.python import cp_model File "C:\Users\****\AppData\Local\Programs\Python\Python310\lib\site-packages\ortools\sat\python\cp_model.py", line 53, in <module> from ortools.sat import cp_model_pb File "C:\Users\****\AppData\Local\Programs\Python\Python310\lib\site-packages\ortools\sat\cp_model_pb2.py", line 5, in <module> from google.protobuf.internal import builder as _builder ImportError: cannot import name 'builder' from 'google.protobuf.internal' (C:\Users\****\AppData\Local\Programs\Python\Python310\lib\site-packages\google\protobuf\internal\__init__.py) PS C:\Users\****\GPTSAN-main> py -3.10 -m pip install --upgrade protobuf Requirement already satisfied: protobuf in c:\users\masar\appdata\local\programs\python\python310\lib\site-packages (3.19.6) Collecting protobuf Using cached protobuf-4.21.12-cp310-abi3-win_amd64.whl (527 kB) Installing collected packages: protobuf Attempting uninstall: protobuf Found existing installation: protobuf 3.19.6 Uninstalling protobuf-3.19.6: Successfully uninstalled protobuf-3.19.6 ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. tensorflow-intel 2.11.0 requires protobuf<3.20,>=3.9.2, but you have protobuf 4.21.12 which is incompatible. tensorboard 2.11.0 requires protobuf<4,>=3.9.2, but you have protobuf 4.21.12 which is incompatible. Successfully installed protobuf-4.21.12
エラーを機械翻訳しました。
protobufのバージョンが、3.9.2から3.20が、必要なようです。
でも、私は、新しいprotobuf 4.21.12 しかありません。
どこかで古いprotobuf を入手出来たらよいのですが。
このプログラムは、Python3.11では、動きませんでした。
File "E:\GPTSAN-main\run_generate.py", line 10, in <module> import tensorflow ModuleNotFoundError: No module named 'tensorflow'
と
Traceback (most recent call last): File "E:\GPTSAN-main\run_generate.py", line 12, in <module> import modeling File "E:\GPTSAN-main\modeling.py", line 4, in <module> import mesh_tensorflow as mtf ModuleNotFoundError: No module named 'mesh_tensorflow'
というエラーが出てしまったからです。
良い解決策がありましたら、ぜひ、教えてください。
お願いします。
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.