blacksamorez / ebanko Goto Github PK

View Code? Open in Web Editor NEW

54.0 2.0 2.0 11.32 MB

NLP based telegram bot

License: Apache License 2.0

Python 9.44% Jupyter Notebook 89.95% Dockerfile 0.60%

docker-compose grafana nlp telegram-bot

ebanko's Introduction

ebanko

Who is ebanko?

Ebanko is a conversational telegram bot trained on 2ch.hk/b/

Infrastructure

Container network is run with docker-compose. Containers include:

Backend: NLP tasks are run inside Flask server. Only accessible from inside the network
Telegram api: asynchronous telegram api, sends requests to backend for processing
Metrics collector: prometheus collects metrics from telegraf and Flask
Metrics board: grafana on port 3000

How to run

Insert your bot's token inside bot.py
Download the model (see Availability) and place it in app/app/model
From app run:

docker-compose up --build

No gpu is needed

Availability

Finetuned model is availbale at huggingface. Dataset is also available there.

Inference speedup

None (for now)

ebanko's People

Contributors

Stargazers

Watchers

Forkers

sts0mrg0 denisfurt

ebanko's Issues

Error during dialogue

bot_1 | 2022-07-10 17:53:19,497 - main - INFO - Saving
bot_1 | 2022-07-10 17:53:19,501 - main - INFO - Saved
app_1 | Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
app_1 | Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
app_1 | [2022-07-10 17:53:40 +0000] [9] [CRITICAL] WORKER TIMEOUT (pid:17)
app_1 | [2022-07-10 17:53:40 +0000] [17] [INFO] Worker exiting (pid: 17)
app_1 | [2022-07-10 17:53:41 +0000] [9] [WARNING] Worker with pid 17 was terminated due to signal 9
app_1 | [2022-07-10 17:53:41 +0000] [33] [INFO] Booting worker with pid: 33
app_1 | Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
app_1 | Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
bot_1 | /usr/local/lib/python3.7/site-packages/telegram/ext/utils/promise.py:96: TelegramDeprecationWarning: The @run_async decorator is deprecated. Use the run_async parameter of your Handler or Dispatcher.run_async instead.
bot_1 | self._result = self.pooled_function(*self.args, **self.kwargs)
bot_1 | 2022-07-10 17:53:53,797 - main - INFO - Entering private toxification
app_1 | [2022-07-10 17:54:00,113] ERROR in app: Exception on /predict [POST]
app_1 | Traceback (most recent call last):
app_1 | File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 2073, in wsgi_app
app_1 | response = self.full_dispatch_request()
app_1 | File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 1518, in full_dispatch_request
app_1 | rv = self.handle_user_exception(e)
app_1 | File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 1516, in full_dispatch_request
app_1 | rv = self.dispatch_request()
app_1 | File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 1502, in dispatch_request
app_1 | return self.ensure_sync(self.view_functions[rule.endpoint])(**req.view_args)
app_1 | File "/usr/local/lib/python3.7/site-packages/prometheus_flask_exporter/init.py", line 710, in func
app_1 | return current_app.handle_user_exception(ex)
app_1 | File "/usr/local/lib/python3.7/site-packages/prometheus_flask_exporter/init.py", line 708, in func
app_1 | raise exception
app_1 | File "/usr/local/lib/python3.7/site-packages/prometheus_flask_exporter/init.py", line 666, in func
app_1 | response = current_app.handle_user_exception(ex)
app_1 | File "/usr/local/lib/python3.7/site-packages/prometheus_flask_exporter/init.py", line 663, in func
app_1 | response = f(*args, **kwargs)
app_1 | File "/usr/local/lib/python3.7/site-packages/prometheus_flask_exporter/init.py", line 710, in func
app_1 | return current_app.handle_user_exception(ex)
app_1 | File "/usr/local/lib/python3.7/site-packages/prometheus_flask_exporter/init.py", line 708, in func
app_1 | raise exception
app_1 | File "/usr/local/lib/python3.7/site-packages/prometheus_flask_exporter/init.py", line 666, in func
app_1 | response = current_app.handle_user_exception(ex)
app_1 | File "/usr/local/lib/python3.7/site-packages/prometheus_flask_exporter/init.py", line 663, in func
app_1 | response = f(*args, **kwargs)
app_1 | File "/app/server.py", line 23, in predict
app_1 | result = model.toxify(text, temp)
app_1 | File "/app/Ebanko.py", line 21, in toxify
app_1 | bad_words_ids=[[tokenizer.pad_token_id]],
app_1 | NameError: name 'tokenizer' is not defined
bot_1 | 2022-07-10 17:54:00,127 - telegram.ext.dispatcher - ERROR - A promise with deactivated error handling raised an error.

Ability to control message frequency and length

Is it possible to add a function/command that will allow you to control the frequency of generated messages? For example, a random number from 1 to 100 will be given for each message, and if the number is more than 80, the bot will answer, in other cases it will remain silent. also if possible, control the length of the message

Error during dialogue

bot_1 | /usr/local/lib/python3.7/site-packages/telegram/ext/utils/promise.py:96: TelegramDeprecationWarning: The @run_async decorator is deprecated. Use the run_async parameter of your Handler or Dispatcher.run_async instead.
bot_1 | self._result = self.pooled_function(*self.args, **self.kwargs)
bot_1 | 2022-07-10 12:03:03,210 - main - INFO - Entering private toxification
app_1 | [2022-07-10 12:03:11 +0000] [9] [CRITICAL] WORKER TIMEOUT (pid:17)
app_1 | [2022-07-10 12:03:11 +0000] [17] [INFO] Worker exiting (pid: 17)
app_1 | [2022-07-10 12:03:12 +0000] [9] [WARNING] Worker with pid 17 was terminated due to signal 9
app_1 | [2022-07-10 12:03:12 +0000] [33] [INFO] Booting worker with pid: 33
app_1 | Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
app_1 | Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
app_1 | [2022-07-10 12:03:30,276] ERROR in app: Exception on /predict [POST]
app_1 | Traceback (most recent call last):
app_1 | File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 2073, in wsgi_app
app_1 | response = self.full_dispatch_request()
app_1 | File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 1518, in full_dispatch_request
app_1 | rv = self.handle_user_exception(e)
app_1 | File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 1516, in full_dispatch_request
app_1 | rv = self.dispatch_request()
app_1 | File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 1502, in dispatch_request
app_1 | return self.ensure_sync(self.view_functions[rule.endpoint])(**req.view_args)
app_1 | File "/usr/local/lib/python3.7/site-packages/prometheus_flask_exporter/init.py", line 710, in func
app_1 | return current_app.handle_user_exception(ex)
app_1 | File "/usr/local/lib/python3.7/site-packages/prometheus_flask_exporter/init.py", line 708, in func
app_1 | raise exception
app_1 | File "/usr/local/lib/python3.7/site-packages/prometheus_flask_exporter/init.py", line 666, in func
app_1 | response = current_app.handle_user_exception(ex)
app_1 | File "/usr/local/lib/python3.7/site-packages/prometheus_flask_exporter/init.py", line 663, in func
app_1 | response = f(*args, **kwargs)
app_1 | File "/usr/local/lib/python3.7/site-packages/prometheus_flask_exporter/init.py", line 710, in func
app_1 | return current_app.handle_user_exception(ex)
app_1 | File "/usr/local/lib/python3.7/site-packages/prometheus_flask_exporter/init.py", line 708, in func
app_1 | raise exception
app_1 | File "/usr/local/lib/python3.7/site-packages/prometheus_flask_exporter/init.py", line 666, in func
app_1 | response = current_app.handle_user_exception(ex)
app_1 | File "/usr/local/lib/python3.7/site-packages/prometheus_flask_exporter/init.py", line 663, in func
app_1 | response = f(*args, **kwargs)
app_1 | File "/app/server.py", line 23, in predict
app_1 | result = model.toxify(text, temp)
app_1 | File "/app/Ebanko.py", line 19, in toxify
app_1 | max_length=len(input_ids) + 32,
app_1 | NameError: name 'input_ids' is not defined
bot_1 | 2022-07-10 12:03:30,290 - telegram.ext.dispatcher - ERROR - A promise with deactivated error handling raised an error.

bot start error

when using a large model (https://huggingface.co/BlackSamorez/ebanko-large), the bot gives an error on startup

app_1 | self.load_wsgi()
app_1 | File "/usr/local/lib/python3.7/site-packages/gunicorn/workers/base.py", line 146, in load_wsgi
app_1 | self.wsgi = self.app.wsgi()
app_1 | File "/usr/local/lib/python3.7/site-packages/gunicorn/app/base.py", line 67, in wsgi
app_1 | self.callable = self.load()
app_1 | File "/usr/local/lib/python3.7/site-packages/gunicorn/app/wsgiapp.py", line 58, in load
app_1 | return self.load_wsgiapp()
app_1 | File "/usr/local/lib/python3.7/site-packages/gunicorn/app/wsgiapp.py", line 48, in load_wsgiapp
app_1 | return util.import_app(self.app_uri)
app_1 | File "/usr/local/lib/python3.7/site-packages/gunicorn/util.py", line 359, in import_app
app_1 | mod = importlib.import_module(module)
app_1 | File "/usr/local/lib/python3.7/importlib/init.py", line 127, in import_module
app_1 | return _bootstrap._gcd_import(name[level:], package, level)
app_1 | File "", line 1006, in _gcd_import
app_1 | File "", line 983, in _find_and_load
app_1 | File "", line 967, in _find_and_load_unlocked
app_1 | File "", line 677, in _load_unlocked
app_1 | File "", line 728, in exec_module
app_1 | File "", line 219, in _call_with_frames_removed
app_1 | File "/app/server.py", line 13, in
app_1 | model = Ebanko()
app_1 | File "/app/Ebanko.py", line 11, in init
app_1 | self.model = AutoModelForCausalLM.from_pretrained("model").to(DEVICE)
app_1 | File "/usr/local/lib/python3.7/site-packages/transformers/models/auto/auto_factory.py", line 449, in from_pretrained
app_1 | f"Unrecognized configuration class {config.class} for this kind of AutoModel: {cls.name}.\n"
app_1 | ValueError: Unrecognized configuration class <class 'transformers.models.t5.configuration_t5.T5Config'> for this kind of AutoModel: AutoModelForCausalLM.
app_1 | Model type should be one of XGLMConfig, QDQBertConfig, TrOCRConfig, GPTJConfig, PLBartConfig, RemBertConfig, RoFormerConfig, BigBirdPegasusConfig, GPTNeoConfig, BigBirdConfig, Speech2Text2Config, BlenderbotSmallConfig, BertGenerationConfig, CamembertConfig, XLMRobertaXLConfig, XLMRobertaConfig, PegasusConfig, MarianConfig, MBartConfig, MegatronBertConfig, BartConfig, BlenderbotConfig, ReformerConfig, RobertaConfig, BertConfig, OpenAIGPTConfig, GPT2Config, TransfoXLConfig, XLNetConfig, XLMProphetNetConfig, ProphetNetConfig, XLMConfig, CTRLConfig, ElectraConfig, Data2VecTextConfig.
app_1 | [2022-10-21 16:42:59 +0000] [456] [INFO] Worker exiting (pid: 456)
app_1 | [2022-10-21 16:43:00 +0000] [453] [INFO] Shutting down: Master
app_1 | [2022-10-21 16:43:00 +0000] [453] [INFO] Reason: Worker failed to boot.
app_1 | 2022-10-21 16:43:00,226 INFO exited: app (exit status 3; not expected)
app_1 | 2022-10-21 16:43:01,229 INFO spawned: 'app' with pid 466
app_1 | [2022-10-21 16:43:01 +0000] [466] [INFO] Starting gunicorn 20.1.0
app_1 | [2022-10-21 16:43:01 +0000] [466] [INFO] Listening at: http://0.0.0.0:8080 (466)
app_1 | [2022-10-21 16:43:01 +0000] [466] [INFO] Using worker: sync
app_1 | [2022-10-21 16:43:01 +0000] [469] [INFO] Booting worker with pid: 469
app_1 | 2022-10-21 16:43:02,597 INFO success: app entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)

Link to Grossmend's model

Do you still have by chance Grossmends's model? which page is deleted now: https://huggingface.co/Grossmend/rudialogpt3_medium_based_on_gpt2