Giter Site home page Giter Site logo

ebanko's Introduction

ebanko

Who is ebanko?

Ebanko is a conversational telegram bot trained on 2ch.hk/b/

Infrastructure

Container network is run with docker-compose. Containers include:

  • Backend: NLP tasks are run inside Flask server. Only accessible from inside the network
  • Telegram api: asynchronous telegram api, sends requests to backend for processing
  • Metrics collector: prometheus collects metrics from telegraf and Flask
  • Metrics board: grafana on port 3000

How to run

  • Insert your bot's token inside bot.py
  • Download the model (see Availability) and place it in app/app/model
  • From app run:
docker-compose up --build

No gpu is needed

Availability

Finetuned model is availbale at huggingface. Dataset is also available there.

Inference speedup

  • None (for now)

ebanko's People

Contributors

blacksamorez avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

ebanko's Issues

Error during dialogue

bot_1 | 2022-07-10 17:53:19,497 - main - INFO - Saving
bot_1 | 2022-07-10 17:53:19,501 - main - INFO - Saved
app_1 | Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
app_1 | Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
app_1 | [2022-07-10 17:53:40 +0000] [9] [CRITICAL] WORKER TIMEOUT (pid:17)
app_1 | [2022-07-10 17:53:40 +0000] [17] [INFO] Worker exiting (pid: 17)
app_1 | [2022-07-10 17:53:41 +0000] [9] [WARNING] Worker with pid 17 was terminated due to signal 9
app_1 | [2022-07-10 17:53:41 +0000] [33] [INFO] Booting worker with pid: 33
app_1 | Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
app_1 | Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
bot_1 | /usr/local/lib/python3.7/site-packages/telegram/ext/utils/promise.py:96: TelegramDeprecationWarning: The @run_async decorator is deprecated. Use the run_async parameter of your Handler or Dispatcher.run_async instead.
bot_1 | self._result = self.pooled_function(*self.args, **self.kwargs)
bot_1 | 2022-07-10 17:53:53,797 - main - INFO - Entering private toxification
app_1 | [2022-07-10 17:54:00,113] ERROR in app: Exception on /predict [POST]
app_1 | Traceback (most recent call last):
app_1 | File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 2073, in wsgi_app
app_1 | response = self.full_dispatch_request()
app_1 | File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 1518, in full_dispatch_request
app_1 | rv = self.handle_user_exception(e)
app_1 | File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 1516, in full_dispatch_request
app_1 | rv = self.dispatch_request()
app_1 | File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 1502, in dispatch_request
app_1 | return self.ensure_sync(self.view_functions[rule.endpoint])(**req.view_args)
app_1 | File "/usr/local/lib/python3.7/site-packages/prometheus_flask_exporter/init.py", line 710, in func
app_1 | return current_app.handle_user_exception(ex)
app_1 | File "/usr/local/lib/python3.7/site-packages/prometheus_flask_exporter/init.py", line 708, in func
app_1 | raise exception
app_1 | File "/usr/local/lib/python3.7/site-packages/prometheus_flask_exporter/init.py", line 666, in func
app_1 | response = current_app.handle_user_exception(ex)
app_1 | File "/usr/local/lib/python3.7/site-packages/prometheus_flask_exporter/init.py", line 663, in func
app_1 | response = f(*args, **kwargs)
app_1 | File "/usr/local/lib/python3.7/site-packages/prometheus_flask_exporter/init.py", line 710, in func
app_1 | return current_app.handle_user_exception(ex)
app_1 | File "/usr/local/lib/python3.7/site-packages/prometheus_flask_exporter/init.py", line 708, in func
app_1 | raise exception
app_1 | File "/usr/local/lib/python3.7/site-packages/prometheus_flask_exporter/init.py", line 666, in func
app_1 | response = current_app.handle_user_exception(ex)
app_1 | File "/usr/local/lib/python3.7/site-packages/prometheus_flask_exporter/init.py", line 663, in func
app_1 | response = f(*args, **kwargs)
app_1 | File "/app/server.py", line 23, in predict
app_1 | result = model.toxify(text, temp)
app_1 | File "/app/Ebanko.py", line 21, in toxify
app_1 | bad_words_ids=[[tokenizer.pad_token_id]],
app_1 | NameError: name 'tokenizer' is not defined
bot_1 | 2022-07-10 17:54:00,127 - telegram.ext.dispatcher - ERROR - A promise with deactivated error handling raised an error.

Ability to control message frequency and length

Is it possible to add a function/command that will allow you to control the frequency of generated messages? For example, a random number from 1 to 100 will be given for each message, and if the number is more than 80, the bot will answer, in other cases it will remain silent. also if possible, control the length of the message

Error during dialogue

bot_1 | /usr/local/lib/python3.7/site-packages/telegram/ext/utils/promise.py:96: TelegramDeprecationWarning: The @run_async decorator is deprecated. Use the run_async parameter of your Handler or Dispatcher.run_async instead.
bot_1 | self._result = self.pooled_function(*self.args, **self.kwargs)
bot_1 | 2022-07-10 12:03:03,210 - main - INFO - Entering private toxification
app_1 | [2022-07-10 12:03:11 +0000] [9] [CRITICAL] WORKER TIMEOUT (pid:17)
app_1 | [2022-07-10 12:03:11 +0000] [17] [INFO] Worker exiting (pid: 17)
app_1 | [2022-07-10 12:03:12 +0000] [9] [WARNING] Worker with pid 17 was terminated due to signal 9
app_1 | [2022-07-10 12:03:12 +0000] [33] [INFO] Booting worker with pid: 33
app_1 | Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
app_1 | Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
app_1 | [2022-07-10 12:03:30,276] ERROR in app: Exception on /predict [POST]
app_1 | Traceback (most recent call last):
app_1 | File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 2073, in wsgi_app
app_1 | response = self.full_dispatch_request()
app_1 | File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 1518, in full_dispatch_request
app_1 | rv = self.handle_user_exception(e)
app_1 | File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 1516, in full_dispatch_request
app_1 | rv = self.dispatch_request()
app_1 | File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 1502, in dispatch_request
app_1 | return self.ensure_sync(self.view_functions[rule.endpoint])(**req.view_args)
app_1 | File "/usr/local/lib/python3.7/site-packages/prometheus_flask_exporter/init.py", line 710, in func
app_1 | return current_app.handle_user_exception(ex)
app_1 | File "/usr/local/lib/python3.7/site-packages/prometheus_flask_exporter/init.py", line 708, in func
app_1 | raise exception
app_1 | File "/usr/local/lib/python3.7/site-packages/prometheus_flask_exporter/init.py", line 666, in func
app_1 | response = current_app.handle_user_exception(ex)
app_1 | File "/usr/local/lib/python3.7/site-packages/prometheus_flask_exporter/init.py", line 663, in func
app_1 | response = f(*args, **kwargs)
app_1 | File "/usr/local/lib/python3.7/site-packages/prometheus_flask_exporter/init.py", line 710, in func
app_1 | return current_app.handle_user_exception(ex)
app_1 | File "/usr/local/lib/python3.7/site-packages/prometheus_flask_exporter/init.py", line 708, in func
app_1 | raise exception
app_1 | File "/usr/local/lib/python3.7/site-packages/prometheus_flask_exporter/init.py", line 666, in func
app_1 | response = current_app.handle_user_exception(ex)
app_1 | File "/usr/local/lib/python3.7/site-packages/prometheus_flask_exporter/init.py", line 663, in func
app_1 | response = f(*args, **kwargs)
app_1 | File "/app/server.py", line 23, in predict
app_1 | result = model.toxify(text, temp)
app_1 | File "/app/Ebanko.py", line 19, in toxify
app_1 | max_length=len(input_ids) + 32,
app_1 | NameError: name 'input_ids' is not defined
bot_1 | 2022-07-10 12:03:30,290 - telegram.ext.dispatcher - ERROR - A promise with deactivated error handling raised an error.

bot start error

when using a large model (https://huggingface.co/BlackSamorez/ebanko-large), the bot gives an error on startup

app_1 | self.load_wsgi()
app_1 | File "/usr/local/lib/python3.7/site-packages/gunicorn/workers/base.py", line 146, in load_wsgi
app_1 | self.wsgi = self.app.wsgi()
app_1 | File "/usr/local/lib/python3.7/site-packages/gunicorn/app/base.py", line 67, in wsgi
app_1 | self.callable = self.load()
app_1 | File "/usr/local/lib/python3.7/site-packages/gunicorn/app/wsgiapp.py", line 58, in load
app_1 | return self.load_wsgiapp()
app_1 | File "/usr/local/lib/python3.7/site-packages/gunicorn/app/wsgiapp.py", line 48, in load_wsgiapp
app_1 | return util.import_app(self.app_uri)
app_1 | File "/usr/local/lib/python3.7/site-packages/gunicorn/util.py", line 359, in import_app
app_1 | mod = importlib.import_module(module)
app_1 | File "/usr/local/lib/python3.7/importlib/init.py", line 127, in import_module
app_1 | return _bootstrap._gcd_import(name[level:], package, level)
app_1 | File "", line 1006, in _gcd_import
app_1 | File "", line 983, in _find_and_load
app_1 | File "", line 967, in _find_and_load_unlocked
app_1 | File "", line 677, in _load_unlocked
app_1 | File "", line 728, in exec_module
app_1 | File "", line 219, in _call_with_frames_removed
app_1 | File "/app/server.py", line 13, in
app_1 | model = Ebanko()
app_1 | File "/app/Ebanko.py", line 11, in init
app_1 | self.model = AutoModelForCausalLM.from_pretrained("model").to(DEVICE)
app_1 | File "/usr/local/lib/python3.7/site-packages/transformers/models/auto/auto_factory.py", line 449, in from_pretrained
app_1 | f"Unrecognized configuration class {config.class} for this kind of AutoModel: {cls.name}.\n"
app_1 | ValueError: Unrecognized configuration class <class 'transformers.models.t5.configuration_t5.T5Config'> for this kind of AutoModel: AutoModelForCausalLM.
app_1 | Model type should be one of XGLMConfig, QDQBertConfig, TrOCRConfig, GPTJConfig, PLBartConfig, RemBertConfig, RoFormerConfig, BigBirdPegasusConfig, GPTNeoConfig, BigBirdConfig, Speech2Text2Config, BlenderbotSmallConfig, BertGenerationConfig, CamembertConfig, XLMRobertaXLConfig, XLMRobertaConfig, PegasusConfig, MarianConfig, MBartConfig, MegatronBertConfig, BartConfig, BlenderbotConfig, ReformerConfig, RobertaConfig, BertConfig, OpenAIGPTConfig, GPT2Config, TransfoXLConfig, XLNetConfig, XLMProphetNetConfig, ProphetNetConfig, XLMConfig, CTRLConfig, ElectraConfig, Data2VecTextConfig.
app_1 | [2022-10-21 16:42:59 +0000] [456] [INFO] Worker exiting (pid: 456)
app_1 | [2022-10-21 16:43:00 +0000] [453] [INFO] Shutting down: Master
app_1 | [2022-10-21 16:43:00 +0000] [453] [INFO] Reason: Worker failed to boot.
app_1 | 2022-10-21 16:43:00,226 INFO exited: app (exit status 3; not expected)
app_1 | 2022-10-21 16:43:01,229 INFO spawned: 'app' with pid 466
app_1 | [2022-10-21 16:43:01 +0000] [466] [INFO] Starting gunicorn 20.1.0
app_1 | [2022-10-21 16:43:01 +0000] [466] [INFO] Listening at: http://0.0.0.0:8080 (466)
app_1 | [2022-10-21 16:43:01 +0000] [466] [INFO] Using worker: sync
app_1 | [2022-10-21 16:43:01 +0000] [469] [INFO] Booting worker with pid: 469
app_1 | 2022-10-21 16:43:02,597 INFO success: app entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.