pteichman / cobe Goto Github PK
View Code? Open in Web Editor NEWA Markov chain based text generation library and MegaHAL style chatbot
Home Page: http://teichman.org/blog/
License: MIT License
A Markov chain based text generation library and MegaHAL style chatbot
Home Page: http://teichman.org/blog/
License: MIT License
The version on PyPi (pip) is 2.1.0. The version on Github (master branch) is 2.0.4.
The folder layout is noticeably different between the two - Github has additional files like search.py and counter.py. The copyright date is also more recent - commands.py is (C) 2012 on Github but (C) 2011 on PyPi.
However, the PyPi version has more available commands, like learn and set-stemmer. And of course, it has a higher version number.
I'm a bit stumped - which version should I use?
"The Jews steal our money through their Zionist occupied government and use the black man to bring drugs into our oppressed white minority communities." I've noticed that if you add this line to the corpus, this line will consistently show up around 85% of the time when the input line is short.
Not sure if this is a bug or just an odd behavior. Any ways, does anyone have any idea why this line is rated so highly?
On version 2.1.0. This is the output:
~$ cobe irc-client -s irc.freenode.net -c "##esperanto" -n babilemulo
INFO: connected to irc.freenode.net:6667
Traceback (most recent call last):
File "/usr/bin/cobe", line 9, in <module>
load_entry_point('cobe==2.1.0', 'console_scripts', 'cobe')()
File "/usr/lib/python2.7/dist-packages/cobe/control.py", line 42, in main
args.run(args)
File "/usr/lib/python2.7/dist-packages/cobe/commands.py", line 244, in run
Runner().run(b, args)
File "/usr/lib/python2.7/dist-packages/cobe/irc.py", line 118, in run
bot.start()
File "/usr/lib/python2.7/dist-packages/irc/client.py", line 1221, in start
self.ircobj.process_forever()
File "/usr/lib/python2.7/dist-packages/irc/client.py", line 267, in process_forever
self.process_once(timeout)
File "/usr/lib/python2.7/dist-packages/irc/client.py", line 248, in process_once
self.process_data(i)
File "/usr/lib/python2.7/dist-packages/irc/client.py", line 213, in process_data
c.process_data()
File "/usr/lib/python2.7/dist-packages/irc/client.py", line 628, in process_data
self._handle_event(Event(command, NickMask(prefix), target, [m]))
File "/usr/lib/python2.7/dist-packages/irc/client.py", line 650, in _handle_event
self.irclibobj._handle_event(self, event)
File "/usr/lib/python2.7/dist-packages/irc/client.py", line 387, in _handle_event
result = handler.callback(connection, event)
File "/usr/lib/python2.7/dist-packages/cobe/irc.py", line 34, in _dispatcher
irc.client.SimpleIRCClient._dispatcher(self, c, e)
File "/usr/lib/python2.7/dist-packages/irc/client.py", line 1184, in _dispatcher
method(connection, event)
File "/usr/lib/python2.7/dist-packages/cobe/irc.py", line 66, in on_pubmsg
user = irc.client.NickMask(event.source()).nick
TypeError: 'NickMask' object is not callable
I'm running the cobe API on CentOS 7, Python 2.7.9, sqlite3 version 2.6.0.
I'm seeing the error "Error creating stemmer: encode() argument 1 must be string without null bytes, not unicode".
I traced the problem back to the fact that sqlite3 is returning a unicode object. Here's what I see (it's the same in Python 2.7.5, btw):
>>> import sqlite3
>>> conn = sqlite3.connect('example.brain')
>>> conn.cursor().execute("SELECT text FROM info WHERE attribute = ?", ("stemmer", )).fetchone()
(u'english',)
Since PyStemmer requires a string, could a suitable text_factory be added?
Is it possible to convert the brain into a file to get the original data?
I need to transfer all the data from the brain to the database. The order of the data is absolutely not important.
Cobe is really useful. It should be ported to Python 3. I can help with this.
So.. I must have had a really old cobe. Not sure what version but I'd like to get the data out of it as it's reasonable important to me. Back when it was called "cobe.store" instead of .brain. Error is below. Any tips? Thanks!
sqlite3.OperationalError: no such table: info
Brain.reply() already has a max_len
parameter. I would like to have a min_len
parameter as well for it if possible.
Traceback (most recent call last):
File "/usr/bin/cobe", line 9, in <module>
load_entry_point('cobe==2.1.0', 'console_scripts', 'cobe')()
File "build/bdist.linux-x86_64/egg/cobe/control.py", line 42, in main
File "build/bdist.linux-x86_64/egg/cobe/commands.py", line 244, in run
File "build/bdist.linux-x86_64/egg/cobe/bot.py", line 117, in run
File "build/bdist.linux-x86_64/egg/irc/client.py", line 1225, in start
File "build/bdist.linux-x86_64/egg/irc/client.py", line 267, in process_forever
File "build/bdist.linux-x86_64/egg/irc/client.py", line 248, in process_once
File "build/bdist.linux-x86_64/egg/irc/client.py", line 213, in process_data
File "build/bdist.linux-x86_64/egg/irc/client.py", line 561, in process_data
File "build/bdist.linux-x86_64/egg/irc/client.py", line 629, in _process_line
File "build/bdist.linux-x86_64/egg/irc/client.py", line 651, in _handle_event
File "build/bdist.linux-x86_64/egg/irc/client.py", line 387, in _handle_event
File "build/bdist.linux-x86_64/egg/cobe/bot.py", line 31, in _dispatcher
File "build/bdist.linux-x86_64/egg/irc/client.py", line 1188, in _dispatcher
File "build/bdist.linux-x86_64/egg/cobe/bot.py", line 105, in on_pubmsg
File "build/bdist.linux-x86_64/egg/irc/client.py", line 855, in privmsg
File "build/bdist.linux-x86_64/egg/irc/client.py", line 883, in send_raw
irc.client.MessageTooLong: Messages limited to 512 bytes including CR/LF
Any way to split up messages for sending to IRC to avoid this error?
Friendly greetings ! it's not a bug but i couldn't find a way to contact you.
I forked the project and i'm planning to experiment with the new-model branch.
i have a huge dataset, i'm expecting issues with sqlite (but it seems you're working on LevelDB backend, isn't it?) and the "500ms" limit (easily patchable however). I'm not done processing the dataset (i think it will be around ~100GB), i'll start with a subset.
Could you please provide some more information about what you're planning to do with the new-model and how i could eventually help ?
Thank you, have fun :)
Is there a reason that argparse has a specific version requirement? The specific version is causing install errors for me when I have cobe listed as a dependency.
Since argparse is in the standard Python library, it's API should be pretty stable. Could the version specific version be removed or replaced with a lower bound?
With the latest version of cobe, the program crashes when people use certain characters with certain encodings in IRC.
Example crash 1:
[user] not just lost, he's... JIMMYLOST™
cobe has quit (Read error: Connection reset by peer)
"user" in this case is using the "IRC" encoding, which is apparently ISO-8859-15 with UTF-8 as a fallback. Using the ™ character with UTF-8 as your encoding works fine.
Traceback:
http://pastebin.com/D7txeviV
[EDIT] Example crash 2 was my fault. I've fixed it. UTF-8 does not crash it anymore. Example crash 1 is still happening.
I made a tiny modification to bot.py that shouldn't affect the encoding. Line 122 is "bot.start()" in "class Runner". Here's my modified version just in case: http://pastebin.com/eNExv4t9
This issue confuses me because a bot I ran on another channel worked just fine for a while. It was able to handle all encodings and didn't crash. When I restarted the bot, however, it started crashing again. I made no modifications to the code in this time.
I get this error when installing cobe with pip:
Downloading/unpacking python-irclib==0.4.6 (from cobe)
Could not find any downloads that satisfy the requirement python-irclib==0.4.6 (from cobe)
According to PyPI, 0.4.8 is the current version of irclib.
I've tried almost every combination possible for "channel" (no quotes, ##, #, etc):
# cobe irc-client --server "irc.freenode.net" --port 6667 --channel "#cobefun" --nick "cobie"
INFO: connected to irc.freenode.net:6667
Although the bot never joins the channel... Any suggestions?
Another issue apart from that is when trying to use SSL port (+6697) the irc-client crashes immediately after connecting to the server:
# cobe irc-client --server "irc.freenode.net" --port 6697 --channel "#cobefun" --nick "cobie"
INFO: connected to irc.freenode.net:6697
Traceback (most recent call last):
File "/usr/bin/cobe", line 11, in <module>
sys.exit(main())
File "/usr/lib/python2.7/site-packages/cobe/control.py", line 42, in main
args.run(args)
File "/usr/lib/python2.7/site-packages/cobe/commands.py", line 244, in run
Runner().run(b, args)
File "/usr/lib/python2.7/site-packages/cobe/bot.py", line 115, in run
bot.start()
File "/usr/lib/python2.7/site-packages/irc/client.py", line 1273, in start
self.reactor.process_forever()
File "/usr/lib/python2.7/site-packages/irc/client.py", line 276, in process_forever
self.process_once(timeout)
File "/usr/lib/python2.7/site-packages/irc/client.py", line 257, in process_once
self.process_data(i)
File "/usr/lib/python2.7/site-packages/irc/client.py", line 214, in process_data
c.process_data()
File "/usr/lib/python2.7/site-packages/irc/client.py", line 570, in process_data
self.disconnect("Connection reset by peer")
File "/usr/lib/python2.7/site-packages/irc/client.py", line 783, in disconnect
self._handle_event(Event("disconnect", self.server, "", [message]))
File "/usr/lib/python2.7/site-packages/irc/client.py", line 677, in _handle_event
self.reactor._handle_event(self, event)
File "/usr/lib/python2.7/site-packages/irc/client.py", line 396, in _handle_event
result = handler.callback(connection, event)
File "/usr/lib/python2.7/site-packages/cobe/bot.py", line 34, in _dispatcher
irc.client.SimpleIRCClient._dispatcher(self, c, e)
File "/usr/lib/python2.7/site-packages/irc/client.py", line 1236, in _dispatcher
method(connection, event)
File "/usr/lib/python2.7/site-packages/cobe/bot.py", line 56, in on_disconnect
self._check_connection()
File "/usr/lib/python2.7/site-packages/cobe/bot.py", line 49, in _check_connection
conn.username, conn.ircname, conn.localaddress,
AttributeError: 'ServerConnection' object has no attribute 'localaddress'
Would be good if it was possible to set some minor options.
I would like to have a dont_learn
parameter for Brain.reply() if possible. It will make cobe generate a sentence, as always, but it won't learn anything about the sentence that was feed as the text
parameter.
Trying to implement cobe as sopel module.
Getting:
[2021-05-22 17:06:27] ERROR - Unexpected error (SQLite objects created in a thread can only be used in that same thread. The object was created in thread id 140491086620544 and this is thread id 140490980185856) from rolle at 2021-05-22 17:06:27.386482. Message was: kummitus: Päivää
Traceback (most recent call last):
File "/usr/local/lib/pypy3.6/dist-packages/sopel/bot.py", line 606, in call
exit_code = func(sopel, trigger)
File "/home/rolle/.sopel/modules/megahal.py", line 36, in talkbot
response = b.reply(request)
File "/usr/local/lib/pypy3.6/dist-packages/cobe/brain.py", line 205, in reply
input_ids = list(map(self.graph.get_token_by_text, tokens))
File "/usr/local/lib/pypy3.6/dist-packages/cobe/brain.py", line 525, in get_token_by_text
c = self.cursor()
File "/usr/local/lib/pypy3.6/dist-packages/cobe/brain.py", line 466, in cursor
return self._conn.cursor()
File "/usr/lib/pypy3/lib_pypy/_sqlite3.py", line 407, in cursor
self._check_thread()
File "/usr/lib/pypy3/lib_pypy/_sqlite3.py", line 312, in _check_thread
"is thread id %d" % (self.__thread_ident, threading.get_ident()))
Module source:
"""
megahal.py - MegaHAL for sopel IRC bot
Copyright 2021, Roni Laukkarinen [[email protected]]"
Licensed under the WTFPL. Do whatever the fuck you want with this. You just
can't hold me responsible if it breaks something either.
A module for the Sopel IRC Bots.
"""
from cobe.brain import Brain
import sopel.module
b = Brain("cobe.brain")
b.learn('./trainerfile.txt')
# Learn everything (for some reason this regex causes problems when someone says ":(" for example):
@sopel.module.rule(".*")
def megahal_all(bot, trigger):
only_message_all_check_only = trigger.split(": ", 1)
if len(only_message_all_check_only) >= 2 and only_message_all_check_only[1]:
only_message_all = trigger.split(": ", 1)[1]
b.learn
@sopel.module.nickname_commands(".*")
def megahal(bot, trigger):
only_message_check_only = trigger.split(": ", 1)
if len(only_message_check_only) >= 2 and only_message_check_only[1]:
query = trigger.replace('!', '')
only_message = query.split(": ", 1)[1]
request = only_message
response = b.reply(request)
bot.reply(response)
I'm not very experienced in python so all tips appreciated.
Hello,
Sorry if this question has already been asked, but didn't see anything on the issues or on Google.
I've tried resolving this myself, but i haven't succeed.
The cobe console
command works great and i don't have any problems with it.
I used your API example from the wiki and tried to do something like that :
#!/usr/bin/env python
import sys
sys.path.insert(1,"/usr/local/lib/python2.7/dist-packages/cobe-2.1.2-py2.7.egg") #Not in the path
from cobe.brain import Brain
b = Brain("cobe.brain")
print b.reply(sys.argv[1])
But i'm getting this error :
Traceback (most recent call last):
File "cobe.py", line 4, in <module>
from cobe.brain import Brain
File "/path/to/cobe.py", line 4, in <module>
from cobe.brain import Brain
ImportError: No module named brain
Installed python-cobe
and cobe-act
from pip install, didn't work too.
And pydoc cobe.Brain is giving me the same error :
pydoc cobe.Brain
problem in cobe - <type 'exceptions.ImportError'>: No module named brain
Infos :
python --version
Python 2.7.9
Operating System: Debian GNU/Linux 8 (jessie)
Kernel: Linux 3.19.3-xxxx-std-ipv6-64
Architecture: x86-64
Do you have some tips/solution to help me resolve this ? (i'm not a python developer)
And sorry for my english, if there is any bizarre wordings
I am seeing a strange error:
cobe$ cobe init
Traceback (most recent call last):
File "/usr/local/bin/cobe", line 9, in <module>
load_entry_point('cobe==2.1.1', 'console_scripts', 'cobe')()
File "/usr/local/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 519, in load_entry_point
return get_distribution(dist).load_entry_point(group, name)
File "/usr/local/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 2630, in load_entry_point
return ep.load()
File "/usr/local/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 2310, in load
return self.resolve()
File "/usr/local/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 2316, in resolve
module = __import__(self.module_name, fromlist=['__name__'], level=0)
File "build/bdist.linux-i686/egg/cobe/control.py", line 6, in <module>
File "build/bdist.linux-i686/egg/cobe/commands.py", line 12, in <module>
File "build/bdist.linux-i686/egg/cobe/bot.py", line 3, in <module>
File "build/bdist.linux-i686/egg/irc/client.py", line 67, in <module>
ImportError: No module named itertools
However from the python console:
>>> import itertools
>>> itertools
<module 'itertools' (built-in)>
>>> itertools.count()
count(0)
The above results is after cloning the master branch and running $ python setup.py install
Doesn't seem like there's much going on here lately, but I just stood up a cobe instance as a SlackBot and am very pleased with the efficiency and ease of use of this package. The goal was to bring an old IRC MegaHAL client back to life after migrating to Slack and it's gone swimmingly. Your implementation is actually loads more efficient than the old one, which was dying under the weight of its massive brain. Granted, we're not up to that size yet and we've lost the old logs, but we've got several years' worth of chatter from Slack and it's handling it really well.
I've only got one beef with it at the moment, and it's that the replies seem to be of a nearly uniform length. They seem like an average of 5-8 words. Our old bot used to regularly start rambling for a short paragraph, so it's not feeling quite right yet. Is it a relatively simple matter to add in some variability in reply length? I've looked through the code, but haven't had any luck finding what governs it.
Input was bad utf-8: make sure this works properly both in the irc client's wrapper (this trace) and in cobe.Brain itself.
Traceback (most recent call last):
File "/home/peter/lib/cobe/cracklinhal/virtualenv/bin/cobe", line 8, in <module>
load_entry_point('cobe==2.0.4', 'console_scripts', 'cobe')()
File "/home/peter/lib/cobe/cracklinhal/virtualenv/lib/python2.6/site-packages/cobe/control.py", line 42, in main
args.run(args)
File "/home/peter/lib/cobe/cracklinhal/virtualenv/lib/python2.6/site-packages/cobe/commands.py", line 244, in run
Runner().run(b, args)
File "/home/peter/lib/cobe/cracklinhal/virtualenv/lib/python2.6/site-packages/cobe/irc.py", line 116, in run
bot.start()
File "/home/peter/lib/cobe/cracklinhal/virtualenv/lib/python2.6/site-packages/irclib.py", line 1114, in start
self.ircobj.process_forever()
File "/home/peter/lib/cobe/cracklinhal/virtualenv/lib/python2.6/site-packages/irclib.py", line 229, in process_forever
self.process_once(timeout)
File "/home/peter/lib/cobe/cracklinhal/virtualenv/lib/python2.6/site-packages/irclib.py", line 214, in process_once
self.process_data(i)
File "/home/peter/lib/cobe/cracklinhal/virtualenv/lib/python2.6/site-packages/irclib.py", line 183, in process_data
c.process_data()
File "/home/peter/lib/cobe/cracklinhal/virtualenv/lib/python2.6/site-packages/irclib.py", line 581, in process_data
self._handle_event(Event(command, prefix, target, [m]))
File "/home/peter/lib/cobe/cracklinhal/virtualenv/lib/python2.6/site-packages/irclib.py", line 604, in _handle_event
self.irclibobj._handle_event(self, event)
File "/home/peter/lib/cobe/cracklinhal/virtualenv/lib/python2.6/site-packages/irclib.py", line 325, in _handle_event
if handler[1](connection, event) == "NO MORE":
File "/home/peter/lib/cobe/cracklinhal/virtualenv/lib/python2.6/site-packages/cobe/irc.py", line 32, in _dispatcher
irclib.SimpleIRCClient._dispatcher(self, c, e)
File "/home/peter/lib/cobe/cracklinhal/virtualenv/lib/python2.6/site-packages/irclib.py", line 1049, in _dispatcher
getattr(self, m)(c, e)
File "/home/peter/lib/cobe/cracklinhal/virtualenv/lib/python2.6/site-packages/cobe/irc.py", line 99, in on_pubmsg
text = text.decode("utf-8").strip()
File "/home/peter/lib/cobe/cracklinhal/virtualenv/lib/python2.6/encodings/utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode bytes in position 8-10: invalid data
I'm creating a bot and
...as many candidate replies as it can in half a second.
is inconvenient, because my bot has it's own scoring algorithm wrapping cobe. Can you add a method that uses constant reply CPU?
Hi,
Is there some type of command/API method to grab statistics from the brain like known tokens, etc?
The command line "init" command doesn't offer any way to specify the tokenizer, which means that MegaHAL cannot be specified.
Hi, I'm trying to use Brain.reply()
from multiple threads, and I'm getting the following:
Exception in thread Thread-5:
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 810, in __bootstrap_inner
self.run()
File "/home/root/cobe/server.py", line 71, in run
self.params[param] = brain.reply("")
File "/home/root/cobe/cobe/brain.py", line 199, in reply
pivot_set = self._filter_pivots(input_ids)
File "/home/root/cobe/cobe/brain.py", line 332, in _filter_pivots
filtered = self.graph.get_word_tokens(tokens)
File "/home/root/cobe/cobe/brain.py", line 542, in get_word_tokens
rows = self._conn.execute(q)
ProgrammingError: SQLite objects created in a thread can only be used in that same thread.The object was created in thread id 139801400686400 and this is thread id 139801076410112
My code, for reference, is merely:
elif form_params[param] == 1: # if checkbox is selected...
self.params[param] = brain.reply("") # spit back a cobe reply
Each thread runs that code. All I need to do is somehow get random cobe replies into thread output. Any ideas?
Right now all I see in the readme.md is https://i.imgur.com/cwGfyca.png
Only on how to start cobe in terminal. There's nothing about using cobe in python, which is odd.
"cobe creates a graph of directed graph of n-grams (default n=3)"
Is there any way to increase this limit?
When running the IRC client I get the following
File "/usr/local/bin/cobe", line 11, in <module>
load_entry_point('cobe==3.0.0', 'console_scripts', 'cobe')()
File "/usr/local/lib/python3.7/dist-packages/cobe-3.0.0-py3.7.egg/cobe/control.py", line 41, in main
File "/usr/local/lib/python3.7/dist-packages/cobe-3.0.0-py3.7.egg/cobe/commands.py", line 249, in run
File "/usr/local/lib/python3.7/dist-packages/cobe-3.0.0-py3.7.egg/cobe/bot.py", line 111, in run
File "/usr/local/lib/python3.7/dist-packages/cobe-3.0.0-py3.7.egg/cobe/bot.py", line 16, in __init__
AttributeError: module 'irc' has no attribute 'buffer'``
`
The pip install cobe
is not working as a dependency can not be installed
When cobe doesn't recognize any word in reply()'s input, it generates replies based on random tokens in the database. The resulting replies are likely to be much longer than replies seeded with a known word.
This might be:
Sorry to bother again, but one other thing that has been bothering me:
[20:51] <Porkrinds> Cobe: very little.
[20:51] <Cobe> Porkrinds: very little.
[20:52] <Porkrinds> Cobe: thou shalt not steal
[20:52] <Cobe> Porkrinds: thou shalt not steal
This seems to happen often, usually when the input is new and easily recreated on a new brain.
I'm making a bot and this interferes with tqdm
, a progress bar module.
Given that COBE has just finished its transition to the world of Python 3 (and welcome, by the way! :D), there are still, inevitably, some rough spots here and there.
This issue seeks to coordinate and group any of my smaller contributions meant to help COBE perform better in the world of Python 3, as well as serve as a little list of the things I actually did in it.
Since each pull request wouldb e a batch of little changes, each almost indistinguishable from the one before and the one after, try keeping relevant discussion to this issue :)
Ahh, if only we could split an issue's discussion into multiple threads.
––
Changes:
str.encode(....)
where it is unnecessary: Most Python 3 libraries and functions accept str
, unless they specifically do bytewise or binary operation, such as manipulating raw data (like base64
) or reading a struct (like, well, `struct').PRs:
None yet. See https://github.com/Gustavo6046/cobe/tree/py3-refinement
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.