Giter Site home page Giter Site logo

zhenya_bot's People

Contributors

fimad avatar keneanung avatar numberten avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

Forkers

keneanung

zhenya_bot's Issues

!be is slow the first time it is run

N-gram distributions are computed lazily which makes zbot start faster, but it means it takes a couple seconds to impersonate someone for the first call :(

Proposed solution: either strictly compute the distributions on load, or in a background thread to compute.

An Onion Component

Idea for a component that echos the name of onion articles to channel.

ircWrite newline error

ircWrite breaks on newlines as shown:

23:27 @jesse !define london
23:27 @zhenya_bot london: The Capital of the world, only rival New York. Incorporates the best of both Europe and America. Unlike in New York the Tube stations are
Clearly signposted. Unlike New York the streets are all squigley and it is really really old. South of the river Thames is a mythical land that those
on the North talk about in nervous whispers, but it actually isn't that bad and is fast becoming the only place in the city besides cardboard boxes th
23:27 @zhenya_bot at is affordable to live in. Stand in the middle of the Millenium footbridge and turn around in a 360 degree circle. Go on the London eye. Don't
visit the London Dungeons. Go shopping on portabello road, or in Camden, not in Covent Garden. Go to the opera in Regent's park, and to speaker's
corner in Hyde park on a sunday afternoon. Trafalger Square in the evening, Leicester square at mid-day. Karl Marx and Charles Dickens are buried in
Highga
23:27 @zhenya_bot te cemetary. Ealing is queen of the suburbs.
23:27 @jesse now..
23:27 @jesse !define 666
23:27 @zhenya_bot 666: The Mark of the beast, and the men who follow the beast of the earth will bear this mark to display their allegiance to the beast. It has been
disputed by some scholars that certain numeric ciphers would translate it over to certain words, names, etc... For example, one of the more popular
translations was Nero Caesar, and before that, Lateinos, etc...
23:27 @zhenya_bot ect number to represent the impurities of man and their evildoings and sinful ways.
23:27 @jesse it drops some stuff
23:27 @jesse usually it stops reading on new line breaks
23:27 @jesse !define trogdor
23:27 @zhenya_bot trogdor: Trogdor:
23:27 @zhenya_bot e countryside and the peasants who inhabit it, he is also known by his full name, "Trogdor the Burninator".
23:27 @will ah
23:27 @jesse o.o
23:27 @jesse anything with newlines

Maximum line length

PRIVMSG character limit is truncating long lines.

From RFC 1459:

IRC messages are always lines of characters terminated with a CR-LF
(Carriage Return - Line Feed) pair, and these messages shall not
exceed 512 characters in length, counting all characters including
the trailing CR-LF. Thus, there are 510 characters maximum allowed
for the command and its parameters. There is no provision for
continuation message lines.

NAMES tracking

Right now the bot keeps track of currentNicks, which is the most recent list of nicks retrieved from a NAMES request. Right now NAMES queries are only sent out when connecting to a channel and when the !oprah command is used.

There exists a problem where the NAMES request sent before !oprah commands doesn't update state until after currentNicks is read to send ops.

Stalker component queries aren't being rate limited

The bot gets kicked from the server for the same excess flood problem seen in #35, when enough people join/quit the channel at a single time.

This is pretty common on servers like freenode that have frequent netsplits, or if the bot is in multiple channels.

Solution: All messages (in this case NAMES) sent to the server should be client side rate limited like what we do for PRIVMSGs. With the exception of PONGS as mentioned in #35.

!grep: last n matches

Idea to add a flag to !grep that returns the last n matches.

usage: !grep [-c int] [-n nick] [-m matches] regex

Improve the nick clustering algorithm

New issue for the discussion raised in #3.

"Seems like that fixes this issue, but opens up other problems with our clustering algorithm.

Since nick clusters represent a single identity, it doesn't really make sense for a new cluster to be formed with the pieces of others. Assuming the current clusters are correct (every nick is clustered into the cluster representing the correct identity), then all new aliases should either be grouped with a previously existing cluster, or form a group of their own.

My suggestion is that any new clustering algorithm obey this invariant:

  • After clustering, every cluster group must be a superset of its previous iteration.

Assuming that any clustering algorithm has some likelihood of clustering incorrectly, I think this would allow for the most ease of use (as it would only require the rearrangement of a single nick). What we've got now could cause any number of distortions to the group clusters, per nick introduction, depending upon similarities between groups."

Regex components - multiple matches

Something in the regex component causes actions to fire multiple times. This arises when kleene stars/pluses are used in patterns.

This isn't a crash inducing bug, since most actions fail cleanly when run with invalid input.

Current work arounds:

  • making sure all actions handle invalid input correctly.
  • avoiding the use of very generalized patterns

Substitution Component

Component that actually rewrites typoed phrases after a vim-esc find and replace command.

Ex:

<jesse> yeah man!1!
<jesse> s/1/!
<zhenya_bot> jesse: yeah man!!

Dictionary Service

As brought up in #37, it could be very useful to have a dictionary service that stores a list of words and information associated with them such as syllables, part of speech, etc. This would make it far easier for other components to make use of word specific data.

Existing components that could benefit from this:

  • NGram
  • Grep

Future components that could benefit from this:

Generalize stateful, et al. to allow for stacking multiple Monad Transformers.

Currently the type for stateful is
stateful :: (String -> StateT s Bot ()) -> Bot s -> Bot BotComponent

It would be nice to have something like
statefulT :: MonadBot b => (String -> StateT s b ()) -> b s -> Bot BotComponent

This would allow for stacking multiple stateful components in the way that commandT does. It would be nice if the solution allowed component combinators to work on these more complex bot components.

Pongs are getting rate limited

If there is a long queue of messages to be privmsged and the bot receives a ping it puts the pong at the end of its queue which leads to the server disconnecting it. Seen in fimad/ggircd#26.

Bot shouldn't rate limit ping responses.

Summarize Component

A component for catching up on backlog! This component would introduce two new commands as listed.

  • !note <number>
  • !summarize

!note would bookmark the last x lines spoken.
!summarize would have the bot privmsg all of the notes taken since the last time you spoke in the channel.

dictionary component implementation - error catching

After a failed attempt to !define a word, the next zhenya_bot trigger is ignored.
Things known to cause !define to fail:

  • no urbandictionary definition for lookup phrase
  • certain pages having varying page source

Note: This effect has been seen in other components upon failure.

!grep: ignore !grep commands

The only thing more annoying than messing up a !grep, is having your next corrected !grep command match the previous failed attempt. This alongside #14 should fix a lot of the component's most frustrating problems.

Add lists to zbot

I want to be able to do:

!list add ggpj write basic ircd

!list

ggpj
list2

!list show ggpj

  1. foo
  2. bar

!list rm ggpj 1

  1. bar

!list rmlist ggpj

A guide for making your own component

As mentioned in #24, a guide for 'rolling your own component' would be useful.

Things it could include:

  • an example component written in literate haskell
  • an overview of the different BotComponent constructors (stateful, regex, command, etc)
  • a section about using passive components (nickcluster, history, etc)

Selectiveness for Reboot/Op Components

Some components should only respond to approved users. For example, no random person should be able to make the bot !quit, or be given ops via !ascend.

Reputation

This bot needs rep.

-1 bot for no rep

Should restart if the server goes down.

Currently the bot does not detect server crashes. Zhenya_bot should either detect a broken socket or periodically send pings and restart itself if the server does not send a pong message in a reasonable amount of time.

Enforce usage messages

For the sake of transparency for end users, it is best practice for all active components to have a usage message when invoked without any parameters. It would be convenient if this were enforced by the type system, instead of leaving it up to the component's implementer to remember to catch the edge case.

Command to remove bad aliases.

It's possible to troll the nick clustering service into grouping all names together by creating a sequence of nicks of small edit distance. As of now the only recourse is to manually edit the persistent nick cluster file.

Example:
22:54 -!- Zhenya is now known as z
22:55 < z> nvm
22:55 @jesse almost
22:55 @jesse !alias z
22:55 < zhenya_bot> Zhenya, Zhenya_home, Zhenya_work, Zhenya, will, will_, z, zbot_will 22:55 <@jesse> oh weird 22:55 <@jesse> !alias Zhenya 22:55 < zhenya_bot> Zhenya, Zhenya_home, Zhenya_work,Zhenya, will, will_, z, zbot_will
22:55 @jesse lol
22:55 @jesse you made a z name
22:55 @jesse and it combined your group with wills
22:55 @jesse you ruined all of name clustering
22:55 @jesse hahahaha

Client-side rate limiting

As the wikipedia article below mentions: IRCds keep a buffer for privmsgs that they use as a queue to rate limit you, in cases where you send more than they allow in given intervals. If you send more than the buffer can hold, you get kicked from the server with a "Excess Flood" quit message.

Solution: We need client side rate limiting, since the server can't be relied upon to handle arbitrary sized groups of privmsgs.

http://en.wikipedia.org/wiki/Internet_Relay_Chat_flood

!define: substitute randomly defined word with search term

When !define can't find a queries definition, it picks a definition at random. It would be better if all occurrences of the randomly chosen word were replaced with the queried word.

Example:
user> !define partici-pants
bot> partici-pants: The name of a really hot guy, beloved by all. Dylan is the best name ever, unlike every other urbandictionary definition of someone's name.

Better reply:
bot> partici-pants: The name of a really hot guy, beloved by all. partici-pants is the best name ever, unlike every other urbandictionary definition of someone's name.

Rhyming Component

Implement rhyme matching so zhenya_bot can detect rhyming words.

Originally I thought it would be neat to scrape random songs off Rap Genius and pull rhyming lines from different songs together to create a new rap.

It could also be interesting to apply the rhyming component to the IRC history too though.

non-responsive github component

Github component doesn't seem to be working. The bot isn't updating itself post webhook.

Likely because github changed their privmsg, breaking our regex.

!queens: denial of service

Large inputs force zhenya_bot to freeze as it calculates an answer.

Solution: Putting a hard limit on valid values of n.

!queens: pretty printing

Pretty Printing Ideas:

  • It would be cool if the !queens component had a pretty printer that displayed a grid of its solution.
  • The bot should have error messages for the cases that don't have answers (n=2, n=3) as well as cases where answers are too computationally expensive (#19).

!define component failing to capitalize queries

This didn't used to be a problem but it looks like our queries are no longer being automatically capitalized as needed.

ex:

10:50 <jesse> !define haskell
10:50 <zhenya_bot> haskell: A hit-single (well, sort of,) by rapper Sage the Gemini. It's actually really catchy after it sinks in. Also, there is a dance to 
                   it (and although it's controversial, I'd rather see a party girl do it than a little kid, because that's just really bad and wrong for them 
                   to know how to do such a promiscuous dance.) haskell is essentially referring to a girl shaking her plump ass the way a pitbull shakes its 
                   body (after getting wet, et
10:50 <zhenya_bot> c.) and preferably, he names a haskell pitbull (any form of pittie with a pink or, "red" nose.) Sage is actually really fine, too! I think 
                   it's the eyes...
10:51 <jesse> !define Haskell
10:51 <zhenya_bot> Haskell: 1 -- A general purpose, polymorphicly typed, lazy functional programming language largely based on lambda calculus.   2 -- A 
                   constant source of frustration for those who have been brainwashed by the OO paradigm.

Idea for manual/help component

As brought up in #30:

We should have a component that lists all other components and information about their use.

This would be similar to lambdabot's @help command.

README is out of date

README could use a redo.

Things we're missing:

  • installation guide
  • up-to-date syntax for commands
  • a license
  • compatibility

!define: breaks on hyperlinks

When the dictionary component finds a hyperlink in a definition it short circuits.

Ex:

<jesse> !define poway
<zhenya_bot> poway: A rich and poor suburban town of 

Where the expected definition is: "A rich and poor suburban town of San Diego. I.E. Upper Windmill Poway (rich) and Lower Windmill Poway (poor). Formally known as the "City In The Country". Also known for the hottest bitchest girls around, milfs, and professional athletes. Topping it off with nearly nothing to do except; see movies or do something illegal. Come stare at huge houses with huge price tags come to Poway "

Where the strings "San Diego" and "milfs" are links to other definitions.

Persistent components crash if their data file does not exist.

See:

> NICK rick
> USER rick 0 * :Greatest Guys bot
> WHO rick
> JOIN #waffles
ERROR: NickCluster Error: Data.Clustering.Hierarchical.Internal.DistanceMatrix.dendrogram': empty input list
zhenya_bot: data/log.txt: openFile: does not exist (No such file or directory)

buttbot component

We all remember the wonderful buttbot. If you don't, check out https://code.google.com/p/buttbot/

Basically, at random, buttbot will take a sentence and replace a word with 'butt'. Results can be hilarious.

Only issue I ever had with buttbot was that sometimes it would be a little too active. I'm thinking maybe a !butt command will take the last said line and butt-ify it.

image

Haiku Component

Would be cool if zhenya_bot could generate haikus. This would be rather easy by explicitly tagging history messages with their syllable length and then sampling from the history component.

Alternatively/also it would be cool if zhenya_bot could recognize haikus spoken to in the channel. First three lines of example taken from actual backlog excerpt:

19:08 < Cosmo> because it's intense!
19:09 <@jesse> do core monsters shoot at you?
19:18 < Cosmo> haha, sadly no
19:18 <@zhenya_bot> Nice haiku guys!

Support for super users

I think we should add some formal support for 'super users' or higher tiered users. This would allow for things like #44. Any ideas for a good way of doing this?

I was thinking a nice way of doing this would be passing a list of approved nicks as command line args and storing them as bot state accessible to any component. I'm unsure how nick clusters would be treated though. Not respecting clusters would be annoying, but respecting them could be a security hole.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.