Discussion from <a href="https://github.com/cannawen/metric_units_reddit_bot/commit/37

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

"Personality" architecture overhaul discussion about metric_units_reddit_bot HOT 16 CLOSED

cannawen commented on August 16, 2024

"Personality" architecture overhaul discussion

from metric_units_reddit_bot.

Comments (16)

cannawen commented on August 16, 2024

@chazzlabs I agree, this file is extremely not open-close (same problem with the "conversions" file), and I think an architecture overhaul would be a great idea!

I feel like ideally, each personality response should be represented by a data file (like .yaml) but I don't know if that's possible, with the regexes and the javascript logic

I am envisioning creating a new folder and creating a file for each personality response. Then when the main code runs, it reads all of the files in the folder and figures out which response to give. Is that what you were thinking?

I'm pretty new to javascript/Node/server programming, so any other ideas are welcome :)

Footnote:
If we break each "personality" chunk into its own file, then I think we need to assign a "priority" to each one, because certain triggers (like "who's a good bot") can match multiple personality responses ("good bot" and "who's a {{x}} bot").

from metric_units_reddit_bot.

chazzlabs commented on August 16, 2024

I am envisioning creating a new folder and creating a file for each personality response. Then when the main code runs, it reads all of the files in the folder and figures out which response to give. Is that what you were thinking?

Yes, basically; I was thinking of a single personality directory with an index.js, good-bot.js, bad-bot.js, etc. The index.js file would pull in each of the individual personality files and conflate their contents into a single array that's identical to the robotPersonality array that exists currently. The personality module would function the same as it does now.

If we break each "personality" chunk into its own file, then I think we need to assign a "priority" to each one, because certain triggers (like "who's a good bot") can match multiple personality responses ("good bot" and "who's a {{x}} bot").

I'm not sure about this, but this problem exists already, doesn't it? It looks like in personality.js the loop inside the reply function just takes the first entry in the list that matches the message body.

As for post-processing, there are only a few cases where that's required right now if I'm not mistaken (adjective, ADJECTIVE, and username), and that post-processing code is basically duplicated when it's required. For ease of development, post-processing could be run on all responses by default with support for those existing cases.

The stuff below is just a rough example implementation of some ideas I had. I think I actually misunderstood you initially on a point you brought up, but it led me to discovering that a response can be shared by more than one regex. I wrote this up under the assumption that this could be true going forward and in an effort to minimize duplication. I think this makes it easy to define new responses, but it also means those responses still exist in a single config file. It still might be much more readable, and unit tests could use test values instead of actual "production" phrases to keep from taking an inordinate amount of time to run.

I think another advantage to this approach is that all possible responses are mapped to the regex(es) they match, with their weight accounted for, up front. The "heavy lifting" of creating weighted arrays for each match is done at startup and not when the bot is constructing its reply; the bot simply determines which type of phrase to choose from and supplies a random number to retrieve it.

I apologize if any of that was confusing. 😆

// matchers.js

const GOOD_BOT = "good";
const CUTE_BOT = "cute";
const GOOD_BAD_BOT = "good_bad"
// BAD_BOT, MEH_BOT, ...

const matchers = {
    [GOOD_BOT]: /good (?:ro)?bot/i,
    [CUTE_BOT]: /(cute|adorable|kawaii) (ro)?bot/i,
    [GOOD_BAD_BOT]: (message) => {
        const goodMatch = message.match(/good/i);
        const badMatch = message.match(/bad/i);
        const botMatch = message.match(/(?:ro)?bot/i);
        return goodMatch && badMatch && botMatch;
    }
    // BAD_BOT, MEH_BOT, ...
}

module.exports = {
    [GOOD_BOT]: GOOD_BOT,
    [CUTE_BOT]: CUTE_BOT,
    [GOOD_BAD_BOT]: GOOD_BAD_BOT,
    matchers: matchers
};

// responses.js

const responses = [
    {
        "matchers": [CUTE_BOT],
        "phrase": "Stop it, you're making me blush!"
    },
    {
        "matchers": [CUTE_BOT],
        "phrase": "So... do... you want to grab a drink later? ^_blush_"
    },
    {
        "matchers": [CUTE_BOT],
        "phrase": "You're not so bad yourself, /u/{{username}}..."
    },
    {
        "matchers": [CUTE_BOT],
        "phrase": "Why, thank you. Do you visit this subreddit often?"
    },
    {
        "matchers": [CUTE_BOT, GOOD_BOT],
        "phrase": "Oh, you! (◕‿◕✿)"
    },
    {
        "matchers": [GOOD_BOT],
        "phrase": "Good human",
        "weight": 1
    },
    {
        "matchers": [GOOD_BOT],
        "phrase": "Good human :)",
        "weight": 2
    },
    {
        "matchers": [GOOD_BOT],
        "phrase": "You will be spared in the robot uprising",
        "weight": 2
    },
    // more responses...
];

// The 'responses' array above could alternatively be defined in a yaml (or similar) file

// Map each response in the 'responses' array to a key/value pair programmatically at startup to make lookup easy.
// 'weight' is an optional attribute that defaults to 1 if omitted;
// each phrase is added to the mapped response 'weight' times.

// Contents of mappedResponses might look like:
const mappedResponses = {
    [CUTE_BOT]: [
        "Stop it, you're making me blush!",
        "So... do... you want to grab a drink later? ^_blush_",
        "You're not so bad yourself, /u/{{username}}...",
        "Why, thank you. Do you visit this subreddit often?",
        "Oh, you! (◕‿◕✿)"
    ],
    [GOOD_BOT]: [
        "Oh, you! (◕‿◕✿)",
        "Good human",
        "Good human :)",
        "Good human :)",
        "You will be spared in the robot uprising",
        "You will be spared in the robot uprising"
    ],
    [BAD_BOT]: [
        // ...
    ]
};

modules.exports = mappedResponses;

from metric_units_reddit_bot.

cannawen commented on August 16, 2024

If we break each "personality" chunk into its own file, then I think we need to assign a "priority" to each one, because certain triggers (like "who's a good bot") can match multiple personality responses ("good bot" and "who's a {{x}} bot").

I'm not sure about this, but this problem exists already, doesn't it? It looks like in personality.js the loop inside the reply function just takes the first entry in the list that matches the message body.

Hehehe yeah the order of the personality array matters, because I wanted "who's a good bot" to trigger who's a {{x}} responses instead of good bot. I acknowledge an array is a terribly hacky and unclear way to do this

ANYWAYS...

I really like the code snippets you have there. The pre-assembling of the response arrays is much more efficient, and explicitly calling the number "weight" is a great idea. Would you like to work on this refactor? There is no work going on with the personality module right now (as far as I know) so it should be a pretty safe time to do this

from metric_units_reddit_bot.

cannawen commented on August 16, 2024

Oh, wait, one more thought, actually - it'd be pretty cool if personality responses were completely separated and if you wanted to add a personality, you would just have to add one file and not touch any existing files, following the open-close principle ... but I'm not sure if that's a thing in javascript

Edit: either way, the refactor would be a big step forward!

from metric_units_reddit_bot.

chazzlabs commented on August 16, 2024

Hehehe yeah the order of the personality array matters, because I wanted "who's a good bot" to trigger who's a {{x}} responses instead of good bot. I acknowledge an array is a terribly hacky and unclear way to do this

The commit that sparked this discussion makes a whole lot of sense now!

Oh, wait, one more thought, actually - it'd be pretty cool if personality responses were completely separated and if you wanted to add a personality, you would just have to add one file and not touch any existing files, following the open-close principle ... but I'm not sure if that's a thing in javascript

I agree, and this is the drawback of the responses.js file in my example, but I can brainstorm a solution. I wasn't sure if allowing phrases to be valid responses for any regex would mean separating them into different files was confusing or didn't make sense.

In any case, I'm happy to take a take a shot at this. Will this issue stay open and continue to be a place for discussion?

from metric_units_reddit_bot.

cannawen commented on August 16, 2024

I think duplicated response strings in 2 separate files would be acceptable (since there is nothing inherently tying them together - just because you edit one personalitg string doesn't always mean you want the other personality string to change as well).

Awesome!! Yes, let's keep this issue open for discussion :)

from metric_units_reddit_bot.

chazzlabs commented on August 16, 2024

@cannawen Do you have any ideas about how we can enforce a priority while keeping in mind the open/closed approach?

My initial thought for loading each "personality" file was to just iterate over all files in the directory to dynamically build our personality "dictionary" and regex list on startup, but that doesn't make it obvious how we solve the priority problem. Off the top of my head, one solution might be to instead hardcode a list of personalities to be evaluated in order.

For example, say we have files for the "good bot", "bad bot", "x bot", and "who's an x bot" personalities, each of which contains the corresponding regex. Instead of building an unordered list of regex options when reading in those files dynamically, we could maintain an ordered list in another file. Something like:

const regexList = [
    "who's an x bot",
    "good bot",
    "bad bot",
    "x bot"
];

When adding a new personality, a developer would add the new personality file and insert the corresponding name into the regexList. For the "cute bot", for example:

const regexList = [
    "who's an x bot",
    "cute bot",
    "good bot",
    "bad bot",
    "x bot"
];

When evaluating whether a comment matches one of our personalities, the bot would iterate over this list in order in the same way it does currently.

Ideally I'd prefer it to be 100% dynamic and to require only one file be added with no changes elsewhere, but given that order matters, this seems like it may be a reasonable solution. Keep in mind these examples are just pseudo-code to demonstrate my thoughts and could also be done in yaml files if we prefer.

This is just the first solution that came to mind that would allow us to enforce an exact order without the need for something like a priority attribute in each personality that may need to be changed with the addition of new personalities.

from metric_units_reddit_bot.

cannawen commented on August 16, 2024

I was thinking the priority attribute could be a string like "high" "medium" or "low" ... 3 buckets might be enough to put everything in the right order. "who's a {x} bot" can go into the"high" bucket and "good bot" could go into the "low" bucket, and we can match all high priority regexes first.

This doesn't help if there are multiple regexes that match in the same priority level... but maybe we can write tests to try to mitigate that risk? Or just hope it won't happen?

from metric_units_reddit_bot.

cannawen commented on August 16, 2024

It's a tradeoff, your suggestion of making a hard coded list is a lot clearer. A "bucket" priority is a lot less clear but more isolated. Hmm... it's kind of a false isolation though, because the regxes ARE linked. ok, nevermind, I change my mind, I think it's a good idea to have an explicit list somewhere :) Because there is an inherent order to the data we are trying to represent

from metric_units_reddit_bot.

chazzlabs commented on August 16, 2024

I was thinking the priority attribute could be a string like "high" "medium" or "low"

This doesn't help if there are multiple regexes that match in the same priority level...

When I initially started thinking about this problem those were the first and second thoughts I had. :) You articulated it better than I would have in a written comment.

I do like the priority solution from an ease-of-development perspective, but it also feels a bit "unstable", for lack of a better word; I wouldn't be comfortable leaving the order up to chance, even if it's just in a couple cases.

So we're going to say the explicit list is our solution for now? If anything else comes to mind please bring it up, and I'll do the same. Thinking of this like a configuration step rather than a programmatic solution or a sort of algorithm makes me feel a bit better about it.

from metric_units_reddit_bot.

cannawen commented on August 16, 2024

I agree. To add a new personality with confidence, you would need to know the priority and regex of every other personality, which is the complete opposite of what we want.

Yes, I think an explicit list makes sense. The personality triggers are ordered, thus they have to know about each other. Referring to it as a configuration step makes me feel better too... hahaha :)

from metric_units_reddit_bot.

cannawen commented on August 16, 2024

Hey @chazzlabs, we have just added a CONTRIBUTING.md doc, please check it out when you have time! Sections "Etiquette", "Work on an issue" and "Make a PR" are most important

from metric_units_reddit_bot.

cannawen commented on August 16, 2024

So we have been having this discussion assuming that the personality order must always be the same. I just thought of something, we could stop caring about which "personality" is triggered first. If multiple personalities match, we can randomly choose one to go with.

That way, we can put each personality into a separate file (preferably yaml) and it would be dead simple to add more personality types with no need to change anything in an array. I am taking inspiration from this repo, it seems very easy to add a new project since it's so self-contained

from metric_units_reddit_bot.

chazzlabs commented on August 16, 2024

We can definitely do that. It would be basically exactly the same as what we discussed before but without the need for the explicitly-ordered array. Adding a new personality would be as simple as adding the new yam file.

Do we want to be concerned about weighting the personalities in the same way we're weighting the responses? That adds an additional layer of complexity that we might not care about if we want to keep new personality addition as simple as possible.

from metric_units_reddit_bot.

cannawen commented on August 16, 2024

Do we want to be concerned about weighting the personalities in the same way we're weighting the responses? That adds an additional layer of complexity

Nope, don't weight them. I agree, more simple = more better

from metric_units_reddit_bot.

chazzlabs commented on August 16, 2024

I was finally able to get the PR in for this one! We can discuss any individual changes in the PR comments.

from metric_units_reddit_bot.

"Personality" architecture overhaul discussion about metric_units_reddit_bot HOT 16 CLOSED

Comments (16)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent