Giter Site home page Giter Site logo

Comments (10)

ssokolow avatar ssokolow commented on September 21, 2024 4

This isn't an exhaustive list of all bad strings, it's a list of examples.

What you do is you test your code with each of these to try to shake out bugs and, once you identify the bugs, you write a properly comprehensive fix.

(eg. If Æ was included in the list as an example of a unicode character and your program didn't support Unicode, Checking .contains() for Æ wouldn't solve your problem for other 128,000+ unicode characters. The proper solution would depend on how you were using the text.)

from big-list-of-naughty-strings.

btseytlin avatar btseytlin commented on September 21, 2024 4

I woner how to protect myself from the human injection

from big-list-of-naughty-strings.

parshap avatar parshap commented on September 21, 2024 2

This list is useful for automated testing, not for runtime input validation/sanitization. Here's an example: https://github.com/parshap/node-sanitize-filename/blob/ef1e8ad58e95eb90f8a01f209edf55cd4176e9c8/test.js#L259-L262

from big-list-of-naughty-strings.

jyounus avatar jyounus commented on September 21, 2024

Yeah, that's what I was thinking with the example you provided and looking for something more generic.

I'm sure this is a huge and common thing that needs to be implemented in different systems, aren't there like well established libraries available that help you out with this sort of thing? Maybe I'm thinking of something different here (input sanitisation? input validation?)

from big-list-of-naughty-strings.

ssokolow avatar ssokolow commented on September 21, 2024

The BLNS is for testing your input sanitization.

Unfortunately, I don't code web apps in NodeJS (Python, PHP, and various other languages), so I can't suggest an input sanitization library/framework off the top of my head.

from big-list-of-naughty-strings.

jfinkhaeuser avatar jfinkhaeuser commented on September 21, 2024

One of the issues with automating this kind of thing is that it lets the number of test cases explode.

Let's say you have a hundred bad strings, that's a hundred test cases, right? Well, no... the only way to make sure that your input doesn't break anything is to make sure all inputs on all parameters for a form/API call are tested.

That means your number of test cases is (number of bad strings)^(number of string parameters) for each such form/API call.

Very few people take the time to test through that, even if the test cases can be generated automatically from some kind of spec.

That said, yes, it would be nice to see something like this, right?

from big-list-of-naughty-strings.

ssokolow avatar ssokolow commented on September 21, 2024

@euphe Careful UI design. It really depends on the specific case.

from big-list-of-naughty-strings.

JC5 avatar JC5 commented on September 21, 2024

Testing frameworks such as PHPUnit can use this list as a "data provider". Here's some code for PHPUnit:

    /**
     * @return array
     */
    public function naughtyStringProvider()
    {

        $path    = realpath(__DIR__ . '/../resources/tests/blns.base64.json');
        $content = file_get_contents($path);
        $array   = json_decode($content);
        $return  = [];
        foreach ($array as $entry) {
            $return[] = [base64_decode($entry)];
        }

        return $return;
    }

When you have a specific function that should accept user input and not break somehow, you can do this (again, in PHPUnit):

/**
     * @covers       \FireflyIII\Http\Controllers\Transaction\SingleController::store
     * @dataProvider naughtyStringProvider
     */
    public function testStoreNaughty(string $description)
    {
 // ...
}

This test is called 400+ times with a different string from the naughty list, automatically.

It is worth to know however, that this specific test (depending on how you set it up) would only test if your application accepts these strings. Which it might as well do, because many of the strings in the naughty list aren't very naughty per se, they're just inconvenient to read. If a user wants to give a description that's emoticons only, wel sure. That's not a problem per se.

My test case is just an example to show you how you could use this list. It's by no means the only way.

from big-list-of-naughty-strings.

ChipWolf avatar ChipWolf commented on September 21, 2024

I suggest including the type of string (reason for error) beside each value

from big-list-of-naughty-strings.

ssokolow avatar ssokolow commented on September 21, 2024

@JC5

You'd be surprised. For example, Fanfiction.net's got this stupid overzealous string sanitization which silently strips all percent signs from input, so a chapter containing "I'm 100% woman" would become "I'm 100 woman" without a single warning.

The punctuation used in "plaintext" emotes has its own scunthorpe problem.

from big-list-of-naughty-strings.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.