Comments (10)
This isn't an exhaustive list of all bad strings, it's a list of examples.
What you do is you test your code with each of these to try to shake out bugs and, once you identify the bugs, you write a properly comprehensive fix.
(eg. If Æ
was included in the list as an example of a unicode character and your program didn't support Unicode, Checking .contains()
for Æ
wouldn't solve your problem for other 128,000+ unicode characters. The proper solution would depend on how you were using the text.)
from big-list-of-naughty-strings.
I woner how to protect myself from the human injection
from big-list-of-naughty-strings.
This list is useful for automated testing, not for runtime input validation/sanitization. Here's an example: https://github.com/parshap/node-sanitize-filename/blob/ef1e8ad58e95eb90f8a01f209edf55cd4176e9c8/test.js#L259-L262
from big-list-of-naughty-strings.
Yeah, that's what I was thinking with the example you provided and looking for something more generic.
I'm sure this is a huge and common thing that needs to be implemented in different systems, aren't there like well established libraries available that help you out with this sort of thing? Maybe I'm thinking of something different here (input sanitisation? input validation?)
from big-list-of-naughty-strings.
The BLNS is for testing your input sanitization.
Unfortunately, I don't code web apps in NodeJS (Python, PHP, and various other languages), so I can't suggest an input sanitization library/framework off the top of my head.
from big-list-of-naughty-strings.
One of the issues with automating this kind of thing is that it lets the number of test cases explode.
Let's say you have a hundred bad strings, that's a hundred test cases, right? Well, no... the only way to make sure that your input doesn't break anything is to make sure all inputs on all parameters for a form/API call are tested.
That means your number of test cases is (number of bad strings)^(number of string parameters) for each such form/API call.
Very few people take the time to test through that, even if the test cases can be generated automatically from some kind of spec.
That said, yes, it would be nice to see something like this, right?
from big-list-of-naughty-strings.
@euphe Careful UI design. It really depends on the specific case.
from big-list-of-naughty-strings.
Testing frameworks such as PHPUnit can use this list as a "data provider". Here's some code for PHPUnit:
/**
* @return array
*/
public function naughtyStringProvider()
{
$path = realpath(__DIR__ . '/../resources/tests/blns.base64.json');
$content = file_get_contents($path);
$array = json_decode($content);
$return = [];
foreach ($array as $entry) {
$return[] = [base64_decode($entry)];
}
return $return;
}
When you have a specific function that should accept user input and not break somehow, you can do this (again, in PHPUnit):
/**
* @covers \FireflyIII\Http\Controllers\Transaction\SingleController::store
* @dataProvider naughtyStringProvider
*/
public function testStoreNaughty(string $description)
{
// ...
}
This test is called 400+ times with a different string from the naughty list, automatically.
It is worth to know however, that this specific test (depending on how you set it up) would only test if your application accepts these strings. Which it might as well do, because many of the strings in the naughty list aren't very naughty per se, they're just inconvenient to read. If a user wants to give a description that's emoticons only, wel sure. That's not a problem per se.
My test case is just an example to show you how you could use this list. It's by no means the only way.
from big-list-of-naughty-strings.
I suggest including the type of string (reason for error) beside each value
from big-list-of-naughty-strings.
You'd be surprised. For example, Fanfiction.net's got this stupid overzealous string sanitization which silently strips all percent signs from input, so a chapter containing "I'm 100% woman" would become "I'm 100 woman" without a single warning.
The punctuation used in "plaintext" emotes has its own scunthorpe problem.
from big-list-of-naughty-strings.
Related Issues (20)
- Add <!--<script> to the list of strings
- Add dangerous WiFi SSIDs
- Add 睷�睷睷� to the list
- O'[email protected] HOT 2
- Add markdown injection
- Niger, the country. HOT 1
- Add rm -rf / HOT 5
- Comment misidentifies Œ as lowercase
- Question - Naughty Http Endpoints HOT 1
- IDN characters HOT 7
- Underscore-separated digits HOT 2
- XML bomb
- Line 507 HOT 1
- Accident HOT 2
- Is this repo still "alive"? HOT 9
- Add BWTC32Key-generated BOM+CJK test string
- Add CSV excel macro injections
- Add Abugidas and CJK/Emoji variation selectors.
- Test
- Is this an appropriate place for place names or other names that might cause issues?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from big-list-of-naughty-strings.