mrdogebro / content_filter Goto Github PK
View Code? Open in Web Editor NEWA basic but robust content filter for python.
License: MIT License
A basic but robust content filter for python.
License: MIT License
Is default word's list empty?
Or maybe there is some option in your library to see the all list of words which will be filtered?
Convert foreign characters into their English equivalents to allow the filter to detect words even when letters have been replaced with foreign lookalike's. Also replace non-qwerty charaters with qwerty charaters. This would mean something like โ
would be converted to e
, รง
to c
, etc.
Update the module and its functions to comply with the PEP8 standards.
Make the filter ignore non-printing charaters so that they do not mess up filtering as they do now.
With Content Filter's upcoming introduction into Carberretta (Carberra/Carberretta#137), making the project typed would ensure some potential CI issues are eradicated. It would also provide better autocompletes for those using and contributing to Content Filter.
This would involve:
mypy --strict
standardsblack
happyThis would force a drop in support for Python 3.4, but considering it reached end-of-life in March 2019, that's no big deal.
I'll PR these changes in.
Could you add the ability to replace a matching word with the configured censored word? For example, say you have this line in a JSON file:
{ "find": "find", "word": "word", "censored": "censored" }
It would be very useful to have a function that replaced each matching word found with the censored alternative string here or a setting to do this while filtering.
Convert the module to be class-based. For example, their would be a filter class that would hold the functions for the filter. This would allow for more object-orientated programming as well as allow multiple different filter configs to be used in a single file.
I noticed that when I check a string, and it returns a match, when I look at the indexes, the indexes don't match the original string. It actually looks like it is the indexes of the string without any whitespace (or at least any spaces).
For example, say I have this string, hello world censor hello world
, and I am trying to match the word "censor". as_list
will tell me that the match is at (10, 16)
. However, if I use those indexes to find the word to replace it, it returns d cens
. So the indexes appear to be offset by 2 in this case, which correspond with the number of spaces before the word "censor" in the string.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.