Giter Site home page Giter Site logo

marksweb / django-bleach Goto Github PK

View Code? Open in Web Editor NEW
146.0 7.0 23.0 226 KB

Bleach is a Python module that takes any HTML input, and returns valid, sanitised HTML that contains only an allowed subset of HTML tags, attributes and styles. django-bleach is a Django app that makes using bleach extremely easy.

License: MIT License

Python 96.92% HTML 3.08%

django-bleach's Introduction

django-bleach - Bleach and sanitise user HTML

Documentation Status Latest Version License Downloads

Codacy Badge pre-commit.ci status Language grade: Python Total alerts

Bleach is a Python module that takes any HTML input, and returns valid, sanitised HTML that contains only an allowed subset of HTML tags, attributes and styles. django-bleach is a Django app that makes using bleach extremely easy.

Read the documentation here.

Setup

  1. Install django-bleach via pip:

    pip install django-bleach
    
  2. Add django-bleach to your INSTALLED_APPS:

    INSTALLED_APPS = [
        # ...
        'django_bleach',
        # ...
    ]
  3. Select some sensible defaults for the allowed tags, attributes and styles; and the behaviour when unknown tags are encountered. Each of these are optional, and default to using the bleach defaults. See the bleach documentation:

    # Which HTML tags are allowed
    BLEACH_ALLOWED_TAGS = ['p', 'b', 'i', 'u', 'em', 'strong', 'a']
    
    # Which HTML attributes are allowed
    BLEACH_ALLOWED_ATTRIBUTES = ['href', 'title', 'style']
    
    # Which CSS properties are allowed in 'style' attributes (assuming
    # style is an allowed attribute)
    BLEACH_ALLOWED_STYLES = [
        'font-family', 'font-weight', 'text-decoration', 'font-variant']
    
    # Strip unknown tags if True, replace with HTML escaped characters if
    # False
    BLEACH_STRIP_TAGS = True
    
    # Strip comments, or leave them in.
    BLEACH_STRIP_COMMENTS = False
  4. Select the default widget for bleach fields. This defaults to django.forms.Textarea, but you will probably want to replace it with a WYSIWYG editor, or something similar:

    # Use the CKEditorWidget for bleached HTML fields
    BLEACH_DEFAULT_WIDGET = 'wysiwyg.widgets.WysiwygWidget'

    I use django-ckeditor in my projects, but what you use is up to you.

Usage

In your models

django-bleach provides three ways of creating bleached output. The simplest way of including user-editable HTML content that is automatically sanitised is by using the BleachField model field:

# in app/models.py

from django import models
from django_bleach.models import BleachField

class Post(models.Model):

    title = models.CharField()
    content = BleachField()

    # ...

BleachField takes the following arguments, to customise the output of bleach. See the bleach documentation for their use:

  • allowed_tags
  • allowed_attributes
  • strip_tags
  • strip_comments
  • css_sanitizer

The following argument will be deprecated in the near future:

  • allowed_styles

In addition to the bleach-specific arguments, the BleachField model field accepts all of the normal field attributes. Behind the scenes, it is a TextField, and accepts all the same arguments as the default TextField does.

The BleachField model field sanitises its value before it is saved to the database and is marked safe so it can be immediately rendered in a template without further intervention.

In model forms, BleachField model field are represented with the BleachField form field by default.

In your forms

A BleachField form field is provided. This field sanitises HTML input from the user, and presents safe, clean HTML to your Django application and the returned value is marked safe for immediate rendering.

In your templates

If you have a piece of content from somewhere that needs to be printed in a template, you can use the bleach filter:

{% load bleach_tags %}

{{ some_unsafe_content|bleach }}

If filter has no arguments it uses default settings defined in your application settings. You can override allowed tags by specifying them as a parameter to the filter:

{{ some_unsafe_content|bleach:"p,span" }}

There is also bleach_linkify which uses the linkify function of bleach which converts URL-like strings in an HTML fragment to links

This function converts strings that look like URLs, domain names and email addresses in text that may be an HTML fragment to links, while preserving:

  1. links already in the string
  2. urls found in attributes
  3. email addresses

django-bleach's People

Contributors

ad-m avatar adamchainz avatar alirezaja1384 avatar arnoutdemooij avatar askeyt avatar blag avatar chrisgrande avatar codacy-badger avatar debdolph avatar denisroldan avatar dependabot[bot] avatar dxist avatar joopeed avatar kz26 avatar laityned avatar marksweb avatar mrkgrgsn avatar mx-moth avatar pegler avatar physicistsouravdas avatar pre-commit-ci[bot] avatar seler avatar spenserblack avatar wsvincent avatar zubux avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

django-bleach's Issues

Allow tag filtering

Please add a way to filter only a select few HTML tags. I want to use this to filter out only script tags and I end up having to list out every HTML tag except the script tag.

Form field kwargs get ignored

Describe the bug
If you definekwargs for a form field, models.BleachField.formfield doesn't always pass on those kwargs.

To Reproduce
Set a custom widget for a BleachField with no choices defined

Expected behavior
The kwargs should be updated, not overridden so that anything set outside of django-bleach isn't ignored & lost.

Screenshots
You can see here, a widget is passed in through kwargs but then kwargs aren't used in the returned field.

Screenshot 2021-07-27 at 11 11 58

Codecov not reporting correctly

Describe the bug
Coverage from tox reports all the files in the application correctly, however codecov which is used by github actions only includes the __init__.py and forms.py files;
https://app.codecov.io/gh/marksweb/django-bleach/

To Reproduce
Steps to reproduce the behavior:

  1. Go to https://app.codecov.io/gh/marksweb/django-bleach/
  2. Scroll down to the files section at the bottom
  3. See list of files included in results

Expected behavior
A list of all files in the app, similar to;

__init__.py
forms.py
models.py
templatetags/__init__.py
templatetags/bleach_tags.py
utils.py

Screenshots
Screenshot 2021-06-15 at 11 12 22

Setting `BLEACH_DEFAULT_WIDGET` does not change the widget

Describe the bug

I would like to use django-bleach along with django-ckeditor. I added the following line in my settings:

BLEACH_DEFAULT_WIDGET = "ckeditor_uploader.widgets.CKEditorUploadingWidget"

However, the widget displayed in django admin is still the text area widget.

To Reproduce
Steps to reproduce the behavior:

  1. Install django-bleach and django-ckeditor
  2. Add the following line in settings:
    BLEACH_DEFAULT_WIDGET = "ckeditor_uploader.widgets.CKEditorUploadingWidget"
  3. Add a model with BleachField and display it in django-admin
  4. The widget for the field is a regular textarea

Expected behavior
I should see the CKEditor widget:

Screenshots
image

Additional context
I debugged through the fields and it appears that the correct widget is set here:

self.widget = get_default_widget()

But the kwargs contains the widget value defined to django.contrib.admin.widgets.AdminTextareaWidget and self.widget is then reset to django.contrib.admin.widgets.AdminTextareaWidget at this line:

super().__init__(*args, **kwargs)

Form field doesn't respect empty_value argument

Describe the bug
The form field always returns None if no value was submitted and if it's for a model field with null=False the DB raises an IntegrityError.

To Reproduce

class MyModel(Model):
    bleached = models.BleachField(blank=True, null=False)

class MyModelForm(ModelForm):
    class Meta:
        model = MyModel
        fields = "_all__"

form = MyModelForm(data={}, instance=my_model)
form.save()

Expected behavior
I expect that BleachField.to_python returns self.empty_value if the value is considered empty.

The empty_value argument controls whether CharField objects return None or an empty string when no value was provided by the user and, for model forms, ensures that the field returns the correct value for the model field configuration and chosen DB.

Form field marks cleaned data as template safe

Is your feature request related to a problem? Please describe.
As I mentioned in the description of #25, when displaying a confirmation page at the end of a django-formtools wizard, the wizard renders the submitted form values to the page. When using the BleachField form field, these values have been bleached and thus are safe for rendering without escaping, however they are not marked as safe.

Describe the solution you'd like
I propose that the form field marks values as safe after bleaching so they can be rendered without explicit intervention by calling mark_safe.

Describe alternatives you've considered
The implementation has several options:

  1. Provide a child of the BleachField model field that performs the mark_safe.
  2. The existing BleachField form field always performs mark_safe after bleaching.
  3. The existing BleachField form field accepts an optional argument that directs it to perform mark_safe after bleaching.

Option 1 is backwards compatible but would require projects to override the form field class for model forms using Meta.field_classes, which will be fine once #25 is fixed.

Option 2 changes types, although I would expect this to have no impact, and would provide the benefits of the new behaviour to projects without local changes. However, projects that already escape the value, perhaps to render the markup, but I can't imagine what the use case for doing this would be so perhaps it's purely hypothetical.

Option 3 is also backwards compatible and would also require projects to modify models forms to get the new behaviour but the changes are more convoluted than for Option 1, because you would need to declare the whole form field on the model form instead of just providing a new class in Meta.field_classes.

I would prefer Option 2 because it suits my project but maybe there is some risk of breaking current usage with it, so the risk averse choice would be Option 1.

Bleach v. 5.0 is not compatible with django-bleach 1.0

Describe the bug
An error is raised while saving a model with a django_bleach.models.BleachField.
Our code was working until bleach released v. 5.0 yesterday (7th of April 2022).

To Reproduce
Steps to reproduce the behavior:

  1. Get a fresh venv with django-bleach 1.0.0 and it's dependencies.
  2. Add a django-bleach field to a model.
  3. Save a value in the field.

Expected behavior
Save don't raise the error.

Stacktrace

  File "/usr/local/lib/python3.8/site-packages/django/db/models/base.py", line 739, in save
    self.save_base(using=using, force_insert=force_insert,
  File "/usr/local/lib/python3.8/site-packages/model_utils/tracker.py", line 375, in inner
    return original(instance, *args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/django/db/models/base.py", line 776, in save_base
    updated = self._save_table(
  File "/usr/local/lib/python3.8/site-packages/django/db/models/base.py", line 881, in _save_table
    results = self._do_insert(cls._base_manager, using, fields, returning_fields, raw)
  File "/usr/local/lib/python3.8/site-packages/django/db/models/base.py", line 919, in _do_insert
    return manager._insert(
  File "/usr/local/lib/python3.8/site-packages/django/db/models/manager.py", line 85, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/django/db/models/query.py", line 1270, in _insert
    return query.get_compiler(using=using).execute_sql(returning_fields)
  File "/usr/local/lib/python3.8/site-packages/django/db/models/sql/compiler.py", line 1415, in execute_sql
    for sql, params in self.as_sql():
  File "/usr/local/lib/python3.8/site-packages/django/db/models/sql/compiler.py", line 1358, in as_sql
    value_rows = [
  File "/usr/local/lib/python3.8/site-packages/django/db/models/sql/compiler.py", line 1359, in <listcomp>
    [self.prepare_value(field, self.pre_save_val(field, obj)) for field in fields]
  File "/usr/local/lib/python3.8/site-packages/django/db/models/sql/compiler.py", line 1359, in <listcomp>
    [self.prepare_value(field, self.pre_save_val(field, obj)) for field in fields]
  File "/usr/local/lib/python3.8/site-packages/django/db/models/sql/compiler.py", line 1310, in pre_save_val
    return field.pre_save(obj, add=True)
  File "/usr/local/lib/python3.8/site-packages/django_bleach/models.py", line 55, in pre_save
    clean_value = clean(data, **self.bleach_kwargs) if data else ""
TypeError: clean() got an unexpected keyword argument 'styles'

Work around
Force the use of bleach 4.1.0 in your requirements.txt

django-bleach
bleach==4.1.0

Support for bleach 6

Bleach 6 has been released and is due to be the final version of bleach because it is now deprecated due to html5lib being unmaintained. Further details here.

The significant changes required to support version 6 are;

  • bleach.clean, bleach.sanitizer.Cleaner,
    bleach.html5lib_shim.BleachHTMLParser: the tags and protocols
    arguments were changed from lists to sets.
    bleach.clean(
        "some text",
        tags={"a", "p", "img"},
        #    ^               ^ set
        protocols={"http", "https"},
        #         ^               ^ set
    )
  • bleach.linkify, bleach.linkifier.Linker: the skip_tags and
    recognized_tags arguments were changed from lists to sets.
     bleach.linkify(
         "some text",
         skip_tags={"pre"},
         #         ^     ^ set
     )

Release 0.5.0 has no entry in CHANGELOG

Describe the bug
In CHANGELOG.md, there is no mention of release 0.5.0 that was just published.

To Reproduce

  1. View CHANGELOG.md
  2. See that there is no mention of 0.5.0

Expected behavior
Explanation of what changed in 0.5.0 in CHANGELOG.md

ckeditor

Hello,

I am using CKEditor for RichTextField.
How should I change the command below to use it for CKEditor? Is then this used by default for this RichTextField?
Should I do anything else?

Thank you!
Martin

BLEACH_DEFAULT_WIDGET = 'wysiwyg.widgets.WysiwygWidget'

BleachField form field tries to clean None values

Describe the bug
If the BleachField.to_python() is passed a value of None it passes this straight to bleach.clean() which doesn't accept None since 2.1 (ref) and raises a TypeError.

To Reproduce
Using Python 2.7.12, Django 1.11.20 and django_bleach 0.5.1.

Steps to reproduce the behavior:

  1. Get a form that contains a BleachField
  2. Use the browser's dev tools to delete the textarea for the BleachField
  3. Submit the form
  4. See error

Expected behavior
BleachField.to_python() returns self.empty_value if value is None. This would be consistent with Django's CharField (code).

Model form fields don't respect BLEACH_DEFAULT_WIDGET setting

Describe the bug
Model form fields instantiated without a widget argument always have a widget of class Textarea regardless of the value of BLEACH_DEFAULT_WIDGET.

To Reproduce
Here is a test that exposes the issue:

from django import forms
from django.test import TestCase, override_settings

from testproject.models import Person
from testproject.forms import CustomBleachWidget

class TestBleachField(TestCase):

    @override_settings(BLEACH_DEFAULT_WIDGET='testproject.forms.CustomBleachWidget')
    def test_model_form_widget_type(self):
        form = forms.modelform_factory(Person, fields='__all__')()
        self.assertIsInstance(form.fields['biography'].widget, CustomBleachWidget)

Expected behavior
As documented BleachField form fields should use a widget of class defined in BLEACH_DEFAULT_WIDGET, if this is not overridden.

Additional context
I think this was a regression in #26, I'm sorry (in my defense, none of the existing tests for custom widgets work as intended). That change delegated form field instantiation to upstream classes which apply a default Textarea widget. Patching BleachField.formfield() to set the widget kwarg to the custom setting should be all that is required.

Replace Bleach by NH3

Since Bleach 6.0 will be the last release and we want the awesome Django Bleach to stay in alive...

Bleach could perhaps be replaced by HN3 ?
https://github.com/messense/nh3

NH3 is the Python Binding for Ammonia.
Ammonia is a whitelist-based HTML sanitization library.
Ammonia is written in Rust and a little Benchmark showed 15x faster performance.
https://github.com/rust-ammonia/ammonia

Looks like they were inspired by Bleach to do it, it could be our solution, what do you think ? ^_^

Use forms.BleachField when models.BleachField is used

When defining a model field using models.BleachField, it won't automatically reflect on the ModelForm field. Instead, the form field has to be defined separately using forms.BleachField. I found this confusing initially given Django's "don't repeat yourself" style.

This can be solved by defining a formfield method in the models.BleachField class (documentation).

Is there a reason why things are the way they are? If you are you happy with this change, I will submit a pull request.

Model field formfield method raises exception if passed a form_class argument

Describe the bug
The model field formfield method raises an exception if passed a form_class argument:

   File "ve/lib/python3.6/site-packages/django_bleach/models.py", line 47, in formfield
    return forms.BleachField(**kwargs)
  File "ve/lib/python3.6/site-packages/django_bleach/forms.py", line 51, in __init__
    super(BleachField, self).__init__(*args, **kwargs)
  File "ve/lib/python3.6/site-packages/django/forms/fields.py", line 214, in __init__
    super().__init__(**kwargs)
TypeError: __init__() got an unexpected keyword argument 'form_class'

This prevents overriding the form field used for a BleachField model field in model forms.

To Reproduce
The bug can be reproduced by instantiating a model form and overriding the default form field class for a bleach model field:

class MyModel(Model):
    bleached = models.BleachField(...)

class MyBleachFormField(forms.BleachField):
    ...

class MyModelForm(ModelForm):
    class Meta:
        model = MyModel
        field_classes = {"bleached": MyBleachFormField}

Expected behavior
I expect the form_class argument to be used the same way it is by core Django model fields, ie, it overrides the default form field. See the base definition of formfield.

Additional context
This bug impacts my project because the project uses a custom form field, a child of BleachField that marks the submitted value in cleaned_data as template safe so it can be used in templates without having to explicitly mark it as safe (we try to avoid ad-hoc calls to mark_safe in our codebase). The use case is a django-formtools wizard confirmation page, where the wizard renders the submitted form values to the page for the user to check before the forms are saved and committed.

I'm happy to write a patch.

Propose that the model field marks retrieved data as template safe

Is your feature request related to a problem? Please describe.
The problem here is similar to #27 in that when writing templates that render model instances with bleached data fields you need to explicitly mark each field as template safe. I think ad-hoc use of mark_safe is a bad practice and better practice would be to do it systematically.

I would have proposed this in #27 but at the time I was thinking that the model field already did this. I realised afterward that this is actually a local modification. Sorry for the spam!

Describe the solution you'd like
I would like the BleachedField model field to mark all data retrieved from the DB as template safe in from_db_value(). The data is bleached before saving so anything read back is safe.

Describe alternatives you've considered
The default position is that the field doesn't mark anything safe and you need to mark bleached content as template safe every time it's used in a template. This introduces an unnecessary human element in my view.

Make `bleach_linkify` safe

Is your feature request related to a problem? Please describe.

Right now, html|bleach is marked as safe, but html|bleach_linkify is an unsafe, raw string.

I guess this is intended behavior given the test linked below, but this seems like strange behavior to me

'{{ link_this|bleach_linkify|safe }}'

Describe the solution you'd like

Just like the bleach template tag, I'd like bleach_linkify to return a safestring, not a raw string.

Describe alternatives you've considered

As the linked test shows, html|bleach_linkify|safe seems to be OK (in my usage I actually had to use html|bleach|bleach_linkify|safe).

Additional context
AFAIK bleach does support a way to both sanitize HTML and linkify in one pass. Maybe that's a possibility?

Model formfield() does not honor model field blank attribute making form field always required

Describe the bug

When blank=True is set to True, expected field.formfield() should be required=False however that is not honored.

To Reproduce

Steps to reproduce the behavior:

  1. Create model
  2. Add bleached field with blank=True
  3. Add it to django admin
  4. Check in django admin if field is required

Expected behavior

It should appear as not required but it appears as required

Context

Looks like its related to

if not self.choices:
return forms.BleachField(
label=self.verbose_name,
max_length=self.max_length,
allowed_tags=self.bleach_kwargs.get("tags"),
allowed_attributes=self.bleach_kwargs.get("attributes"),
allowed_styles=self.bleach_kwargs.get("styles"),
allowed_protocols=self.bleach_kwargs.get("protocols"),
strip_tags=self.bleach_kwargs.get("strip"),
strip_comments=self.bleach_kwargs.get("strip_comments"),
)

Recently it was changed to return BleachField which inherently does not call super() where base Field correctly interprets other django model field attributes such as blank:

https://github.com/django/django/blob/56f9579105c324ff15250423bf9f8bdf1634cfb4/django/db/models/fields/__init__.py#L908-L914

(copy from link since it does not render inline)

    def formfield(self, form_class=None, choices_form_class=None, **kwargs):
        """Return a django.forms.Field instance for this field."""
        defaults = {
            'required': not self.blank,
            'label': capfirst(self.verbose_name),
            'help_text': self.help_text,
        }

ModuleNotFoundError

I have followed all steps to install and setup and I get this error:

ModuleNotFoundError: No module named 'django_bleachhealth_check'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.