Giter Site home page Giter Site logo

morty's Introduction

Morty

Build Status License: AGPL v3 Docker Pulls

Web content sanitizer proxy as a service

Morty rewrites web pages to exclude malicious HTML tags and attributes. It also replaces external resource references to prevent third party information leaks.

The main goal of morty is to provide a result proxy for searx, but it can be used as a standalone sanitizer service too.

Features:

  • HTML sanitization
  • Rewrites HTML/CSS external references to locals
  • JavaScript blocking
  • No Cookies forwarded
  • No Referrers
  • No Caching/Etag
  • Supports GET/POST forms and IFrames
  • Optional HMAC URL verifier key to prevent service abuse

Installation and setup

Requirement: Go version 1.10 or higher.

$ go get github.com/asciimoo/morty
$ "$GOPATH/bin/morty" --help

Usage

  -debug
        Debug mode (default true)
  -followredirect
        Follow HTTP GET redirect
  -ipv6
        Allow IPv6 HTTP requests (default true)
  -key string
        HMAC url validation key (base64 encoded) - leave blank to disable validation
  -listen string
        Listen address (default "127.0.0.1:3000")
  -proxy string
        Use the specified HTTP proxy (ie: '[user:pass@]hostname:port'). Overrides -socks5, -ipv6.
  -proxyenv
        Use a HTTP proxy as set in the environment (HTTP_PROXY, HTTPS_PROXY and NO_PROXY). Overrides -proxy, -socks5, -ipv6.
  -socks5 string
        Use a SOCKS5 proxy (ie: 'hostname:port'). Overrides -ipv6.
  -timeout uint
        Request timeout (default 5)
  -version
        Show version

Environment variables

Morty can additionally be configured using the following environment variables:

  • MORTY_ADDRESS: Listen address (default to 127.0.0.1:3000)
  • MORTY_KEY: HMAC url validation key (base64 encoded) to prevent direct URL opening. Leave blank to disable validation. Use openssl rand -base64 33 to generate.
  • DEBUG: Enable/disable proxy and redirection logs (default to true). Set to false to disable.

Docker

docker run -e DEBUG=false -e MORTY_ADDRESS=0.0.0.0:3000 dalf/morty
docker run -e DEBUG=false dalf/morty -listen 0.0.0.0:3000

Test

$ cd "$GOPATH/src/github.com/asciimoo/morty"
$ go test

Benchmark

$ cd "$GOPATH/src/github.com/asciimoo/morty"
$ go test -benchmem -bench .

Bugs

Bugs or suggestions? Visit the issue tracker.

morty's People

Contributors

asciimoo avatar aureq avatar aveao avatar dalf avatar equim-chan avatar josch avatar madmath03 avatar pataquets avatar polys avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

morty's Issues

Please add support for a configuration file

morty has finally been accepted into Debian! https://tracker.debian.org/news/961042/accepted-morty-010-1-source-amd64-into-unstable-unstable/

The package includes the systemd script in /lib/systemd/system/morty.service with the following content:

[Unit]
Description=morty proxy
Documentation=man:morty(1)

[Service]
User=morty
Group=morty
ExecStart=/usr/bin/morty -listen 127.0.0.1:3000

[Install]
WantedBy=multi-user.target

The listen address is a good default because it doesn't expose morty to the public web. The problem is, that to expose morty to the web, we need to supply a secret key. Currently, the only way to supply that key is to edit the ExecStart line in /lib/systemd/system/morty.service. This is problematic because every time the package is updated, the content of that file will be reset. This is suboptimal.

The normal way to handle this situation is to put configuration parameters in /etc/ as these files will not be touched upon package upgrades. But currently, morty doesn't support a configuration file. Thus is this issue I want to ask morty developers to add support for a simple configuration file where the admin can put the secret key and the listen address at least.

Thanks!

Support lazy loading of images

If there is no javascript, is it possible to load image that are normally lazy loaded ?
Usually there is a data-*src attribute.

Another case : some images are references as style="background-image: url("image.jpg");"

MortyProxy specific Donation & current development state

Dear developers, first of all thank you for the great work!

I am considering a donation to MortyProxy, as it is one of my favorite features in searx.
Is that even possible, to specifically donate for MortyProxy? if yes, then please tell me how to proceed.

What about its current development state? Recently it is not working on "searx.me", and the "proxied" option isn't displayed at all on other searx public instances.
Whenever I try to use "proxied" on "searx.me" I get this Error page (pastebin link bellow)
https://privatebin.net/?cd718dc9016f8811#M/W3WhtNiOtiXxh4ui38VbCy75+0UEjQXeI7WuJF/g0=

Looking forward to read your reply and comments.

Can we have a release?

Hi,

I'm the Debian maintainer of the searx package. Since morty works well with searx, I'd love to also package it and make it such that it works well in combination with searx.

But to package it, it would make things much easier and more convenient, if morty could make a release.

Would you consider making one?

Thanks!

cheers, josch

fonts.googleapis.com : different responses according to the user agent

according to https://developers.google.com/fonts/docs/technical_considerations :

When a browser sends a request for a Fonts API stylesheet (as specified in a tag in your web page), the Fonts API serves a stylesheet generated for the specific user agent making the request.

The Web Font Loader is a JavaScript library that gives you more control over font loading than the Google Fonts API provides. The Web Font Loader also lets you use multiple web font providers.

TODO : check if everything is ok with the current Firefox user agent.

Improve CSS url() regex

Hi,

I found some ways to smuggle some css url() bits past morty so that the user's browser will still request the 3rd party resources and thus breaking the privacy expectation. I found the following ways:

background-image: url( 'http://127.0.0.1:8000/test11.jpg' );
background-image: \75 \72 \6C ('http://127.0.0.1:8000/test3.jpg');
background-image: \75\72\6C ('http://127.0.0.1:8000/test13.jpg');
background-image: \75r\6C ('http://127.0.0.1:8000/test14.jpg');

Notice the space after the opening bracket in the first example. The other three make use of encoding stuff in hex arbitrarily (and also put spaces between each hex character).

Thanks!

cheers, josch

Remove req.SetConnectionClose()

Use case example: display the searx from wikipedia (infobox in searx).

Involve 3 requests (2 redirects) :

Even with #76, the response time could be decreased without closing the HTTP connections, especially on the second request.

Not sure about resource usage over time (memory, sockets) ?

A quck question ...

On my private instance of Searx (localhost behind a VPN) the Proxy Morty works on some sites displaying only hyperlinks, text, and some pictures, as it should do. But often I receive this following message:

MortyProxy
Error: timeout
Warning! This instance does not support direct URL opening.

I am just wondering if it's a problem I can solve.

Thank you.

Content-Type:application/xhtml+xml and closing elements

When the content type is application/xhtml+xml, the document should be XML compliant not SGML / HTML. The browsers (FF, Chrome at least) displays error instead of the page.

Example : http://eu.battle.net/forums/fr/wow/

The browser see nested elements here :

<link rel="shortcut icon" type="image/x-icon" href="./?mortyurl=http%3A%2F%2Feu.battle.net%2Fforums%2Fstatic%2Fimages%2Ficons%2Fwow-favicon.ico">
<link rel="stylesheet" type="text/css" media="all" href="./?mortyurl=http%3A%2F%2Feu.battle.net%2Fforums%2Fstatic%2Fcss%2Fnav-client%2Fnav-client.css%3Fv%3D84">

It should be (with a / at the end) :

<link rel="shortcut icon" type="image/x-icon" href="..." />
<link rel="stylesheet" type="text/css" media="all" href="..." />

More over, the namespace of the html element is removed by morty. Original is :

<html xmlns="http://www.w3.org/1999/xhtml">

A dirty way is to fix the problem, is to replace Content-Type:application/xhtml+xml with Content-Type:text/html, but the rendering may be different.

Option to remove some filtering.

Having a option that still allows javascript and doesnt block any dom elements would be nice, only rewriting the URLs so that some sites still work. That would be nice.

Installation issue: unrecognized import path "math/bits"

zlayton@dell $ go version
go version go1.7.4 linux/amd64
zlayton@dell $ go get github.com/asciimoo/morty
package math/bits: unrecognized import path "math/bits" (import path does not begin with hostname)

Debian Stretch uses Go 1.7.4 by default, but math/bits requires Go>=1.9. Recommend specifying minimum Go version requirement in Installation in readme, and in Travis.

IE conditional comments

CSS and HTML in IE conditional comments are not proxified.
They can be harmful if IE is used.

Should morty remove them or parse the content ?

<meta http-equiv='refresh'...> : URL with quote doesn't work.

The standard way is :

<meta http-equiv="refresh" content="0; url=news-nojs.php">

The uncommon way but that's work in browser :

<meta http-equiv="refresh" content="0; url='news-nojs.php'">

with quote arround the url.

Example : http://www.bhaskar.com/news/NAT-NAN-demonetisation-500-1000-notes-it-department-sends-notice-news-hindi-5462937-NOR.html (it seems the website use a reverse proxy that use PageSpeed from google, I don't know the redirect is created by this module or not. If yes, this issue is related to all websites using PageSpeed with a similar configuration).

How to handle other protocols than http/https ?

Other protocols :


One idea : morty profixies these URL but when the user asks to display the content, morty displays a html page containing :

  • the unprofixed URL
  • an explaination that the user will go to an unproxified content

The links which are using the javascript protocol are deleted ( <a href="">... ) .

If we had plenty of contributors / plenty of time, morty can proxy the ftp protocol.

[documentation] Purpose of Morty (in Searx context only)

Hello, I have self-hosted Searx with no public instance (LAN only)

Could someone help me understand what Morty does exactly, but only in the context of Searx. I'm not asking about Morty functionality for opening a page link from Searx; only about the added privacy TO Searx via Morty.

  • Does Morty make Search Engine user tracking more difficult? How?
  • Does a local Morty proxy rely on other Morty instances, connecting to them so that Search Engines have millions of queries coming from same IP?
  • What does Morty do that Searx does not already do for Searx traffic?

Thanks for any help.

License?

I'm looking to package Morty as an Arch Linux package but I can't find anything about the code license.

Thanks

Support "safe" js libraries

I'm mainly thinking about frameworks like bootstrap, which require js for some types of menues and other graphic elements. Those scripts are often controlled using css classes, which means execution of website specific code is not required to use them.

If morty can detect uses of them, it can deliver a (sanitized) version of the specific library to improve usability of proxied websites without running unsave/untrusted code on the browser.

Installing morty for Searx

I have installed the Docker app on my iMac 2015 10.12.6.

How do I install Morty on macOS?

I have installed Searx via Kitematic and can use Searx on my localhost in my Firefox Browser.

However, I am not sure how to install Morty https://github.com/asciimoo/morty

When I type $ go get github.com/asciimoo/morty (as instructed) it returns the following: -bash: go: command not found

Does anybody know what I am doing wrong?

Is there an easy guide to installing and using Morty with Searx?

Any help or suggestion is really appreciated.

Thanks

some URL in CSS are not proxifed

If there is a space after url(' the URL is not proxified.
Example from instagram.com : url(' //instagramstatic-a.akamaihd.net....

Hide proxied resources in the logs

Unless I missed something in the configuration, Morty currently displays in its logs all URLs proxied... that's not really clean in terms of privacy as an admin could see everything that has been searched by all users of the instance.

Logging proxied resources are of course useful for dev / debug, but we should have a way to disable it.

For me, this is a blocking issue for using Morty in production.

self-closing unsafe tags break documents

Hi,

I noticed that when the input contains <svg /> or <applet /> then the output will be cut off at the point where the tag is located. Writing <svg></svg> or <applet></applet> correctly filters these unsafe tags out.

morty should also filter out the self-closing versions of unsafe tags without cutting off the document.

Thanks!

cheers, josch

key parameter is not hexadecimal encoded

Hi,

the help text for the key parameter is not hexadecimal encoded as the help text suggests:

  -key string
    	HMAC url validation key (hexadecimal encoded) - leave blank to disable

proof:

$ morty -listen 127.0.0.1:3000 -key foobar
$ echo -n 'http://127.0.0.1:8000/' | openssl dgst -sha256 -hmac foobar
$ curl 'http://127.0.0.1:3000/?mortyurl=http://127.0.0.1:8000/&mortyhash=047a8c0a42af40750448bc8b72221e70751d23b82bd973feae03207be0630650'

This suggests that the value of -key is not hexadecimal encoded but just taken as its raw binary value. Another indication for this conclusion is this bug I field to searx: searx/searx#1310 To give searx the right key, I had to base64-encode the ascii representation of the key and did not need to turn a hexadecimal key to binary directly.

How to deal with URL in meta informations ?

Some HTML elements are meta informations for crawlers, social websites, etc.
Some of these HTML elements contains URL. They are not used by the browser.
Examples :

What to do with them ?
For now, it contains unproxified URLs.

security bug report

Hi Team,
I found security bug report in your application , where i can apply , so we can fix ASAP.
do you have bug bounty program or security team so i can contact.

MortyProxy fail on twitter

Error: error when reading response headers: small read buffer. 
Increase ReadBufferSize. Buffer size=4096, 
contents: "HTTP/1.1 200 OK\r\nCache-Control: no-cache, no-store, must-revalidate, pre-check=0, post-check=0\r\nConnection: close\r\nContent-Length: 418686\r\nContent-Security-Policy: script-src https://connect.facebook."..."om https://twitter.com https://*.twimg.com https://translate.googleapis.com https://ton.twitter.com 'unsafe-inline' https://platform.twitter.com https://maxcdn.bootstrapcdn.com https://netdna.bootstra"

Warning! This instance does not support direct URL opening.

Ubunutu/Debian start script

How is morty supposed to run on Debian/Ubuntu? Some more details on suggested setup would be helpful:

  • Should it run on a different user?
  • Start with a systemd script? Maybe a template would be great (I am no expert on startup scripts)
  • How do I see if this works? Is there some debugging option that shows if searx works with morty?
    Appreciate your time,
    RaspyVotan

Morty not doing anything

I have searx installed as a local instance and it is running fine, and also have installed morty and adjusted the /etc/searx/settings.yaml accordingly.

Searching works but I cannot tell if morty is doing anything. There's no output in the terminal morty is running in and the searx output doesn't show anything morty related. Also, if I attach strace to the morty PID, morty remains idle and does nothing when I search.

Is there some logging I'm missing somewhere? What should I be seeing if morty is working correctly?

Content-Disposition header

For mimetype which are not html, image nor css, should the Content-Disposition header should be forwarded or set ?

Suggestion for URL https://f-droid.org/FDroid.apk :

Content-Disposition: inline; filename="FDroid.apk"

FDroid.apk would be the last part of the URL if the header is not already set.

Opposite direction : cancel the download.

Resolve HTTP redirection

The main goal of morty is to provide a result proxy for searx.
There are two use cases :

  • HTML page proxy
  • image proxy

In the second use case, when an info box includes an image from wikimedia, there are two redirection to get the final media.

It would be nice if morty could resolves redirection directly : it would decrease the response time of searx (it would avoid the twice the time between morty and browser)

Return HTTP error 504 when there is timeout ?

According to https://en.wikipedia.org/wiki/List_of_HTTP_status_codes , the errror 504 is described as "Gateway Time-out. The server was acting as a gateway or proxy and did not receive a timely response from the upstream server."

Should morty return a 504 error when there is time-out ? It would be more helpful to understand what is going on instead of seeing 404 error.

I agree, it's mostly on debugging purpose, but I don't think it's harmful for privacy.

@import url

using morty the content of https://desktop.github.com/stylesheets/main.css is :

@import ./?mortyhash=...&mortyurl=https%3A%2F%2Fdesktop.github.com%2Fstylesheets%2Furl%2528octicons%2Focticons.css);
@import ./?mortyhash=...&mortyurl=https%3A%2F%2Fdesktop.github.com%2Fstylesheets%2Furl%2528reset.css);
@import ./?mortyhash=...&mortyurl=https%3A%2F%2Fdesktop.github.com%2Fstylesheets%2Furl%2528base.css);
@import ./?mortyhash=...&mortyurl=https%3A%2F%2Fdesktop.github.com%2Fstylesheets%2Furl%2528mac.css);
@import ./?mortyhash=...&mortyurl=https%3A%2F%2Fdesktop.github.com%2Fstylesheets%2Furl%2528windows.css);

image fetching timeout

Several days ago, without making any changes to my configurations, image results in searx stopped working, giving 504 errors. This is an example of what morty prints to the console:

2018/04/05 18:09:17 getting https://tse2.mm.bing.net/th?id=OIP.sH7AGV34A5Ytw7sROAp2JQHaLD&w=201&h=299&pid=1.1
2018/04/05 18:09:18 error: timeout

However, if I do a wget on one of the urls, it works fine. There doesn't seem to be any debug mode for morty, so I'm not sure where to go from here. Suggestions welcome.

Disallow some MIME types

Unless morty can parse the content of these MIME type, they shouldn't allowed:

  • multipart/* (each part can contains any MIME type)
  • image/svg+xml (javascript)
  • application/mathml+xml (javascript)

About web font, some browsers includes this : https://github.com/khaledhosny/ots to sanitize the font, not sure if all of them do it.

(I don't include CSS)

Travis build failing: "Errorf format %s has arg testCase.Input of wrong type contenttype.Filter"

From current (and past, somehow not failing) Travis build:

$ go test -v ./...
# github.com/asciimoo/morty/contenttype
contenttype/contenttype_test.go:220: Errorf format %s has arg testCase.Input of wrong type contenttype.Filter
contenttype/contenttype_test.go:225: Errorf format %s has arg testCase.Input of wrong type contenttype.Filter
contenttype/contenttype_test.go:244: Errorf format %s has arg testCase.Filter of wrong type map[string]bool
=== RUN   TestAttrSanitizer
--- PASS: TestAttrSanitizer (0.00s)
=== RUN   TestSanitizeURI
--- PASS: TestSanitizeURI (0.00s)
=== RUN   TestURLProxifier
--- PASS: TestURLProxifier (0.00s)
PASS
ok  	github.com/asciimoo/morty	0.005s
FAIL	github.com/asciimoo/morty/contenttype [build failed]
The command "go test -v ./..." exited with 2.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.