Giter Site home page Giter Site logo

tecnativa / docker-whitelist Goto Github PK

View Code? Open in Web Editor NEW
8.0 8.0 8.0 61 KB

A socat service to whitelist network connections

License: Apache License 2.0

Shell 1.26% Python 92.53% Dockerfile 6.21%
docker-image whitelist networking isolation socat

docker-whitelist's Introduction

Last image-template GitHub Container Registry Docker Hub Docker Pulls Layers Commit License

Docker Whitelister

What?

A whitelist proxy that uses socat. ๐Ÿ”Œ๐Ÿ˜ผ

Why?

tl;dr: To workaround moby/moby#36174.

Basically, Docker supports internal networks; but when you use them, you simply cannot open ports from those services, which is not very convenient: you either have full or none isolation.

This proxy allows some whitelist endpoints to have network connectivity. It can be used for:

  • Allowing connection only to some APIs, but not to the rest of the WWW.
  • Exposing ports from a container while still not letting the container access the WWW.

How?

Use these environment variables:

TARGET

Required. It's the host name where the incoming connections will be redirected to.

HTTP_HEALTHCHECK

Default: 0

Set to 1 to enable healthcheck with pycurl http requests. This is useful if the target uses a deployment where the ip of the service gets changed frequently (e.g. accounts.google.com) and you are using PRE_RESOLVE

Automatically restarting unhealthy proxies

When you enable the http healthcheck the container marks itself as unhealthy but does nothing. (see moby/moby#22719)

If you want to restart your proxies automatically, you can use https://github.com/willfarrell/docker-autoheal.

HTTP_HEALTHCHECK_URL

Default: http://$TARGET/

Url to use in HTTP_HEALTHCHECK if enabled. $TARGET gets replaced inside the url by the configured TARGET.

HTTP_HEALTHCHECK_TIMEOUT_MS

Default: 2000

Timeout in milliseconds for http healthcheck. This is used as a timeout for connecting and receiving an answer. You may end up with twice the time spend.

MODE

Default: tcp

Set to udp to proxy in UDP mode.

MAX_CONNECTIONS

Default: 100

Limits the maximum number of accepted connections at once per port.

Setting "unlimited" connections

For each port and open connection a subprocess is spawned. Setting a number too high might make your host system unresponsive and prevent you from logging in to it. So be very careful with setting this setting to a large number.

The typical linux system can handle up to 32768 so if you need a lot more parallel open connections make sure to also set the corresponding variables on your host system. See https://stackoverflow.com/questions/6294133/maximum-pid-in-linux for reference. And divide this number by at least the number of ports you are running through docker-whitelist.

What happens when the limit is hit?

docker-whitelist basically starts socat so the behaviour is the same. In case no more subprocesses can be forked:

  • UDP mode: You won't see a difference on the connecting side. But no more packets are forwarded for new connections until the number of connections for this port is reduced.
  • TCP mode: docker-whitelist no longer accepts the connection and your connection will wait until the number of connections for this port is reduced. Your connection may time out.

NAMESERVERS

Default: 208.67.222.222 8.8.8.8 208.67.220.220 8.8.4.4 to use OpenDNS and Google DNS resolution servers by default.

Only used when pre-resolving is enabled.

PORT

Default: 80 443. If you're proxying HTTP/S services, no need to specify!

The port where this service will listen, and where the target service is expected to be listening on also.

PRE_RESOLVE

Default: 0

Set to 1 to force using the specified nameservers to resolve the target before proxying.

This is especially useful when using a network alias to whitelist an external API.

SMTP_HEALTHCHECK

Default: 0

Set to 1 to enable healthcheck with pycurl smtp requests. This is useful if the target uses a deployment where the ip of the service gets changed frequently (e.g. smtp.eu.sparkpostmail.com) and you are using PRE_RESOLVE

Automatically restarting unhealthy proxies

see HTTP_HEALTHCHECK

SMTP_HEALTHCHECK_URL

Default: smtp://$TARGET/

Url to use in SMTP_HEALTHCHECK if enabled. $TARGET gets replaced inside the url by the configured TARGET.

SMTP_HEALTHCHECK_COMMAND

Default: HELP

Enables changing the healthcheck command for servers that do not support HELP (e.g. for MailHog you can use QUIT)

SMTP_HEALTHCHECK_TIMEOUT_MS

Default: 2000

Timeout in milliseconds for smtp healthcheck. This is used as a timeout for connecting and receiving an answer. You may end up with twice the time spend.

UDP_ANSWERS

Default: 1

1 means the process will wait for an answer from the server before the forked child process terminates (until this happens the connection counts towards the connection limit). Set to 0 if no answers are expected from the server, this prevents subprocesses waiting for an answer indefinitely.

Setting to 0 is recommended if you are using this to connect to a syslog server like graylog.

VERBOSE

Default: 0

Set to 1 to log all connections.

Example

So say you have a production app called coolapp that sends and reads emails, and uses Google Font APIs to render some PDF reports.

It is defined in a docker-compose.yaml file like this:

# Production deployment
version: "2.0"
services:
    app:
        image: Tecnativa/coolapp
        ports:
            - "80:80"
        environment:
            DB_HOST: db
        depends_on:
            - db

    db:
        image: postgres:alpine
        volumes:
            - dbvol:/var/lib/postgresql/data:z

volumes:
    dbvol:

Now you want to set up a staging environment for your QA team, which includes a fresh copy of the production database. To avoid the app to send or read emails, you put all into a safe internal network:

# Staging deployment
version: "2.0"
services:
    proxy:
        image: traefik
        networks:
            default:
            public:
        ports:
            - "8080:8080"
        volumes:
            # Here you redirect incoming connections to the app container
            - /etc/traefik/traefik.toml

    app:
        image: Tecnativa/coolapp
        environment:
            DB_HOST: db
        depends_on:
            - db

    db:
        image: postgres:alpine

networks:
    default:
        internal: true
    public:

Now, it turns out your QA detects font problems. Logic! app cannot contact fonts.google.com. Yikes! What to do? ๐Ÿคท

tecnativa/whitelist to the rescue!! ๐Ÿ’ช๐Ÿค 

# Staging deployment
version: "2.0"
services:
    fonts_googleapis_proxy:
        image: tecnativa/whitelist
        environment:
            TARGET: fonts.googleapis.com
            PRE_RESOLVE: 1 # Otherwise it would resolve to localhost
        networks:
            # Containers in default restricted network will ask here for fonts
            default:
                aliases:
                    - fonts.googleapis.com
            # We need public access to "open the door"
            public:

    fonts_gstatic_proxy:
        image: tecnativa/whitelist
        networks:
            default:
                aliases:
                    - fonts.gstatic.com
            public:
        environment:
            TARGET: fonts.gstatic.com
            PRE_RESOLVE: 1

    proxy:
        image: traefik
        networks:
            default:
            public:
        ports:
            - "8080:8080"
        volumes:
            # Here you redirect incoming connections to the app container
            - /etc/traefik/traefik.toml

    app:
        image: Tecnativa/coolapp
        environment:
            DB_HOST: db
        depends_on:
            - db

    db:
        image: postgres:alpine

networks:
    default:
        internal: true
    public:

And voilร ! app has fonts, but nothing more. โœ‹๐Ÿ‘ฎ

Development

All the dependencies you need to develop this project (apart from Docker itself) are managed with poetry.

To set up your development environment, run:

pip install pipx  # If you don't have pipx installed
pipx install poetry  # Install poetry itself
poetry install  # Install the python dependencies and setup the development environment

Testing

To run the tests locally, add --prebuild to autobuild the image before testing:

poetry run pytest --prebuild

By default, the image that the tests use (and optionally prebuild) is named test:docker-whitelist. If you prefer, you can build it separately before testing, and remove the --prebuild flag, to run the tests with that image you built:

docker image build -t test:docker-whitelist .
poetry run pytest

If you want to use a different image, pass the --image command line argument with the name you want:

# To build it automatically
poetry run pytest --prebuild --image my_custom_image

# To prebuild it separately
docker image build -t my_custom_image .
poetry run pytest --image my_custom_image

docker-whitelist's People

Contributors

ap-wtioit avatar joao-p-marques avatar josep-tecnativa avatar pcatinean avatar pedrobaeza avatar yajo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

docker-whitelist's Issues

Support multiple aliases at the same time?

I inherited a project that uses this container. The setup looks as follows:

  proxy_i_vimeocdn_com:
    image: tecnativa/whitelist
    environment:
      TARGET: i.vimeocdn.com
      PRE_RESOLVE: 1 # Otherwise it would resolve to localhost
    networks:
      e2e-test-suite-network:
        aliases:
          - i.vimeocdn.com

  proxy_f_vimeocdn_com:
    image: tecnativa/whitelist
    environment:
      TARGET: f.vimeocdn.com
      PRE_RESOLVE: 1 # Otherwise it would resolve to localhost
    networks:
      e2e-test-suite-network:
        aliases:
          - f.vimeocdn.com

# More here (lots more)

I was wondering if there's a way to have TARGET contain a CSV host list? ๐Ÿค”

How to use to expose ports

How does this solve the

Exposing ports from a container while still not letting the container access the WWW

usage listed in the README?
I see traefik used for that in the example instead

Errors on AWS / RDS

We deployed some staging servers without any issues, but suddenly our RDS (Postgres) won't answer throw the proxy anymore, it almost the same setup on all our amazon EC2 instances.

The globalwhitelist in our setup looks like this

docker-compose.txt

It seems to "work" as i can ping and resolve the RDS host

$ cd /opt/odoo/docker ; docker-compose exec odoo bash

odoo@hobbii:/opt/odoo$ ping staging.amazon.rds
PING staging.amazon.rds (192.168.0.4) 56(84) bytes of data.
64 bytes from globalwhitelist_amazon_rds_stageing_1.globalwhitelist_shared (192.168.0.4): icmp_seq=1 ttl=64 time=0.036 ms
64 bytes from globalwhitelist_amazon_rds_stageing_1.globalwhitelist_shared (192.168.0.4): icmp_seq=2 ttl=64 time=0.024 ms
64 bytes from globalwhitelist_amazon_rds_stageing_1.globalwhitelist_shared (192.168.0.4): icmp_seq=3 ttl=64 time=0.022 ms

odoo@hobbii:/opt/odoo$ telnet staging.amazon.rds 5432
Trying 192.168.0.4...
Connected to staging.amazon.rds.
Escape character is '^]'.

But this fails..

odoo@hobbii:/opt/odoo$ echo $PGHOST
staging.amazon.rds
odoo@hobbii:/opt/odoo$ echo $PGUSER
odoo
odoo@hobbii:/opt/odoo$ echo $PGPASSWORD
********************

psql -h staging.amazon.rds -U odoo  #Timeouts 

in our docker instance (odoo) I get the following message

hjess@odoo-staging-v2:/opt/odoo/docker$ docker-compose logs -f
Attaching to docker_odoo_1, docker_smtp_1
odoo_1  | doodba INFO: Waiting until postgres is listening at staging.amazon.rds...
odoo_1  | doodba INFO: Waiting until postgres is listening at staging.amazon.rds...

But a connection never happens from our docker instance to the RDS..

Any thoughts on what could be happening here? - our setup normally is working whit this setup.

Socat uses 100% when odoo is stopped

When using this in doodba-scaffolding and stopping socat sometimes uses 100% of the CPU:

top - 10:06:13 up  1:59,  4 users,  load average: 3,31, 4,00, 4,01
Tasks: 530 total,   5 running, 393 sleeping,   0 stopped,   0 zombie
%Cpu(s):  4,8 us,  7,3 sy,  0,0 ni, 87,9 id,  0,0 wa,  0,0 hi,  0,0 si,  0,0 st
KiB Mem : 32937164 total,  9793444 free, 10869704 used, 12274016 buff/cache
KiB Swap: 33484796 total, 33483260 free,     1536 used. 20933540 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                 
26513 root      20   0   14584    360      0 R 100,0  0,0  10:40.81 socat tcp-listen:8069,fork,reuseaddr tcp-connect:odoo:8069   

Currently the image is using socat version 1.7.3.1 (from 2016):

bigbear3001@wt-io-it-bigbear3001:~$ docker run tecnativa/whitelist socat -V
socat by Gerhard Rieger - see www.dest-unreach.org
socat version 1.7.3.1 on Apr 29 2016 22:10:44
   running on Linux version #58-Ubuntu SMP Mon Jun 24 10:55:24 UTC 2019, release 4.15.0-54-generic, machine x86_64
features:
...

It seems that 1.7.3.2 (from 2017) whould fix this issue: http://www.dest-unreach.org/socat/.
Would it be ok to have a patch where socat is installed from tar.gz instead of from the alpine repos? or upgrade to alpine 3.6 which would have this version i guess:
https://pkgs.alpinelinux.org/package/v3.6/main/ppc64le/socat

Cannot start on a mac M1

Odoo image was built successfully but at the last step of the doodba set-up, when I perform a 'docker-compose up -d' I get the following error. Does anyone have found a workaround?

Digest: sha256:e6e1d1d41fb7087250176b38c73666c1205816e8b8d2d8f8f4a69ce23f7635b3
Status: Downloaded newer image for kozea/wdb:latest
Pulling cdnjs_cloudflare_proxy (tecnativa/whitelist:)...
latest: Pulling from tecnativa/whitelist
ERROR: no matching manifest for linux/arm64/v8 in the manifest list entries

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.