Giter Site home page Giter Site logo

Comments (16)

jdstrand avatar jdstrand commented on July 18, 2024 1

This one should work with the understanding that id -Gn will not have duplicates in the output:

# honor groups supplied via 'docker run --group-add ...' but drop 'root' (the sed
# removes 'telegraf' since we unconditionally add it and don't want it listed twice)
groups="telegraf"
extra_groups="$(id -Gn | sed \
   -e 's/ /,/g' \  
   -e 's/,\(root\|telegraf\),/,/g' \           
   -e 's/^\(root\|telegraf\),//g'  \    
   -e 's/,\(root\|telegraf\)$//g' \ 
   -e 's/^\(root\|telegraf\)$//g')"  
if [ -n "$extra_groups" ]; then
    groups="$groups,$extra_groups"
fi
exec setpriv --reuid telegraf --regid telegraf --groups "$groups" "$@"

It handles when the group is the only, the first, last or in the middle. It preserves groups like 'groot', 'rootbeer' and 'frooty'. The extra_groups handles when id into sed comes up empty.

from influxdata-docker.

powersj avatar powersj commented on July 18, 2024

Hi,

Can you provide some more details about how you are trying to do this, such that it is no longer working?

Looking at the previous issue, I used my previous suggestion of --user telegraf:$(stat -c '%g' /var/run/docker.sock) and it appears to work as expected:

The group I specified is still part of the telegraf user:

telegraf@4027e6d38705:/$ groups
groups: cannot find name for group ID 961
961

And collects stats as expected:

docker run --user telegraf:$(stat -c '%g' /var/run/docker.sock) -v /var/run/docker.sock:/var/run/docker.sock -v $PWD/config.toml:/etc/telegraf/telegraf.conf telegraf:latest
2024-02-22T14:46:27Z I! Loading config: /etc/telegraf/telegraf.conf
2024-02-22T14:46:27Z W! DeprecationWarning: Option "perdevice" of plugin "inputs.docker" deprecated since version 1.18.0 and will be removed in 2.0.0: use 'perdevice_include' instead
2024-02-22T14:46:27Z I! Starting Telegraf 1.29.5 brought to you by InfluxData the makers of InfluxDB
2024-02-22T14:46:27Z I! Available plugins: 241 inputs, 9 aggregators, 30 processors, 24 parsers, 60 outputs, 6 secret-stores
2024-02-22T14:46:27Z I! Loaded inputs: docker
2024-02-22T14:46:27Z I! Loaded aggregators: 
2024-02-22T14:46:27Z I! Loaded processors: 
2024-02-22T14:46:27Z I! Loaded secretstores: 
2024-02-22T14:46:27Z I! Loaded outputs: file
2024-02-22T14:46:27Z I! Tags enabled: host=71b9c160f000
2024-02-22T14:46:27Z W! Deprecated inputs: 0 and 1 options
2024-02-22T14:46:27Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:"71b9c160f000", Flush Interval:10s
2024-02-22T14:46:27Z D! [agent] Initializing plugins
2024-02-22T14:46:27Z D! [agent] Connecting outputs
2024-02-22T14:46:27Z D! [agent] Attempting connection to [outputs.file]
2024-02-22T14:46:27Z D! [agent] Successfully connected to outputs.file
2024-02-22T14:46:27Z D! [agent] Starting service inputs
2024-02-22T14:46:37Z D! [outputs.file] Wrote batch of 7 metrics in 86.761µs
2024-02-22T14:46:37Z D! [outputs.file] Buffer fullness: 0 / 10000 metrics
docker,engine_host=ryzen,host=71b9c160f000,server_version=25.0.2 n_images=1i,n_goroutines=58i,n_listener_events=0i,n_containers_running=1i,n_used_file_descriptors=33i,n_containers=4i,n_containers_stopped=3i,n_containers_paused=0i,n_cpus=32i 1708613190000000000
docker,engine_host=ryzen,host=71b9c160f000,server_version=25.0.2 memory_total=67333791744i 1708613190000000000
docker_container_status,container_image=telegraf,container_name=nice_grothendieck,container_status=running,container_version=latest,engine_host=ryzen,host=71b9c160f000,server_version=25.0.2 pid=19710i,exitcode=0i,restart_count=0i,container_id="71b9c160f000f32ebadf435a26e1b4363867ed4f20bf0d0e67d343ced8bcad4c",started_at=1708613187806809377i,uptime_ns=3209109177i,oomkilled=false 1708613191000000000
docker_container_mem,container_image=telegraf,container_name=nice_grothendieck,container_status=running,container_version=latest,engine_host=ryzen,host=71b9c160f000,server_version=25.0.2 usage_percent=0.3771173691887432,inactive_anon=0i,inactive_file=151552i,pgfault=6138i,pgmajfault=66i,unevictable=0i,max_usage=0i,usage=253927424i,container_id="71b9c160f000f32ebadf435a26e1b4363867ed4f20bf0d0e67d343ced8bcad4c",active_anon=41172992i,active_file=210833408i,limit=67333791744i 1708613191000000000
<snip>

from influxdata-docker.

moorglade avatar moorglade commented on July 18, 2024

Thanks for the reponse, now I see the difference in my configuration: instead of --user telegraf:$(stat -c '%g' /var/run/docker.sock) I am using --user root:$(stat -c '%g' /var/run/docker.sock).

The reason for this is that if I set the user to telegraf instead of root, the entrypoint does not set the required capabilities on the telegraf binary, and some other plugins stop working (#560 (comment)).

Example configuration:

[[inputs.docker]]
  endpoint = "unix:///host/var/run/docker.sock"

[[inputs.ping]]
  urls = ["github.com"]
  method = "native"

[[outputs.file]]
  files = ["stdout"]

Error for --user telegraf:$(stat -c '%g' /var/run/docker.sock):

[inputs.ping] ping failed: permission changes required, enable CAP_NET_RAW capabilities (refer to the ping plugin's README.md for more info)

Error for --user root:$(stat -c '%g' /var/run/docker.sock) (this used to work for me with previous container images):

[inputs.docker] Error in plugin: permission denied while trying to connect to the Docker daemon socket at unix:///host/var/run/docker.sock: Get "http://%2Fhost%2Fvar%2Frun%2Fdocker.sock/v1.24/info": dial unix /host/var/run/docker.sock: connect: permission denied

I checked the ping's input README and it seems this can be worked around by using method = "exec" instead of method = "native".

If this is expected and I should just not use ping's method = "native" together with docker input, feel free to close the issue.

from influxdata-docker.

powersj avatar powersj commented on July 18, 2024

@jdstrand,

This scenario seems like a regression in behavior, so I want your thoughts on this. The scenario is when a user is trying to monitor docker via the socket + use ping.

To monitor docker the user needs to pass an additional group to telegraf to have permissions to use the socket. To use ping, we previous set capabilities on the telegraf binary in the entrypoint, but only if you are root.

When running as root, now that we are dropping all groups, including user-specified groups, the user can no longer do both at the same time.

Working with v1.29.4:

$ docker run --rm --user root:$(stat -c '%g' /var/run/docker.sock) -v /var/run/docker.sock:/var/run/docker.sock -v $PWD/config.toml:/etc/telegraf/telegraf.conf telegraf:1.29.4
Unable to find image 'telegraf:1.29.4' locally
1.29.4: Pulling from library/telegraf
7bb465c29149: Already exists 
2b9b41aaa3c5: Already exists 
c7c71dd3592a: Already exists 
9140cc5510d6: Already exists 
aab5bc94bab0: Pull complete 
6396348f0ac2: Pull complete 
Digest: sha256:d883b097fbbb1ed1db5fb1430a2d767ab72b423cf3cbb065bb274ff030d6311d
Status: Downloaded newer image for telegraf:1.29.4
2024-02-23T15:00:00Z I! Loading config: /etc/telegraf/telegraf.conf
2024-02-23T15:00:00Z W! DeprecationWarning: Option "perdevice" of plugin "inputs.docker" deprecated since version 1.18.0 and will be removed in 2.0.0: use 'perdevice_include' instead
2024-02-23T15:00:00Z I! Starting Telegraf 1.29.4 brought to you by InfluxData the makers of InfluxDB
2024-02-23T15:00:00Z I! Available plugins: 241 inputs, 9 aggregators, 30 processors, 24 parsers, 60 outputs, 6 secret-stores
2024-02-23T15:00:00Z I! Loaded inputs: docker ping
2024-02-23T15:00:00Z I! Loaded aggregators: 
2024-02-23T15:00:00Z I! Loaded processors: 
2024-02-23T15:00:00Z I! Loaded secretstores: 
2024-02-23T15:00:00Z I! Loaded outputs: file
2024-02-23T15:00:00Z I! Tags enabled: host=6d2e075490ba
2024-02-23T15:00:00Z W! Deprecated inputs: 0 and 1 options
2024-02-23T15:00:00Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:"6d2e075490ba", Flush Interval:10s
2024-02-23T15:00:00Z D! [agent] Initializing plugins
2024-02-23T15:00:00Z D! [agent] Connecting outputs
2024-02-23T15:00:00Z D! [agent] Attempting connection to [outputs.file]
2024-02-23T15:00:00Z D! [agent] Successfully connected to outputs.file
2024-02-23T15:00:00Z D! [agent] Starting service inputs
ping,host=6d2e075490ba,url=192.168.1.1 result_code=0i,packets_transmitted=1i,maximum_response_ms=0.841078,packets_received=1i,ttl=63i,percent_packet_loss=0,minimum_response_ms=0.841078,average_response_ms=0.841078,standard_deviation_ms=0 1708700410000000000
docker,engine_host=ryzen,host=6d2e075490ba,server_version=25.0.2 n_cpus=32i,n_containers_paused=0i,n_images=2i,n_goroutines=58i,n_listener_events=0i,n_used_file_descriptors=31i,n_containers=1i,n_containers_running=1i,n_containers_stopped=0i 1708700410000000000
docker,engine_host=ryzen,host=6d2e075490ba,server_version=25.0.2 memory_total=67333787648i 1708700410000000000
2024-02-23T15:00:10Z D! [outputs.file] Wrote batch of 3 metrics in 56.241µs

and now with latest:

$ docker run --rm --user root:$(stat -c '%g' /var/run/docker.sock) -v /var/run/docker.sock:/var/run/docker.sock -v $PWD/config.toml:/etc/telegraf/telegraf.conf telegraf:1.29.5
2024-02-23T15:00:47Z I! Loading config: /etc/telegraf/telegraf.conf
2024-02-23T15:00:47Z W! DeprecationWarning: Option "perdevice" of plugin "inputs.docker" deprecated since version 1.18.0 and will be removed in 2.0.0: use 'perdevice_include' instead
2024-02-23T15:00:47Z I! Starting Telegraf 1.29.5 brought to you by InfluxData the makers of InfluxDB
2024-02-23T15:00:47Z I! Available plugins: 241 inputs, 9 aggregators, 30 processors, 24 parsers, 60 outputs, 6 secret-stores
2024-02-23T15:00:47Z I! Loaded inputs: docker ping
2024-02-23T15:00:47Z I! Loaded aggregators: 
2024-02-23T15:00:47Z I! Loaded processors: 
2024-02-23T15:00:47Z I! Loaded secretstores: 
2024-02-23T15:00:47Z I! Loaded outputs: file
2024-02-23T15:00:47Z I! Tags enabled: host=7e9459d14147
2024-02-23T15:00:47Z W! Deprecated inputs: 0 and 1 options
2024-02-23T15:00:47Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:"7e9459d14147", Flush Interval:10s
2024-02-23T15:00:47Z D! [agent] Initializing plugins
2024-02-23T15:00:47Z D! [agent] Connecting outputs
2024-02-23T15:00:47Z D! [agent] Attempting connection to [outputs.file]
2024-02-23T15:00:47Z D! [agent] Successfully connected to outputs.file
2024-02-23T15:00:47Z D! [agent] Starting service inputs
2024-02-23T15:00:50Z E! [inputs.docker] Error in plugin: permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get "http://%2Fvar%2Frun%2Fdocker.sock/v1.24/info": dial unix /var/run/docker.sock: connect: permission denied
2024-02-23T15:00:50Z E! [inputs.docker] Error in plugin: permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get "http://%2Fvar%2Frun%2Fdocker.sock/v1.24/containers/json?filters=%7B%22status%22%3A%7B%22running%22%3Atrue%7D%7D": dial unix /var/run/docker.sock: connect: permission denied
2024-02-23T15:00:57Z D! [outputs.file] Wrote batch of 1 metrics in 42.41µs
2024-02-23T15:00:57Z D! [outputs.file] Buffer fullness: 0 / 10000 metrics
ping,host=7e9459d14147,url=192.168.1.1 maximum_response_ms=0.321623,result_code=0i,packets_transmitted=1i,packets_received=1i,percent_packet_loss=0,minimum_response_ms=0.321623,ttl=63i,average_response_ms=0.321623,standard_deviation_ms=0 1708700450000000000

from influxdata-docker.

jdstrand avatar jdstrand commented on July 18, 2024

While I agree this is a regression, it underscores how there was a problem before. I think the path forward is deciding on what groups to keep. Can we somehow detect what groups docker gave (eg, is this in an env var)? If not, perhaps we could honor an env var that the user can set? Eg (untested):

groups="telegraf"
if [ -n "$TELEGRAF_GROUPS" ]; then
    groups="$groups,$TELEGRAF_GROUPS"
fi
exec setpriv --reuid telegraf --regid telegraf --groups "$groups" "$@"

from influxdata-docker.

powersj avatar powersj commented on July 18, 2024

While I agree this is a regression, it underscores how there was a problem before.

Can you reiterate what you found here, since that was in a private issue? My understanding was that it was due to the fact that we retained the root group.

from influxdata-docker.

jdstrand avatar jdstrand commented on July 18, 2024

While I agree this is a regression, it underscores how there was a problem before.

Can you reiterate what you found here, since that was in a private issue? My understand was that it was due to the fact that we retained the root group.

The issue was that the setpriv command was intending to drop privileges to the telegraf user, but it didn't drop group membership correctly (so, 'yes', root would've been retained). The root group grants a lot of privileges and in the case of this issue, it showed that it gave access to the docker socket, which for this plugin was a good thing, but for all others, it would not be. There is a lot more that group membership would give access to (not least of which, DAC checks within the kernel for various file and non-file access checks). Since there is clear intent to drop privileges when the container is started as root (an excellent thing to do!), we need to do it right and really drop, so we made this change.

That said, this issue shows there are cases that we need to handle when the user wants a specific behavior from the container related to group membership, so I put forth a couple of ideas on how to do that.

from influxdata-docker.

powersj avatar powersj commented on July 18, 2024

Since there is clear intent to drop privileges when the container is started as root (an excellent thing to do!), we need to do it right and really drop, so we made this change

When I made the root change the goal was around not running as root. Dropping everything, including groups passed in by a user was certainly not the intent. If a user provides a group via the user or group-add argument, I would expect that group to get passed on.

Can we somehow detect what groups docker gave (eg, is this in an env var)?

The only way I see the groups show up was via id, nothing in /etc/groups or the env.

If not, perhaps we could honor an env var that the user can set? Eg (untested):

We are now in a state where if you start up as the telegraf user, you can pass in groups and have them work as expected. However, if you start up as the root user, you cannot. Can we drop the root group, as we are dropping the command from the root user, but not drop other groups a user has asked to apply.

In effect, does this not end up doing the same thing as passing a list of groups via an environment variable? Except not breaking our users and requiring them to make changes to their deployments.

from influxdata-docker.

jdstrand avatar jdstrand commented on July 18, 2024

If not, perhaps we could honor an env var that the user can set? Eg (untested):

We are now in a state where if you start up as the telegraf user, you can pass in groups and have them work as expected. However, if you start up as the root user, you cannot. Can we drop the root group, as we are dropping the command from the root user, but not drop other groups a user has asked to apply.

In effect, does this not end up doing the same thing as passing a list of groups via an environment variable? Except not breaking our users and requiring them to make changes to their deployments.

Yes, assuming that the group membership in the container was very intentional, which AIUI can happen in one of two ways: 1. during container build (through Dockerfile USER or modifying /etc/group in the container) or 2. use docker run --group-add foo. We control '1' so don't have to worry about that. For '2', the only way to detect that is via id. While I'm slightly uncomfortable with using id within the container, but I'm not sure there is a better choice. Perhaps this:

# honor groups supplied via 'docker run --group-add ...' but drop 'root' (the sed
# removes 'telegraf' since we unconditionally add it and don't want it listed twice)
groups="telegraf"
extra_groups="$(id -Gn | sed \
   -e 's/^\(root\|telegraf\)$//g' \ 
   -e 's/^\(root\|telegraf\) //g' \ 
   -e 's/ \(root\|telegraf\)$//g' \
   -e 's/ \(root\|telegraf\)//g' \  
    -e 's/ /,/g')"
if [ -n "$extra_groups" ]; then
    groups="$groups,$extra_groups"
fi
exec setpriv --reuid telegraf --regid telegraf --groups "$groups" "$@"

That sed is ugly since it needs to backslash (, ) and |, but also tries to handle when the group is the only, the first, last or in the middle. The extra_groups handles when id into sed comes up empty.

from influxdata-docker.

jdstrand avatar jdstrand commented on July 18, 2024

That sed isn't quite right. I'll give a better one.

from influxdata-docker.

powersj avatar powersj commented on July 18, 2024

Thank you for putting that together! I'll give it a look over this week so we can get this change in for v1.30.

from influxdata-docker.

powersj avatar powersj commented on July 18, 2024

Thanks again @jdstrand for the work on this.

I've put up #727 which makes the change to the nightly image. I've been playing with it a bit and I think it looks good, but wanted to get some additional feedback before making the changes to the other images.

Would you be opposed to landing that first and then we can land a change to the other images next week?

from influxdata-docker.

jdstrand avatar jdstrand commented on July 18, 2024

Would you be opposed to landing that first and then we can land a change to the other images next week?

That's fine by me

from influxdata-docker.

powersj avatar powersj commented on July 18, 2024

re-opening until we land this in other releases.

from influxdata-docker.

powersj avatar powersj commented on July 18, 2024

These changes will go out with the release of v1.30 on or around next Monday. The changes will apply to older images as well.

from influxdata-docker.

wz2b avatar wz2b commented on July 18, 2024

This issue is closed so I'm not sure anybody will even see this, but should this:

extra_groups="$(id -Gn || true)"

be changed to:

extra_groups="$(id -Gn telegraf || true)"

because without that it doesn't work right in a docker compose / buildx script. At the point that it runs id it is still root. I think what it really means to do there is to add the extra groups of the telegraf user.

from influxdata-docker.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.