Giter Site home page Giter Site logo

Comments (20)

rhatdan avatar rhatdan commented on July 18, 2024 1

@kolyshkin Could you allow the entrpoint rule and see what AVC's get created?

#279

from container-selinux.

kolyshkin avatar kolyshkin commented on July 18, 2024

Alas, I was unable to repro this with a separate runc binary and runc integration tests.

from container-selinux.

rhatdan avatar rhatdan commented on July 18, 2024

Is this still happening? Does runc write out a binary file and then execute it?

from container-selinux.

rhatdan avatar rhatdan commented on July 18, 2024

This looks like runc creates the binary, and then sets the SELinux label on the container at which point it starts executing the binary. Can we set the SELinux label later, only when it executes PID1 process?

from container-selinux.

kolyshkin avatar kolyshkin commented on July 18, 2024

@rhatdan this is an intermediate init binary (which does nothing else except executing the container init process); the binary is created in container's state directory (/run/runc by default). This directory is not being labeled by runc.

Alas, I can't repro this without cri-o.

from container-selinux.

kolyshkin avatar kolyshkin commented on July 18, 2024

I have a repro now: opencontainers/runc#4053.

I also have a workaround (disable dmz if selinux is in enforcing mode and process.selinuxLabel is set in config), but I am not a huge fan of doing things that way.

from container-selinux.

kolyshkin avatar kolyshkin commented on July 18, 2024

Filed runc issue: opencontainers/runc#4057 which for some reason shows slightly different avc denials than this issue.

from container-selinux.

cyphar avatar cyphar commented on July 18, 2024

I suggested setting the security.selinux xattr of the memfd to the container's label, but you get -EACCES from setxattr. I guess it's possible this would not solve the issue even if you could set the label, but it seems like it should.

@rhatdan Is this something that should be possible or something we can enable, or is there something I'm doing wrong? Since we set the process label to the container label with the same privileges, it seems strange to be unable to set a file to have the same label. The only other alternative is to not use runc-dmz on SELinux-enforcing systems, but I'd prefer not to do that if possible.

from container-selinux.

rhatdan avatar rhatdan commented on July 18, 2024

If you call label.SetFSCreateLabel(mount_label) before creating the object, then the container_t should be able to use it.

from container-selinux.

kolyshkin avatar kolyshkin commented on July 18, 2024

If you call label.SetFSCreateLabel(mount_label) before creating the object, then the container_t should be able to use it.

For some reason it doesn't work. Maybe it's not working with memfd_create?

(will provide logs a bit later)

from container-selinux.

kolyshkin avatar kolyshkin commented on July 18, 2024

I checked the logs and they are the same with and without the patch that adds SetFSCreateLabel (except for obvious differences in pids, inode etc).

Here's the log from a run on CentOS 9, before the proposed fix:

runc run tst (status=1):
writing sync procError: write sync: file already closed
execveat: permission denied
----
type=PROCTITLE msg=audit(10/10/2023 00:23:28.967:10933) : proctitle=/tmp/bats-run-6AjJKB/runc.9vKatN/bundle/runc init
type=SYSCALL msg=audit(10/10/2023 00:23:28.967:10933) : arch=x86_64 syscall=execveat success=no exit=EACCES(Permission denied) a0=0x6 a1=0xc0000c511a a2=0xc0000575c0 a3=0xc000024520 items=0 ppid=105366 pid=105377 auid=root uid=root gid=root euid=root suid=root fsuid=root egid=root sgid=root fsgid=root tty=pts0 ses=8 comm=runc:[2:INIT] exe=/tmp/bats-run-6AjJKB/runc.9vKatN/bundle/runc subj=unconfined_u:unconfined_r:container_runtime_t:s0-s0:c0.c1023 key=(null)
type=AVC msg=audit(10/10/2023 00:23:28.967:10933) : avc:  denied  { entrypoint } for  pid=105377 comm=runc:[2:INIT] path=/memfd:runc_cloned:runc-dmz (deleted) dev="tmpfs" ino=271 scontext=system_u:system_r:container_t:s0:c4,c5 tcontext=unconfined_u:object_r:container_runtime_tmpfs_t:s0 tclass=file permissive=0

Here is one after the fix (i.e. with SetFSCreateLabel added):

runc run tst (status=1):
writing sync procError: write sync: file already closed
execveat: permission denied
----
type=PROCTITLE msg=audit(10/09/2023 23:44:03.603:10929) : proctitle=/tmp/bats-run-FSYIY5/runc.HCqujT/bundle/runc init
type=SYSCALL msg=audit(10/09/2023 23:44:03.603:10929) : arch=x86_64 syscall=execveat success=no exit=EACCES(Permission denied) a0=0x6 a1=0xc00012b0fa a2=0xc0001146a0 a3=0xc000024660 items=0 ppid=105403 pid=105414 auid=root uid=root gid=root euid=root suid=root fsuid=root egid=root sgid=root fsgid=root tty=pts0 ses=8 comm=runc:[2:INIT] exe=/tmp/bats-run-FSYIY5/runc.HCqujT/bundle/runc subj=unconfined_u:unconfined_r:container_runtime_t:s0-s0:c0.c1023 key=(null)
type=AVC msg=audit(10/09/2023 23:44:03.603:10929) : avc:  denied  { entrypoint } for  pid=105414 comm=runc:[2:INIT] path=/memfd:runc_cloned:runc-dmz (deleted) dev="tmpfs" ino=3341 scontext=system_u:system_r:container_t:s0:c4,c5 tcontext=unconfined_u:object_r:container_runtime_tmpfs_t:s0 tclass=file permissive=0

I assume that it just doesn't work for memfd_create.

from container-selinux.

kolyshkin avatar kolyshkin commented on July 18, 2024

Indeed, it appears that the fd created by memfd_create does not take fscreatecon into account. I added a check in the code to see if the memfd label is as expected, and it is not:

time="2023-10-10T07:10:57Z" level=error msg="runc run failed: unable to create new parent process: runc-dmz file label mismatch: want \"system_u:system_r:container_t:s0:c4,c5\", got \"unconfined_u:object_r:container_runtime_tmpfs_t:s0\""

Here's the code that does it: opencontainers/runc@650a51b

from container-selinux.

rhatdan avatar rhatdan commented on July 18, 2024

That quite possibly is a kernel issue. I will ping the SELinux kernel developers on this. For now we can add allow rules to container-selinux policy, and then over time remove the rules, once we have a fixed kernel.

from container-selinux.

rhatdan avatar rhatdan commented on July 18, 2024

https://bugzilla.redhat.com/show_bug.cgi?id=2243055

from container-selinux.

kolyshkin avatar kolyshkin commented on July 18, 2024

@kolyshkin Could you allow the entrpoint rule and see what AVC's get created?

With the entrypoint allowed (in opencontainers/runc@48336b6), I get this on CentOS 9:

type=PROCTITLE msg=audit(10/10/2023 19:38:49.181:10952) : proctitle=(null)
type=SYSCALL msg=audit(10/10/2023 19:38:49.181:10952) : arch=x86_64 syscall=execveat success=no exit=EACCES(Permission denied) a0=0x6 a1=0xc00014f11a a2=0xc0000575c0 a3=0xc000024520 items=0 ppid=105361 pid=105372 auid=root uid=root gid=root euid=root suid=root fsuid=root egid=root sgid=root fsgid=root tty=pts0 ses=8 comm=6 exe=/memfd:runc_cloned:runc-dmz (deleted) subj=system_u:system_r:container_t:s0:c4,c5 key=(null)
type=AVC msg=audit(10/10/2023 19:38:49.181:10952) : avc:  denied  { map } for  pid=105372 comm=6 path=/memfd:runc_cloned:runc-dmz (deleted) dev="tmpfs" ino=3312 scontext=system_u:system_r:container_t:s0:c4,c5 tcontext=unconfined_u:object_r:container_runtime_tmpfs_t:s0 tclass=file permissive=0

It's similar for CentOS 8 and Fedora 38.

On CentOS 7 I see:

# type=PROCTITLE msg=audit(10/10/2023 19:37:00.763:712) : proctitle=(null)
# type=SYSCALL msg=audit(10/10/2023 19:37:00.763:712) : arch=x86_64 syscall=execve success=no exit=EACCES(Permission denied) a0=0xc0000b9730 a1=0xc0000a28d0 a2=0xc0000247e0 a3=0x0 items=0 ppid=550 pid=563 auid=root uid=root gid=root euid=root suid=root fsuid=root egid=root sgid=root fsgid=root tty=pts0 ses=7 comm=6 exe=/memfd:runc_cloned:runc-dmz (deleted) subj=system_u:system_r:container_t:s0:c4,c5 key=(null)
# type=AVC msg=audit(10/10/2023 19:37:00.763:712) : avc:  denied  { read execute } for  pid=563 comm=6 path=/memfd:runc_cloned:runc-dmz (deleted) dev="tmpfs" ino=146083 scontext=system_u:system_r:container_t:s0:c4,c5 tcontext=unconfined_u:object_r:container_runtime_tmpfs_t:s0 tclass=file permissive=0

from container-selinux.

kolyshkin avatar kolyshkin commented on July 18, 2024

With these rules

allow container_t container_runtime_tmpfs_t:file entrypoint;
allow container_t container_runtime_tmpfs_t:file map;
allow container_t container_runtime_tmpfs_t:file { execute read };

it works on all distros we test with selinux (centos 7,8,9 and latest fedora).

I am going to also test it with cri-o ci (where the problem was initially discovered), as I'm not sure my tests cover all the bases.

from container-selinux.

kolyshkin avatar kolyshkin commented on July 18, 2024

I am going to also test it with cri-o ci (where the problem was initially discovered), as I'm not sure my tests cover all the bases.

Tested; looks OK (cri-o/cri-o#7359)

from container-selinux.

cyphar avatar cyphar commented on July 18, 2024

I would prefer if we allowed the label to be changed by the runtime (setting the security.selinux xattr to the container's label), rather than allowing the container label to execute stuff outside of its label -- unless that is less safe than just allowing execution (each container gets its own label, I don't know if you can just allow container_runtime_t -> container_t relabelto operations).

We can set fscreatecon as well, so kernels with this behaviour fixed don't need us to change the label.

from container-selinux.

rhatdan avatar rhatdan commented on July 18, 2024

It is not a huge loosening of security to allow containers to read/execute content created in a tmpfs by the container runtimes. Although I would like to remove it at some point.

from container-selinux.

kolyshkin avatar kolyshkin commented on July 18, 2024

Fixed in https://github.com/containers/container-selinux/releases/tag/v2.224.0

from container-selinux.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.