Comments (20)
@kolyshkin Could you allow the entrpoint rule and see what AVC's get created?
from container-selinux.
Alas, I was unable to repro this with a separate runc binary and runc integration tests.
from container-selinux.
Is this still happening? Does runc write out a binary file and then execute it?
from container-selinux.
This looks like runc creates the binary, and then sets the SELinux label on the container at which point it starts executing the binary. Can we set the SELinux label later, only when it executes PID1 process?
from container-selinux.
@rhatdan this is an intermediate init binary (which does nothing else except executing the container init process); the binary is created in container's state directory (/run/runc
by default). This directory is not being labeled by runc.
Alas, I can't repro this without cri-o.
from container-selinux.
I have a repro now: opencontainers/runc#4053.
I also have a workaround (disable dmz if selinux is in enforcing mode and process.selinuxLabel is set in config), but I am not a huge fan of doing things that way.
from container-selinux.
Filed runc issue: opencontainers/runc#4057 which for some reason shows slightly different avc denials than this issue.
from container-selinux.
I suggested setting the security.selinux
xattr of the memfd to the container's label, but you get -EACCES
from setxattr
. I guess it's possible this would not solve the issue even if you could set the label, but it seems like it should.
@rhatdan Is this something that should be possible or something we can enable, or is there something I'm doing wrong? Since we set the process label to the container label with the same privileges, it seems strange to be unable to set a file to have the same label. The only other alternative is to not use runc-dmz
on SELinux-enforcing systems, but I'd prefer not to do that if possible.
from container-selinux.
If you call label.SetFSCreateLabel(mount_label) before creating the object, then the container_t should be able to use it.
from container-selinux.
If you call label.SetFSCreateLabel(mount_label) before creating the object, then the container_t should be able to use it.
For some reason it doesn't work. Maybe it's not working with memfd_create
?
(will provide logs a bit later)
from container-selinux.
I checked the logs and they are the same with and without the patch that adds SetFSCreateLabel
(except for obvious differences in pids, inode etc).
Here's the log from a run on CentOS 9, before the proposed fix:
runc run tst (status=1):
writing sync procError: write sync: file already closed
execveat: permission denied
----
type=PROCTITLE msg=audit(10/10/2023 00:23:28.967:10933) : proctitle=/tmp/bats-run-6AjJKB/runc.9vKatN/bundle/runc init
type=SYSCALL msg=audit(10/10/2023 00:23:28.967:10933) : arch=x86_64 syscall=execveat success=no exit=EACCES(Permission denied) a0=0x6 a1=0xc0000c511a a2=0xc0000575c0 a3=0xc000024520 items=0 ppid=105366 pid=105377 auid=root uid=root gid=root euid=root suid=root fsuid=root egid=root sgid=root fsgid=root tty=pts0 ses=8 comm=runc:[2:INIT] exe=/tmp/bats-run-6AjJKB/runc.9vKatN/bundle/runc subj=unconfined_u:unconfined_r:container_runtime_t:s0-s0:c0.c1023 key=(null)
type=AVC msg=audit(10/10/2023 00:23:28.967:10933) : avc: denied { entrypoint } for pid=105377 comm=runc:[2:INIT] path=/memfd:runc_cloned:runc-dmz (deleted) dev="tmpfs" ino=271 scontext=system_u:system_r:container_t:s0:c4,c5 tcontext=unconfined_u:object_r:container_runtime_tmpfs_t:s0 tclass=file permissive=0
Here is one after the fix (i.e. with SetFSCreateLabel
added):
runc run tst (status=1):
writing sync procError: write sync: file already closed
execveat: permission denied
----
type=PROCTITLE msg=audit(10/09/2023 23:44:03.603:10929) : proctitle=/tmp/bats-run-FSYIY5/runc.HCqujT/bundle/runc init
type=SYSCALL msg=audit(10/09/2023 23:44:03.603:10929) : arch=x86_64 syscall=execveat success=no exit=EACCES(Permission denied) a0=0x6 a1=0xc00012b0fa a2=0xc0001146a0 a3=0xc000024660 items=0 ppid=105403 pid=105414 auid=root uid=root gid=root euid=root suid=root fsuid=root egid=root sgid=root fsgid=root tty=pts0 ses=8 comm=runc:[2:INIT] exe=/tmp/bats-run-FSYIY5/runc.HCqujT/bundle/runc subj=unconfined_u:unconfined_r:container_runtime_t:s0-s0:c0.c1023 key=(null)
type=AVC msg=audit(10/09/2023 23:44:03.603:10929) : avc: denied { entrypoint } for pid=105414 comm=runc:[2:INIT] path=/memfd:runc_cloned:runc-dmz (deleted) dev="tmpfs" ino=3341 scontext=system_u:system_r:container_t:s0:c4,c5 tcontext=unconfined_u:object_r:container_runtime_tmpfs_t:s0 tclass=file permissive=0
I assume that it just doesn't work for memfd_create
.
from container-selinux.
Indeed, it appears that the fd created by memfd_create does not take fscreatecon into account. I added a check in the code to see if the memfd label is as expected, and it is not:
time="2023-10-10T07:10:57Z" level=error msg="runc run failed: unable to create new parent process: runc-dmz file label mismatch: want \"system_u:system_r:container_t:s0:c4,c5\", got \"unconfined_u:object_r:container_runtime_tmpfs_t:s0\""
Here's the code that does it: opencontainers/runc@650a51b
from container-selinux.
That quite possibly is a kernel issue. I will ping the SELinux kernel developers on this. For now we can add allow rules to container-selinux policy, and then over time remove the rules, once we have a fixed kernel.
from container-selinux.
https://bugzilla.redhat.com/show_bug.cgi?id=2243055
from container-selinux.
@kolyshkin Could you allow the entrpoint rule and see what AVC's get created?
With the entrypoint allowed (in opencontainers/runc@48336b6), I get this on CentOS 9:
type=PROCTITLE msg=audit(10/10/2023 19:38:49.181:10952) : proctitle=(null)
type=SYSCALL msg=audit(10/10/2023 19:38:49.181:10952) : arch=x86_64 syscall=execveat success=no exit=EACCES(Permission denied) a0=0x6 a1=0xc00014f11a a2=0xc0000575c0 a3=0xc000024520 items=0 ppid=105361 pid=105372 auid=root uid=root gid=root euid=root suid=root fsuid=root egid=root sgid=root fsgid=root tty=pts0 ses=8 comm=6 exe=/memfd:runc_cloned:runc-dmz (deleted) subj=system_u:system_r:container_t:s0:c4,c5 key=(null)
type=AVC msg=audit(10/10/2023 19:38:49.181:10952) : avc: denied { map } for pid=105372 comm=6 path=/memfd:runc_cloned:runc-dmz (deleted) dev="tmpfs" ino=3312 scontext=system_u:system_r:container_t:s0:c4,c5 tcontext=unconfined_u:object_r:container_runtime_tmpfs_t:s0 tclass=file permissive=0
It's similar for CentOS 8 and Fedora 38.
On CentOS 7 I see:
# type=PROCTITLE msg=audit(10/10/2023 19:37:00.763:712) : proctitle=(null)
# type=SYSCALL msg=audit(10/10/2023 19:37:00.763:712) : arch=x86_64 syscall=execve success=no exit=EACCES(Permission denied) a0=0xc0000b9730 a1=0xc0000a28d0 a2=0xc0000247e0 a3=0x0 items=0 ppid=550 pid=563 auid=root uid=root gid=root euid=root suid=root fsuid=root egid=root sgid=root fsgid=root tty=pts0 ses=7 comm=6 exe=/memfd:runc_cloned:runc-dmz (deleted) subj=system_u:system_r:container_t:s0:c4,c5 key=(null)
# type=AVC msg=audit(10/10/2023 19:37:00.763:712) : avc: denied { read execute } for pid=563 comm=6 path=/memfd:runc_cloned:runc-dmz (deleted) dev="tmpfs" ino=146083 scontext=system_u:system_r:container_t:s0:c4,c5 tcontext=unconfined_u:object_r:container_runtime_tmpfs_t:s0 tclass=file permissive=0
from container-selinux.
With these rules
allow container_t container_runtime_tmpfs_t:file entrypoint;
allow container_t container_runtime_tmpfs_t:file map;
allow container_t container_runtime_tmpfs_t:file { execute read };
it works on all distros we test with selinux (centos 7,8,9 and latest fedora).
I am going to also test it with cri-o ci (where the problem was initially discovered), as I'm not sure my tests cover all the bases.
from container-selinux.
I am going to also test it with cri-o ci (where the problem was initially discovered), as I'm not sure my tests cover all the bases.
Tested; looks OK (cri-o/cri-o#7359)
from container-selinux.
I would prefer if we allowed the label to be changed by the runtime (setting the security.selinux
xattr to the container's label), rather than allowing the container label to execute stuff outside of its label -- unless that is less safe than just allowing execution (each container gets its own label, I don't know if you can just allow container_runtime_t
-> container_t
relabelto
operations).
We can set fscreatecon
as well, so kernels with this behaviour fixed don't need us to change the label.
from container-selinux.
It is not a huge loosening of security to allow containers to read/execute content created in a tmpfs by the container runtimes. Although I would like to remove it at some point.
from container-selinux.
Fixed in https://github.com/containers/container-selinux/releases/tag/v2.224.0
from container-selinux.
Related Issues (20)
- SELinux blocks ansible from doing DNF updates with the nsenter connection plugin HOT 8
- Branch protection for main branch HOT 3
- gating tests? HOT 2
- iptables-restore cannot read file from inside a container HOT 6
- allow user_u to work with containers HOT 8
- Packit: Use packit for bumping official fedora package HOT 1
- CI: check for long-running relabels HOT 1
- [packit] Propose downstream failed for release v2.213.0 HOT 3
- Issues on Fedora (container-selinux-2.211.1) with container_domain_template HOT 5
- Issue on RHEL with iscsiadm on v2.205 HOT 4
- user_namespace { create } rule not working HOT 11
- Concern with use of dac_override in home_container.cil HOT 3
- `avc: denied { shutdown }` when using socket activation with rootless podman quadlet HOT 3
- dri_device_t cannot be accessed correctly by pods using device plugins. HOT 12
- Add support for `rpm --verify` HOT 2
- container_init_t does not possess ptrace process context HOT 13
- systemd crashes while attempting to start under container_user_r role HOT 11
- /etc/kubernetes filetrans? HOT 1
- container_user_u issues related to `podmansh` HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from container-selinux.