Comments (12)
Container-selinux is not necessarly going to work with systemd-nspawn, unless all of the content in the image has a label of container_file_t.
container-selinux is for traditional Podman/Docker containers, and expects all content inside of the container to have a single label. It runs all of the processes with a single label as well.
from container-selinux.
I have labeled the content under /var/lib/machines/msstest
with container_file_t
. And create the following with semanage fcontext
/usr/bin/systemd-nspawn system_u:object_r:container_runtime_exec_t:s0
/var/lib/machines/msstest(/.*)? system_u:object_r:container_file_t:s0:c73
I began work on this a couple years ago but I admit I get lost in SELinux sometimes. I have been running FreeIPA for a while in systemd-nspawn containers, albeit with only audit2allow
style fixes.
My work started based on the systemd-nspwan
man page example:
Example 7. Run a container with SELinux sandbox security contexts
# chcon system_u:object_r:svirt_sandbox_file_t:s0:c0,c1 -R /srv/container
# systemd-nspawn -L system_u:object_r:svirt_sandbox_file_t:s0:c0,c1 \
-Z system_u:system_r:svirt_lxc_net_t:s0:c0,c1 -D /srv/container /bin/sh
I find that the number of AVC denials is so low that "we've got to be close" to SELinux containerization for systemd-nspawn and I am often inspired by things like your recent addition of container_init_t
which seem to be designed for systemd containers (according to the comment in the container.te
file).
Would it be possible to "clone" one of the container types in this container-selinux package and build the modifications on top of that? Essentially, how can I create my own policy that inherits container_t
or the like?
from container-selinux.
After the relabeling, what are the AVCs?
from container-selinux.
The AVCs are the same as I originally reported
AVC avc: denied { write } for pid=41646 comm="systemd-machine" name="msstest" dev="dm-1" ino=2367529 scontext=system_u:system_r:systemd_machined_t:s0 tcontext=system_u:object_r:container_file_t:s0:c73 tclass=dir permissive=1
AVC avc: denied { mounton } for pid=53564 comm="(networkd)" path="/" dev="dm-1" ino=2367529 scontext=system_u:system_r:container_userns_t:s0:c73 tcontext=system_u:object_r:container_file_t:s0:c73 tclass=dir permissive=1
AVC avc: denied { remount } for pid=53564 comm="(networkd)" scontext=system_u:system_r:container_userns_t:s0:c73 tcontext=system_u:object_r:fs_t:s0 tclass=filesystem permissive=1
AVC avc: denied { mounton } for pid=53569 comm="(modprobe)" path="/run/systemd/unit-root/proc/sys/kernel/domainname" dev="proc" ino=1123384 scontext=system_u:system_r:container_userns_t:s0:c73 tcontext=system_u:object_r:sysctl_kernel_t:s0 tclass=file permissive=1
AVC avc: denied { mounton } for pid=53571 comm="(r-launch)" path="/tmp/namespace-dev-cNaOYN/dev/pts" dev="tmpfs" ino=1120151 scontext=system_u:system_r:container_userns_t:s0:c73 tcontext=system_u:object_r:container_file_t:s0 tclass=dir permissive=1
AVC avc: denied { unmount } for pid=53571 comm="(r-launch)" scontext=system_u:system_r:container_userns_t:s0:c73 tcontext=system_u:object_r:tmpfs_t:s0 tclass=filesystem permissive=1
AVC avc: denied { sendto } for pid=53547 comm="systemd" path="/systemd/nspawn/notify" scontext=system_u:system_r:container_userns_t:s0:c73 tcontext=system_u:system_r:container_runtime_t:s0 tclass=unix_dgram_socket permissive=1
AVC avc: denied { kill } for pid=41646 comm="systemd-machine" capability=5 scontext=system_u:system_r:systemd_machined_t:s0 tcontext=system_u:system_r:systemd_machined_t:s0 tclass=cap_userns permissive=1
I'm going to investigate the PrivateTmp
directives in the service definitions inside the container. It may be that systemd itself needs some modifications to propagate the SELinux type and MCS to those things as well. Especially for the mounton
and unmount
, or remount
permissions.
The issues related to systemd_machined_t
are less concerning as probably just a few bits would need to be added to allow machinectl
and systemd-nspawn
binaries to control SELinux constrained containers.
from container-selinux.
Modification of service units inside the container makes a difference. I've put a new Fedora 32 container together since I wanted to include the latest near-stable changes in Fedora's systemd unit files. With the following #
commented lines, I've made some progress. In permissive mode, the container will start and only reports
AVC avc: denied { sendto } for pid=8968 comm="systemd" path="/systemd/nspawn/notify" scontext=system_u:system_r:container_userns_t:s0:c32 tcontext=system_u:system_r:container_runtime_t:s0 tclass=unix_dgram_socket permissive=1
In enforcing mode,
AVC avc: denied { mounton } for pid=8980 comm="(networkd)" path="/" dev="dm-1" ino=2361090 scontext=system_u:system_r:container_userns_t:s0:c32 tcontext=system_u:object_r:container_file_t:s0:c32 tclass=dir permissive=0
AVC avc: denied { sendto } for pid=8968 comm="systemd" path="/systemd/nspawn/notify" scontext=system_u:system_r:container_userns_t:s0:c32 tcontext=system_u:system_r:container_runtime_t:s0 tclass=unix_dgram_socket permissive=0
Stopping the container...
USER_AVC pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='avc: denied { stop } for auid=n/a uid=0 gid=0 path="/usr/lib/systemd/system/[email protected]" cmdline="/usr/lib/systemd/systemd-machined" scontext=system_u:system_r:systemd_machined_t:s0 tcontext=system_u:object_r:systemd_unit_file_t:s0 tclass=service permissive=0
systemd-networkd.service
[Service]
AmbientCapabilities=CAP_NET_ADMIN CAP_NET_BIND_SERVICE CAP_NET_BROADCAST CAP_NET_RAW
CapabilityBoundingSet=CAP_NET_ADMIN CAP_NET_BIND_SERVICE CAP_NET_BROADCAST CAP_NET_RAW
DeviceAllow=char-* rw
ExecStart=!!/usr/lib/systemd/systemd-networkd
LockPersonality=yes
MemoryDenyWriteExecute=yes
NoNewPrivileges=yes
ProtectControlGroups=yes
ProtectHome=yes
#ProtectKernelModules=yes
#ProtectKernelLogs=yes
#ProtectSystem=strict
Restart=on-failure
RestartSec=0
RestrictAddressFamilies=AF_UNIX AF_NETLINK AF_INET AF_INET6 AF_PACKET AF_ALG
RestrictNamespaces=yes
RestrictRealtime=yes
RestrictSUIDSGID=yes
RuntimeDirectory=systemd/netif
#RuntimeDirectoryPreserve=yes
SystemCallArchitectures=native
SystemCallErrorNumber=EPERM
SystemCallFilter=@system-service
Type=notify
RestartKillSignal=SIGUSR2
User=systemd-network
systemd-homed.service
[Service]
BusName=org.freedesktop.home1
CapabilityBoundingSet=CAP_SYS_ADMIN CAP_CHOWN CAP_DAC_OVERRIDE CAP_FOWNER CAP_FSETID CAP_SETGID CAP_SETUID
DeviceAllow=/dev/loop-control rw
DeviceAllow=/dev/mapper/control rw
DeviceAllow=block-* rw
ExecStart=/usr/lib/systemd/systemd-homed
IPAddressDeny=any
KillMode=mixed
LimitNOFILE=524288
LockPersonality=yes
MemoryDenyWriteExecute=yes
NoNewPrivileges=yes
#PrivateNetwork=yes
RestrictAddressFamilies=AF_UNIX AF_NETLINK AF_ALG
#RestrictNamespaces=mnt
RestrictRealtime=yes
StateDirectory=systemd/home
SystemCallArchitectures=native
SystemCallErrorNumber=EPERM
SystemCallFilter=@system-service @mount
dbus-broker.service
[Service]
Type=notify
Sockets=dbus.socket
OOMScoreAdjust=-900
LimitNOFILE=16384
#ProtectSystem=full
#PrivateTmp=true
#PrivateDevices=true
ExecStart=/usr/bin/dbus-broker-launch --scope system --audit
ExecReload=/usr/bin/busctl call org.freedesktop.DBus /org/freedesktop/DBus org.freedesktop.DBus ReloadConfig
from container-selinux.
The remaining parts are...
# ausearch -m user_avc -m avc -ts recent | audit2allow -m fixcont
module fixcont 1.0;
require {
type container_userns_t;
type systemd_unit_file_t;
type systemd_machined_t;
type container_file_t;
type container_runtime_t;
class service stop;
class dir mounton;
class unix_dgram_socket sendto;
}
#============= container_userns_t ==============
allow container_userns_t container_file_t:dir mounton;
allow container_userns_t container_runtime_t:unix_dgram_socket sendto;
#============= systemd_machined_t ==============
allow systemd_machined_t systemd_unit_file_t:service stop;
And to control start/stop of the machine with machinectl
we need
#============= systemd_machined_t ==============
allow systemd_machined_t container_file_t:dir write;
allow systemd_machined_t self:cap_userns kill;
from container-selinux.
I pushed fixed in container-selinux 2.130.0 for the first twok.
The systemd_machined_t AVCs need to be fixed in the fedora selinux policy package.
@wrabcak PTAL
from container-selinux.
Thank you @rhatdan. I was able do more research last night and there does appear to be a systemd_nspawn_t
domain configured in the reference policy https://github.com/SELinuxProject/refpolicy/tree/master/policy/modules/system that isn't a part of the fedora-selinux policy. In fact the two seemed to have diverged quite a bit. I was looking to chop the systemd_nspawn_t
domain from the reference policy and build it as a module, but a large number of interfaces in the reference policy don't exist (or have changed) in the fedora-selinux policy so that will take me some time.
Is the reference policy useful? I mean I was surprised that Fedora which seems to tout systemd and SELinux didn't have the systemd_nspawn_t
part of the policy that was committed 3 years ago,
I did find something interesting about running systemd-nspawn containers with your containter-selinux policy -- it seems to "fix" systemd/systemd#680 -- I don't have this issue with macvlans when run under any of the container_*_t
policies, but I do have the issue when using the default Fedora policy.
from container-selinux.
This is as much of a start as I can get on bringing in the bits from the refpolicy in line with Fedora policy before I head back to work. The systemd_nspawn.if contains the interfaces for which I couldn't identify the right match in current Fedora policy, so at least it compiles.
It may be all out of date and not helpful. It also doesn't seem to support userns, but it's been a good learning experience so far to read through all this.
I'll try running a simple systemd-nspawn machine with this tomorrow or Wednesday. Thanks again for your guidance.
systemd_nspawn.fc
/usr/bin/systemd-nspawn -- gen_context(system_u:object_r:systemd_nspawn_exec_t,s0)
/run/systemd/nspawn(/.*)? gen_context(system_u:object_r:systemd_nspawn_runtime_t,s0)
systemd_nspawn.if
## <summary>systemd-nspawn policy</summary>
########################################
## <summary>
## Mount filesystems in the tmp directory (/tmp)
## </summary>
## <param name="domain">
## <summary>
## Domain allowed access.
## </summary>
## </param>
#
interface(`files_mounton_tmp',`
gen_require(`
type tmp_t;
')
allow $1 tmp_t:dir mounton;
')
########################################
## <summary>
## mounton a /var/run directory.
## </summary>
## <param name="domain">
## <summary>
## Domain allowed access.
## </summary>
## </param>
#
interface(`files_mounton_pid_dirs',`
gen_require(`
type var_run_t;
')
allow $1 var_run_t:dir mounton;
')
########################################
## <summary>
## Mount on tmpfs files.
## </summary>
## <param name="domain">
## <summary>
## Domain allowed access.
## </summary>
## </param>
#
interface(`fs_mounton_tmpfs_files',`
gen_require(`
type tmpfs_t;
')
allow $1 tmpfs_t:file mounton;
')
########################################
## <summary>
## remount the proc filesystem.
## </summary>
## <param name="domain">
## <summary>
## Domain allowed access.
## </summary>
## </param>
#
interface(`kernel_remount_proc',`
gen_require(`
type proc_t;
')
allow $1 proc_t:filesystem remount;
')
########################################
## <summary>
## Allow getting init_t rlimit
## </summary>
## <param name="domain">
## <summary>
## Source domain
## </summary>
## </param>
#
interface(`init_getrlimit',`
gen_require(`
type init_t;
')
allow $1 init_t:process getrlimit;
')
systemd_nspawn.te
policy_module(systemd_nspawn, 0.0.1)
#########################################
#
# Declarations
#
## <desc>
## <p>
## Allow systemd-nspawn to create a labelled namespace with the same types
## as parent environment
## </p>
## </desc>
gen_tunable(systemd_nspawn_labeled_namespace, false)
type systemd_nspawn_t;
type systemd_nspawn_exec_t;
init_system_domain(systemd_nspawn_t, systemd_nspawn_exec_t)
mcs_killall(systemd_nspawn_t)
type systemd_nspawn_runtime_t alias systemd_nspawn_var_run_t;
files_pid_file(systemd_nspawn_runtime_t)
type systemd_nspawn_tmp_t;
files_tmp_file(systemd_nspawn_tmp_t)
# Added...
require {
type systemd_journal_t;
type systemd_machined_t;
class dbus send_msg;
}
########################################
#
# Nspawn local policy
#
allow systemd_nspawn_t self:process { signal getcap setcap setfscreate setrlimit sigkill };
allow systemd_nspawn_t self:capability { dac_override dac_read_search fsetid mknod net_admin setgid setuid setpcap sys_admin sys_chroot };
allow systemd_nspawn_t self:capability2 wake_alarm;
allow systemd_nspawn_t self:unix_dgram_socket connected_socket_perms;
allow systemd_nspawn_t self:unix_stream_socket create_stream_socket_perms;
allow systemd_nspawn_t systemd_journal_t:dir search;
allow systemd_nspawn_t systemd_machined_t:dbus send_msg;
allow systemd_nspawn_t systemd_nspawn_runtime_t:dir manage_dir_perms;
allow systemd_nspawn_t systemd_nspawn_runtime_t:file manage_file_perms;
init_pid_filetrans(systemd_nspawn_t, systemd_nspawn_runtime_t, dir)
files_tmp_filetrans(systemd_nspawn_t, systemd_nspawn_tmp_t, { dir file })
allow systemd_nspawn_t systemd_nspawn_tmp_t:dir manage_dir_perms;
allow systemd_nspawn_t systemd_nspawn_tmp_t:dir mounton;
# for /tmp/.#inaccessible*
allow systemd_nspawn_t systemd_nspawn_tmp_t:file manage_file_perms;
# for /run/systemd/nspawn/incoming in chroot
allow systemd_nspawn_t systemd_nspawn_runtime_t:dir mounton;
kernel_mount_proc(systemd_nspawn_t)
kernel_mounton_systemd_ProtectKernelTunables(systemd_nspawn_t)
kernel_mounton_messages(systemd_nspawn_t)
kernel_read_kernel_sysctls(systemd_nspawn_t)
kernel_read_system_state(systemd_nspawn_t)
kernel_remount_proc(systemd_nspawn_t)
corecmd_exec_shell(systemd_nspawn_t)
corecmd_search_bin(systemd_nspawn_t)
corenet_rw_tun_tap_dev(systemd_nspawn_t)
dev_getattr_fs(systemd_nspawn_t)
dev_manage_sysfs_dirs(systemd_nspawn_t)
dev_mounton_sysfs(systemd_nspawn_t)
dev_mount_sysfs_fs(systemd_nspawn_t)
dev_read_rand(systemd_nspawn_t)
dev_read_urand(systemd_nspawn_t)
files_getattr_tmp_dirs(systemd_nspawn_t)
files_manage_etc_files(systemd_nspawn_t)
files_manage_mnt_dirs(systemd_nspawn_t)
files_mounton_mnt(systemd_nspawn_t)
files_mounton_rootfs(systemd_nspawn_t)
files_mounton_tmp(systemd_nspawn_t)
files_read_kernel_symbol_table(systemd_nspawn_t)
files_setattr_pid_dirs(systemd_nspawn_t)
fs_getattr_tmpfs(systemd_nspawn_t)
fs_manage_tmpfs_chr_files(systemd_nspawn_t)
fs_mount_tmpfs(systemd_nspawn_t)
fs_remount_tmpfs(systemd_nspawn_t)
fs_remount_xattr_fs(systemd_nspawn_t)
fs_read_cgroup_files(systemd_nspawn_t)
term_getattr_generic_ptys(systemd_nspawn_t)
term_getattr_pty_fs(systemd_nspawn_t)
term_mount_pty_fs(systemd_nspawn_t)
term_search_ptys(systemd_nspawn_t)
term_setattr_generic_ptys(systemd_nspawn_t)
term_use_ptmx(systemd_nspawn_t)
init_domtrans_script(systemd_nspawn_t)
#init_getrlimit(systemd_nspawn_t)
init_sigkill_script(systemd_nspawn_t)
init_read_state(systemd_nspawn_t)
init_search_pid_dirs(systemd_nspawn_t)
init_write_pid_socket(systemd_nspawn_t)
init_spec_domtrans_script(systemd_nspawn_t)
miscfiles_manage_localization(systemd_nspawn_t)
# for writing inside chroot
sysnet_manage_config(systemd_nspawn_t)
userdom_manage_user_home_dirs(systemd_nspawn_t)
tunable_policy(`systemd_nspawn_labeled_namespace',`
corecmd_exec_bin(systemd_nspawn_t)
corecmd_exec_shell(systemd_nspawn_t)
dev_mounton(systemd_nspawn_t)
dev_setattr_generic_dirs(systemd_nspawn_t)
# manage etc symlinks for /etc/localtime
files_manage_etc_symlinks(systemd_nspawn_t)
files_mounton_pid_dirs(systemd_nspawn_t)
files_search_home(systemd_nspawn_t)
fs_getattr_cgroup(systemd_nspawn_t)
fs_manage_cgroup_dirs(systemd_nspawn_t)
fs_manage_tmpfs_dirs(systemd_nspawn_t)
fs_manage_tmpfs_files(systemd_nspawn_t)
fs_manage_tmpfs_symlinks(systemd_nspawn_t)
fs_mount_cgroup(systemd_nspawn_t)
fs_mounton_cgroup(systemd_nspawn_t)
fs_mounton_tmpfs(systemd_nspawn_t)
fs_mounton_tmpfs_files(systemd_nspawn_t)
fs_remount_cgroup(systemd_nspawn_t)
fs_search_tmpfs(systemd_nspawn_t)
fs_unmount_cgroup(systemd_nspawn_t)
fs_write_cgroup_files(systemd_nspawn_t)
selinux_getattr_fs(systemd_nspawn_t)
selinux_remount_fs(systemd_nspawn_t)
# Does not compile
#selinux_search_fs(systemd_nspawn_t)
init_domtrans(systemd_nspawn_t)
logging_search_logs(systemd_nspawn_t)
seutil_search_default_contexts(systemd_nspawn_t)
')
optional_policy(`
allow systemd_machined_t systemd_nspawn_t:dbus send_msg;
dbus_system_bus_client(systemd_nspawn_t)
optional_policy(`
unconfined_dbus_send(systemd_machined_t)
')
')
from container-selinux.
You should open this discussion on the fedora-selinux github. This has nothing to do with container-selinux.
from container-selinux.
Yes, could you please move this discussion to fedora-selinux/selinux-policy repo?
FYI @zpytela
from container-selinux.
Ok. I have moved this discussion: fedora-selinux/selinux-policy#344. Thanks.
from container-selinux.
Related Issues (20)
- SELinux blocks ansible from doing DNF updates with the nsenter connection plugin HOT 8
- Branch protection for main branch HOT 3
- gating tests? HOT 2
- iptables-restore cannot read file from inside a container HOT 6
- allow user_u to work with containers HOT 8
- Packit: Use packit for bumping official fedora package HOT 1
- CI: check for long-running relabels HOT 1
- [packit] Propose downstream failed for release v2.213.0 HOT 3
- Issues on Fedora (container-selinux-2.211.1) with container_domain_template HOT 5
- Issue on RHEL with iscsiadm on v2.205 HOT 4
- user_namespace { create } rule not working HOT 11
- Concern with use of dac_override in home_container.cil HOT 3
- `avc: denied { shutdown }` when using socket activation with rootless podman quadlet HOT 3
- dri_device_t cannot be accessed correctly by pods using device plugins. HOT 12
- Add support for `rpm --verify` HOT 2
- container_init_t does not possess ptrace process context HOT 13
- CRI-O CI broken due to SELinux AVC Denials with latest runc (main branch) build HOT 20
- systemd crashes while attempting to start under container_user_r role HOT 11
- /etc/kubernetes filetrans? HOT 1
- container_user_u issues related to `podmansh` HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from container-selinux.