Giter Site home page Giter Site logo

Comments (12)

rhatdan avatar rhatdan commented on July 18, 2024

Container-selinux is not necessarly going to work with systemd-nspawn, unless all of the content in the image has a label of container_file_t.

container-selinux is for traditional Podman/Docker containers, and expects all content inside of the container to have a single label. It runs all of the processes with a single label as well.

from container-selinux.

amessina avatar amessina commented on July 18, 2024

I have labeled the content under /var/lib/machines/msstest with container_file_t. And create the following with semanage fcontext

/usr/bin/systemd-nspawn    system_u:object_r:container_runtime_exec_t:s0
/var/lib/machines/msstest(/.*)?    system_u:object_r:container_file_t:s0:c73

I began work on this a couple years ago but I admit I get lost in SELinux sometimes. I have been running FreeIPA for a while in systemd-nspawn containers, albeit with only audit2allow style fixes.
My work started based on the systemd-nspwan man page example:

       Example 7. Run a container with SELinux sandbox security contexts

           # chcon system_u:object_r:svirt_sandbox_file_t:s0:c0,c1 -R /srv/container
           # systemd-nspawn -L system_u:object_r:svirt_sandbox_file_t:s0:c0,c1 \
                 -Z system_u:system_r:svirt_lxc_net_t:s0:c0,c1 -D /srv/container /bin/sh

I find that the number of AVC denials is so low that "we've got to be close" to SELinux containerization for systemd-nspawn and I am often inspired by things like your recent addition of container_init_t which seem to be designed for systemd containers (according to the comment in the container.te file).

Would it be possible to "clone" one of the container types in this container-selinux package and build the modifications on top of that? Essentially, how can I create my own policy that inherits container_t or the like?

from container-selinux.

rhatdan avatar rhatdan commented on July 18, 2024

After the relabeling, what are the AVCs?

from container-selinux.

amessina avatar amessina commented on July 18, 2024

The AVCs are the same as I originally reported

AVC avc:  denied  { write } for  pid=41646 comm="systemd-machine" name="msstest" dev="dm-1" ino=2367529 scontext=system_u:system_r:systemd_machined_t:s0 tcontext=system_u:object_r:container_file_t:s0:c73 tclass=dir permissive=1
AVC avc:  denied  { mounton } for  pid=53564 comm="(networkd)" path="/" dev="dm-1" ino=2367529 scontext=system_u:system_r:container_userns_t:s0:c73 tcontext=system_u:object_r:container_file_t:s0:c73 tclass=dir permissive=1
AVC avc:  denied  { remount } for  pid=53564 comm="(networkd)" scontext=system_u:system_r:container_userns_t:s0:c73 tcontext=system_u:object_r:fs_t:s0 tclass=filesystem permissive=1
AVC avc:  denied  { mounton } for  pid=53569 comm="(modprobe)" path="/run/systemd/unit-root/proc/sys/kernel/domainname" dev="proc" ino=1123384 scontext=system_u:system_r:container_userns_t:s0:c73 tcontext=system_u:object_r:sysctl_kernel_t:s0 tclass=file permissive=1
AVC avc:  denied  { mounton } for  pid=53571 comm="(r-launch)" path="/tmp/namespace-dev-cNaOYN/dev/pts" dev="tmpfs" ino=1120151 scontext=system_u:system_r:container_userns_t:s0:c73 tcontext=system_u:object_r:container_file_t:s0 tclass=dir permissive=1
AVC avc:  denied  { unmount } for  pid=53571 comm="(r-launch)" scontext=system_u:system_r:container_userns_t:s0:c73 tcontext=system_u:object_r:tmpfs_t:s0 tclass=filesystem permissive=1
AVC avc:  denied  { sendto } for  pid=53547 comm="systemd" path="/systemd/nspawn/notify" scontext=system_u:system_r:container_userns_t:s0:c73 tcontext=system_u:system_r:container_runtime_t:s0 tclass=unix_dgram_socket permissive=1
AVC avc:  denied  { kill } for  pid=41646 comm="systemd-machine" capability=5  scontext=system_u:system_r:systemd_machined_t:s0 tcontext=system_u:system_r:systemd_machined_t:s0 tclass=cap_userns permissive=1

I'm going to investigate the PrivateTmp directives in the service definitions inside the container. It may be that systemd itself needs some modifications to propagate the SELinux type and MCS to those things as well. Especially for the mounton and unmount, or remount permissions.

The issues related to systemd_machined_t are less concerning as probably just a few bits would need to be added to allow machinectl and systemd-nspawn binaries to control SELinux constrained containers.

from container-selinux.

amessina avatar amessina commented on July 18, 2024

Modification of service units inside the container makes a difference. I've put a new Fedora 32 container together since I wanted to include the latest near-stable changes in Fedora's systemd unit files. With the following # commented lines, I've made some progress. In permissive mode, the container will start and only reports

AVC avc:  denied  { sendto } for  pid=8968 comm="systemd" path="/systemd/nspawn/notify" scontext=system_u:system_r:container_userns_t:s0:c32 tcontext=system_u:system_r:container_runtime_t:s0 tclass=unix_dgram_socket permissive=1

In enforcing mode,

AVC avc:  denied  { mounton } for  pid=8980 comm="(networkd)" path="/" dev="dm-1" ino=2361090 scontext=system_u:system_r:container_userns_t:s0:c32 tcontext=system_u:object_r:container_file_t:s0:c32 tclass=dir permissive=0
AVC avc:  denied  { sendto } for  pid=8968 comm="systemd" path="/systemd/nspawn/notify" scontext=system_u:system_r:container_userns_t:s0:c32 tcontext=system_u:system_r:container_runtime_t:s0 tclass=unix_dgram_socket permissive=0

Stopping the container...

USER_AVC pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='avc:  denied  { stop } for auid=n/a uid=0 gid=0 path="/usr/lib/systemd/system/[email protected]" cmdline="/usr/lib/systemd/systemd-machined" scontext=system_u:system_r:systemd_machined_t:s0 tcontext=system_u:object_r:systemd_unit_file_t:s0 tclass=service permissive=0

systemd-networkd.service

[Service]
AmbientCapabilities=CAP_NET_ADMIN CAP_NET_BIND_SERVICE CAP_NET_BROADCAST CAP_NET_RAW
CapabilityBoundingSet=CAP_NET_ADMIN CAP_NET_BIND_SERVICE CAP_NET_BROADCAST CAP_NET_RAW
DeviceAllow=char-* rw
ExecStart=!!/usr/lib/systemd/systemd-networkd
LockPersonality=yes
MemoryDenyWriteExecute=yes
NoNewPrivileges=yes
ProtectControlGroups=yes
ProtectHome=yes
#ProtectKernelModules=yes
#ProtectKernelLogs=yes
#ProtectSystem=strict
Restart=on-failure
RestartSec=0
RestrictAddressFamilies=AF_UNIX AF_NETLINK AF_INET AF_INET6 AF_PACKET AF_ALG
RestrictNamespaces=yes
RestrictRealtime=yes
RestrictSUIDSGID=yes
RuntimeDirectory=systemd/netif
#RuntimeDirectoryPreserve=yes
SystemCallArchitectures=native
SystemCallErrorNumber=EPERM
SystemCallFilter=@system-service
Type=notify
RestartKillSignal=SIGUSR2
User=systemd-network

systemd-homed.service

[Service]
BusName=org.freedesktop.home1
CapabilityBoundingSet=CAP_SYS_ADMIN CAP_CHOWN CAP_DAC_OVERRIDE CAP_FOWNER CAP_FSETID CAP_SETGID CAP_SETUID
DeviceAllow=/dev/loop-control rw
DeviceAllow=/dev/mapper/control rw
DeviceAllow=block-* rw
ExecStart=/usr/lib/systemd/systemd-homed
IPAddressDeny=any
KillMode=mixed
LimitNOFILE=524288
LockPersonality=yes
MemoryDenyWriteExecute=yes
NoNewPrivileges=yes
#PrivateNetwork=yes
RestrictAddressFamilies=AF_UNIX AF_NETLINK AF_ALG
#RestrictNamespaces=mnt
RestrictRealtime=yes
StateDirectory=systemd/home
SystemCallArchitectures=native
SystemCallErrorNumber=EPERM
SystemCallFilter=@system-service @mount

dbus-broker.service

[Service]
Type=notify
Sockets=dbus.socket
OOMScoreAdjust=-900
LimitNOFILE=16384
#ProtectSystem=full
#PrivateTmp=true
#PrivateDevices=true
ExecStart=/usr/bin/dbus-broker-launch --scope system --audit
ExecReload=/usr/bin/busctl call org.freedesktop.DBus /org/freedesktop/DBus org.freedesktop.DBus ReloadConfig

from container-selinux.

amessina avatar amessina commented on July 18, 2024

The remaining parts are...

# ausearch -m user_avc -m avc -ts recent | audit2allow -m fixcont

module fixcont 1.0;

require {
        type container_userns_t;
        type systemd_unit_file_t;
        type systemd_machined_t;
        type container_file_t;
        type container_runtime_t;
        class service stop;
        class dir mounton;
        class unix_dgram_socket sendto;
}

#============= container_userns_t ==============
allow container_userns_t container_file_t:dir mounton;
allow container_userns_t container_runtime_t:unix_dgram_socket sendto;

#============= systemd_machined_t ==============
allow systemd_machined_t systemd_unit_file_t:service stop;

And to control start/stop of the machine with machinectl we need

#============= systemd_machined_t ==============
allow systemd_machined_t container_file_t:dir write;
allow systemd_machined_t self:cap_userns kill;

from container-selinux.

rhatdan avatar rhatdan commented on July 18, 2024

I pushed fixed in container-selinux 2.130.0 for the first twok.
The systemd_machined_t AVCs need to be fixed in the fedora selinux policy package.
@wrabcak PTAL

from container-selinux.

amessina avatar amessina commented on July 18, 2024

Thank you @rhatdan. I was able do more research last night and there does appear to be a systemd_nspawn_t domain configured in the reference policy https://github.com/SELinuxProject/refpolicy/tree/master/policy/modules/system that isn't a part of the fedora-selinux policy. In fact the two seemed to have diverged quite a bit. I was looking to chop the systemd_nspawn_t domain from the reference policy and build it as a module, but a large number of interfaces in the reference policy don't exist (or have changed) in the fedora-selinux policy so that will take me some time.

Is the reference policy useful? I mean I was surprised that Fedora which seems to tout systemd and SELinux didn't have the systemd_nspawn_t part of the policy that was committed 3 years ago,

I did find something interesting about running systemd-nspawn containers with your containter-selinux policy -- it seems to "fix" systemd/systemd#680 -- I don't have this issue with macvlans when run under any of the container_*_t policies, but I do have the issue when using the default Fedora policy.

from container-selinux.

amessina avatar amessina commented on July 18, 2024

This is as much of a start as I can get on bringing in the bits from the refpolicy in line with Fedora policy before I head back to work. The systemd_nspawn.if contains the interfaces for which I couldn't identify the right match in current Fedora policy, so at least it compiles.

It may be all out of date and not helpful. It also doesn't seem to support userns, but it's been a good learning experience so far to read through all this.

I'll try running a simple systemd-nspawn machine with this tomorrow or Wednesday. Thanks again for your guidance.

systemd_nspawn.fc

/usr/bin/systemd-nspawn			--	gen_context(system_u:object_r:systemd_nspawn_exec_t,s0)
/run/systemd/nspawn(/.*)?	gen_context(system_u:object_r:systemd_nspawn_runtime_t,s0)

systemd_nspawn.if

## <summary>systemd-nspawn policy</summary> 

########################################
## <summary>
##	Mount filesystems in the tmp directory (/tmp)
## </summary>
## <param name="domain">
##	<summary>
##	Domain allowed access.
##	</summary>
## </param>
#
interface(`files_mounton_tmp',`
	gen_require(`
		type tmp_t;
	')

	allow $1 tmp_t:dir mounton;
')

########################################
## <summary>
##	mounton a /var/run directory.
## </summary>
## <param name="domain">
##	<summary>
##	Domain allowed access.
##	</summary>
## </param>
#
interface(`files_mounton_pid_dirs',`
	gen_require(`
		type var_run_t;
	')

	allow $1 var_run_t:dir mounton;
')

########################################
## <summary>
##	Mount on tmpfs files.
## </summary>
## <param name="domain">
##	<summary>
##	Domain allowed access.
##	</summary>
## </param>
#
interface(`fs_mounton_tmpfs_files',`
	gen_require(`
		type tmpfs_t;
	')

	allow $1 tmpfs_t:file mounton;
')

########################################
## <summary>
##	remount the proc filesystem.
## </summary>
## <param name="domain">
##	<summary>
##	Domain allowed access.
##	</summary>
## </param>
#
interface(`kernel_remount_proc',`
	gen_require(`
		type proc_t;
	')

	allow $1 proc_t:filesystem remount;
')

########################################
## <summary>
##      Allow getting init_t rlimit
## </summary>
## <param name="domain">
##      <summary>
##      Source domain
##      </summary>
## </param>
#
interface(`init_getrlimit',`
	gen_require(`
		type init_t;
	')

	allow $1 init_t:process getrlimit;
')

systemd_nspawn.te

policy_module(systemd_nspawn, 0.0.1)

#########################################
#
# Declarations
#

## <desc>
## <p>
## Allow systemd-nspawn to create a labelled namespace with the same types
## as parent environment
## </p>
## </desc>
gen_tunable(systemd_nspawn_labeled_namespace, false)

type systemd_nspawn_t;
type systemd_nspawn_exec_t;
init_system_domain(systemd_nspawn_t, systemd_nspawn_exec_t)
mcs_killall(systemd_nspawn_t)

type systemd_nspawn_runtime_t alias systemd_nspawn_var_run_t;
files_pid_file(systemd_nspawn_runtime_t)

type systemd_nspawn_tmp_t;
files_tmp_file(systemd_nspawn_tmp_t)

# Added...
require {
  type systemd_journal_t;
  type systemd_machined_t;
  class dbus send_msg;
}

########################################
#
# Nspawn local policy
#

allow systemd_nspawn_t self:process { signal getcap setcap setfscreate setrlimit sigkill };
allow systemd_nspawn_t self:capability { dac_override dac_read_search fsetid mknod net_admin setgid setuid setpcap sys_admin sys_chroot };
allow systemd_nspawn_t self:capability2 wake_alarm;
allow systemd_nspawn_t self:unix_dgram_socket connected_socket_perms;
allow systemd_nspawn_t self:unix_stream_socket create_stream_socket_perms;

allow systemd_nspawn_t systemd_journal_t:dir search;

allow systemd_nspawn_t systemd_machined_t:dbus send_msg;

allow systemd_nspawn_t systemd_nspawn_runtime_t:dir manage_dir_perms;
allow systemd_nspawn_t systemd_nspawn_runtime_t:file manage_file_perms;
init_pid_filetrans(systemd_nspawn_t, systemd_nspawn_runtime_t, dir)

files_tmp_filetrans(systemd_nspawn_t, systemd_nspawn_tmp_t, { dir file })
allow systemd_nspawn_t systemd_nspawn_tmp_t:dir manage_dir_perms;
allow systemd_nspawn_t systemd_nspawn_tmp_t:dir mounton;
# for /tmp/.#inaccessible*
allow systemd_nspawn_t systemd_nspawn_tmp_t:file manage_file_perms;

# for /run/systemd/nspawn/incoming in chroot
allow systemd_nspawn_t systemd_nspawn_runtime_t:dir mounton;

kernel_mount_proc(systemd_nspawn_t)
kernel_mounton_systemd_ProtectKernelTunables(systemd_nspawn_t)
kernel_mounton_messages(systemd_nspawn_t)
kernel_read_kernel_sysctls(systemd_nspawn_t)
kernel_read_system_state(systemd_nspawn_t)
kernel_remount_proc(systemd_nspawn_t)

corecmd_exec_shell(systemd_nspawn_t)
corecmd_search_bin(systemd_nspawn_t)

corenet_rw_tun_tap_dev(systemd_nspawn_t)

dev_getattr_fs(systemd_nspawn_t)
dev_manage_sysfs_dirs(systemd_nspawn_t)
dev_mounton_sysfs(systemd_nspawn_t)
dev_mount_sysfs_fs(systemd_nspawn_t)
dev_read_rand(systemd_nspawn_t)
dev_read_urand(systemd_nspawn_t)

files_getattr_tmp_dirs(systemd_nspawn_t)
files_manage_etc_files(systemd_nspawn_t)
files_manage_mnt_dirs(systemd_nspawn_t)
files_mounton_mnt(systemd_nspawn_t)
files_mounton_rootfs(systemd_nspawn_t)
files_mounton_tmp(systemd_nspawn_t)
files_read_kernel_symbol_table(systemd_nspawn_t)
files_setattr_pid_dirs(systemd_nspawn_t)

fs_getattr_tmpfs(systemd_nspawn_t)
fs_manage_tmpfs_chr_files(systemd_nspawn_t)
fs_mount_tmpfs(systemd_nspawn_t)
fs_remount_tmpfs(systemd_nspawn_t)
fs_remount_xattr_fs(systemd_nspawn_t)
fs_read_cgroup_files(systemd_nspawn_t)

term_getattr_generic_ptys(systemd_nspawn_t)
term_getattr_pty_fs(systemd_nspawn_t)
term_mount_pty_fs(systemd_nspawn_t)
term_search_ptys(systemd_nspawn_t)
term_setattr_generic_ptys(systemd_nspawn_t)
term_use_ptmx(systemd_nspawn_t)

init_domtrans_script(systemd_nspawn_t)
#init_getrlimit(systemd_nspawn_t)
init_sigkill_script(systemd_nspawn_t)
init_read_state(systemd_nspawn_t)
init_search_pid_dirs(systemd_nspawn_t)
init_write_pid_socket(systemd_nspawn_t)
init_spec_domtrans_script(systemd_nspawn_t)

miscfiles_manage_localization(systemd_nspawn_t)

# for writing inside chroot
sysnet_manage_config(systemd_nspawn_t)

userdom_manage_user_home_dirs(systemd_nspawn_t)

tunable_policy(`systemd_nspawn_labeled_namespace',`
	corecmd_exec_bin(systemd_nspawn_t)
	corecmd_exec_shell(systemd_nspawn_t)

	dev_mounton(systemd_nspawn_t)
	dev_setattr_generic_dirs(systemd_nspawn_t)

	# manage etc symlinks for /etc/localtime
	files_manage_etc_symlinks(systemd_nspawn_t)
	files_mounton_pid_dirs(systemd_nspawn_t)
	files_search_home(systemd_nspawn_t)

	fs_getattr_cgroup(systemd_nspawn_t)
	fs_manage_cgroup_dirs(systemd_nspawn_t)
	fs_manage_tmpfs_dirs(systemd_nspawn_t)
	fs_manage_tmpfs_files(systemd_nspawn_t)
	fs_manage_tmpfs_symlinks(systemd_nspawn_t)
	fs_mount_cgroup(systemd_nspawn_t)
	fs_mounton_cgroup(systemd_nspawn_t)
	fs_mounton_tmpfs(systemd_nspawn_t)
	fs_mounton_tmpfs_files(systemd_nspawn_t)
	fs_remount_cgroup(systemd_nspawn_t)
	fs_search_tmpfs(systemd_nspawn_t)
	fs_unmount_cgroup(systemd_nspawn_t)
	fs_write_cgroup_files(systemd_nspawn_t)

	selinux_getattr_fs(systemd_nspawn_t)
	selinux_remount_fs(systemd_nspawn_t)
	
	# Does not compile
	#selinux_search_fs(systemd_nspawn_t)

	init_domtrans(systemd_nspawn_t)

	logging_search_logs(systemd_nspawn_t)

	seutil_search_default_contexts(systemd_nspawn_t)
')

optional_policy(`
	allow systemd_machined_t systemd_nspawn_t:dbus send_msg;

	dbus_system_bus_client(systemd_nspawn_t)

	optional_policy(`
		unconfined_dbus_send(systemd_machined_t)
	')
')

from container-selinux.

rhatdan avatar rhatdan commented on July 18, 2024

You should open this discussion on the fedora-selinux github. This has nothing to do with container-selinux.

from container-selinux.

wrabcak avatar wrabcak commented on July 18, 2024

Yes, could you please move this discussion to fedora-selinux/selinux-policy repo?

FYI @zpytela

from container-selinux.

amessina avatar amessina commented on July 18, 2024

Ok. I have moved this discussion: fedora-selinux/selinux-policy#344. Thanks.

from container-selinux.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.