[PATCH] host/rootfs: Sandbox crosvm
This means that a breach of crosvm is not guaranteed to be fatal. The Wayland socket is still only accessible by root, so crosvm must run as root. The known container escape via /proc/self/exe is blocked by bwrap being on a read-only filesystem. Container escapes via /proc are blocked by remounting /proc read-only. Crosvm does not have CAP_SYS_ADMIN so it cannot change mounts. The two remaining steps are: - Run crosvm as an unprivileged user. - Enable seccomp to block most system calls. The latter should be done from within crosvm itself. Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com> --- host/rootfs/default.nix | 6 +++--- .../template/data/service/vhost-user-gpu/run | 17 ++++++++++++++++- 2 files changed, 19 insertions(+), 4 deletions(-) diff --git a/host/rootfs/default.nix b/host/rootfs/default.nix index b441a517f3bbb78f84d8566ca6dfd9181d0302be..81e12b6c2e98ca789d2d14e56dd2b7175296c1e8 100644 --- a/host/rootfs/default.nix +++ b/host/rootfs/default.nix @@ -10,7 +10,7 @@ pkgsMusl.callPackage ( { spectrum-host-tools , lib, stdenvNoCC, nixos, runCommand, writeClosure, erofs-utils, s6-rc -, busybox, cloud-hypervisor, cosmic-files, crosvm, cryptsetup +, bubblewrap, busybox, cloud-hypervisor, cosmic-files, crosvm, cryptsetup , dejavu_fonts, dbus, execline, foot, fuse3, iproute2, inotify-tools , jq, kmod, mdevd, mesa, s6, s6-linux-init, socat, systemd , util-linuxMinimal, virtiofsd, westonLite, xdg-desktop-portal @@ -25,8 +25,8 @@ let trivial; packages = [ - btrfs-progs cloud-hypervisor cosmic-files crosvm cryptsetup dbus - execline fuse3 inotify-tools iproute2 jq kmod mdevd s6 s6-linux-init + bubblewrap btrfs-progs cloud-hypervisor cosmic-files crosvm cryptsetup + dbus execline fuse3 inotify-tools iproute2 jq kmod mdevd s6 s6-linux-init s6-rc socat spectrum-host-tools util-linuxMinimal virtiofsd xdg-desktop-portal-spectrum-host diff --git a/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/vhost-user-gpu/run b/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/vhost-user-gpu/run index 0b4f6a00bc7aed0e721454d584d3bcd47fb18e2a..4838199a859cfadb45c23fb314f4651c6a6b3041 100755 --- a/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/vhost-user-gpu/run +++ b/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/vhost-user-gpu/run @@ -1,10 +1,25 @@ #!/bin/execlineb -P # SPDX-License-Identifier: EUPL-1.2+ # SPDX-FileCopyrightText: 2025 Alyssa Ross <hi@alyssa.is> +# SPDX-FileCopyrightText: 2025 Demi Marie Obenour <demiobenour@gmail.com> s6-ipcserver -1a 0700 -C 1 -b 1 env/crosvm.sock -crosvm --no-syslog device gpu +bwrap + --unshare-all + --unshare-user + --bind /run/user/0/wayland-1 /run/user/0/wayland-1 + --ro-bind /usr /usr + --ro-bind /lib /lib + --tmpfs /tmp + --dev /dev + --tmpfs /dev/shm + --ro-bind /nix /nix + --disable-userns + --proc /proc + --remount-ro /proc + -- + crosvm --no-syslog device gpu --fd 0 --wayland-sock /run/user/0/wayland-1 --params "{\"context-types\":\"cross-domain\"}" --- base-commit: 965f5706764edb1b4fea147683b5ab803dd6df5e change-id: 20251129-sandbox-5a42a6a41b59 -- Sincerely, Demi Marie Obenour (she/her/hers)
This restricts the access of these programs to the system. Seccomp is not enabled, though, and the processes still run as root. Full sandboxing needs additional work. In particular, Cloud Hypervisor should receive access to VFIO devices via file descriptor passing. Sandboxing Cloud Hypervisor requires the use of sh, as there is no s6 or execline program to increase hard resource limits. D-Bus and the portal are not sandboxed. They have full access to all user files by design, so a breach of either is catastrophic no matter what. Furthermore, sandboxing them even slightly proved very difficult. Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com> --- Changes in v2: - Sandbox Cloud Hypervisor, virtiofsd, and the router - Link to v1: https://spectrum-os.org/lists/archives/spectrum-devel/20251129-sandbox-v1-1-... --- Demi Marie Obenour (4): host/rootfs: Sandbox crosvm host/rootfs: Sandbox router host/rootfs: Sandbox virtiofsd host/rootfs: Sandbox Cloud Hypervisor host/rootfs/default.nix | 4 +-- .../template/data/service/spectrum-router/run | 19 +++++++++++-- .../template/data/service/vhost-user-fs/run | 28 ++++++++++++++++-- .../template/data/service/vhost-user-gpu/run | 24 +++++++++++++++- .../image/etc/udev/rules.d/99-spectrum.rules | 3 ++ host/rootfs/image/usr/bin/run-vmm | 33 +++++++++++++++++++--- 6 files changed, 98 insertions(+), 13 deletions(-) --- base-commit: 44f32b7a4b3cfbb4046447318e6753dd0eb71add change-id: 20251129-sandbox-5a42a6a41b59 -- Sincerely, Demi Marie Obenour (she/her/hers)
This means that a breach of crosvm is not guaranteed to be fatal. The Wayland socket is still only accessible by root, so crosvm must run as root. The known container escape via /proc/self/exe is blocked by bwrap being on a read-only filesystem. Container escapes via /proc are blocked by remounting /proc read-only. Crosvm does not have CAP_SYS_ADMIN so it cannot change mounts. The two remaining steps are: - Run crosvm as an unprivileged user. - Enable seccomp to block most system calls. The latter should be done from within crosvm itself. Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com> --- host/rootfs/default.nix | 4 ++-- .../template/data/service/vhost-user-gpu/run | 24 +++++++++++++++++++++- 2 files changed, 25 insertions(+), 3 deletions(-) diff --git a/host/rootfs/default.nix b/host/rootfs/default.nix index ca2084f26d58be5e0e1695634e125032c50f82b2..4716bb7298515b2940cad09bb55e42c196ce7ebc 100644 --- a/host/rootfs/default.nix +++ b/host/rootfs/default.nix @@ -10,7 +10,7 @@ pkgsMusl.callPackage ( { spectrum-host-tools, spectrum-router , lib, stdenvNoCC, nixos, runCommand, writeClosure, erofs-utils, s6-rc -, btrfs-progs, busybox, cloud-hypervisor, cosmic-files, crosvm +, btrfs-progs, bubblewrap, busybox, cloud-hypervisor, cosmic-files, crosvm , cryptsetup, dejavu_fonts, dbus, execline, foot, fuse3, iproute2 , inotify-tools, jq, kmod, mdevd, mesa, mount-flatpak, s6 , s6-linux-init, socat, systemd, util-linuxMinimal, virtiofsd @@ -25,7 +25,7 @@ let trivial; packages = [ - btrfs-progs cloud-hypervisor cosmic-files crosvm cryptsetup dbus + btrfs-progs bubblewrap cloud-hypervisor cosmic-files crosvm cryptsetup dbus execline fuse3 inotify-tools iproute2 jq kmod mdevd mount-flatpak s6 s6-linux-init s6-rc socat spectrum-host-tools spectrum-router util-linuxMinimal virtiofsd xdg-desktop-portal-spectrum-host diff --git a/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/vhost-user-gpu/run b/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/vhost-user-gpu/run index 0b4f6a00bc7aed0e721454d584d3bcd47fb18e2a..9b5dfad91944bd2c6c8994f387ab91394c68c1df 100755 --- a/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/vhost-user-gpu/run +++ b/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/vhost-user-gpu/run @@ -1,10 +1,32 @@ #!/bin/execlineb -P # SPDX-License-Identifier: EUPL-1.2+ # SPDX-FileCopyrightText: 2025 Alyssa Ross <hi@alyssa.is> +# SPDX-FileCopyrightText: 2025 Demi Marie Obenour <demiobenour@gmail.com> s6-ipcserver -1a 0700 -C 1 -b 1 env/crosvm.sock -crosvm --no-syslog device gpu +bwrap + --unshare-all + --unshare-user + --bind /run/user/0/wayland-1 /run/user/0/wayland-1 + --ro-bind /usr /usr + --ro-bind /lib /lib + --tmpfs /tmp + --dev /dev + --tmpfs /dev/shm + --ro-bind /nix /nix + --disable-userns + --proc /proc + --remount-ro /proc + --ro-bind /dev/null /proc/timer_list + --tmpfs /proc/scsi + --remount-ro /proc/scsi + --ro-bind /dev/null /proc/kcore + --ro-bind /dev/null /proc/sysrq-trigger + --tmpfs /proc/acpi + --remount-ro /proc/acpi + -- + crosvm --no-syslog device gpu --fd 0 --wayland-sock /run/user/0/wayland-1 --params "{\"context-types\":\"cross-domain\"}" -- 2.52.0
Demi Marie Obenour <demiobenour@gmail.com> writes:
This means that a breach of crosvm is not guaranteed to be fatal.
The Wayland socket is still only accessible by root, so crosvm must run as root. The known container escape via /proc/self/exe is blocked by bwrap being on a read-only filesystem. Container escapes via /proc are blocked by remounting /proc read-only. Crosvm does not have CAP_SYS_ADMIN so it cannot change mounts.
The two remaining steps are:
- Run crosvm as an unprivileged user. - Enable seccomp to block most system calls.
The latter should be done from within crosvm itself.
Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com> --- host/rootfs/default.nix | 4 ++-- .../template/data/service/vhost-user-gpu/run | 24 +++++++++++++++++++++- 2 files changed, 25 insertions(+), 3 deletions(-)
diff --git a/host/rootfs/default.nix b/host/rootfs/default.nix index ca2084f26d58be5e0e1695634e125032c50f82b2..4716bb7298515b2940cad09bb55e42c196ce7ebc 100644 --- a/host/rootfs/default.nix +++ b/host/rootfs/default.nix @@ -10,7 +10,7 @@ pkgsMusl.callPackage (
{ spectrum-host-tools, spectrum-router , lib, stdenvNoCC, nixos, runCommand, writeClosure, erofs-utils, s6-rc -, btrfs-progs, busybox, cloud-hypervisor, cosmic-files, crosvm +, btrfs-progs, bubblewrap, busybox, cloud-hypervisor, cosmic-files, crosvm , cryptsetup, dejavu_fonts, dbus, execline, foot, fuse3, iproute2 , inotify-tools, jq, kmod, mdevd, mesa, mount-flatpak, s6 , s6-linux-init, socat, systemd, util-linuxMinimal, virtiofsd @@ -25,7 +25,7 @@ let trivial;
packages = [ - btrfs-progs cloud-hypervisor cosmic-files crosvm cryptsetup dbus + btrfs-progs bubblewrap cloud-hypervisor cosmic-files crosvm cryptsetup dbus execline fuse3 inotify-tools iproute2 jq kmod mdevd mount-flatpak s6 s6-linux-init s6-rc socat spectrum-host-tools spectrum-router util-linuxMinimal virtiofsd xdg-desktop-portal-spectrum-host diff --git a/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/vhost-user-gpu/run b/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/vhost-user-gpu/run index 0b4f6a00bc7aed0e721454d584d3bcd47fb18e2a..9b5dfad91944bd2c6c8994f387ab91394c68c1df 100755 --- a/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/vhost-user-gpu/run +++ b/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/vhost-user-gpu/run @@ -1,10 +1,32 @@ #!/bin/execlineb -P # SPDX-License-Identifier: EUPL-1.2+ # SPDX-FileCopyrightText: 2025 Alyssa Ross <hi@alyssa.is> +# SPDX-FileCopyrightText: 2025 Demi Marie Obenour <demiobenour@gmail.com>
You add a copyright line here, but not in subsequent patches. Is that on purpose?
s6-ipcserver -1a 0700 -C 1 -b 1 env/crosvm.sock
-crosvm --no-syslog device gpu +bwrap + --unshare-all + --unshare-user
--unshare-all doesn't imply --unshare-user?
+ --bind /run/user/0/wayland-1 /run/user/0/wayland-1 + --ro-bind /usr /usr + --ro-bind /lib /lib + --tmpfs /tmp + --dev /dev + --tmpfs /dev/shm + --ro-bind /nix /nix + --disable-userns + --proc /proc + --remount-ro /proc + --ro-bind /dev/null /proc/timer_list + --tmpfs /proc/scsi + --remount-ro /proc/scsi + --ro-bind /dev/null /proc/kcore + --ro-bind /dev/null /proc/sysrq-trigger + --tmpfs /proc/acpi + --remount-ro /proc/acpi + -- + crosvm --no-syslog device gpu
No indent necessary here. This is a chain-loading program like many others we use in execline scripts. We don't indent for those or the rightwards drift would be ridiculous!
--fd 0 --wayland-sock /run/user/0/wayland-1 --params "{\"context-types\":\"cross-domain\"}"
On 12/3/25 07:43, Alyssa Ross wrote:
Demi Marie Obenour <demiobenour@gmail.com> writes:
This means that a breach of crosvm is not guaranteed to be fatal.
The Wayland socket is still only accessible by root, so crosvm must run as root. The known container escape via /proc/self/exe is blocked by bwrap being on a read-only filesystem. Container escapes via /proc are blocked by remounting /proc read-only. Crosvm does not have CAP_SYS_ADMIN so it cannot change mounts.
The two remaining steps are:
- Run crosvm as an unprivileged user. - Enable seccomp to block most system calls.
The latter should be done from within crosvm itself.
Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com> --- host/rootfs/default.nix | 4 ++-- .../template/data/service/vhost-user-gpu/run | 24 +++++++++++++++++++++- 2 files changed, 25 insertions(+), 3 deletions(-)
diff --git a/host/rootfs/default.nix b/host/rootfs/default.nix index ca2084f26d58be5e0e1695634e125032c50f82b2..4716bb7298515b2940cad09bb55e42c196ce7ebc 100644 --- a/host/rootfs/default.nix +++ b/host/rootfs/default.nix @@ -10,7 +10,7 @@ pkgsMusl.callPackage (
{ spectrum-host-tools, spectrum-router , lib, stdenvNoCC, nixos, runCommand, writeClosure, erofs-utils, s6-rc -, btrfs-progs, busybox, cloud-hypervisor, cosmic-files, crosvm +, btrfs-progs, bubblewrap, busybox, cloud-hypervisor, cosmic-files, crosvm , cryptsetup, dejavu_fonts, dbus, execline, foot, fuse3, iproute2 , inotify-tools, jq, kmod, mdevd, mesa, mount-flatpak, s6 , s6-linux-init, socat, systemd, util-linuxMinimal, virtiofsd @@ -25,7 +25,7 @@ let trivial;
packages = [ - btrfs-progs cloud-hypervisor cosmic-files crosvm cryptsetup dbus + btrfs-progs bubblewrap cloud-hypervisor cosmic-files crosvm cryptsetup dbus execline fuse3 inotify-tools iproute2 jq kmod mdevd mount-flatpak s6 s6-linux-init s6-rc socat spectrum-host-tools spectrum-router util-linuxMinimal virtiofsd xdg-desktop-portal-spectrum-host diff --git a/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/vhost-user-gpu/run b/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/vhost-user-gpu/run index 0b4f6a00bc7aed0e721454d584d3bcd47fb18e2a..9b5dfad91944bd2c6c8994f387ab91394c68c1df 100755 --- a/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/vhost-user-gpu/run +++ b/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/vhost-user-gpu/run @@ -1,10 +1,32 @@ #!/bin/execlineb -P # SPDX-License-Identifier: EUPL-1.2+ # SPDX-FileCopyrightText: 2025 Alyssa Ross <hi@alyssa.is> +# SPDX-FileCopyrightText: 2025 Demi Marie Obenour <demiobenour@gmail.com>
You add a copyright line here, but not in subsequent patches. Is that on purpose?
No.
s6-ipcserver -1a 0700 -C 1 -b 1 env/crosvm.sock
-crosvm --no-syslog device gpu +bwrap + --unshare-all + --unshare-user
--unshare-all doesn't imply --unshare-user?
It implies --unshare-user-try, but I want it to fail if it can't create a user namespace.
+ --bind /run/user/0/wayland-1 /run/user/0/wayland-1 + --ro-bind /usr /usr + --ro-bind /lib /lib + --tmpfs /tmp + --dev /dev + --tmpfs /dev/shm + --ro-bind /nix /nix + --disable-userns + --proc /proc + --remount-ro /proc + --ro-bind /dev/null /proc/timer_list + --tmpfs /proc/scsi + --remount-ro /proc/scsi + --ro-bind /dev/null /proc/kcore + --ro-bind /dev/null /proc/sysrq-trigger + --tmpfs /proc/acpi + --remount-ro /proc/acpi + -- + crosvm --no-syslog device gpu
No indent necessary here. This is a chain-loading program like many others we use in execline scripts. We don't indent for those or the rightwards drift would be ridiculous!
Should I indent the parameters above it?
--fd 0 --wayland-sock /run/user/0/wayland-1 --params "{\"context-types\":\"cross-domain\"}"
-- Sincerely, Demi Marie Obenour (she/her/hers)
Demi Marie Obenour <demiobenour@gmail.com> writes:
On 12/3/25 07:43, Alyssa Ross wrote:
Demi Marie Obenour <demiobenour@gmail.com> writes:
This means that a breach of crosvm is not guaranteed to be fatal.
The Wayland socket is still only accessible by root, so crosvm must run as root. The known container escape via /proc/self/exe is blocked by bwrap being on a read-only filesystem. Container escapes via /proc are blocked by remounting /proc read-only. Crosvm does not have CAP_SYS_ADMIN so it cannot change mounts.
The two remaining steps are:
- Run crosvm as an unprivileged user. - Enable seccomp to block most system calls.
The latter should be done from within crosvm itself.
Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com> --- host/rootfs/default.nix | 4 ++-- .../template/data/service/vhost-user-gpu/run | 24 +++++++++++++++++++++- 2 files changed, 25 insertions(+), 3 deletions(-)
diff --git a/host/rootfs/default.nix b/host/rootfs/default.nix index ca2084f26d58be5e0e1695634e125032c50f82b2..4716bb7298515b2940cad09bb55e42c196ce7ebc 100644 --- a/host/rootfs/default.nix +++ b/host/rootfs/default.nix @@ -10,7 +10,7 @@ pkgsMusl.callPackage (
{ spectrum-host-tools, spectrum-router , lib, stdenvNoCC, nixos, runCommand, writeClosure, erofs-utils, s6-rc -, btrfs-progs, busybox, cloud-hypervisor, cosmic-files, crosvm +, btrfs-progs, bubblewrap, busybox, cloud-hypervisor, cosmic-files, crosvm , cryptsetup, dejavu_fonts, dbus, execline, foot, fuse3, iproute2 , inotify-tools, jq, kmod, mdevd, mesa, mount-flatpak, s6 , s6-linux-init, socat, systemd, util-linuxMinimal, virtiofsd @@ -25,7 +25,7 @@ let trivial;
packages = [ - btrfs-progs cloud-hypervisor cosmic-files crosvm cryptsetup dbus + btrfs-progs bubblewrap cloud-hypervisor cosmic-files crosvm cryptsetup dbus execline fuse3 inotify-tools iproute2 jq kmod mdevd mount-flatpak s6 s6-linux-init s6-rc socat spectrum-host-tools spectrum-router util-linuxMinimal virtiofsd xdg-desktop-portal-spectrum-host diff --git a/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/vhost-user-gpu/run b/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/vhost-user-gpu/run index 0b4f6a00bc7aed0e721454d584d3bcd47fb18e2a..9b5dfad91944bd2c6c8994f387ab91394c68c1df 100755 --- a/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/vhost-user-gpu/run +++ b/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/vhost-user-gpu/run @@ -1,10 +1,32 @@ #!/bin/execlineb -P # SPDX-License-Identifier: EUPL-1.2+ # SPDX-FileCopyrightText: 2025 Alyssa Ross <hi@alyssa.is> +# SPDX-FileCopyrightText: 2025 Demi Marie Obenour <demiobenour@gmail.com>
You add a copyright line here, but not in subsequent patches. Is that on purpose?
No.
s6-ipcserver -1a 0700 -C 1 -b 1 env/crosvm.sock
-crosvm --no-syslog device gpu +bwrap + --unshare-all + --unshare-user
--unshare-all doesn't imply --unshare-user?
It implies --unshare-user-try, but I want it to fail if it can't create a user namespace.
Aha! Makes sense.
+ --bind /run/user/0/wayland-1 /run/user/0/wayland-1 + --ro-bind /usr /usr + --ro-bind /lib /lib + --tmpfs /tmp + --dev /dev + --tmpfs /dev/shm + --ro-bind /nix /nix + --disable-userns + --proc /proc + --remount-ro /proc + --ro-bind /dev/null /proc/timer_list + --tmpfs /proc/scsi + --remount-ro /proc/scsi + --ro-bind /dev/null /proc/kcore + --ro-bind /dev/null /proc/sysrq-trigger + --tmpfs /proc/acpi + --remount-ro /proc/acpi + -- + crosvm --no-syslog device gpu
No indent necessary here. This is a chain-loading program like many others we use in execline scripts. We don't indent for those or the rightwards drift would be ridiculous!
Should I indent the parameters above it?
Yeah I think that helps keep make it clear which exec they're scoped to.
Demi Marie Obenour <demiobenour@gmail.com> writes:
On 12/3/25 07:43, Alyssa Ross wrote:
Demi Marie Obenour <demiobenour@gmail.com> writes:
This means that a breach of crosvm is not guaranteed to be fatal.
The Wayland socket is still only accessible by root, so crosvm must run as root. The known container escape via /proc/self/exe is blocked by bwrap being on a read-only filesystem. Container escapes via /proc are blocked by remounting /proc read-only. Crosvm does not have CAP_SYS_ADMIN so it cannot change mounts.
The two remaining steps are:
- Run crosvm as an unprivileged user. - Enable seccomp to block most system calls.
The latter should be done from within crosvm itself.
Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com> --- host/rootfs/default.nix | 4 ++-- .../template/data/service/vhost-user-gpu/run | 24 +++++++++++++++++++++- 2 files changed, 25 insertions(+), 3 deletions(-)
diff --git a/host/rootfs/default.nix b/host/rootfs/default.nix index ca2084f26d58be5e0e1695634e125032c50f82b2..4716bb7298515b2940cad09bb55e42c196ce7ebc 100644 --- a/host/rootfs/default.nix +++ b/host/rootfs/default.nix @@ -10,7 +10,7 @@ pkgsMusl.callPackage (
{ spectrum-host-tools, spectrum-router , lib, stdenvNoCC, nixos, runCommand, writeClosure, erofs-utils, s6-rc -, btrfs-progs, busybox, cloud-hypervisor, cosmic-files, crosvm +, btrfs-progs, bubblewrap, busybox, cloud-hypervisor, cosmic-files, crosvm , cryptsetup, dejavu_fonts, dbus, execline, foot, fuse3, iproute2 , inotify-tools, jq, kmod, mdevd, mesa, mount-flatpak, s6 , s6-linux-init, socat, systemd, util-linuxMinimal, virtiofsd @@ -25,7 +25,7 @@ let trivial;
packages = [ - btrfs-progs cloud-hypervisor cosmic-files crosvm cryptsetup dbus + btrfs-progs bubblewrap cloud-hypervisor cosmic-files crosvm cryptsetup dbus execline fuse3 inotify-tools iproute2 jq kmod mdevd mount-flatpak s6 s6-linux-init s6-rc socat spectrum-host-tools spectrum-router util-linuxMinimal virtiofsd xdg-desktop-portal-spectrum-host diff --git a/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/vhost-user-gpu/run b/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/vhost-user-gpu/run index 0b4f6a00bc7aed0e721454d584d3bcd47fb18e2a..9b5dfad91944bd2c6c8994f387ab91394c68c1df 100755 --- a/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/vhost-user-gpu/run +++ b/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/vhost-user-gpu/run @@ -1,10 +1,32 @@ #!/bin/execlineb -P # SPDX-License-Identifier: EUPL-1.2+ # SPDX-FileCopyrightText: 2025 Alyssa Ross <hi@alyssa.is> +# SPDX-FileCopyrightText: 2025 Demi Marie Obenour <demiobenour@gmail.com>
You add a copyright line here, but not in subsequent patches. Is that on purpose?
No.
So which should it be, for all these changes?
This needs very little access to the system. Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com> --- .../template/data/service/spectrum-router/run | 19 ++++++++++++++++--- 1 file changed, 16 insertions(+), 3 deletions(-) diff --git a/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/spectrum-router/run b/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/spectrum-router/run index 7b3e3db3b109ba1c8d195c7c47d50d0cfbc30bd5..ef68cd638c092b53cc714a5d65bbfa3b49585346 100755 --- a/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/spectrum-router/run +++ b/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/spectrum-router/run @@ -4,6 +4,19 @@ importas -i VM VM -export RUST_LOG spectrum-router=debug,info -spectrum-router --app-listen-path ${VM}/router-app.sock --driver-listen-path ${VM}/router-driver.sock - +bwrap + --unshare-all + --unshare-user + --dev-bind / / + --setenv RUST_LOG spectrum-router=debug,info + --tmpfs /tmp + --dev /dev + --tmpfs /dev/shm + --ro-bind /nix /nix + --ro-bind /etc /etc + --tmpfs /run + --ro-bind /usr /usr + --ro-bind /lib /lib + --bind $VM $VM + -- + spectrum-router --app-listen-path ${VM}/router-app.sock --driver-listen-path ${VM}/router-driver.sock -- 2.52.0
It needs no write access to anything outside of its shared directory, and no network or abstract socket access. Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com> --- .../template/data/service/vhost-user-fs/run | 28 +++++++++++++++++++--- 1 file changed, 25 insertions(+), 3 deletions(-) diff --git a/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/vhost-user-fs/run b/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/vhost-user-fs/run index a9bbd8ea43a8c0a1a664f88b8593f794d07333cc..1a77385fd26726723b00b3e4feec26d08c992579 100755 --- a/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/vhost-user-fs/run +++ b/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/vhost-user-fs/run @@ -8,8 +8,30 @@ if { fdmove 1 3 echo } fdmove -c 3 0 redirfd -r 0 /dev/null -export TMPDIR /run - importas -i VM VM nsenter --mount=${VM}/mount -virtiofsd --fd 3 --shared-dir ${VM}/fs + +bwrap + --unshare-all + --unshare-user + --setenv TMPDIR /tmp + --dev /dev + --tmpfs /tmp + --tmpfs /dev/shm + --tmpfs /run + --ro-bind ${VM}/fs ${VM}/fs + --ro-bind /nix /nix + --ro-bind /usr /usr + --ro-bind /lib /lib + --ro-bind /etc /etc + --proc /proc + --remount-ro /proc + --ro-bind /dev/null /proc/timer_list + --tmpfs /proc/scsi + --remount-ro /proc/scsi + --ro-bind /dev/null /proc/kcore + --ro-bind /dev/null /proc/sysrq-trigger + --tmpfs /proc/acpi + --remount-ro /proc/acpi + -- + /usr/bin/virtiofsd --fd 3 --shared-dir ${VM}/fs -- 2.52.0
It only needs access to a small number of resources. Unfortunately, it needs access to /dev/vfio right now. This should be fixed by using file descriptor passing instead. Furthermore, Cloud Hypervisor needs to be able to lock memory. Running in a user namespace prevents it from having CAP_IPC_LOCK. Therefore, it is necessary to increase RLIMIT_MLOCK before running Cloud Hypervisor. The s6-softlimit program can only increase the soft limit, not the hard one. Therefore, use Busybox sh to increase the hard limit. Given that sh must be used anyway, take the opportunity to use shell conditionals and redirection instead of a few external commands. Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com> --- .../image/etc/udev/rules.d/99-spectrum.rules | 3 ++ host/rootfs/image/usr/bin/run-vmm | 33 +++++++++++++++++++--- 2 files changed, 32 insertions(+), 4 deletions(-) diff --git a/host/rootfs/image/etc/udev/rules.d/99-spectrum.rules b/host/rootfs/image/etc/udev/rules.d/99-spectrum.rules index 337bbe47dbbc6f3828722d8244f2689a39f3090f..de0f682aa40f8481dc3c25a90c695e2326536316 100644 --- a/host/rootfs/image/etc/udev/rules.d/99-spectrum.rules +++ b/host/rootfs/image/etc/udev/rules.d/99-spectrum.rules @@ -3,3 +3,6 @@ # systemd-udevd unsets PATH, so fix that. ACTION!="remove", ENV{PCI_CLASS}=="2????", RUN+="/usr/bin/env PATH=/usr/bin /usr/libexec/net-add" + +# make /dev/kvm world-accessible +KERNEL=="kvm", MODE="0666" diff --git a/host/rootfs/image/usr/bin/run-vmm b/host/rootfs/image/usr/bin/run-vmm index ba8b59c2677408acdd01c2eda3cf2dd60992d881..5fb0678b5ca7b6bcf49bf362a9355113892e4030 100755 --- a/host/rootfs/image/usr/bin/run-vmm +++ b/host/rootfs/image/usr/bin/run-vmm @@ -49,8 +49,33 @@ background -d { ch-remote --api-socket /run/vm/by-id/${router_id}/vmm add-net id=router,vhost_user=on,socket=/run/vm/by-id/${router_id}/router-driver.sock,mac=02:01:00:00:00:01 } unexport ! -fdmove -c 3 0 -redirfd -r 0 /dev/null +# I am not aware of an execlineb command to increase the hard limit, so do it in sh. +# Given that sh is in use, do a few things with it that would need external commands otherwise. +sh -c "exec 3>&0 >/dev/null && ulimit -l unlimited && udevadm wait /dev/kvm && exec \"$""@\"" sh -if { udevadm wait /dev/kvm } -cloud-hypervisor --api-socket fd=3 +bwrap + --unshare-all + --unshare-user + --dev /dev + --dev-bind /dev/kvm /dev/kvm + --dev-bind /dev/vfio /dev/vfio + --tmpfs /dev/shm + --tmpfs /tmp + --tmpfs /var/tmp + --ro-bind /etc /etc + --ro-bind /lib /lib + --ro-bind /nix /nix + --ro-bind /usr /usr + --bind /sys /sys + --bind /run /run + --proc /proc + --remount-ro /proc + --ro-bind /dev/null /proc/timer_list + --tmpfs /proc/scsi + --remount-ro /proc/scsi + --ro-bind /dev/null /proc/kcore + --ro-bind /dev/null /proc/sysrq-trigger + --tmpfs /proc/acpi + --remount-ro /proc/acpi + -- + cloud-hypervisor --api-socket fd=3 -- 2.52.0
Demi Marie Obenour <demiobenour@gmail.com> writes:
This restricts the access of these programs to the system. Seccomp is not enabled, though, and the processes still run as root. Full sandboxing needs additional work. In particular, Cloud Hypervisor should receive access to VFIO devices via file descriptor passing.
Sandboxing Cloud Hypervisor requires the use of sh, as there is no s6 or execline program to increase hard resource limits.
Yes there is! It's poorly named though — presumably the hard limit functionality was added later. We now do this in application VMs for Pipewire: https://spectrum-os.org/git/spectrum/tree/img/app/image/usr/bin/init?id=decd...
D-Bus and the portal are not sandboxed. They have full access to all user files by design, so a breach of either is catastrophic no matter what. Furthermore, sandboxing them even slightly proved very difficult.
Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com> --- Changes in v2: - Sandbox Cloud Hypervisor, virtiofsd, and the router - Link to v1: https://spectrum-os.org/lists/archives/spectrum-devel/20251129-sandbox-v1-1-...
--- Demi Marie Obenour (4): host/rootfs: Sandbox crosvm host/rootfs: Sandbox router host/rootfs: Sandbox virtiofsd host/rootfs: Sandbox Cloud Hypervisor
host/rootfs/default.nix | 4 +-- .../template/data/service/spectrum-router/run | 19 +++++++++++-- .../template/data/service/vhost-user-fs/run | 28 ++++++++++++++++-- .../template/data/service/vhost-user-gpu/run | 24 +++++++++++++++- .../image/etc/udev/rules.d/99-spectrum.rules | 3 ++ host/rootfs/image/usr/bin/run-vmm | 33 +++++++++++++++++++--- 6 files changed, 98 insertions(+), 13 deletions(-) --- base-commit: 44f32b7a4b3cfbb4046447318e6753dd0eb71add change-id: 20251129-sandbox-5a42a6a41b59
-- Sincerely, Demi Marie Obenour (she/her/hers)
This restricts the access of these programs to the system. Seccomp is not enabled, though, and the processes still run as root. Full sandboxing needs additional work. In particular, Cloud Hypervisor should receive access to VFIO devices via file descriptor passing. D-Bus and the portal are not sandboxed. They have full access to all user files by design, so a breach of either is catastrophic no matter what. Furthermore, sandboxing them even slightly proved very difficult. Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com> --- Changes in v3: - Protect bus daemon and portal from other services. - Use s6-softlimit instead of sh to set hard RLIMIT_MEMLOCK. - Link to v2: https://spectrum-os.org/lists/archives/spectrum-devel/20251201-sandbox-v2-0-... Changes in v2: - Sandbox Cloud Hypervisor, virtiofsd, and the router - Link to v1: https://spectrum-os.org/lists/archives/spectrum-devel/20251129-sandbox-v1-1-... --- Demi Marie Obenour (5): host/rootfs: Sandbox crosvm host/rootfs: Sandbox router host/rootfs: Unshare a few more namespaces in virtiofsd host/rootfs: Sandbox Cloud Hypervisor host/rootfs: Try to protect the portal and dbus daemon host/rootfs/default.nix | 4 +-- .../vm-services/template/data/service/dbus/run | 1 + .../template/data/service/spectrum-router/run | 19 +++++++++++-- .../template/data/service/vhost-user-fs/run | 2 +- .../template/data/service/vhost-user-gpu/run | 29 +++++++++++++++++++ .../image/etc/udev/rules.d/99-spectrum.rules | 3 ++ host/rootfs/image/usr/bin/run-vmm | 33 +++++++++++++++++++++- 7 files changed, 84 insertions(+), 7 deletions(-) --- base-commit: 36d857a937900f85b460e9b3db89cf79737bd72c change-id: 20251129-sandbox-5a42a6a41b59 -- Sincerely, Demi Marie Obenour (she/her/hers)
This means that a breach of crosvm is not guaranteed to be fatal. The Wayland socket is still only accessible by root, so crosvm must run as root. The known container escape via /proc/self/exe is blocked by bwrap being on a read-only filesystem. Container escapes via /proc are blocked by remounting /proc read-only. Crosvm does not have CAP_SYS_ADMIN so it cannot change mounts. The two remaining steps are: - Run crosvm as an unprivileged user. - Enable seccomp to block most system calls. The latter should be done from within crosvm itself. Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com> --- host/rootfs/default.nix | 4 +-- .../template/data/service/vhost-user-gpu/run | 29 ++++++++++++++++++++++ 2 files changed, 31 insertions(+), 2 deletions(-) diff --git a/host/rootfs/default.nix b/host/rootfs/default.nix index ca2084f26d58be5e0e1695634e125032c50f82b2..4716bb7298515b2940cad09bb55e42c196ce7ebc 100644 --- a/host/rootfs/default.nix +++ b/host/rootfs/default.nix @@ -10,7 +10,7 @@ pkgsMusl.callPackage ( { spectrum-host-tools, spectrum-router , lib, stdenvNoCC, nixos, runCommand, writeClosure, erofs-utils, s6-rc -, btrfs-progs, busybox, cloud-hypervisor, cosmic-files, crosvm +, btrfs-progs, bubblewrap, busybox, cloud-hypervisor, cosmic-files, crosvm , cryptsetup, dejavu_fonts, dbus, execline, foot, fuse3, iproute2 , inotify-tools, jq, kmod, mdevd, mesa, mount-flatpak, s6 , s6-linux-init, socat, systemd, util-linuxMinimal, virtiofsd @@ -25,7 +25,7 @@ let trivial; packages = [ - btrfs-progs cloud-hypervisor cosmic-files crosvm cryptsetup dbus + btrfs-progs bubblewrap cloud-hypervisor cosmic-files crosvm cryptsetup dbus execline fuse3 inotify-tools iproute2 jq kmod mdevd mount-flatpak s6 s6-linux-init s6-rc socat spectrum-host-tools spectrum-router util-linuxMinimal virtiofsd xdg-desktop-portal-spectrum-host diff --git a/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/vhost-user-gpu/run b/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/vhost-user-gpu/run index 0b4f6a00bc7aed0e721454d584d3bcd47fb18e2a..19d5a61388e6f49c7f722814d6de47227b02da01 100755 --- a/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/vhost-user-gpu/run +++ b/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/vhost-user-gpu/run @@ -1,9 +1,38 @@ #!/bin/execlineb -P # SPDX-License-Identifier: EUPL-1.2+ # SPDX-FileCopyrightText: 2025 Alyssa Ross <hi@alyssa.is> +# SPDX-FileCopyrightText: 2025 Demi Marie Obenour <demiobenour@gmail.com> s6-ipcserver -1a 0700 -C 1 -b 1 env/crosvm.sock +bwrap + --unshare-all + # --unshare-all only implies --unshare-user-try. + # Make this more than a "try". + --unshare-user + --bind /run/user/0/wayland-1 /run/user/0/wayland-1 + --ro-bind /usr /usr + --ro-bind /lib /lib + --tmpfs /tmp + --dev /dev + --tmpfs /dev/shm + --ro-bind /nix /nix + --disable-userns + --proc /proc + --ro-bind /proc/sys /proc/sys + --tmpfs /proc/scsi + --remount-ro /proc/scsi + --tmpfs /proc/acpi + --remount-ro /proc/acpi + --tmpfs /proc/fs + --remount-ro /proc/fs + --tmpfs /proc/irq + --remount-ro /proc/irq + --ro-bind /dev/null /proc/timer_list + --ro-bind /dev/null /proc/kcore + --ro-bind /dev/null /proc/kallsyms + --ro-bind /dev/null /proc/sysrq-trigger + -- crosvm --no-syslog device gpu --fd 0 --wayland-sock /run/user/0/wayland-1 -- 2.52.0
This needs very little access to the system. Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com> --- .../template/data/service/spectrum-router/run | 19 ++++++++++++++++--- 1 file changed, 16 insertions(+), 3 deletions(-) diff --git a/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/spectrum-router/run b/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/spectrum-router/run index 7b3e3db3b109ba1c8d195c7c47d50d0cfbc30bd5..ef68cd638c092b53cc714a5d65bbfa3b49585346 100755 --- a/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/spectrum-router/run +++ b/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/spectrum-router/run @@ -4,6 +4,19 @@ importas -i VM VM -export RUST_LOG spectrum-router=debug,info -spectrum-router --app-listen-path ${VM}/router-app.sock --driver-listen-path ${VM}/router-driver.sock - +bwrap + --unshare-all + --unshare-user + --dev-bind / / + --setenv RUST_LOG spectrum-router=debug,info + --tmpfs /tmp + --dev /dev + --tmpfs /dev/shm + --ro-bind /nix /nix + --ro-bind /etc /etc + --tmpfs /run + --ro-bind /usr /usr + --ro-bind /lib /lib + --bind $VM $VM + -- + spectrum-router --app-listen-path ${VM}/router-app.sock --driver-listen-path ${VM}/router-driver.sock -- 2.52.0
On 12/3/25 16:54, Demi Marie Obenour wrote:
This needs very little access to the system.
Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com> --- .../template/data/service/spectrum-router/run | 19 ++++++++++++++++--- 1 file changed, 16 insertions(+), 3 deletions(-)
diff --git a/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/spectrum-router/run b/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/spectrum-router/run index 7b3e3db3b109ba1c8d195c7c47d50d0cfbc30bd5..ef68cd638c092b53cc714a5d65bbfa3b49585346 100755 --- a/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/spectrum-router/run +++ b/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/spectrum-router/run @@ -4,6 +4,19 @@
importas -i VM VM
-export RUST_LOG spectrum-router=debug,info -spectrum-router --app-listen-path ${VM}/router-app.sock --driver-listen-path ${VM}/router-driver.sock - +bwrap + --unshare-all + --unshare-user + --dev-bind / / + --setenv RUST_LOG spectrum-router=debug,info + --tmpfs /tmp + --dev /dev + --tmpfs /dev/shm + --ro-bind /nix /nix + --ro-bind /etc /etc + --tmpfs /run This won't work. The router sets up unix sockets in /run which are accessed by the vmm. + --ro-bind /usr /usr + --ro-bind /lib /lib + --bind $VM $VM + -- + spectrum-router --app-listen-path ${VM}/router-app.sock --driver-listen-path ${VM}/router-driver.sock
Please make sure the integration tests still pass after this.
Yureka <yuka@yuka.dev> writes:
On 12/3/25 16:54, Demi Marie Obenour wrote:
This needs very little access to the system.
Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com> --- .../template/data/service/spectrum-router/run | 19 ++++++++++++++++--- 1 file changed, 16 insertions(+), 3 deletions(-)
diff --git a/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/spectrum-router/run b/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/spectrum-router/run index 7b3e3db3b109ba1c8d195c7c47d50d0cfbc30bd5..ef68cd638c092b53cc714a5d65bbfa3b49585346 100755 --- a/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/spectrum-router/run +++ b/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/spectrum-router/run @@ -4,6 +4,19 @@
importas -i VM VM
-export RUST_LOG spectrum-router=debug,info -spectrum-router --app-listen-path ${VM}/router-app.sock --driver-listen-path ${VM}/router-driver.sock - +bwrap + --unshare-all + --unshare-user + --dev-bind / / + --setenv RUST_LOG spectrum-router=debug,info + --tmpfs /tmp + --dev /dev + --tmpfs /dev/shm + --ro-bind /nix /nix + --ro-bind /etc /etc + --tmpfs /run This won't work. The router sets up unix sockets in /run which are accessed by the vmm. + --ro-bind /usr /usr + --ro-bind /lib /lib + --bind $VM $VM
Doesn't this line cover the sockets, or are there more outside of this directory?
+ -- + spectrum-router --app-listen-path ${VM}/router-app.sock --driver-listen-path ${VM}/router-driver.sock
On 12/3/25 17:11, Alyssa Ross wrote:
Yureka <yuka@yuka.dev> writes:
On 12/3/25 16:54, Demi Marie Obenour wrote:
This needs very little access to the system.
Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com> --- .../template/data/service/spectrum-router/run | 19 ++++++++++++++++--- 1 file changed, 16 insertions(+), 3 deletions(-)
diff --git a/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/spectrum-router/run b/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/spectrum-router/run index 7b3e3db3b109ba1c8d195c7c47d50d0cfbc30bd5..ef68cd638c092b53cc714a5d65bbfa3b49585346 100755 --- a/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/spectrum-router/run +++ b/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/spectrum-router/run @@ -4,6 +4,19 @@
importas -i VM VM
-export RUST_LOG spectrum-router=debug,info -spectrum-router --app-listen-path ${VM}/router-app.sock --driver-listen-path ${VM}/router-driver.sock - +bwrap + --unshare-all + --unshare-user + --dev-bind / / + --setenv RUST_LOG spectrum-router=debug,info + --tmpfs /tmp + --dev /dev + --tmpfs /dev/shm + --ro-bind /nix /nix + --ro-bind /etc /etc + --tmpfs /run This won't work. The router sets up unix sockets in /run which are accessed by the vmm. + --ro-bind /usr /usr + --ro-bind /lib /lib + --bind $VM $VM Doesn't this line cover the sockets, or are there more outside of this directory?
True.
+ -- + spectrum-router --app-listen-path ${VM}/router-app.sock --driver-listen-path ${VM}/router-driver.sock
Ok from me if it passes the integration tests: Reviewed-by: Yureka Lilian <yureka@cyberchaos.dev>
It doesn't need to share IPC, UTS, or cgroup namespaces. Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com> --- .../service/vm-services/template/data/service/vhost-user-fs/run | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/vhost-user-fs/run b/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/vhost-user-fs/run index bfe66f4607ab07884488df35691ba1c202b26e8e..6bd69ad944a464294ad9a3268c8a63482c7e8040 100755 --- a/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/vhost-user-fs/run +++ b/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/vhost-user-fs/run @@ -13,6 +13,6 @@ export TMPDIR /run importas -i VM VM nsenter --mount=${VM}/mount -unshare -U --map-user 1000 --map-group 1000 +unshare -U --map-user 1000 --map-group 1000 --uts --ipc --cgroup virtiofsd --fd 3 --shared-dir ${VM}/fs -- 2.52.0
It only needs access to a small number of resources. Unfortunately, it needs access to /dev/vfio right now. This should be fixed by using file descriptor passing instead. Furthermore, Cloud Hypervisor needs to be able to lock memory. Running in a user namespace prevents it from having CAP_IPC_LOCK. Therefore, it is necessary to increase RLIMIT_MLOCK before running Cloud Hypervisor. Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com> --- .../image/etc/udev/rules.d/99-spectrum.rules | 3 ++ host/rootfs/image/usr/bin/run-vmm | 33 +++++++++++++++++++++- 2 files changed, 35 insertions(+), 1 deletion(-) diff --git a/host/rootfs/image/etc/udev/rules.d/99-spectrum.rules b/host/rootfs/image/etc/udev/rules.d/99-spectrum.rules index 337bbe47dbbc6f3828722d8244f2689a39f3090f..de0f682aa40f8481dc3c25a90c695e2326536316 100644 --- a/host/rootfs/image/etc/udev/rules.d/99-spectrum.rules +++ b/host/rootfs/image/etc/udev/rules.d/99-spectrum.rules @@ -3,3 +3,6 @@ # systemd-udevd unsets PATH, so fix that. ACTION!="remove", ENV{PCI_CLASS}=="2????", RUN+="/usr/bin/env PATH=/usr/bin /usr/libexec/net-add" + +# make /dev/kvm world-accessible +KERNEL=="kvm", MODE="0666" diff --git a/host/rootfs/image/usr/bin/run-vmm b/host/rootfs/image/usr/bin/run-vmm index ba8b59c2677408acdd01c2eda3cf2dd60992d881..24c3d607bfcf6fea6196b61d2941141486d33fd6 100755 --- a/host/rootfs/image/usr/bin/run-vmm +++ b/host/rootfs/image/usr/bin/run-vmm @@ -52,5 +52,36 @@ unexport ! fdmove -c 3 0 redirfd -r 0 /dev/null +s6-softlimit -H -l 18446744073709551615 if { udevadm wait /dev/kvm } -cloud-hypervisor --api-socket fd=3 +bwrap + --unshare-all + --unshare-user + --dev /dev + --dev-bind /dev/kvm /dev/kvm + --dev-bind /dev/vfio /dev/vfio + --tmpfs /dev/shm + --tmpfs /tmp + --tmpfs /var/tmp + --ro-bind /etc /etc + --ro-bind /lib /lib + --ro-bind /nix /nix + --ro-bind /usr /usr + --ro-bind /sys /sys + --bind /run /run + --proc /proc + --ro-bind /proc/sys /proc/sys + --tmpfs /proc/scsi + --remount-ro /proc/scsi + --tmpfs /proc/acpi + --remount-ro /proc/acpi + --tmpfs /proc/fs + --remount-ro /proc/fs + --tmpfs /proc/irq + --remount-ro /proc/irq + --ro-bind /dev/null /proc/timer_list + --ro-bind /dev/null /proc/kcore + --ro-bind /dev/null /proc/kallsyms + --ro-bind /dev/null /proc/sysrq-trigger + -- + cloud-hypervisor --api-socket fd=3 -- 2.52.0
This tries to protect the portal and D-Bus daemon from other processes. Unfortunately, this protection is extremely limited: it currently only switches network and cgroup namespaces. The single biggest improvement that could be made, by far, is to make all mounts that the portal and bus daemon have access to 'nosymfollow', except for the root filesystem. Unfortunately, I am not aware of how to enforce this on mounts that appear after the service starts. Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com> --- .../run-image/service/vm-services/template/data/service/dbus/run | 1 + 1 file changed, 1 insertion(+) diff --git a/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/dbus/run b/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/dbus/run index 9b2319265024ab51934157834b280be869afa9b9..4e100ad39e11c802f875ac318c2d908b5e6dd9b8 100755 --- a/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/dbus/run +++ b/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/dbus/run @@ -6,6 +6,7 @@ importas -i VM VM nsenter --mount=${VM}/mount +unshare --net --ipc dbus-daemon --config-file /usr/share/dbus-1/session.conf --print-address 3 -- 2.52.0
This restricts the access of these programs to the system. Seccomp is not enabled, though, and the processes still run as root. Full sandboxing needs additional work. In particular, Cloud Hypervisor should receive access to VFIO devices via file descriptor passing. D-Bus, the portals, and Weston only unshare cgroup, IPC, network, and UTS namespaces. Unsharing mount namespaces breaks the file portal. Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com> --- Changes in v4: - Unshare cgroup, IPC, network, and UTS namespaces from Weston. - Unshare cgroup and UTS namespaces from D-Bus. - Link to v3: https://spectrum-os.org/lists/archives/spectrum-devel/20251203-sandbox-v3-0-... Changes in v3: - Protect bus daemon and portal from other services. - Use s6-softlimit instead of sh to set hard RLIMIT_MEMLOCK. - Link to v2: https://spectrum-os.org/lists/archives/spectrum-devel/20251201-sandbox-v2-0-... Changes in v2: - Sandbox Cloud Hypervisor, virtiofsd, and the router - Link to v1: https://spectrum-os.org/lists/archives/spectrum-devel/20251129-sandbox-v1-1-... --- Demi Marie Obenour (6): host/rootfs: Sandbox crosvm host/rootfs: Sandbox router host/rootfs: Unshare a few more namespaces in virtiofsd host/rootfs: Sandbox Cloud Hypervisor host/rootfs: Try to protect the portal and dbus daemon host/rootfs: "Sandbox" Weston host/rootfs/default.nix | 4 +-- .../vm-services/template/data/service/dbus/run | 5 ++++ .../template/data/service/spectrum-router/run | 19 +++++++++++-- .../template/data/service/vhost-user-fs/run | 2 +- .../template/data/service/vhost-user-gpu/run | 29 +++++++++++++++++++ host/rootfs/image/etc/s6-rc/weston/run | 5 ++++ .../image/etc/udev/rules.d/99-spectrum.rules | 3 ++ host/rootfs/image/usr/bin/run-vmm | 33 +++++++++++++++++++++- 8 files changed, 93 insertions(+), 7 deletions(-) --- base-commit: de3a8808f390bdce421077a62107f1d8bdeff22c change-id: 20251129-sandbox-5a42a6a41b59 -- Sincerely, Demi Marie Obenour (she/her/hers)
This means that a breach of crosvm is not guaranteed to be fatal. The Wayland socket is still only accessible by root, so crosvm must run as root. The known container escape via /proc/self/exe is blocked by bwrap being on a read-only filesystem. Container escapes via /proc are blocked by remounting /proc read-only. Crosvm does not have CAP_SYS_ADMIN so it cannot change mounts. The two remaining steps are: - Run crosvm as an unprivileged user. - Enable seccomp to block most system calls. The latter should be done from within crosvm itself. Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com> --- host/rootfs/default.nix | 4 +-- .../template/data/service/vhost-user-gpu/run | 29 ++++++++++++++++++++++ 2 files changed, 31 insertions(+), 2 deletions(-) diff --git a/host/rootfs/default.nix b/host/rootfs/default.nix index ca2084f26d58be5e0e1695634e125032c50f82b2..4716bb7298515b2940cad09bb55e42c196ce7ebc 100644 --- a/host/rootfs/default.nix +++ b/host/rootfs/default.nix @@ -10,7 +10,7 @@ pkgsMusl.callPackage ( { spectrum-host-tools, spectrum-router , lib, stdenvNoCC, nixos, runCommand, writeClosure, erofs-utils, s6-rc -, btrfs-progs, busybox, cloud-hypervisor, cosmic-files, crosvm +, btrfs-progs, bubblewrap, busybox, cloud-hypervisor, cosmic-files, crosvm , cryptsetup, dejavu_fonts, dbus, execline, foot, fuse3, iproute2 , inotify-tools, jq, kmod, mdevd, mesa, mount-flatpak, s6 , s6-linux-init, socat, systemd, util-linuxMinimal, virtiofsd @@ -25,7 +25,7 @@ let trivial; packages = [ - btrfs-progs cloud-hypervisor cosmic-files crosvm cryptsetup dbus + btrfs-progs bubblewrap cloud-hypervisor cosmic-files crosvm cryptsetup dbus execline fuse3 inotify-tools iproute2 jq kmod mdevd mount-flatpak s6 s6-linux-init s6-rc socat spectrum-host-tools spectrum-router util-linuxMinimal virtiofsd xdg-desktop-portal-spectrum-host diff --git a/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/vhost-user-gpu/run b/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/vhost-user-gpu/run index 0b4f6a00bc7aed0e721454d584d3bcd47fb18e2a..19d5a61388e6f49c7f722814d6de47227b02da01 100755 --- a/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/vhost-user-gpu/run +++ b/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/vhost-user-gpu/run @@ -1,9 +1,38 @@ #!/bin/execlineb -P # SPDX-License-Identifier: EUPL-1.2+ # SPDX-FileCopyrightText: 2025 Alyssa Ross <hi@alyssa.is> +# SPDX-FileCopyrightText: 2025 Demi Marie Obenour <demiobenour@gmail.com> s6-ipcserver -1a 0700 -C 1 -b 1 env/crosvm.sock +bwrap + --unshare-all + # --unshare-all only implies --unshare-user-try. + # Make this more than a "try". + --unshare-user + --bind /run/user/0/wayland-1 /run/user/0/wayland-1 + --ro-bind /usr /usr + --ro-bind /lib /lib + --tmpfs /tmp + --dev /dev + --tmpfs /dev/shm + --ro-bind /nix /nix + --disable-userns + --proc /proc + --ro-bind /proc/sys /proc/sys + --tmpfs /proc/scsi + --remount-ro /proc/scsi + --tmpfs /proc/acpi + --remount-ro /proc/acpi + --tmpfs /proc/fs + --remount-ro /proc/fs + --tmpfs /proc/irq + --remount-ro /proc/irq + --ro-bind /dev/null /proc/timer_list + --ro-bind /dev/null /proc/kcore + --ro-bind /dev/null /proc/kallsyms + --ro-bind /dev/null /proc/sysrq-trigger + -- crosvm --no-syslog device gpu --fd 0 --wayland-sock /run/user/0/wayland-1 -- 2.52.0
This patch has been committed as 62590b86c60e01e20d6cb01ba75b4ef24d99fe73, which can be viewed online at https://spectrum-os.org/git/spectrum/commit/?id=62590b86c60e01e20d6cb01ba75b.... This is an automated message. Send comments/questions/requests to: Alyssa Ross <hi@alyssa.is>
This needs very little access to the system. Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com> --- .../template/data/service/spectrum-router/run | 19 ++++++++++++++++--- 1 file changed, 16 insertions(+), 3 deletions(-) diff --git a/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/spectrum-router/run b/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/spectrum-router/run index 7b3e3db3b109ba1c8d195c7c47d50d0cfbc30bd5..ef68cd638c092b53cc714a5d65bbfa3b49585346 100755 --- a/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/spectrum-router/run +++ b/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/spectrum-router/run @@ -4,6 +4,19 @@ importas -i VM VM -export RUST_LOG spectrum-router=debug,info -spectrum-router --app-listen-path ${VM}/router-app.sock --driver-listen-path ${VM}/router-driver.sock - +bwrap + --unshare-all + --unshare-user + --dev-bind / / + --setenv RUST_LOG spectrum-router=debug,info + --tmpfs /tmp + --dev /dev + --tmpfs /dev/shm + --ro-bind /nix /nix + --ro-bind /etc /etc + --tmpfs /run + --ro-bind /usr /usr + --ro-bind /lib /lib + --bind $VM $VM + -- + spectrum-router --app-listen-path ${VM}/router-app.sock --driver-listen-path ${VM}/router-driver.sock -- 2.52.0
This patch has been committed as a53b58f6ac1a6f54efae17980939ac20f0c5d99b, which can be viewed online at https://spectrum-os.org/git/spectrum/commit/?id=a53b58f6ac1a6f54efae17980939.... This is an automated message. Send comments/questions/requests to: Alyssa Ross <hi@alyssa.is>
It doesn't need to share IPC, UTS, or cgroup namespaces. Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com> --- .../service/vm-services/template/data/service/vhost-user-fs/run | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/vhost-user-fs/run b/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/vhost-user-fs/run index bfe66f4607ab07884488df35691ba1c202b26e8e..6bd69ad944a464294ad9a3268c8a63482c7e8040 100755 --- a/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/vhost-user-fs/run +++ b/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/vhost-user-fs/run @@ -13,6 +13,6 @@ export TMPDIR /run importas -i VM VM nsenter --mount=${VM}/mount -unshare -U --map-user 1000 --map-group 1000 +unshare -U --map-user 1000 --map-group 1000 --uts --ipc --cgroup virtiofsd --fd 3 --shared-dir ${VM}/fs -- 2.52.0
This patch has been committed as 0710918c9edb48247cbffbe20a849a840d332ca6, which can be viewed online at https://spectrum-os.org/git/spectrum/commit/?id=0710918c9edb48247cbffbe20a84.... This is an automated message. Send comments/questions/requests to: Alyssa Ross <hi@alyssa.is>
It only needs access to a small number of resources. Unfortunately, it needs access to /dev/vfio right now. This should be fixed by using file descriptor passing instead. Furthermore, Cloud Hypervisor needs to be able to lock memory. Running in a user namespace prevents it from having CAP_IPC_LOCK. Therefore, it is necessary to increase RLIMIT_MLOCK before running Cloud Hypervisor. Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com> --- .../image/etc/udev/rules.d/99-spectrum.rules | 3 ++ host/rootfs/image/usr/bin/run-vmm | 33 +++++++++++++++++++++- 2 files changed, 35 insertions(+), 1 deletion(-) diff --git a/host/rootfs/image/etc/udev/rules.d/99-spectrum.rules b/host/rootfs/image/etc/udev/rules.d/99-spectrum.rules index 337bbe47dbbc6f3828722d8244f2689a39f3090f..de0f682aa40f8481dc3c25a90c695e2326536316 100644 --- a/host/rootfs/image/etc/udev/rules.d/99-spectrum.rules +++ b/host/rootfs/image/etc/udev/rules.d/99-spectrum.rules @@ -3,3 +3,6 @@ # systemd-udevd unsets PATH, so fix that. ACTION!="remove", ENV{PCI_CLASS}=="2????", RUN+="/usr/bin/env PATH=/usr/bin /usr/libexec/net-add" + +# make /dev/kvm world-accessible +KERNEL=="kvm", MODE="0666" diff --git a/host/rootfs/image/usr/bin/run-vmm b/host/rootfs/image/usr/bin/run-vmm index ba8b59c2677408acdd01c2eda3cf2dd60992d881..24c3d607bfcf6fea6196b61d2941141486d33fd6 100755 --- a/host/rootfs/image/usr/bin/run-vmm +++ b/host/rootfs/image/usr/bin/run-vmm @@ -52,5 +52,36 @@ unexport ! fdmove -c 3 0 redirfd -r 0 /dev/null +s6-softlimit -H -l 18446744073709551615 if { udevadm wait /dev/kvm } -cloud-hypervisor --api-socket fd=3 +bwrap + --unshare-all + --unshare-user + --dev /dev + --dev-bind /dev/kvm /dev/kvm + --dev-bind /dev/vfio /dev/vfio + --tmpfs /dev/shm + --tmpfs /tmp + --tmpfs /var/tmp + --ro-bind /etc /etc + --ro-bind /lib /lib + --ro-bind /nix /nix + --ro-bind /usr /usr + --ro-bind /sys /sys + --bind /run /run + --proc /proc + --ro-bind /proc/sys /proc/sys + --tmpfs /proc/scsi + --remount-ro /proc/scsi + --tmpfs /proc/acpi + --remount-ro /proc/acpi + --tmpfs /proc/fs + --remount-ro /proc/fs + --tmpfs /proc/irq + --remount-ro /proc/irq + --ro-bind /dev/null /proc/timer_list + --ro-bind /dev/null /proc/kcore + --ro-bind /dev/null /proc/kallsyms + --ro-bind /dev/null /proc/sysrq-trigger + -- + cloud-hypervisor --api-socket fd=3 -- 2.52.0
Demi Marie Obenour <demiobenour@gmail.com> writes:
It only needs access to a small number of resources. Unfortunately, it needs access to /dev/vfio right now. This should be fixed by using file descriptor passing instead.
Furthermore, Cloud Hypervisor needs to be able to lock memory. Running in a user namespace prevents it from having CAP_IPC_LOCK. Therefore, it is necessary to increase RLIMIT_MLOCK before running Cloud Hypervisor.
Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com> --- .../image/etc/udev/rules.d/99-spectrum.rules | 3 ++ host/rootfs/image/usr/bin/run-vmm | 33 +++++++++++++++++++++- 2 files changed, 35 insertions(+), 1 deletion(-)
diff --git a/host/rootfs/image/etc/udev/rules.d/99-spectrum.rules b/host/rootfs/image/etc/udev/rules.d/99-spectrum.rules index 337bbe47dbbc6f3828722d8244f2689a39f3090f..de0f682aa40f8481dc3c25a90c695e2326536316 100644 --- a/host/rootfs/image/etc/udev/rules.d/99-spectrum.rules +++ b/host/rootfs/image/etc/udev/rules.d/99-spectrum.rules @@ -3,3 +3,6 @@
# systemd-udevd unsets PATH, so fix that. ACTION!="remove", ENV{PCI_CLASS}=="2????", RUN+="/usr/bin/env PATH=/usr/bin /usr/libexec/net-add" + +# make /dev/kvm world-accessible +KERNEL=="kvm", MODE="0666" diff --git a/host/rootfs/image/usr/bin/run-vmm b/host/rootfs/image/usr/bin/run-vmm index ba8b59c2677408acdd01c2eda3cf2dd60992d881..24c3d607bfcf6fea6196b61d2941141486d33fd6 100755 --- a/host/rootfs/image/usr/bin/run-vmm +++ b/host/rootfs/image/usr/bin/run-vmm @@ -52,5 +52,36 @@ unexport ! fdmove -c 3 0 redirfd -r 0 /dev/null
+s6-softlimit -H -l 18446744073709551615
The s6-softlimit documentation says that hard limits should generally only be set once, at boot, and that's what we now do for PipeWire in img/app. Is the idea here that it would be undesirable to incraese the hard limit for all processes, so only do it for Cloud Hypervisor?
if { udevadm wait /dev/kvm } -cloud-hypervisor --api-socket fd=3 +bwrap + --unshare-all + --unshare-user + --dev /dev + --dev-bind /dev/kvm /dev/kvm + --dev-bind /dev/vfio /dev/vfio + --tmpfs /dev/shm + --tmpfs /tmp + --tmpfs /var/tmp + --ro-bind /etc /etc + --ro-bind /lib /lib + --ro-bind /nix /nix + --ro-bind /usr /usr + --ro-bind /sys /sys + --bind /run /run + --proc /proc + --ro-bind /proc/sys /proc/sys + --tmpfs /proc/scsi + --remount-ro /proc/scsi + --tmpfs /proc/acpi + --remount-ro /proc/acpi + --tmpfs /proc/fs + --remount-ro /proc/fs + --tmpfs /proc/irq + --remount-ro /proc/irq + --ro-bind /dev/null /proc/timer_list + --ro-bind /dev/null /proc/kcore + --ro-bind /dev/null /proc/kallsyms + --ro-bind /dev/null /proc/sysrq-trigger + -- + cloud-hypervisor --api-socket fd=3
-- 2.52.0
On 12/4/25 09:35, Alyssa Ross wrote:
Demi Marie Obenour <demiobenour@gmail.com> writes:
It only needs access to a small number of resources. Unfortunately, it needs access to /dev/vfio right now. This should be fixed by using file descriptor passing instead.
Furthermore, Cloud Hypervisor needs to be able to lock memory. Running in a user namespace prevents it from having CAP_IPC_LOCK. Therefore, it is necessary to increase RLIMIT_MLOCK before running Cloud Hypervisor.
Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com> --- .../image/etc/udev/rules.d/99-spectrum.rules | 3 ++ host/rootfs/image/usr/bin/run-vmm | 33 +++++++++++++++++++++- 2 files changed, 35 insertions(+), 1 deletion(-)
diff --git a/host/rootfs/image/etc/udev/rules.d/99-spectrum.rules b/host/rootfs/image/etc/udev/rules.d/99-spectrum.rules index 337bbe47dbbc6f3828722d8244f2689a39f3090f..de0f682aa40f8481dc3c25a90c695e2326536316 100644 --- a/host/rootfs/image/etc/udev/rules.d/99-spectrum.rules +++ b/host/rootfs/image/etc/udev/rules.d/99-spectrum.rules @@ -3,3 +3,6 @@
# systemd-udevd unsets PATH, so fix that. ACTION!="remove", ENV{PCI_CLASS}=="2????", RUN+="/usr/bin/env PATH=/usr/bin /usr/libexec/net-add" + +# make /dev/kvm world-accessible +KERNEL=="kvm", MODE="0666" diff --git a/host/rootfs/image/usr/bin/run-vmm b/host/rootfs/image/usr/bin/run-vmm index ba8b59c2677408acdd01c2eda3cf2dd60992d881..24c3d607bfcf6fea6196b61d2941141486d33fd6 100755 --- a/host/rootfs/image/usr/bin/run-vmm +++ b/host/rootfs/image/usr/bin/run-vmm @@ -52,5 +52,36 @@ unexport ! fdmove -c 3 0 redirfd -r 0 /dev/null
+s6-softlimit -H -l 18446744073709551615
The s6-softlimit documentation says that hard limits should generally only be set once, at boot, and that's what we now do for PipeWire in img/app. Is the idea here that it would be undesirable to incraese the hard limit for all processes, so only do it for Cloud Hypervisor?
s6-softlimit -H also increases the soft limit. Allowing every process on the system to lock an unlimited amount of memory doesn't seem ideal. For interactive logins, soft limits will be set via PAM, but Spectrum doesn't use PAM yet. This keeps the change localized, rather than having to bump the hard limit everywhere and then undo the change elsewhere. -- Sincerely, Demi Marie Obenour (she/her/hers)
Demi Marie Obenour <demiobenour@gmail.com> writes:
On 12/4/25 09:35, Alyssa Ross wrote:
Demi Marie Obenour <demiobenour@gmail.com> writes:
It only needs access to a small number of resources. Unfortunately, it needs access to /dev/vfio right now. This should be fixed by using file descriptor passing instead.
Furthermore, Cloud Hypervisor needs to be able to lock memory. Running in a user namespace prevents it from having CAP_IPC_LOCK. Therefore, it is necessary to increase RLIMIT_MLOCK before running Cloud Hypervisor.
Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com> --- .../image/etc/udev/rules.d/99-spectrum.rules | 3 ++ host/rootfs/image/usr/bin/run-vmm | 33 +++++++++++++++++++++- 2 files changed, 35 insertions(+), 1 deletion(-)
diff --git a/host/rootfs/image/etc/udev/rules.d/99-spectrum.rules b/host/rootfs/image/etc/udev/rules.d/99-spectrum.rules index 337bbe47dbbc6f3828722d8244f2689a39f3090f..de0f682aa40f8481dc3c25a90c695e2326536316 100644 --- a/host/rootfs/image/etc/udev/rules.d/99-spectrum.rules +++ b/host/rootfs/image/etc/udev/rules.d/99-spectrum.rules @@ -3,3 +3,6 @@
# systemd-udevd unsets PATH, so fix that. ACTION!="remove", ENV{PCI_CLASS}=="2????", RUN+="/usr/bin/env PATH=/usr/bin /usr/libexec/net-add" + +# make /dev/kvm world-accessible +KERNEL=="kvm", MODE="0666" diff --git a/host/rootfs/image/usr/bin/run-vmm b/host/rootfs/image/usr/bin/run-vmm index ba8b59c2677408acdd01c2eda3cf2dd60992d881..24c3d607bfcf6fea6196b61d2941141486d33fd6 100755 --- a/host/rootfs/image/usr/bin/run-vmm +++ b/host/rootfs/image/usr/bin/run-vmm @@ -52,5 +52,36 @@ unexport ! fdmove -c 3 0 redirfd -r 0 /dev/null
+s6-softlimit -H -l 18446744073709551615
The s6-softlimit documentation says that hard limits should generally only be set once, at boot, and that's what we now do for PipeWire in img/app. Is the idea here that it would be undesirable to incraese the hard limit for all processes, so only do it for Cloud Hypervisor?
s6-softlimit -H also increases the soft limit. Allowing every process on the system to lock an unlimited amount of memory doesn't seem ideal. For interactive logins, soft limits will be set via PAM, but Spectrum doesn't use PAM yet. This keeps the change localized, rather than having to bump the hard limit everywhere and then undo the change elsewhere.
I wonder why the documentation says that, then. I suppose that's something I should take up with skarnet rather than you?
On 12/6/25 12:46, Alyssa Ross wrote:
Demi Marie Obenour <demiobenour@gmail.com> writes:
On 12/4/25 09:35, Alyssa Ross wrote:
Demi Marie Obenour <demiobenour@gmail.com> writes:
It only needs access to a small number of resources. Unfortunately, it needs access to /dev/vfio right now. This should be fixed by using file descriptor passing instead.
Furthermore, Cloud Hypervisor needs to be able to lock memory. Running in a user namespace prevents it from having CAP_IPC_LOCK. Therefore, it is necessary to increase RLIMIT_MLOCK before running Cloud Hypervisor.
Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com> --- .../image/etc/udev/rules.d/99-spectrum.rules | 3 ++ host/rootfs/image/usr/bin/run-vmm | 33 +++++++++++++++++++++- 2 files changed, 35 insertions(+), 1 deletion(-)
diff --git a/host/rootfs/image/etc/udev/rules.d/99-spectrum.rules b/host/rootfs/image/etc/udev/rules.d/99-spectrum.rules index 337bbe47dbbc6f3828722d8244f2689a39f3090f..de0f682aa40f8481dc3c25a90c695e2326536316 100644 --- a/host/rootfs/image/etc/udev/rules.d/99-spectrum.rules +++ b/host/rootfs/image/etc/udev/rules.d/99-spectrum.rules @@ -3,3 +3,6 @@
# systemd-udevd unsets PATH, so fix that. ACTION!="remove", ENV{PCI_CLASS}=="2????", RUN+="/usr/bin/env PATH=/usr/bin /usr/libexec/net-add" + +# make /dev/kvm world-accessible +KERNEL=="kvm", MODE="0666" diff --git a/host/rootfs/image/usr/bin/run-vmm b/host/rootfs/image/usr/bin/run-vmm index ba8b59c2677408acdd01c2eda3cf2dd60992d881..24c3d607bfcf6fea6196b61d2941141486d33fd6 100755 --- a/host/rootfs/image/usr/bin/run-vmm +++ b/host/rootfs/image/usr/bin/run-vmm @@ -52,5 +52,36 @@ unexport ! fdmove -c 3 0 redirfd -r 0 /dev/null
+s6-softlimit -H -l 18446744073709551615
The s6-softlimit documentation says that hard limits should generally only be set once, at boot, and that's what we now do for PipeWire in img/app. Is the idea here that it would be undesirable to incraese the hard limit for all processes, so only do it for Cloud Hypervisor?
s6-softlimit -H also increases the soft limit. Allowing every process on the system to lock an unlimited amount of memory doesn't seem ideal. For interactive logins, soft limits will be set via PAM, but Spectrum doesn't use PAM yet. This keeps the change localized, rather than having to bump the hard limit everywhere and then undo the change elsewhere.
I wonder why the documentation says that, then. I suppose that's something I should take up with skarnet rather than you?
I think so. I suspect it's subjective but am not sure. -- Sincerely, Demi Marie Obenour (she/her/hers)
Demi Marie Obenour <demiobenour@gmail.com> writes:
On 12/6/25 12:46, Alyssa Ross wrote:
Demi Marie Obenour <demiobenour@gmail.com> writes:
On 12/4/25 09:35, Alyssa Ross wrote:
Demi Marie Obenour <demiobenour@gmail.com> writes:
It only needs access to a small number of resources. Unfortunately, it needs access to /dev/vfio right now. This should be fixed by using file descriptor passing instead.
Furthermore, Cloud Hypervisor needs to be able to lock memory. Running in a user namespace prevents it from having CAP_IPC_LOCK. Therefore, it is necessary to increase RLIMIT_MLOCK before running Cloud Hypervisor.
Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com> --- .../image/etc/udev/rules.d/99-spectrum.rules | 3 ++ host/rootfs/image/usr/bin/run-vmm | 33 +++++++++++++++++++++- 2 files changed, 35 insertions(+), 1 deletion(-)
diff --git a/host/rootfs/image/etc/udev/rules.d/99-spectrum.rules b/host/rootfs/image/etc/udev/rules.d/99-spectrum.rules index 337bbe47dbbc6f3828722d8244f2689a39f3090f..de0f682aa40f8481dc3c25a90c695e2326536316 100644 --- a/host/rootfs/image/etc/udev/rules.d/99-spectrum.rules +++ b/host/rootfs/image/etc/udev/rules.d/99-spectrum.rules @@ -3,3 +3,6 @@
# systemd-udevd unsets PATH, so fix that. ACTION!="remove", ENV{PCI_CLASS}=="2????", RUN+="/usr/bin/env PATH=/usr/bin /usr/libexec/net-add" + +# make /dev/kvm world-accessible +KERNEL=="kvm", MODE="0666" diff --git a/host/rootfs/image/usr/bin/run-vmm b/host/rootfs/image/usr/bin/run-vmm index ba8b59c2677408acdd01c2eda3cf2dd60992d881..24c3d607bfcf6fea6196b61d2941141486d33fd6 100755 --- a/host/rootfs/image/usr/bin/run-vmm +++ b/host/rootfs/image/usr/bin/run-vmm @@ -52,5 +52,36 @@ unexport ! fdmove -c 3 0 redirfd -r 0 /dev/null
+s6-softlimit -H -l 18446744073709551615
The s6-softlimit documentation says that hard limits should generally only be set once, at boot, and that's what we now do for PipeWire in img/app. Is the idea here that it would be undesirable to incraese the hard limit for all processes, so only do it for Cloud Hypervisor?
s6-softlimit -H also increases the soft limit. Allowing every process on the system to lock an unlimited amount of memory doesn't seem ideal. For interactive logins, soft limits will be set via PAM, but Spectrum doesn't use PAM yet. This keeps the change localized, rather than having to bump the hard limit everywhere and then undo the change elsewhere.
I wonder why the documentation says that, then. I suppose that's something I should take up with skarnet rather than you?
I think so. I suspect it's subjective but am not sure.
Okay. I have enquired.
This tries to protect the portal and D-Bus daemon from other processes. Unfortunately, this protection is extremely limited: it currently only unshares cgroup, IPC, network, and UTS namespaces. The single biggest improvement that could be made, by far, is to make all mounts that the portal and bus daemon have access to 'nosymfollow', except for the root filesystem. Unfortunately, I am not aware of how to enforce this on mounts that appear after the service starts. Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com> --- .../run-image/service/vm-services/template/data/service/dbus/run | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/dbus/run b/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/dbus/run index 9b2319265024ab51934157834b280be869afa9b9..3a7dd49415538f1872b984bcc791ef754b6922aa 100755 --- a/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/dbus/run +++ b/host/rootfs/image/etc/s6-linux-init/run-image/service/vm-services/template/data/service/dbus/run @@ -6,6 +6,11 @@ importas -i VM VM nsenter --mount=${VM}/mount +unshare + --cgroup + --ipc + --net + --uts dbus-daemon --config-file /usr/share/dbus-1/session.conf --print-address 3 -- 2.52.0
This patch has been committed as 03c2b4ba23d8086a235367acccb965860132f8b4, which can be viewed online at https://spectrum-os.org/git/spectrum/commit/?id=03c2b4ba23d8086a235367acccb9.... This is an automated message. Send comments/questions/requests to: Alyssa Ross <hi@alyssa.is>
This is not a true sandbox, but it does protect Weston from other code that tries to connect to an abstract namespace socket or System V IPC object. Cgroup, IPC, network, and UTS namespaces are unshared. Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com> --- host/rootfs/image/etc/s6-rc/weston/run | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/host/rootfs/image/etc/s6-rc/weston/run b/host/rootfs/image/etc/s6-rc/weston/run index c1bce8505944b68c75c1b87d1ae736ff655e0f07..12e5d702b976c165249ac9f8078ce6434fbb43b1 100644 --- a/host/rootfs/image/etc/s6-rc/weston/run +++ b/host/rootfs/image/etc/s6-rc/weston/run @@ -18,4 +18,9 @@ redirfd -r 0 /dev/tty1 importas -i home HOME cd $home if { udevadm wait /dev/dri/card0 } +unshare + --cgroup + --ipc + --net + --uts weston -- 2.52.0
This patch has been committed as a13d3403c1ddbb8dbbbdb05416350b2846162ed1, which can be viewed online at https://spectrum-os.org/git/spectrum/commit/?id=a13d3403c1ddbb8dbbbdb0541635.... This is an automated message. Send comments/questions/requests to: Alyssa Ross <hi@alyssa.is>
participants (4)
-
Alyssa Ross -
Alyssa Ross -
Demi Marie Obenour -
Yureka