Demi Marie Obenour <demiobenour@gmail.com> writes:
On 9/17/25 07:27, Alyssa Ross wrote:
Demi Marie Obenour <demiobenour@gmail.com> writes:
On 9/10/25 11:11, Alyssa Ross wrote:
• These services are part of our TCB anyway. Sandboxing only gets us defense in depth. With that in mind, it's basically never going to be worth adding sandboxing if it adds any amount of attack surface. One example of that would be user namespaces. They've been a consistent source of kernel security issues, and it might be better to turn them off entirely than to use them for sandboxing stuff that's trusted anyway.
Sandboxing virtiofsd is going to be really annoying and will definitely come at a performance cost. The most efficient way to use virtiofsd is to give it CAP_DAC_READ_SEARCH in the initial user namespace and delegate _all_ access control to it. This allows virtiofs to use open_by_handle_at() for all filesystem access. Unfortunately, this also allows virtiofsd to open any file on the filesystem, ignoring all discretionary access control checks. I don't think Landlock would work either. SELinux or SMACK might work, but using them is significantly more complicated.
If one wants to sandbox virtiofsd, one either needs to use --cache=never or run into an effective resource leak (https://gitlab.com/virtio-fs/virtiofsd/-/issues/194). My hope is that in the future the problem will be solved by DAX and an in-kernel shrinker that is aware of the host resources it is using. Denial of service would be prevented by cgroups on the host, addressing the objection mentioned in the issue comments.
Do we not trust virtiofsd's built-in sandboxing?
I do trust it, provided that it is verifiable (by dumping the state of the process at runtime). However, allowing unrestricted open_by_handle_at() allows opening any file on the system, conditioned only on the filesystem supporting open_by_handle_at(). Therefore, sandboxing and using handles for all filesystem access are incompatible.
Wouldn't it be limited to only files on the same filesystem, since you have to pass a mount FD to open_by_handle_at()? That's still bad though. So then to start with we just want to make sure it doesn't have CAP_DAC_READ_SEARCH, and then we hope that something comes along to address the limitations of that?